HeteroSVD: Efficient SVD Accelerator on Versal ACAP with Algorithm-Hardware Co-Design

Singular value decomposition (SVD) is a matrix factorization technique widely used in signal processing and recommendation systems, etc. In general, the time complexity of SVD algorithms is cubic to the problem size, making SVD algorithms difficult to meet stringent performance requirements in real-...

Full description

Saved in:
Bibliographic Details
Published in2025 62nd ACM/IEEE Design Automation Conference (DAC) pp. 1 - 7
Main Authors Luan, Xinya, Lin, Zhe, Shi, Kai, Zhai, Jianwang, Zhao, Kang
Format Conference Proceeding
LanguageEnglish
Published IEEE 22.06.2025
Subjects
Online AccessGet full text
DOI10.1109/DAC63849.2025.11132878

Cover

Abstract Singular value decomposition (SVD) is a matrix factorization technique widely used in signal processing and recommendation systems, etc. In general, the time complexity of SVD algorithms is cubic to the problem size, making SVD algorithms difficult to meet stringent performance requirements in real-time. However, existing FPGA and GPU solutions fall short of jointly optimizing latency, throughput, and power consumption. To settle this issue, this paper proposes HeteroSVD, a heterogeneous reconfigurable accelerator for SVD computation on the Versal ACAP platform. HeteroSVD introduces a system-level SVD decomposition mechanism and proposes an algorithm-hardware co-design method to optimize SVD ordering jointly and AI engine (AIE)-centric dataflow and placement with Versal. Furthermore, in order to improve the quality of results (QoR) and facilitate micro-architecture selection, we introduce an automatic optimization framework that performs accurate performance modeling and fast design space exploration. Experiment results demonstrate that HeteroSVD reduces the latency by 1.98 \times over existing FPGA accelerators and outperforms GPU solutions with an improvement of up to 7.22 \times in latency, 1.77 \times in throughput, and 13.18 \times in energy efficiency.
AbstractList Singular value decomposition (SVD) is a matrix factorization technique widely used in signal processing and recommendation systems, etc. In general, the time complexity of SVD algorithms is cubic to the problem size, making SVD algorithms difficult to meet stringent performance requirements in real-time. However, existing FPGA and GPU solutions fall short of jointly optimizing latency, throughput, and power consumption. To settle this issue, this paper proposes HeteroSVD, a heterogeneous reconfigurable accelerator for SVD computation on the Versal ACAP platform. HeteroSVD introduces a system-level SVD decomposition mechanism and proposes an algorithm-hardware co-design method to optimize SVD ordering jointly and AI engine (AIE)-centric dataflow and placement with Versal. Furthermore, in order to improve the quality of results (QoR) and facilitate micro-architecture selection, we introduce an automatic optimization framework that performs accurate performance modeling and fast design space exploration. Experiment results demonstrate that HeteroSVD reduces the latency by 1.98 \times over existing FPGA accelerators and outperforms GPU solutions with an improvement of up to 7.22 \times in latency, 1.77 \times in throughput, and 13.18 \times in energy efficiency.
Author Luan, Xinya
Shi, Kai
Lin, Zhe
Zhai, Jianwang
Zhao, Kang
Author_xml – sequence: 1
  givenname: Xinya
  surname: Luan
  fullname: Luan, Xinya
  email: luanxinya@bupt.edu.cn
  organization: Beijing University of Posts and Telecommunications
– sequence: 2
  givenname: Zhe
  surname: Lin
  fullname: Lin, Zhe
  email: linzh235@mail.sysu.edu.cn
  organization: Sun Yat-sen University
– sequence: 3
  givenname: Kai
  surname: Shi
  fullname: Shi, Kai
  email: shikai@bupt.edu.cn
  organization: Beijing University of Posts and Telecommunications
– sequence: 4
  givenname: Jianwang
  surname: Zhai
  fullname: Zhai, Jianwang
  email: zhaijw@bupt.edu.cn
  organization: Beijing University of Posts and Telecommunications
– sequence: 5
  givenname: Kang
  surname: Zhao
  fullname: Zhao, Kang
  email: zhaokang@bupt.edu.cn
  organization: Beijing University of Posts and Telecommunications
BookMark eNo1j9FKwzAYhSPohc69gUheoDPJ3zSJd6GbVhgoOHc70vTPDHSNpIXh21tQr875zsUH54ZcDmlAQu45W3HOzMPa1hXo0qwEE3KeOAit9AVZGmU0AJcMWKmvyUeDE-b0vl8_0k0I0UccJjojtd5jj9lNKdM00D3m0fXU1vaNnuP0SW1_THkup6JxuTu7jLROxRrHeBxuyVVw_YjLv1yQ3dNmVzfF9vX5pbbbIhqYChNKpkXwDHwwgN4AV6CFrJxkoUWh2rbypfRClU45DB1WoXPInRNaYgWwIHe_2oiIh68cTy5_H_7Pwg_yY04s
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/DAC63849.2025.11132878
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEL(IEEE/IET Electronic Library )
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798331503048
EndPage 7
ExternalDocumentID 11132878
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 10.13039/501100001809
GroupedDBID 6IE
6IH
CBEJK
RIE
RIO
ID FETCH-LOGICAL-i93t-9f4082fc03cf93ec931738256a50fbe27bb6c45c274a7aefde6fdae1aa285e633
IEDL.DBID RIE
IngestDate Wed Oct 01 07:05:15 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i93t-9f4082fc03cf93ec931738256a50fbe27bb6c45c274a7aefde6fdae1aa285e633
PageCount 7
ParticipantIDs ieee_primary_11132878
PublicationCentury 2000
PublicationDate 2025-June-22
PublicationDateYYYYMMDD 2025-06-22
PublicationDate_xml – month: 06
  year: 2025
  text: 2025-June-22
  day: 22
PublicationDecade 2020
PublicationTitle 2025 62nd ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev DAC
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
Score 2.2975335
Snippet Singular value decomposition (SVD) is a matrix factorization technique widely used in signal processing and recommendation systems, etc. In general, the time...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Field programmable gate arrays
Graphics processing units
Recommender systems
Search problems
Signal processing
Signal processing algorithms
Singular value decomposition
Space exploration
Throughput
Time complexity
Title HeteroSVD: Efficient SVD Accelerator on Versal ACAP with Algorithm-Hardware Co-Design
URI https://ieeexplore.ieee.org/document/11132878
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA62J08qVnyTg9dsd5PNPrwtfVAES8FWeiuT7ETEuitli-CvN0lbRUHwliwJWTK7fMnMfN8QchNLEYNUggkjQmZ77p8LDbPQhKhApOjV9u_HyWgW383lfEtW91wYRPTJZxi4po_ll7VeO1dZ15dFz9KsRVpplmzIWlvWbxTm3X7Rs19T7OgnXAa7wT_KpnjUGB6Q8W69TbLIS7BuVKA_fkkx_vuFDknnm6BHJ1_Qc0T2sDoms5FLbakfHvu3dOCVIexkaru00Nqiiw-o07qizkcGS2qtMKHOD0uL5VO9so1X5gL577BC2qtZ3yd3dMh0OJj2RmxbNYE956JhuXElpI0OhTa5QJ3bA4Kw18AEZGgU8lSpRMdS29sopICmxMSUgBEAzyQmQpyQdlVXeEqoURxQlWFqMHa67pmIIDZgzwgRSOT5Gem4LVm8bXQxFrvdOP_j-QXZd5ZxiVacX5J2s1rjlYX0Rl17U34CTlOhQQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFA46H_RJxYl38-BrurZJevFt7ELVbQzcZG8jSU9kOFsZHYK_3iTbFAXBt6QkpOS0fMk55_sOQjeMUya4pIRq6hPTs_-cr4mBJgApaAxObb8_iLIxu5_wyZqs7rgwAOCSz8CzTRfLz0u1tK6yhiuLnsTJNtrhjDG-omuteb-BnzbazZb5npgloITc2wz_UTjF4UZ3Hw02K67SRV68ZSU99fFLjPHfr3SA6t8UPTz8Ap9DtAXFERpnNrmlfHxq3-KO04Ywk7Hp4qZSBl9cSB2XBbZeMjHHxg5DbD2xuDl_Lhem8UpsKP9dLAC3StJ26R11NOp2Rq2MrOsmkFlKK5JqW0RaK58qnVJQqTkiUHMRjAT3tYQwljJSjCtzHxWxAJ1DpHMBgRBhwiGi9BjVirKAE4S1DAXI3I81MKvsntBAMC3MKSEQHML0FNXtlkzfVsoY081unP3x_BrtZqN-b9q7Gzycoz1rJZt2FYYXqFYtlnBpAL6SV86sn-_ppI4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+62nd+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=HeteroSVD%3A+Efficient+SVD+Accelerator+on+Versal+ACAP+with+Algorithm-Hardware+Co-Design&rft.au=Luan%2C+Xinya&rft.au=Lin%2C+Zhe&rft.au=Shi%2C+Kai&rft.au=Zhai%2C+Jianwang&rft.date=2025-06-22&rft.pub=IEEE&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FDAC63849.2025.11132878&rft.externalDocID=11132878