UpDLRM: Accelerating Personalized Recommendation using Real-World PIM Architecture

Sitian Chen, Haobin Tan, Amelie Chi Zhou*, Yusen Li, Pavan Balaji

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

37 Downloads (Pure)

Abstract

Deep Learning Recommendation Models (DLRMs) have gained popularity in recommendation systems due to their effectiveness in handling large-scale recommendation tasks. The embedding layers of DLRMs have become the performance bottleneck due to their intensive needs on memory capacity and memory bandwidth. In this paper, we propose UpDLRM, which utilizes real-world processing-in-memory (PIM) hardware, UPMEM DPU, to boost the memory bandwidth and reduce recommendation latency. The parallel nature of the DPU memory can provide high aggregated bandwidth for the large number of irregular memory accesses in embedding lookups, thus offering great potential to reduce the inference latency. To fully utilize the DPU memory bandwidth, we further studied the embedding table partitioning problem to achieve good workload-balance and efficient data caching. Evaluations using real-world datasets show that, UpDLRM achieves much lower inference time for DLRMs compared to both CPU-only and CPU-GPU hybrid counterparts.
Original languageEnglish
Title of host publicationDAC '24: Proceedings of the 61st ACM/IEEE Design Automation Conference
PublisherAssociation for Computing Machinery (ACM)
Number of pages6
ISBN (Electronic)9798400706011
DOIs
Publication statusPublished - 7 Nov 2024
Event61st ACM/IEEE Design Automation Conference, DAC 2024 - San Francisco, San Francisco, United States
Duration: 23 Jun 202427 Jun 2024
https://dl.acm.org/doi/proceedings/10.1145/3649329 (Conference proceedings)
https://www.dac.com/

Publication series

NameDAC: Design Automation Conference

Conference

Conference61st ACM/IEEE Design Automation Conference, DAC 2024
Abbreviated titleDAC 2024
Country/TerritoryUnited States
CitySan Francisco
Period23/06/2427/06/24
Internet address

Fingerprint

Dive into the research topics of 'UpDLRM: Accelerating Personalized Recommendation using Real-World PIM Architecture'. Together they form a unique fingerprint.

Cite this