TY - GEN
T1 - Max-PIM
T2 - 58th ACM/IEEE Design Automation Conference, DAC 2021
AU - Zhang, Fan
AU - Angizi, Shaahin
AU - Fan, Deliang
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/12/5
Y1 - 2021/12/5
N2 - Recently, in-DRAM computing is becoming one promising technique to address the notorious 'memory-wall' issue for big data processing. In this work, for the first time, we propose a novel 'Min/Max-in-memory' algorithm based on iterative XNOR bit-wise comparison, which supports parallel inmemory searching for minimum and maximum of bulk data stored in DRAM as unsigned signed integers, fixed-point and floating numbers. We then develop a new processing-in-DRAM architecture, called Max-PIM, that supports complete bit-wise Boolean logic and beyond. Differentiating from prior works, Max-PIM is optimized with one-cycle fast XNOR logicin-DRAM operation and in-memory data transpose, which are heavily used and keys to accelerate the proposed Min/Max-in-memory algorithm efficiently. Extensive experiments of utilizing Max-PIM in big data sorting and graph processing applications show that it could speed up 50X and 1000X than GPU and CPU, while only consuming 10% and 1% energy, respectively. Moreover, comparing with recent representative In-DRAM computing platforms, i.e., Ambit [1], DRISA [2], our design could speed up 3X-10X.
AB - Recently, in-DRAM computing is becoming one promising technique to address the notorious 'memory-wall' issue for big data processing. In this work, for the first time, we propose a novel 'Min/Max-in-memory' algorithm based on iterative XNOR bit-wise comparison, which supports parallel inmemory searching for minimum and maximum of bulk data stored in DRAM as unsigned signed integers, fixed-point and floating numbers. We then develop a new processing-in-DRAM architecture, called Max-PIM, that supports complete bit-wise Boolean logic and beyond. Differentiating from prior works, Max-PIM is optimized with one-cycle fast XNOR logicin-DRAM operation and in-memory data transpose, which are heavily used and keys to accelerate the proposed Min/Max-in-memory algorithm efficiently. Extensive experiments of utilizing Max-PIM in big data sorting and graph processing applications show that it could speed up 50X and 1000X than GPU and CPU, while only consuming 10% and 1% energy, respectively. Moreover, comparing with recent representative In-DRAM computing platforms, i.e., Ambit [1], DRISA [2], our design could speed up 3X-10X.
KW - IMC
KW - In-DRAM Computing
KW - Min/Max
KW - PIM
UR - http://www.scopus.com/inward/record.url?scp=85119406880&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119406880&partnerID=8YFLogxK
U2 - 10.1109/DAC18074.2021.9586096
DO - 10.1109/DAC18074.2021.9586096
M3 - Conference contribution
AN - SCOPUS:85119406880
T3 - Proceedings - Design Automation Conference
SP - 211
EP - 216
BT - 2021 58th ACM/IEEE Design Automation Conference, DAC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 5 December 2021 through 9 December 2021
ER -