TY - GEN
T1 - Deep Neural Network Acceleration in Non-Volatile Memory
T2 - 15th IEEE/ACM International Symposium on Nanoscale Architectures, NANOARCH 2019
AU - Angizi, Shaahin
AU - Fan, Deliang
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Latest algorithmic development has brought competitive classification accuracy for neural networks despite constraining the network parameters to ternary or binary representations. These findings show significant optimization opportunities to replace computationally-intensive convolution operations (based on multiplication) with more efficient and less complex operations such as addition. In hardware implementation domain, processing-in-memory architecture is becoming a promising solution to alleviate enormous energy-hungry data communication between memory and processing units, bringing considerable improvement for system performance and energy efficiency while running such large networks. In this paper, we review several of our recent works regarding Processing-in-Memory (PIM) accelerator based on Magnetic Random Access Memory computational sub-arrays to accelerate the inference mode of quantized neural networks using digital non-volatile memory rather than using analog crossbar operation. In this way, we investigate the performance of two distinct in-memory addition schemes compared to other digital methods based on processing-in-DRAM/GPU/ASIC design to tackle DNN power and memory wall bottleneck.
AB - Latest algorithmic development has brought competitive classification accuracy for neural networks despite constraining the network parameters to ternary or binary representations. These findings show significant optimization opportunities to replace computationally-intensive convolution operations (based on multiplication) with more efficient and less complex operations such as addition. In hardware implementation domain, processing-in-memory architecture is becoming a promising solution to alleviate enormous energy-hungry data communication between memory and processing units, bringing considerable improvement for system performance and energy efficiency while running such large networks. In this paper, we review several of our recent works regarding Processing-in-Memory (PIM) accelerator based on Magnetic Random Access Memory computational sub-arrays to accelerate the inference mode of quantized neural networks using digital non-volatile memory rather than using analog crossbar operation. In this way, we investigate the performance of two distinct in-memory addition schemes compared to other digital methods based on processing-in-DRAM/GPU/ASIC design to tackle DNN power and memory wall bottleneck.
KW - Depp Neural network acceleration
KW - In-memory computing
KW - Magnetic Random Access Memory
UR - http://www.scopus.com/inward/record.url?scp=85084950947&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084950947&partnerID=8YFLogxK
U2 - 10.1109/NANOARCH47378.2019.181297
DO - 10.1109/NANOARCH47378.2019.181297
M3 - Conference contribution
AN - SCOPUS:85084950947
T3 - NANOARCH 2019 - 15th IEEE/ACM International Symposium on Nanoscale Architectures, Proceedings
BT - NANOARCH 2019 - 15th IEEE/ACM International Symposium on Nanoscale Architectures, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 July 2019 through 19 July 2019
ER -