TY - JOUR
T1 - Aligner-D
T2 - Leveraging In-DRAM Computing to Accelerate DNA Short Read Alignment
AU - Zhang, Fan
AU - Angizi, Shaahin
AU - Sun, Jiao
AU - Zhang, Wei
AU - Fan, Deliang
N1 - Publisher Copyright:
© 2011 IEEE.
PY - 2023/3/1
Y1 - 2023/3/1
N2 - DNA short read alignment task has become a major sequential bottleneck to humongous amounts of data generated by next-generation sequencing platforms. In this paper, an energy-efficient and high-throughput Processing-in-Memory (PIM) accelerator based on DRAM (named Aligner-D) is presented to execute DNA short-read alignment with the state-of-the-art BWT alignment algorithm. We first present the PIM design that utilizes DRAM's internal high parallelism and throughput. It converts each DRAM array to a potent processing unit for alignment tasks. The proposed Aligner-D can efficiently execute the bulk bit-wise XNOR-based matching operation required by the alignment task with only 3-transistor/col overhead. We then introduce a highly parallel and customized read alignment algorithm based on BWT that supports both exact and inexact match tasks. Next, we present how to map the correlated data of the alignment task to utilize the parallelism from both new hardware and algorithm maximumly. The experimental results demonstrate that Aligner-D obtains ∼ 4× , ∼ 2.45× , ∼ 3.26× , and ∼ 1.65× improvement, respectively, compared with other in-memory computing platforms: Ambit (Seshadri et al., 2017), DRISA-1T1C (Li et al., 2017), DRISA-3T1C (Li et al., 2017), and ReDRAM (Angizi and Fan, 2019). As for DNA short read alignment, Aligner-D boosts the alignment throughput per Watt by ∼ 20104× , ∼ 3522× , ∼ 927× , ∼ 88× , ∼ 5.28× , and ∼ 2.34×, over ReCAM, CPU, GPU, FPGA, Ambit, and DRISA, respectively.
AB - DNA short read alignment task has become a major sequential bottleneck to humongous amounts of data generated by next-generation sequencing platforms. In this paper, an energy-efficient and high-throughput Processing-in-Memory (PIM) accelerator based on DRAM (named Aligner-D) is presented to execute DNA short-read alignment with the state-of-the-art BWT alignment algorithm. We first present the PIM design that utilizes DRAM's internal high parallelism and throughput. It converts each DRAM array to a potent processing unit for alignment tasks. The proposed Aligner-D can efficiently execute the bulk bit-wise XNOR-based matching operation required by the alignment task with only 3-transistor/col overhead. We then introduce a highly parallel and customized read alignment algorithm based on BWT that supports both exact and inexact match tasks. Next, we present how to map the correlated data of the alignment task to utilize the parallelism from both new hardware and algorithm maximumly. The experimental results demonstrate that Aligner-D obtains ∼ 4× , ∼ 2.45× , ∼ 3.26× , and ∼ 1.65× improvement, respectively, compared with other in-memory computing platforms: Ambit (Seshadri et al., 2017), DRISA-1T1C (Li et al., 2017), DRISA-3T1C (Li et al., 2017), and ReDRAM (Angizi and Fan, 2019). As for DNA short read alignment, Aligner-D boosts the alignment throughput per Watt by ∼ 20104× , ∼ 3522× , ∼ 927× , ∼ 88× , ∼ 5.28× , and ∼ 2.34×, over ReCAM, CPU, GPU, FPGA, Ambit, and DRISA, respectively.
KW - DNA short read alignment
KW - DRAM
KW - accelerator
KW - processing-in-memory
UR - http://www.scopus.com/inward/record.url?scp=85148420827&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85148420827&partnerID=8YFLogxK
U2 - 10.1109/JETCAS.2023.3241545
DO - 10.1109/JETCAS.2023.3241545
M3 - Article
AN - SCOPUS:85148420827
SN - 2156-3357
VL - 13
SP - 332
EP - 343
JO - IEEE Journal on Emerging and Selected Topics in Circuits and Systems
JF - IEEE Journal on Emerging and Selected Topics in Circuits and Systems
IS - 1
ER -