TY - GEN
T1 - H-SIMD machine
T2 - 2005 IEEE International Conference on Computer Design: VLSI in Computers and Processors, ICCD 2005
AU - Xu, Xizhen
AU - Ziavras, Sotirios G.
N1 - Copyright:
Copyright 2008 Elsevier B.V., All rights reserved.
PY - 2005
Y1 - 2005
N2 - FPGAs (Field-Programmable Gate Arrays) are often used as coprocessors to boost the performance of data-intensive applications [1, 2], However, mapping algorithms onto multimillion-gate FPGAs is time consuming and remains a challenge in configurable system design. The communication overhead between the host workstation and the FPGAs is also significant. To address these problems, we propose in this paper the FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Hierarchical Instruction Set Architecture (HISA). At each level, HISA instructions are classified into communication instructions or computation instructions. The former are executed by the local controller while the latter are issued to the lower level for execution. Additionally, by using a memory switching scheme and the high-level HISA set to partition the application into coarse-grain tasks, the host-FPGA communication overhead can be hidden. We enlist matrix multiplication (MM) to test the effectiveness of H-SIMD. The test results show sustained high performance.
AB - FPGAs (Field-Programmable Gate Arrays) are often used as coprocessors to boost the performance of data-intensive applications [1, 2], However, mapping algorithms onto multimillion-gate FPGAs is time consuming and remains a challenge in configurable system design. The communication overhead between the host workstation and the FPGAs is also significant. To address these problems, we propose in this paper the FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Hierarchical Instruction Set Architecture (HISA). At each level, HISA instructions are classified into communication instructions or computation instructions. The former are executed by the local controller while the latter are issued to the lower level for execution. Additionally, by using a memory switching scheme and the high-level HISA set to partition the application into coarse-grain tasks, the host-FPGA communication overhead can be hidden. We enlist matrix multiplication (MM) to test the effectiveness of H-SIMD. The test results show sustained high performance.
UR - http://www.scopus.com/inward/record.url?scp=33645239602&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33645239602&partnerID=8YFLogxK
U2 - 10.1109/ICCD.2005.62
DO - 10.1109/ICCD.2005.62
M3 - Conference contribution
AN - SCOPUS:33645239602
SN - 0769524516
SN - 9780769524511
T3 - Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors
SP - 671
EP - 676
BT - Proceedings - 2005 IEEE International Conference on Computer Design
Y2 - 2 October 2005 through 5 October 2005
ER -