Power flow analysis plays an important role in power grid configurations, operating management and contingency analysis. The Newton-Raphson (NR) iterative method is often enlisted for solving power flow analysis problems. However, it involves computation-expensive matrix multiplications (MMs). In this paper we propose an FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Hierarchical Instruction Set Architecture (HISA) to speed up MM within each NR iteration. FPGA stands for Field-Programmable Gate Array. HISA is comprised of medium-grain and coarse-grain instructions. The H-SIMD machine also facilitates better mapping of MM onto recent multimillion-gate FPGAs. At each level, any HISA instruction is classified to be of either the communication or computation type. The former are executed by a controller while the latter are issued to lower levels in the hierarchy. Additionally, by using a memory switching scheme and the high-level HISA set to partition applications, the host-FPGA communication overheads can be hidden. Our test results show sustained high performance.