Instruction Fusion for Multiscalar and Many-Core Processors

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


The utilization wall, caused by the breakdown of threshold voltage scaling, hinders performance gains for new generation microprocessors. We propose an instruction fusion technique for multiscalar and many-core processors to alleviate its impact. With instruction fusion, similar copies of an instruction to be run on multiple pipelines or cores are merged into a single copy for simultaneous execution. Instruction fusion applied to vector code enables the processor to idle early pipeline stages and instruction caches at various times during program implementation with minimum performance degradation, while reducing program size and the required instruction memory bandwidth. Instruction fusion is applied here to a MIPS-based dual-core that resembles an ideal multiscalar of degree two. Benchmarking using an FPGA prototype shows a 6–11 % reduction in the dynamic power dissipation for the targeted applications as well as a 17–45 % decrease in code size with frequent performance improvements due to higher instruction cache hit rates.

Original languageEnglish (US)
Pages (from-to)67-78
Number of pages12
JournalInternational Journal of Parallel Programming
Issue number1
StatePublished - Feb 1 2017

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Information Systems


  • Benchmarking
  • Instruction fusion
  • Many-core processor
  • Superscalar


Dive into the research topics of 'Instruction Fusion for Multiscalar and Many-Core Processors'. Together they form a unique fingerprint.

Cite this