A super-programming technique for large sparse matrix multiplication on PC clusters

Dejiang Jin, Sotirios G. Ziavras

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


The multiplication of large spare matrices is a basic operation in many scientific and engineering applications. There exist some high-performance library routines for this operation. They are often optimized based on the target architecture. For a parallel environment, it is essential to partition the entire operation into well balanced tasks and assign them to individual processing elements. Most of the existing techniques partition the given matrices based on some kind of workload estimation. For irregular sparse matrices on PC clusters, however, the workloads may not be well estimated in advance. Any approach other than run-time dynamic partitioning may degrade performance. In this paper, we apply our super-programming approach [24] to parallel large matrix multiplication on PC clusters. In our approach, tasks are partitioned into super-instructions that are dynamically assigned to member computer nodes. Thus, the load balancing logic is separated from the computing logic; the former is taken over by the runtime environment. Our super-programming approach facilitates ease of program development and targets high efficiency in dynamic load balancing. Workloads can be balanced effectively and the optimization overhead is small. The results prove the viability of our approach.

Original languageEnglish (US)
Pages (from-to)1774-1780
Number of pages7
JournalIEICE Transactions on Information and Systems
Issue number7
StatePublished - Jul 2004

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence


  • Load balancing
  • Matrix multiplication
  • PC cluster
  • Performance evaluation
  • Programming model


Dive into the research topics of 'A super-programming technique for large sparse matrix multiplication on PC clusters'. Together they form a unique fingerprint.

Cite this