Spark-Based Large-Scale Matrix Inversion for Big Data Processing

Jun Liu, Yang Liang, Nirwan Ansari

Research output: Contribution to journalArticlepeer-review

42 Scopus citations


Matrix inversion is a fundamental operation for solving linear equations for many computational applications, especially for various emerging big data applications. However, it is a challenging task to invert large-scale matrices of extremely high order (several thousands or millions), which are common in most Web-scale systems, such as social networks and recommendation systems. In this paper, we present an lower upper decomposition-based block-recursive algorithm for large-scale matrix inversion. We present its well-designed implementation with optimized data structure, reduction of space complexity, and effective matrix multiplication on the Spark parallel computing platform. The experimental evaluation results show that the proposed algorithm is efficient to invert large-scale matrices on a cluster composed of commodity servers and is scalable for inverting even larger matrices. The proposed algorithm and implementation will become a solid foundation for building a high-performance linear algebra library on Spark for big data processing and applications.

Original languageEnglish (US)
Article number7440788
Pages (from-to)2166-2176
Number of pages11
JournalIEEE Access
StatePublished - 2016

All Science Journal Classification (ASJC) codes

  • General Engineering
  • General Computer Science
  • General Materials Science


  • LU decomposition
  • Matrix inversion
  • Spark
  • distributed computing
  • linear algebra
  • parallel algorithm


Dive into the research topics of 'Spark-Based Large-Scale Matrix Inversion for Big Data Processing'. Together they form a unique fingerprint.

Cite this