Adaptively-Accelerated Parallel Stochastic Gradient Descent for High-Dimensional and Incomplete Data Representation Learning

Wen Qin, Xin Luo, Meng Chu Zhou

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

High-dimensional and incomplete (HDI) interactions among numerous nodes are commonly encountered in a Big Data-related application, like user-item interactions in a recommender system. Owing to its high efficiency and flexibility, a stochastic gradient descent (SGD) algorithm can enable efficient latent feature analysis (LFA) of HDI data for its precise representation, thereby enabling efficient solutions to knowledge acquisition issues like missing data estimation. However, LFA on HDI data involves a bilinear issue, making SGD-based LFA a sequential process, i.e., the update on a feature can impact the results on the others. Intervening the sequence of SGD-based LFA on HDI data can affect the training results. Therefore, a parallel SGD algorithm to LFA should be designed with care. Existing parallel SGD-based LFA models suffer from a) low parallelization degree, and b) slow convergence, which significantly restrict their scalability. Aiming at addressing these vital issues, this paper presents an Adaptively-accelerated Parallel Stochastic Gradient Descent (AP-SGD) algorithm to LFA by: a) establishing a novel local minimum-based data splitting and scheduling scheme to reduce the scheduling cost among threads, thereby achieving high parallelization degree; and b) incorporating the adaptive momentum method into the learning scheme, thereby accelerating the convergence rate by making the learning rate and acceleration coefficient self-adaptive. The convergence of the achieved AP-SGD-based LFA model is theoretically proved. Experimental results on three HDI matrices generated by real industrial applications demonstrate that the AP-SGD-based LFA model outperforms state-of-the-art parallel SGD-based LFA models in both estimation accuracy for missing data and parallelization degree. Hence, it has the potential for efficient representation of HDI data in industrial scenes.

Original languageEnglish (US)
Pages (from-to)92-107
Number of pages16
JournalIEEE Transactions on Big Data
Volume10
Issue number1
DOIs
StatePublished - Feb 1 2024

All Science Journal Classification (ASJC) codes

  • Information Systems and Management
  • Information Systems

Keywords

  • Parallel algorithm
  • data science
  • high-dimensional and incomplete data
  • industrial application
  • latent feature analysis
  • parallelization shared-memory
  • stochastic gradient descent

Fingerprint

Dive into the research topics of 'Adaptively-Accelerated Parallel Stochastic Gradient Descent for High-Dimensional and Incomplete Data Representation Learning'. Together they form a unique fingerprint.

Cite this