A Prediction-Sampling-Based Multilayer-Structured Latent Factor Model for Accurate Representation to High-Dimensional and Sparse Data

Di Wu, Xin Luo, Yi He, Mengchu Zhou

Research output: Contribution to journalArticlepeer-review

42 Scopus citations

Abstract

Performing highly accurate representation learning on a high-dimensional and sparse (HiDS) matrix is of great significance in a big data-related application such as a recommender system. A latent factor (LF) model is one of the most efficient approaches to the HiDS matrix representation. However, an LF model's representation learning ability relies heavily on an HiDS matrix's known data density, which is extremely low due to numerous missing data entities. To address this issue, this work proposes a prediction-sampling-based multilayer-structured LF (PMLF) model with twofold ideas: 1) constructing a loosely connected multilayered LF architecture to increase the known data density of an input HiDS matrix by generating synthetic data layer by layer and 2) constraining this synthetic data generating process through a random prediction-sampling strategy and nonlinear activations to avoid overfitting. In the experiments, PMLF is compared with six state-of-the-art LF-and deep neural network (DNN)-based models on four HiDS matrices from industrial applications. The results demonstrate that PMLF outperforms its peers in well-balancing prediction accuracy and computational efficiency.

Original languageEnglish (US)
Pages (from-to)3845-3858
Number of pages14
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume35
Issue number3
DOIs
StatePublished - Mar 1 2024

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Keywords

  • Deep forest
  • deep learning
  • generalized multilayer structure
  • high-dimensional and sparse (HiDS) data
  • latent factor (LF) model
  • missing data estimation

Fingerprint

Dive into the research topics of 'A Prediction-Sampling-Based Multilayer-Structured Latent Factor Model for Accurate Representation to High-Dimensional and Sparse Data'. Together they form a unique fingerprint.

Cite this