Abstract
Performing highly accurate representation learning on a high-dimensional and sparse (HiDS) matrix is of great significance in a big data-related application such as a recommender system. A latent factor (LF) model is one of the most efficient approaches to the HiDS matrix representation. However, an LF model's representation learning ability relies heavily on an HiDS matrix's known data density, which is extremely low due to numerous missing data entities. To address this issue, this work proposes a prediction-sampling-based multilayer-structured LF (PMLF) model with twofold ideas: 1) constructing a loosely connected multilayered LF architecture to increase the known data density of an input HiDS matrix by generating synthetic data layer by layer and 2) constraining this synthetic data generating process through a random prediction-sampling strategy and nonlinear activations to avoid overfitting. In the experiments, PMLF is compared with six state-of-the-art LF-and deep neural network (DNN)-based models on four HiDS matrices from industrial applications. The results demonstrate that PMLF outperforms its peers in well-balancing prediction accuracy and computational efficiency.
Original language | English (US) |
---|---|
Pages (from-to) | 3845-3858 |
Number of pages | 14 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
Volume | 35 |
Issue number | 3 |
DOIs | |
State | Published - Mar 1 2024 |
All Science Journal Classification (ASJC) codes
- Software
- Computer Science Applications
- Computer Networks and Communications
- Artificial Intelligence
Keywords
- Deep forest
- deep learning
- generalized multilayer structure
- high-dimensional and sparse (HiDS) data
- latent factor (LF) model
- missing data estimation