On Generalization Bounds for Neural Networks with Low Rank Layers

Andrea Pinto, Akshay Rangamani, Tomaso Poggio

Research output: Contribution to journalConference articlepeer-review

Abstract

While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain underexplored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep networks exhibiting neural collapse.

Original languageEnglish (US)
Pages (from-to)921-936
Number of pages16
JournalProceedings of Machine Learning Research
Volume272
StatePublished - 2025
Event36th International Conference on Algorithmic Learning Theory, ALT 2025 - Milan, Italy
Duration: Feb 24 2025Feb 27 2025

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Keywords

  • Gaussian complexity
  • Generalization bounds
  • Low rank layers
  • Neural collapse

Fingerprint

Dive into the research topics of 'On Generalization Bounds for Neural Networks with Low Rank Layers'. Together they form a unique fingerprint.

Cite this