Norm-based Generalization Bounds for Sparse Neural Networks

Tomer Galanti, Liane Galanti, Mengjia Xu, Tomaso Poggio

Research output: Contribution to journalConference articlepeer-review

Abstract

In this paper, we derive norm-based generalization bounds for sparse ReLU neural networks, including convolutional neural networks. These bounds differ from previous ones because they consider the sparse structure of the neural network architecture and the norms of the convolutional filters, rather than the norms of the (Toeplitz) matrices associated with the convolutional layers. Theoretically, we demonstrate that these bounds are significantly tighter than standard norm-based generalization bounds. Empirically, they offer relatively tight estimations of generalization for various simple classification problems. Collectively, these findings suggest that the sparsity of the underlying target function and the model's architecture plays a crucial role in the success of deep learning.

Original languageEnglish (US)
JournalAdvances in Neural Information Processing Systems
Volume36
StatePublished - 2023
Event37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, United States
Duration: Dec 10 2023Dec 16 2023

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Norm-based Generalization Bounds for Sparse Neural Networks'. Together they form a unique fingerprint.

Cite this