Guaranteed Convergence of Training Convolutional Neural Networks via Accelerated Gradient Descent

Shuai Zhang, Meng Wang, Sijia Liu, Pin Yu Chen, Jinjun Xiong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

In this paper, we study the linear regression problem of training an one-hidden-layer non-overlapping convolutional neural networks (ConvNNs) with the rectified linear unit (ReLU) activation functions. Given a set of training data that contains the inputs (feature vectors) and outputs (labels), the outputs are assumed to be generated from a ConvNN with unknown weights, and our goal is to recover the ground-truth weights by minimizing a non-convex optimization problem whose object function is the empirical loss function. We have proved that if the inputs belong to Gaussian distribution, then the optimization problem can be solved by accelerated gradient descent (AGD) algorithm with a well-designed initial point and enough samples, and the iterates via AGD algorithm converge linearly to the ground-truth weights.

Original languageEnglish (US)
Title of host publication2020 54th Annual Conference on Information Sciences and Systems, CISS 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728140841
DOIs
StatePublished - Mar 2020
Externally publishedYes
Event54th Annual Conference on Information Sciences and Systems, CISS 2020 - Princeton, United States
Duration: Mar 18 2020Mar 20 2020

Publication series

Name2020 54th Annual Conference on Information Sciences and Systems, CISS 2020

Conference

Conference54th Annual Conference on Information Sciences and Systems, CISS 2020
Country/TerritoryUnited States
CityPrinceton
Period3/18/203/20/20

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Artificial Intelligence

Keywords

  • accelerated gradient descent
  • convolutional neural networks
  • global optimality
  • linear convergence

Fingerprint

Dive into the research topics of 'Guaranteed Convergence of Training Convolutional Neural Networks via Accelerated Gradient Descent'. Together they form a unique fingerprint.

Cite this