Weekly Seminar – 02/02/2018 – Symmetry-Breaking Convergence Analysis

Gauri Jagatap from Dr. Hegde’s research group presented the paper “Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity”. You can find the paper here: https://openreview.net/forum?id=Hk85q85ee


In this paper, the authors use dynamical system to analyze the nonlinear weight dynamics of two-layered bias-free networks in the form of g(x; w) = \sum_{j=1}^K \sigma(w_j x), where \sigma(.) is ReLU nonlinearity. The input x is assumed to follow Gaussian distribution.

The authors show that for K = 1 (single ReLU), the nonlinear dynamics can be written in close form, and converges to w* with probability at least (1-\epsilon)/2, if random weight initializations of proper standard derivation (1/\sqrt{d}) are used, verifying empirical practice.

For networks with many ReLU nodes (K >= 2), they apply our closed form dynamics and prove that when the teacher parameters w*_j ‘s form an orthonormal basis,
(1) a symmetric weight initialization yields a convergence to a saddle point and
(2) a certain symmetry-breaking weight initialization yields global convergence to w* without local minima.

They claim that this is the first proof that shows global convergence in nonlinear neural network without unrealistic assumptions on the independence of ReLU activations.

Location: 2222 Coover Hall

Time: 12.10 – 1 PM, Friday February 2nd.
Please find the notes of the talk here.