Gauri Jagatap from Dr. Hegde’s research group presented the paper “Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity”. You can find the paper here: https://openreview.net/forum?id=Hk85q85ee

————————————————-

Abstract:

In this paper, the authors use dynamical system to analyze the nonlinear weight dynamics of *two-layered bias-free networks* in the form of g(x; w) = \sum_{j=1}^K \sigma(w_j x), where \sigma(.) is ReLU nonlinearity. The input x is assumed to follow Gaussian distribution.

The authors show that for K = 1 (single ReLU), the nonlinear dynamics can be written in close form, and converges to w* with probability at least (1-\epsilon)/2, if random weight initializations of proper standard derivation (1/\sqrt{d}) are used, verifying empirical practice.

For networks with many ReLU nodes (K >= 2), they apply our closed form dynamics and prove that when the teacher parameters w*_j ‘s form an orthonormal basis,

(1) a symmetric weight initialization yields a convergence to a saddle point and

(2) a certain symmetry-breaking weight initialization yields global convergence to w* without local minima.

They claim that this is the first proof that shows global convergence in nonlinear neural network without unrealistic assumptions on the independence of ReLU activations.

————————————————-

**Location: 2222 Coover Hall
**

**Time: 12.10 – 1 PM, Friday February 2nd.**