Weekly Seminar – 02/23/2018 – Consensus-based distributed stochastic gradient descent method for fixed topology networks

This week Zhanhong Jiang of Mechanical Engineering will speak about Distributed Deep Learning Algorithms.
The abstract:
The focus of this talk is towards developing novel distributed deep learning algorithms in order to solve challenging learning problems in various domains such as robotic networks. Specifically, I will present a consensus-based distributed stochastic gradient descent method for fixed topology networks. While some previous work has been done on this topic, the data parallelism and distributed computation are still not sufficiently explored. Therefore, the proposed method can be used to tackle such issues.

Venue: 2222 Coover Hall

Time: 12.10 – 1.00PM Friday, 23 February.

Weekly Seminar – 02/16/2018 – Interpreting the Deep Learning Models

This week’s speaker for the DSRG seminar series is Aditya Balu from Mechanical Engineering.


Deep learning models are very accurate to give good prediction. However, unlike the traditional shallow models, interpreting such model’s performance is very difficult. In this talk, I shall discuss several approaches which people have used for interpreting the models. Also, we shall discuss about application of such methods on different domains such as manufacturing etc.

Location: 2222 Coover Hall

Time: 12.00 Noon, Feb. 16th.

Notes: Slides for this Seminar are available here.

Weekly Seminar – 02/02/2018 – Symmetry-Breaking Convergence Analysis

Gauri Jagatap from Dr. Hegde’s research group presented the paper “Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity”. You can find the paper here: https://openreview.net/forum?id=Hk85q85ee


In this paper, the authors use dynamical system to analyze the nonlinear weight dynamics of two-layered bias-free networks in the form of g(x; w) = \sum_{j=1}^K \sigma(w_j x), where \sigma(.) is ReLU nonlinearity. The input x is assumed to follow Gaussian distribution.

The authors show that for K = 1 (single ReLU), the nonlinear dynamics can be written in close form, and converges to w* with probability at least (1-\epsilon)/2, if random weight initializations of proper standard derivation (1/\sqrt{d}) are used, verifying empirical practice.

For networks with many ReLU nodes (K >= 2), they apply our closed form dynamics and prove that when the teacher parameters w*_j ‘s form an orthonormal basis,
(1) a symmetric weight initialization yields a convergence to a saddle point and
(2) a certain symmetry-breaking weight initialization yields global convergence to w* without local minima.

They claim that this is the first proof that shows global convergence in nonlinear neural network without unrealistic assumptions on the independence of ReLU activations.

Location: 2222 Coover Hall

Time: 12.10 – 1 PM, Friday February 2nd.
Please find the notes of the talk here.

Weekly Seminar – 1/19/2018 and 1/26/2018 – Power of Gradient Descent

Invited talk by Dr. Chinmay Hegde of ECpE on:

“The power of gradient descent”

Many of the recent advances in machine learning can be attributed to two reasons: (i) more available data, and (ii) new and efficient optimization algorithms. Curiously, the simplest primitive from numerical analysis — gradient descent — is at the forefront of these newer ML techniques, even though the functions being optimized are often extremely non-smooth and/or non-convex.

In this series of chalk talks, I will discuss some recent theoretical advances that may shed light onto why this is happening and how to properly approach design of new training techniques.

– 12pm to 1pm, Friday, 19th and 26th January

– 2222, Coover Hall


Lecture notes are available here.

Spring 18 Seminar #1

After a hiatus of about five months, we’re finally back in action this semester, with a series of exciting talks lined up! Ardhendu Tripathy, a PhD student with Dr. Aditya Ramamoorthy has volunteered to share his experience from his recent internship at MERL. Please find the details below:


In the first few minutes I will describe my internship experience with MERL in Summer 2017, followed by a short talk about the work that was done. The basic subject of the internship was privacy-preserving release of datasets. A report about it can be found at https://arxiv.org/abs/1712.07008

In the talk, I will describe the problem framework and show a tradeoff between privacy and utility in a case of synthetic data. This tradeoff can be closely attained by using adversarial neural networks. Following that I will visualize the performance on a contrived privacy problem on the MNIST dataset.

Thanks and regards,

Please find the presentation slides accompanying the talk here.

12th January, Friday (tomorrow), 12pm-1pm.

2222, Coover Hall.

We’re also going to arrange for some refreshments! Join us!

Summer Seminar

With the ILAS 2017 meet going on at Iowa State University, we had the privilege of inviting two young researchers to give a talk to our audience.

Details of the first talk are as follows:

Date: 28 Jul 2017
2:00 PM – 3:30 PM

3043 ECpE Building Addition

Speaker: Ju Sun, Postdoctoral Research Fellow at Stanford University

Title: “When Are Nonconvex Optimization Problems Not Scary?”

For more details, check the department website.

Details for the second talk are as follows:

Date: 28 Jul 2017
3:45 PM – 5:00 PM

3043 ECpE Building Addition

Speaker: Ludwig Schmidt, PhD student at MIT

Title: Faster Constrained Optimization via Approximate Projections (tentative)

Refreshments (food and coffee) will be provided! Join us!

Weekly Seminar – 4/14/2017 – Low Rank and Sparse Signal Processing #4

Charlie Hubbard from Dr. Hegde’s research group will be giving a talk on “Parallel Methods for Matrix Completion”. Please note the venue, it is NOT our usual place.
Date: April 4th, 2017
Time: 3:00 – 4:00 pm
Venue: 3043 Coover hall
Charlie’s abstract: As a graduate student, you don’t have time to search through the entire Netflix library for a movie you’ll like…you barely have time to watch a movie in the first place!  Thankfully, Netfilx excels at content recommendation, it is able to present you with twenty or so movies from its entire library that it knows you’ll enjoy watching (while you do homework). In recent years it has been shown that matrix completion can be a useful tool for content recommendation: given a sparse matrix of users-item ratings, matrix completion can be used to predict the unseen ratings.  The problem for large-scale content providers, like Amazon and Netflix, is that the size of their user-item matrices (easily 100,000 x 10,000) make most matrix completion approaches infeasible.   In this talk I will discuss: two scalable methods (Jellyfish and Hogwild!) for parallel matrix completion, a GPU-based implementation of Jellyfish and preliminary results from an unnamed algorithm for parallel inductive matrix completion.  
Slides: MC