Although half a decade has passed since Frank Rosenblatt's original work on multilayer perceptrons, modern artificial neural networks are still surprisingly similar to his original ideas.
In this talk, I will question one of their most fundamental design aspects. As networks have become much deeper than had been possible or had even been imagined in the 1950s, it is no longer clear that the layer by layer connectivity pattern is a well-suited architectural choice. In the first part of the talk I will show that randomly removing layers during training can speed up the training process, make it more robust, and ultimately lead to better generalization. We refer to this process as learning with stochastic depth -- as the effective depth of the networks varies for each minibatch. In the second part of the talk I will propose an alternative connectivity pattern, Dense Connectivity, which is inspired by the insights obtained from stochastic depth. Dense connectivity leads to substantial reductions in parameter sizes, faster convergence, and further improvement in generalization. Finally, I will investigate the question why deep neural networks are so well suited for natural images and provide evidence that they linearize the underlying sub-manifold into a Euclidean feature space.
Kilian Weinberger is an Associate Professor in the Department of Computer Science at Cornell University. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul and his undergraduate degree in Mathematics and Computer Science from the University of Oxford. During his career he has won several best paper awards at ICML (2004), CVPR (2004, 2017), AISTATS (2005) and KDD (2014, runner-up award). In 2011 he was awarded the Outstanding AAAI Senior Program Chair Award and in 2012 he received an NSF CAREER award. He was elected co-Program Chair for ICML 2016 and for AAAI 2018.
In 2016 he was the recipient of the Daniel M Lazar '29 Excellence in Teaching Award. Kilian Weinberger's research focuses on Machine Learning and its applications. In particular, he focuses on learning under resource constraints, metric learning, machine learned web-search ranking, computer vision and deep learning. Before joining Cornell University, he was an Associate Professor at Washington University in St. Louis and before that he worked as a research scientist at Yahoo! Research in Santa Clara.