Optimization Methods for Training Neural Networks | Department of Computer Science

Most high-dimensional nonconvex optimization problems cannot be solved to optimality. It has been observed, however, that deep neural networks have a benign geometry that permits standard optimization methods to find acceptable solutions. However, solution times can be exorbitant. In addition, not all minimizers of the neural network loss functions are equally desirable, as some lead to prediction systems with better generalization properties than others. In this talk we discuss classical and new optimization methods in the light of these observations, and conclude with some open questions.

Bio:
Jorge Nocedal is the Walter P. Murphy Professor of Industrial Engineering and Management Sciences at Northwestern University. He received his B.S. in Physics at the National University of Mexico, and later finished his Ph.D. in Mathematical Sciences at Rice University. His research interests are in optimization and its application in machine learning and in disciplines involving differential equations. He specializes in nonlinear optimization, both convex and non-convex; deterministic and stochastic. The motivation for his current algorithmic and theoretical research stems from applications in image and speech recognition, recommendation systems, and search engines.