2017-12-22 | Chris Junchi Li：Stochastic Approximation for Feature Extraction: Nonconvex Optimization, Online Updating, and Diffusion Approximation
Statistical dimension-reduction and feature-extraction procedures such as Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS) and Canonical Correlation Analysis (CCA) present significant computational challenges with large-scale data. Online algorithms that updates the estimator by processing streaming data on-the-fly are of tremendous practical and theoretical interests. In this talk, we formulate these statistical procedures as optimization problem with nonconvex statistical structures, and view these online algorithm as a stochastic approximation method for these nonconvex statistical optimization problem. Using standard tools in martingale theory, we for the first time obtain the finite-sample error bound of O( \sqrt(d/N) ). We prove that (i) for PCA, such bound is known to closely match the minimax information lower bound for PCA (L. et al., 2015 Mathematical Programming; L. et al., 2017 NIPS); (ii) for (tensor formulation of) ICA, such bound is consistent with the computational lower bound for spiked tensor model (L., Wang & Liu, 2016 NIPS; L. et al., 2017; Wang, Yang, L. & Liu, 2016); (iii) for PLS and CCA, we propose novel SGD variants that has desirable properties. Time permitting, we further brief the recent progresses on the diffusion approximation methods for generic first-order algorithms. We show that in the continuous-time framework, first-order stochastic gradient method (Hu & L., 2017+) as well as stochastic heavy-ball method (Hu, L. & Su, 2017+) fastly avoid all saddle points within a time complexity that is inverse proportional to the stepsize.
Dr. Junchi Li obtained his B.S. in Mathematics and Applied Mathematics at Peking University in 2009, and his Ph.D. in Mathematics at Duke University in 2014. He has since held several research positions, including the role of visiting postdoctoral research associate at Department of Operations Research and Financial Engineering, Princeton University. His research interests include statistical machine learning and optimization, scalable online algorithms for big data analytics, and stochastic dynamics on graphs and social networks. He has published research articles in both top optimization journals and top machine learning conferences, including an oral presentation paper (1.23%) at NIPS 2017.