Many statistics and machine learning applications can be cast into a composite convex minimization problem. Well-known examples include LASSO-type, SVM and inverse covariance estimation. These problems are well studied and can efficiently be solved by several state-of-the-arts. Recent development in first-order, second-order, and stochastic gradient-type methods has brought a new opportunity to solve many other classes of convex optimization problems in large-scale settings. Unfortunately, so far, such methods require the underlying models to satisfy some structural assumptions such as Lipschitz gradient and restricted strong convexity, which may be failed to hold or may be hard to check.
In this talk, we demonstrate how to exploit an analytical structure hidden in convex optimization for developing solution methods. Our key idea is to generalize a powerful concept so-called “self-concordance” introduced by Y. Nesterov and A. Nemirovskii to a broader class of convex functions. We show that this structure covers many applications in statistics and machine learning. Then, we develop a unified theory for designing numerical methods. We illustrate our theory through Newton-type and proximal Newton-type methods. We note that the proposed theory can further be applied to develop other methods as long as the underlying model is involved with a “generalized self-concordant structure”. We provide some numerical examples in different fields to illustrate our theoretical development.