Abstract:
Sparse generalized linear models (GLMs) are essential tools in machine learning and statistics,
with broad applications in healthcare, finance, engineering, and science.
In this work, we study the global optimization of cardinality-constrained GLMs.
While many scalable approaches rely on convex surrogates, such as the lasso, or other heuristics, these approximations can yield poor
solutions in the presence of high-dimensional or highly correlated features.
This limitation is particularly problematic in high-stakes applications such as healthcare, where accuracy, reliability,
and interpretability are essential.
We therefore focus on certifiably optimal solutions to the exact l0-constrained formulation.
These problems are NP-hard, and a standard approach is to solve them with mixed-integer programming using branch-and-bound (BnB), which requires valid lower bounds at every node of the search tree.
We focus on computing these lower bounds by solving perspective relaxations, which are strong but computationally challenging for large instances.
Interior-point methods are a standard option, but they do not scale well because they rely on solving linear systems, have cubic
worst-case complexity in the number of variables, and are difficult to parallelize on modern hardware such as GPUs.
First-order methods are more scalable and warm-start friendly, but to truly benefit from GPUs they must avoid expensive subroutines that cannot be parallelized.
Moreover, for use inside BnB, they must generate valid lower bounds with strong convergence guarantees, not merely
good primal iterates.
This motivates the central question of our work: can we solve perspective relaxations within BnB and obtain valid lower bounds using a GPU-friendly first-order method with provable linear convergence?
Our answer is positive.
We first reformulate the perspective relaxation as an unconstrained convex composite problem with a novel nonsmooth regularizer implicitly defined by the perspective function.
We then analyze a more general composite structure and uncover two favorable geometric properties.
Under suitable regularity conditions, the primal problem satisfies quadratic growth, while the dual problem satisfies quadratic decay, a dual counterpart to primal quadratic growth.
By exploiting these properties, we propose a gap-based restart scheme that applies broadly to convex composite problems beyond perspective relaxations.
This scheme accelerates a broad class of proximal methods, including fixed-stepsize, linesearch-based, and linesearch-free variants, and gives provable linear convergence rates for both primal and dual objectives and sequences.
For sparse GLMs, we prove that the implicit perspective regularizer satisfies the required geometric conditions and show that both its function value and proximal operator can be evaluated exactly in log-linear time.
This enables efficient, GPU-friendly first-order methods without solving expensive conic subproblems.
Experiments on synthetic and real-world datasets show one-to-two-orders-of-magnitude speedups on CPUs, with an additional order-of-magnitude improvement on GPUs, for computing dual bounds and certifying optimal solutions of large-scale sparse GLMs.
This is joint work with Andrea Lodi and Soroosh Shafiee.