Numerical benchmarks for c-lasso

We provide numerical benchmarks for the c-lasso package in comparison to cvxpy. We report run times, achieved minimum function values (with the function value found by the path algorithm as baseline), and constraint satisfaction quality of the zero-sum constraint.

Benchmark set-up

Tested Regression and classification problems

Here, we consider the following problem formulations (see here for detailed infos):

[R1] Standard constrained Lasso regression:

[R2] Contrained sparse Huber regression:

[C1] Contrained sparse classification with Square Hinge loss (L1-Regularized Square-Hinge SVM):

$\arg \min_{\beta \in \mathbb{R}^d} \sum_{i=1}^n l(y_i x_i \beta) + \lambda \left\lVert \beta\right\rVert_1 \qquad s.t. \qquad C\beta = 0$

where the x_i are the rows of X and l is defined as:

$l(r) = \begin{cases} (1-r)^2 & if \quad r \leq 1 \\ 0 &if \quad r \geq 1 \end{cases}$

Synthetic data generation and test problem set-ups

We use the random_data function in c-lasso to generate X and y. We use the standard zeroSum constraint. We vary the number of samples n and dimensionionality d of the problems. The regularization parameter is fixed to λ=0.1. This setting does not favor the path algorithm. The reported performance is thus rather a lower bound on the actual speed-up. Since, for most model selection schemes, the computation of the entire solution path is required, the path algorithm formulation would be even more preferable then.

Results

The running times of the micro-benchmark has been computed using Python 3.9.1 on a laptop MacBook Air, operating on macOS high Sierra with the processor 1.8 GHz Intel Core i5, with memory of 8 Gb 1600 MHz DDR3.

R1

Run times for R1.

Achieved minimum function values on R1. We observe considerable inaccuracies in cvx solutions.

Satistifaction of the zero-sum constraint.

R2

Run times for R2.

Achieved minimum function values on R2.

Satistifaction of the zero-sum constraint

C1

Run times for C1.

Achieved minimum function values on C1.

Satistifaction of the zero-sum constraint

Optimization schemes

We consider the following schemes in the benchmark.

Path algorithms (Path-Alg, pa)

This is the default algorithm for non-concomitant problems [R1,R3,C1,C2]. The algorithm uses the fact that the solution path along λ is piecewise- affine (as shown, e.g., in [1]). When Least-Squares is used as objective function, we derive a novel efficient procedure that allows us to also derive the solution for the concomitant problem [R2] along the path with little extra computational overhead.

Projected primal-dual splitting method (pds):

This algorithm is derived from [2] and belongs to the class of proximal splitting algorithms. It extends the classical Forward-Backward (FB) (aka proximal gradient descent) algorithm to handle an additional linear equality constraint via projection. In the absence of a linear constraint, the method reduces to FB. This method can solve problem [R1].

Douglas-Rachford-type splitting method (dr)

This algorithm is the most general algorithm and can solve all regression problems [R1-R4]. It is based on Doulgas Rachford splitting in a higher-dimensional product space. It makes use of the proximity operators of the perspective of the LS objective (see [4,5])

Operator splitting conic solver (SCS) in CVX (cvx)

For external comparison, we use cvx and its underlying conic solver (scs). For more info, see [6].

References

[1] B. R. Gaines, J. Kim, and H. Zhou, Algorithms for Fitting the Constrained Lasso, J. Comput. Graph. Stat., vol. 27, no. 4, pp. 861–871, 2018.
[2] L. Briceno-Arias and S.L. Rivera, A Projected Primal–Dual Method for Solving Constrained Monotone Inclusions, J. Optim. Theory Appl., vol. 180, Issue 3, March 2019.
[3] S. Rosset and J. Zhu, Piecewise linear regularized solution paths, Ann. Stat., vol. 35, no. 3, pp. 1012–1030, 2007.
[4] P. L. Combettes and C. L. Müller, Perspective M-estimation via proximal decomposition, Electronic Journal of Statistics, 2020, Journal version
[5] P. L. Combettes and C. L. Müller, Regression models for compositional data: General log-contrast formulations, proximal optimization, and microbiome data applications, Statistics in Bioscience, 2020.
[6] B. O’Donoghue, E. Chu, N. Parikh, and S. Boyd. Conic optimization via operator splitting and homogeneous self-dual embedding., Journal of Optimization Theory and Applications 169, no. 3 (2016): 1042-1068.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Numerical benchmarks for c-lasso

Table of Contents

Benchmark set-up

Tested Regression and classification problems

[R1] Standard constrained Lasso regression:

[R2] Contrained sparse Huber regression:

[C1] Contrained sparse classification with Square Hinge loss (L1-Regularized Square-Hinge SVM):

Synthetic data generation and test problem set-ups

Results

R1

R2

C1

Optimization schemes

Path algorithms (Path-Alg, pa)

Projected primal-dual splitting method (pds):

Douglas-Rachford-type splitting method (dr)

Operator splitting conic solver (SCS) in CVX (cvx)

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Numerical benchmarks for c-lasso

Table of Contents

Benchmark set-up

Tested Regression and classification problems

[R1] Standard constrained Lasso regression:

[R2] Contrained sparse Huber regression:

[C1] Contrained sparse classification with Square Hinge loss (L1-Regularized Square-Hinge SVM):

Synthetic data generation and test problem set-ups

Results

R1

R2

C1

Optimization schemes

Path algorithms (Path-Alg, pa)

Projected primal-dual splitting method (pds):

Douglas-Rachford-type splitting method (dr)

Operator splitting conic solver (SCS) in CVX (cvx)

References