Skip to content

Latest commit

 

History

History
135 lines (72 loc) · 6.75 KB

README.md

File metadata and controls

135 lines (72 loc) · 6.75 KB

c-lasso

Numerical benchmarks for c-lasso

We provide numerical benchmarks for the c-lasso package in comparison to cvxpy. We report run times, achieved minimum function values (with the function value found by the path algorithm as baseline), and constraint satisfaction quality of the zero-sum constraint.

Table of Contents

Benchmark set-up

Tested Regression and classification problems

Here, we consider the following problem formulations (see here for detailed infos):

[R1] Standard constrained Lasso regression:

[R2] Contrained sparse Huber regression:

[C1] Contrained sparse classification with Square Hinge loss (L1-Regularized Square-Hinge SVM):

where the xi are the rows of X and l is defined as:

Synthetic data generation and test problem set-ups

We use the random_data function in c-lasso to generate X and y. We use the standard zeroSum constraint. We vary the number of samples n and dimensionionality d of the problems. The regularization parameter is fixed to λ=0.1. This setting does not favor the path algorithm. The reported performance is thus rather a lower bound on the actual speed-up. Since, for most model selection schemes, the computation of the entire solution path is required, the path algorithm formulation would be even more preferable then.

Results

The running times of the micro-benchmark has been computed using Python 3.9.1 on a laptop MacBook Air, operating on macOS high Sierra with the processor 1.8 GHz Intel Core i5, with memory of 8 Gb 1600 MHz DDR3.

R1

Run times for R1.

Run times on R1

Achieved minimum function values on R1. We observe considerable inaccuracies in cvx solutions.

Achieved function values on C1

Satistifaction of the zero-sum constraint.

Satistifaction of the zero-sum constraint

R2

Run times for R2.

Run times on R2

Achieved minimum function values on R2.

Achieved function values on R2

Satistifaction of the zero-sum constraint

Satistifaction of the zero-sum constraint

C1

Run times for C1.

Run times on C1

Achieved minimum function values on C1.

Achieved function values on C1

Satistifaction of the zero-sum constraint

Satistifaction of the zero-sum constraint

Optimization schemes

We consider the following schemes in the benchmark.

Path algorithms (Path-Alg, pa)

This is the default algorithm for non-concomitant problems [R1,R3,C1,C2]. The algorithm uses the fact that the solution path along λ is piecewise- affine (as shown, e.g., in [1]). When Least-Squares is used as objective function, we derive a novel efficient procedure that allows us to also derive the solution for the concomitant problem [R2] along the path with little extra computational overhead.

Projected primal-dual splitting method (pds):

This algorithm is derived from [2] and belongs to the class of proximal splitting algorithms. It extends the classical Forward-Backward (FB) (aka proximal gradient descent) algorithm to handle an additional linear equality constraint via projection. In the absence of a linear constraint, the method reduces to FB. This method can solve problem [R1].

Douglas-Rachford-type splitting method (dr)

This algorithm is the most general algorithm and can solve all regression problems [R1-R4]. It is based on Doulgas Rachford splitting in a higher-dimensional product space. It makes use of the proximity operators of the perspective of the LS objective (see [4,5])

Operator splitting conic solver (SCS) in CVX (cvx)

For external comparison, we use cvx and its underlying conic solver (scs). For more info, see [6].

References