Stata package: lassopack

lassopack is a suite of programs for regularized regression methods suitable for the high-dimensional setting where the number of predictors, , may be large and possibly greater than the number of observations, .

The package consists of three main programs:

  • lasso2 implements lasso, square-root lasso, elastic net, ridge regression, adaptive lasso and post-estimation OLS. The lasso (Least Absolute Shrinkage and Selection Operator, Tibshirani 1996), the square-root-lasso (Belloni et al. 2011) and the adaptive lasso (Zou 2006) are regularization methods that use norm penalization to achieve sparse solutions: of the full set of predictors, typically most will have coefficients set to zero. Ridge regression (Hoerl & Kennard 1970) relies on norm penalization; the elastic net (Zou & Hastie 2005) uses a mix of and penalization.
  • cvlasso supports -fold cross-validation and h-step ahead rolling cross-validation (for time-series and panel data) to choose the penalization parameters for all the implemented estimators.
  • rlasso implements theory-driven penalization for the lasso and square-root lasso that can be applied to cross-section and panel data. rlasso uses the theory-driven penalization methodology of Belloni et al. (2012, 2013, 2014, 2016) for the lasso and square-root lasso. In addition, rlasso implements the Chernozhukov et al. (2013) sup-score test of joint significance of the regressors that is suitable for the high-dimensional setting.

The main purpose of lassopack is to facilitate prediction and model selection with large- (i.e., “wide”) data sets. lassopack also underlies the routines developed in pdslasso.