##
When would you want to use **lassopack**?
#

**lassopack** is a suite of programs for regularized regression methods suitable for the high-dimensional setting where the number of predictors,
\(p\)
, may be large and possibly greater than the number of observations,
\(N\)
.

### High-dimensional models #

The regularized regression methods implemented in **lassopack** can deal with situations where the number of regressors is large or may even exceed the number of observations under the assumption of sparsity.

High-dimensionality can arise when (see Belloni et al., 2014):

- There are many variables available for each unit of observation. For example, in cross-country regressions the number of observations is naturally limited by the number of countries, whereas the number of potentially relevant explanatory variables is often large.
- There are only few observed variables, but the functional form through which these regressors enter the model is unknown. We can then use a large set of transformations (e.g. dummy variables, interaction terms and polynomials) to approximate the true functional form.

### Model selection #

Identifying the true model is a fundamental problem in applied econometrics. A standard approach is to use hypothesis testing to identify the correct model (e.g. *general-to-specific* approach). However, this is problematic if the number of regressors is large due to many false positives. Furthermore, sequential hypothesis testing induces a pre-test bias.

Lasso, elastic net and square-root lasso set some coefficient estimates to exactly zero, and thus allow for simultaneous estimation and model selection. The adaptive lasso is known to exhibit good properties as a model selector as shown by Zou (2006).

### Prediction #

If there are many predictors, OLS is likely to suffer from overfitting: good in-sample fit (large
\(R^2\)
), but poor *out-of-sample* prediction performance. Regularized regression methods tend to outperform OLS in terms of *out-of-sample* prediction.

Regularization techniques exploit the *variance-bias-tradeoff*: they reduce the complexity of the model (through shrinkage or by dropping variables). In doing so, they introduce a bias, but also reduce the variance of the prediction, which can result in improved prediction performance.

### Forecasting with time-series or panel data #

**lassopack** can also applied to time-series or panel data. For example, Medeiros & Mendes (2016) prove model selection consistency of the adaptive lasso when applied to time-series data with non-Gaussian, heteroskedastic errors.