When would you want to use pdslasso? #
pdslasso
and ivlasso
are routines for estimating structural parameters in linear models with many controls
and/or many instruments. The routines use methods for estimating sparse high-dimensional models, specifically the lasso
(Least Absolute Shrinkage and Selection Operator, Tibshirani 1996) and the square-root-lasso
(Belloni et al. 2011, 2014).
The purpose of pdslasso is to improve causal inference when the aim is to assess the effect of one or a few (possibly endogenous) regressors on the outcome variable. pdslasso allows to select control variables and/or instruments.
Many control variables #
The primary interest in an econometric analysis often lies in one or a few regressors, for which we want to estimate the causal effect on an outcome variable. However, to allow for a causal interpretation we need to control for confounding factors. Lasso-type techniques can be employed to appropriately select controls and thus improve the robustness of causal inference.
Many instruments #
High-dimensional instruments can arise when there is inherently large number of potentially relevant instruments or when it’s unclear how these instruments should be specified (e.g. dummy variables, interaction effects).
Methods #
Two approaches are implemented in pdslasso
and ivlasso
:
- The post-double-selection methodology of Belloni et al. (2012, 2013, 2014, 2015, 2016).
- The post-regularization methodology of Chernozhukov, Hansen and Spindler (2015).
For instrumental variable estimation, ivlasso
implements weak-identification-robust hypothesis tests and confidence sets
using the Chernozhukov et al. (2013) sup-score test.
The implemention of these methods in pdslasso
and ivlasso
require the Stata program rlasso
(available in the
separate Stata module lassopack), which provides lasso and square root-lasso estimation with data-driven penalization.