PDSLASSO

When would you want to use pdslasso? #

pdslasso and ivlasso are routines for estimating structural parameters in linear models with many controls and/or many instruments. The routines use methods for estimating sparse high-dimensional models, specifically the lasso (Least Absolute Shrinkage and Selection Operator, Tibshirani 1996) and the square-root-lasso (Belloni et al. 2011, 2014).

The purpose of pdslasso is to improve causal inference when the aim is to assess the effect of one or a few (possibly endogenous) regressors on the outcome variable. pdslasso allows to select control variables and/or instruments.

Many control variables #

The primary interest in an econometric analysis often lies in one or a few regressors, for which we want to estimate the causal effect on an outcome variable. However, to allow for a causal interpretation we need to control for confounding factors. Lasso-type techniques can be employed to appropriately select controls and thus improve the robustness of causal inference.

Many instruments #

High-dimensional instruments can arise when there is inherently large number of potentially relevant instruments or when it’s unclear how these instruments should be specified (e.g. dummy variables, interaction effects).

Methods #

Two approaches are implemented in pdslasso and ivlasso:

  1. The post-double-selection methodology of Belloni et al. (2012, 2013, 2014, 2015, 2016).
  2. The post-regularization methodology of Chernozhukov, Hansen and Spindler (2015).

For instrumental variable estimation, ivlasso implements weak-identification-robust hypothesis tests and confidence sets using the Chernozhukov et al. (2013) sup-score test.

The implemention of these methods in pdslasso and ivlasso require the Stata program rlasso (available in the separate Stata module lassopack), which provides lasso and square root-lasso estimation with data-driven penalization.