Partial Linear IV

# Partial Linear IV Model #

### Preparations #

We load the data, define global macros and set the seed.

. use https://statalasso.github.io/dta/AJR.dta, clear
. global Y logpgp95
. global D avexpr
. global Z logem4
. global X lat_abst edes1975 avelf temp* humid* steplow-oilres
. set seed 42


### Step 1: Initialization #

Since the data set is very small, we consider 30 cross-fitting folds.

. ddml init iv, kfolds(30)


### Step 2: Adding learners #

The partially linear IV model has three conditional expectations: $$E[Y|X]$$ , $$E[D|X]$$ and $$E[Z|X]$$ . For each reduced form equation, we add two learners: regress and rforest.

We need to add the option vtype(none) for rforest to work with ddml since rforest’s predict command doesn’t support variable types.

. ddml E[Y|X]: reg $Y$X
. ddml E[Y|X], vtype(none): rforest $Y$X, type(reg)
. ddml E[D|X]: reg $D$X
. ddml E[D|X], vtype(none): rforest $D$X, type(reg)
. ddml E[Z|X]: reg $Z$X
. ddml E[Z|X], vtype(none): rforest $Z$X, type(reg)


### Step 3/4: Cross-fitting and estimation #

We use the shortstack option to combine the base learners. Short-stacking is a computationally cheaper alternative to stacking. Whereas stacking relies on cross-validated predicted values to obtain the relative weights for the base learners, short-stacking uses the cross-fitted predicted values.

. qui ddml crossfit

. ddml estimate, robust

DDML estimation results:
spec  r     Y learner     D learner         b        SE     Z learner
opt  1    Y2_rforest    D2_rforest     0.772  ( 0.207)
ss  1  [shortstack]          [ss]     0.716  ( 0.196)          [ss]
opt = minimum MSE specification for that resample.

Shortstack DDML model
y-E[y|X]  = logpgp95_ss_1                          Number of obs   =        64
D-E[D|X,Z]= avexpr_ss_1
Z-E[Z|X]  = logem4_ss_1
------------------------------------------------------------------------------
|               Robust
logpgp95 | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
avexpr |   .7158468   .1958356     3.66   0.000     .3320162    1.099677
_cons |  -.0308525   .0914993    -0.34   0.736    -.2101878    .1484828
------------------------------------------------------------------------------


### Manual estimation #

If you are curious what ddml does in the background:

. ddml estimate, allcombos spec(8) rep(1) robust

DDML estimation results:
spec  r     Y learner     D learner         b        SE     Z learner
1  1        Y1_reg        D1_reg     0.378  ( 0.125)        Z1_reg
2  1        Y1_reg        D1_reg    -0.187  ( 1.573)    Z2_rforest
3  1        Y1_reg    D2_rforest     2.413  ( 3.594)        Z1_reg
4  1        Y1_reg    D2_rforest     0.083  ( 0.475)    Z2_rforest
5  1    Y2_rforest        D1_reg     0.123  ( 0.207)        Z1_reg
6  1    Y2_rforest        D1_reg    -1.749  ( 4.690)    Z2_rforest
7  1    Y2_rforest    D2_rforest     0.783  ( 0.504)        Z1_reg
*  8  1    Y2_rforest    D2_rforest     0.772  ( 0.207)    Z2_rforest
ss  1  [shortstack]          [ss]     0.716  ( 0.196)          [ss]
* = minimum MSE specification for that resample.

Min MSE DDML model, specification 8
y-E[y|X]  = Y2_rforest_1                           Number of obs   =        64
D-E[D|X,Z]= D2_rforest_1
Z-E[Z|X]  = Z2_rforest_1
------------------------------------------------------------------------------
|               Robust
logpgp95 | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
avexpr |    .772314   .2068282     3.73   0.000     .3669382     1.17769
_cons |  -.0119092   .1009289    -0.12   0.906    -.2097263    .1859079
------------------------------------------------------------------------------

. ivreg Y2_rf (D2_rf = Z2_rf), robust

Instrumental variables 2SLS regression          Number of obs     =         64
F(1, 62)          =      13.94
Prob > F          =     0.0004
R-squared         =          .
Root MSE          =     .80209

------------------------------------------------------------------------------
|               Robust
Y2_rforest_1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
D2_rforest_1 |    .772314   .2068282     3.73   0.000     .3588703    1.185758
_cons |  -.0119092   .1009289    -0.12   0.906    -.2136633    .1898448
------------------------------------------------------------------------------
Instrumented: D2_rforest_1
Instruments: Z2_rforest_1