Partial Linear IV Model #
Preparations #
We load the data, define global macros and set the seed.
. use https://statalasso.github.io/dta/AJR.dta, clear
. global Y logpgp95
. global D avexpr
. global Z logem4
. global X lat_abst edes1975 avelf temp* humid* steplow-oilres
. set seed 42
Step 1: Initialization #
Since the data set is very small, we consider 30 cross-fitting folds.
. ddml init iv, kfolds(30)
Step 2: Adding learners #
The partially linear IV model has three conditional expectations: \(E[Y|X]\) , \(E[D|X]\) and \(E[Z|X]\) . For each reduced form equation, we add two learners: regress and rforest.
We need to add the option vtype(none)
for rforest to work with ddml
since rforest
’s predict
command doesn’t support variable
types.
. ddml E[Y|X]: reg $Y $X
Learner Y1_reg added successfully.
. ddml E[Y|X], vtype(none): rforest $Y $X, type(reg)
Learner Y2_rforest added successfully.
. ddml E[D|X]: reg $D $X
Learner D1_reg added successfully.
. ddml E[D|X], vtype(none): rforest $D $X, type(reg)
Learner D2_rforest added successfully.
. ddml E[Z|X]: reg $Z $X
Learner Z1_reg added successfully.
. ddml E[Z|X], vtype(none): rforest $Z $X, type(reg)
Learner Z2_rforest added successfully.
Step 3/4: Cross-fitting and estimation #
We use the shortstack
option to combine the base learners. Short-stacking is a computationally
cheaper alternative to stacking. Whereas stacking relies on cross-validated predicted values to obtain the relative weights for
the base learners, short-stacking uses the cross-fitted predicted values.
. qui ddml crossfit
. ddml estimate, robust
DDML estimation results:
spec r Y learner D learner b SE Z learner
opt 1 Y2_rforest D2_rforest 0.772 ( 0.207)
ss 1 [shortstack] [ss] 0.716 ( 0.196) [ss]
opt = minimum MSE specification for that resample.
Shortstack DDML model
y-E[y|X] = logpgp95_ss_1 Number of obs = 64
D-E[D|X,Z]= avexpr_ss_1
Z-E[Z|X] = logem4_ss_1
------------------------------------------------------------------------------
| Robust
logpgp95 | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
avexpr | .7158468 .1958356 3.66 0.000 .3320162 1.099677
_cons | -.0308525 .0914993 -0.34 0.736 -.2101878 .1484828
------------------------------------------------------------------------------
Manual estimation #
If you are curious what ddml
does in the background:
. ddml estimate, allcombos spec(8) rep(1) robust
DDML estimation results:
spec r Y learner D learner b SE Z learner
1 1 Y1_reg D1_reg 0.378 ( 0.125) Z1_reg
2 1 Y1_reg D1_reg -0.187 ( 1.573) Z2_rforest
3 1 Y1_reg D2_rforest 2.413 ( 3.594) Z1_reg
4 1 Y1_reg D2_rforest 0.083 ( 0.475) Z2_rforest
5 1 Y2_rforest D1_reg 0.123 ( 0.207) Z1_reg
6 1 Y2_rforest D1_reg -1.749 ( 4.690) Z2_rforest
7 1 Y2_rforest D2_rforest 0.783 ( 0.504) Z1_reg
* 8 1 Y2_rforest D2_rforest 0.772 ( 0.207) Z2_rforest
ss 1 [shortstack] [ss] 0.716 ( 0.196) [ss]
* = minimum MSE specification for that resample.
Min MSE DDML model, specification 8
y-E[y|X] = Y2_rforest_1 Number of obs = 64
D-E[D|X,Z]= D2_rforest_1
Z-E[Z|X] = Z2_rforest_1
------------------------------------------------------------------------------
| Robust
logpgp95 | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
avexpr | .772314 .2068282 3.73 0.000 .3669382 1.17769
_cons | -.0119092 .1009289 -0.12 0.906 -.2097263 .1859079
------------------------------------------------------------------------------
. ivreg Y2_rf (D2_rf = Z2_rf), robust
Instrumental variables 2SLS regression Number of obs = 64
F(1, 62) = 13.94
Prob > F = 0.0004
R-squared = .
Root MSE = .80209
------------------------------------------------------------------------------
| Robust
Y2_rforest_1 | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
D2_rforest_1 | .772314 .2068282 3.73 0.000 .3588703 1.185758
_cons | -.0119092 .1009289 -0.12 0.906 -.2136633 .1898448
------------------------------------------------------------------------------
Instrumented: D2_rforest_1
Instruments: Z2_rforest_1