Flexible Partially Linear IV Model #
Preparations #
We load the data, define global macros and set the seed.
. use https://statalasso.github.io/dta/BLP_CHS.dta, clear
. global Y y
. global D price
. global X hpwt air mpd space
. global Z Zbase*
. set seed 42
Step 1: Initialization #
We initialize the model.
. ddml init ivhd
Step 2: Add learners #
We add learners for \(E[Y|X]\) in the usual way.
. ddml E[Y|X]: reg $Y $X
Learner Y1_reg added successfully.
. ddml E[Y|X]: pystacked $Y $X, type(reg)
Learner Y2_pystacked added successfully.
There are some pecularities that we need to bear in mind when adding learners for \(E[D|Z,X]\) and \(E[D|X]\) . The reason for this is that the estimation of \(E[D|X]\) depends on the estimation of \(E[D|X,Z]\) . More precisely, we first obtain the fitted values \(\hat{D}=E[D|X,Z]\) and fit these against \(X\) to estimate \(E[\hat{D}|X]\) .
When adding learners for
\(E[D|Z,X]\)
, we need to provide a name for each learners using learner(name)
.
. ddml E[D|Z,X], learner(Dhat_reg): reg $D $X $Z
Learner Dhat_reg added successfully.
. ddml E[D|Z,X], learner(Dhat_pystacked): pystacked $D $X $Z, type(reg)
Learner Dhat_pystacked added successfully.
When adding learners for
\(E[D|X]\)
, we explicitly refer to the learner from the previous step (e.g., learner(Dhat_reg))
and also
provide the name of the treatment variable (vname($D)
). Finally, we use the placeholder {D}
in place of the dependent variable.
. ddml E[D|X], learner(Dhat_reg) vname($D): reg {D} $X
Learner Dhat_reg_h added successfully.
. ddml E[D|X], learner(Dhat_pystacked) vname($D): pystacked {D} $X, type(reg)
Replacing existing learner Dhat_pystacked_h...
Learner Dhat_pystacked_h added successfully.
Step 3-4: Cross-fitting and estimation #
That’s it. Now we can move to cross-fitting and estimation.
. ddml crossfit
Cross-fitting E[Y|X,Z] equation: y
Cross-fitting fold 1 2 3 4 5 ...completed cross-fitting
Cross-fitting E[D|X,Z] and E[D|X] equation: price
Cross-fitting fold 1 2 3 4 5 ...completed cross-fitting
. ddml estimate, robust
DDML estimation results:
spec r Y learner D learner b SE DH learner
opt 1 Y2_pystacked Dhat_pystac~d -0.098 ( 0.008) Dhat_pystac~h
opt = minimum MSE specification for that resample.
Min MSE DDML model
y-E[y|X] = Y2_pystacked_1 Number of obs = 2217
E[D|X,Z] = Dhat_pystacked_1
E[D|X] = Dhat_pystacked_h_1
Orthogonalised D = D - E[D|X]; optimal IV = E[D|X,Z] - E[D|X].
------------------------------------------------------------------------------
| Robust
share | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
price | -.0979042 .0075006 -13.05 0.000 -.112605 -.0832033
_cons | .0033532 .0215627 0.16 0.876 -.0389089 .0456154
------------------------------------------------------------------------------
Manual estimation #
If you are curious what ddml does in the background:
. ddml estimate, allcombos spec(8) rep(1) robust
DDML estimation results:
spec r Y learner D learner b SE DH learner
1 1 Y1_reg Dhat_reg -0.137 ( 0.012) Dhat_reg_h
2 1 Y1_reg Dhat_reg 0.369 ( 0.207) Dhat_pystac~h
3 1 Y1_reg Dhat_pystac~d -0.089 ( 0.005) Dhat_reg_h
4 1 Y1_reg Dhat_pystac~d -0.114 ( 0.009) Dhat_pystac~h
5 1 Y2_pystacked Dhat_reg -0.096 ( 0.011) Dhat_reg_h
6 1 Y2_pystacked Dhat_reg -0.212 ( 0.087) Dhat_pystac~h
7 1 Y2_pystacked Dhat_pystac~d -0.042 ( 0.004) Dhat_reg_h
* 8 1 Y2_pystacked Dhat_pystac~d -0.098 ( 0.008) Dhat_pystac~h
* = minimum MSE specification for that resample.
Min MSE DDML model, specification 8
y-E[y|X] = Y2_pystacked_1 Number of obs = 2217
E[D|X,Z] = Dhat_pystacked_1
E[D|X] = Dhat_pystacked_h_1
Orthogonalised D = D - E[D|X]; optimal IV = E[D|X,Z] - E[D|X].
------------------------------------------------------------------------------
| Robust
share | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
price | -.0979042 .0075006 -13.05 0.000 -.112605 -.0832033
_cons | .0033532 .0215627 0.16 0.876 -.0389089 .0456154
------------------------------------------------------------------------------
. gen Dtilde = $D - Dhat_pystacked_h_1
. gen Zopt = Dhat_pystacked_1 - Dhat_pystacked_h_1
. ivreg Y2_pystacked_1 (Dtilde=Zopt), robust
Instrumental variables 2SLS regression Number of obs = 2,217
F(1, 2215) = 170.38
Prob > F = 0.0000
R-squared = 0.1175
Root MSE = 1.0152
------------------------------------------------------------------------------
| Robust
Y2_pystack~1 | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
Dtilde | -.0979042 .0075006 -13.05 0.000 -.1126131 -.0831953
_cons | .0033532 .0215627 0.16 0.876 -.038932 .0456385
------------------------------------------------------------------------------
Instrumented: Dtilde
Instruments: Zopt