Flexible IV

Flexible Partially Linear IV Model #

Preparations #

We load the data, define global macros and set the seed.

. use https://statalasso.github.io/dta/BLP_CHS.dta, clear
. global Y y
. global D price
. global X hpwt air mpd space
. global Z Zbase*
. set seed 42

Step 1: Initialization #

We initialize the model.

. ddml init ivhd

Step 2: Add learners #

We add learners for $E[Y|X]$ in the usual way.

. ddml E[Y|X]: reg $Y $X
Learner Y1_reg added successfully.

. ddml E[Y|X]: pystacked $Y $X, type(reg)
Learner Y2_pystacked added successfully.

There are some pecularities that we need to bear in mind when adding learners for $E[D|Z,X]$ and $E[D|X]$ . The reason for this is that the estimation of $E[D|X]$ depends on the estimation of $E[D|X,Z]$ . More precisely, we first obtain the fitted values $\hat{D}=E[D|X,Z]$ and fit these against $X$ to estimate $E[\hat{D}|X]$ .

When adding learners for $E[D|Z,X]$ , we need to provide a name for each learners using learner(name).

. ddml E[D|Z,X], learner(Dhat_reg): reg $D $X $Z
Learner Dhat_reg added successfully.

. ddml E[D|Z,X], learner(Dhat_pystacked): pystacked $D $X $Z, type(reg)
Learner Dhat_pystacked added successfully.

When adding learners for $E[D|X]$ , we explicitly refer to the learner from the previous step (e.g., learner(Dhat_reg)) and also provide the name of the treatment variable (vname($D)). Finally, we use the placeholder {D} in place of the dependent variable.

. ddml E[D|X], learner(Dhat_reg) vname($D): reg {D} $X
Learner Dhat_reg_h added successfully.

. ddml E[D|X], learner(Dhat_pystacked) vname($D): pystacked {D} $X, type(reg)
Replacing existing learner Dhat_pystacked_h...
Learner Dhat_pystacked_h added successfully.

Step 3-4: Cross-fitting and estimation #

That’s it. Now we can move to cross-fitting and estimation.

. ddml crossfit
Cross-fitting E[Y|X,Z] equation: y
Cross-fitting fold 1 2 3 4 5 ...completed cross-fitting
Cross-fitting E[D|X,Z] and E[D|X] equation: price
Cross-fitting fold 1 2 3 4 5 ...completed cross-fitting

. ddml estimate, robust

DDML estimation results:
spec  r     Y learner     D learner         b        SE    DH learner
 opt  1  Y2_pystacked Dhat_pystac~d    -0.098  ( 0.008) Dhat_pystac~h
opt = minimum MSE specification for that resample.

Min MSE DDML model
y-E[y|X]  = Y2_pystacked_1                         Number of obs   =      2217
E[D|X,Z]  = Dhat_pystacked_1
E[D|X]    = Dhat_pystacked_h_1
Orthogonalised D = D - E[D|X]; optimal IV = E[D|X,Z] - E[D|X].
------------------------------------------------------------------------------
             |               Robust
       share | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       price |  -.0979042   .0075006   -13.05   0.000     -.112605   -.0832033
       _cons |   .0033532   .0215627     0.16   0.876    -.0389089    .0456154
------------------------------------------------------------------------------

Manual estimation #

If you are curious what ddml does in the background:

. ddml estimate, allcombos spec(8) rep(1) robust

DDML estimation results:
spec  r     Y learner     D learner         b        SE    DH learner
   1  1        Y1_reg      Dhat_reg    -0.137  ( 0.012)    Dhat_reg_h
   2  1        Y1_reg      Dhat_reg     0.369  ( 0.207) Dhat_pystac~h
   3  1        Y1_reg Dhat_pystac~d    -0.089  ( 0.005)    Dhat_reg_h
   4  1        Y1_reg Dhat_pystac~d    -0.114  ( 0.009) Dhat_pystac~h
   5  1  Y2_pystacked      Dhat_reg    -0.096  ( 0.011)    Dhat_reg_h
   6  1  Y2_pystacked      Dhat_reg    -0.212  ( 0.087) Dhat_pystac~h
   7  1  Y2_pystacked Dhat_pystac~d    -0.042  ( 0.004)    Dhat_reg_h
*  8  1  Y2_pystacked Dhat_pystac~d    -0.098  ( 0.008) Dhat_pystac~h
* = minimum MSE specification for that resample.

Min MSE DDML model, specification 8
y-E[y|X]  = Y2_pystacked_1                         Number of obs   =      2217
E[D|X,Z]  = Dhat_pystacked_1
E[D|X]    = Dhat_pystacked_h_1
Orthogonalised D = D - E[D|X]; optimal IV = E[D|X,Z] - E[D|X].
------------------------------------------------------------------------------
             |               Robust
       share | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       price |  -.0979042   .0075006   -13.05   0.000     -.112605   -.0832033
       _cons |   .0033532   .0215627     0.16   0.876    -.0389089    .0456154
------------------------------------------------------------------------------


. gen Dtilde = $D - Dhat_pystacked_h_1

. gen Zopt = Dhat_pystacked_1 - Dhat_pystacked_h_1

. ivreg Y2_pystacked_1 (Dtilde=Zopt), robust

Instrumental variables 2SLS regression          Number of obs     =      2,217
                                                F(1, 2215)        =     170.38
                                                Prob > F          =     0.0000
                                                R-squared         =     0.1175
                                                Root MSE          =     1.0152

------------------------------------------------------------------------------
             |               Robust
Y2_pystack~1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      Dtilde |  -.0979042   .0075006   -13.05   0.000    -.1126131   -.0831953
       _cons |   .0033532   .0215627     0.16   0.876     -.038932    .0456385
------------------------------------------------------------------------------
Instrumented: Dtilde
 Instruments: Zopt