Algorithm

## DDML Algorithm #

DDML estimators proceed in two stages:

1. Cross-fitting to estimate conditional expectation functions.
2. Second stage estimation based on Neyman orthogonal scores.

Chernozhukov et al. (2018) show that cross-fitting ensures that we can leverage a large class of machine learners for causal inference – including popular machine learners such as random forests or gradient boosting. Cross-fitting ensures independence between the estimation error from the first step and the regression residual in the second stage.

To illustrate the estimation methodology, let us consider the Partially Linear Model: $Y = a.D + g(X) + U \\ D = m(X) + V \quad\quad~~$ Under an conditional orthogonality, we can write $a = \frac{E\left[\big(Y - \ell(\bm{X})\big)\big(D - m(\bm{X})\big)\right]}{E\left[(D - m(\bm{X}))^2\right]}.$ where $$m(\bm{X})\equiv E[D\vert X]$$ and $$\ell(\bm{X})\equiv E[Y\vert X]$$ .

DDML uses cross-fitting to estimate the conditional expectation functions, which are then used to obtain the DDML estimate of $$a$$ .

To implement cross-fitting, we randomly split the sample into $$K$$ evenly-sized folds, denoted as $$I_1,\ldots, I_K$$ . For each fold $$k$$ , the conditional expectations $$\ell_0$$ and $$m_0$$ are estimated using only observations not in the $$k$$ th fold – i.e., in $$I^c_k\equiv I \setminus I_k$$ – resulting in $$\hat{\ell}_{I^c_{k}}$$ and $$\hat{m}_{I^c_{k}}$$ , respectively, where the subscript $${I^c_{k}}$$ indicates the subsample used for estimation. The out-of-sample predictions for an observation $$i$$ in the $$k$$ th fold are then computed via $$\hat{\ell}_{I^c_{k}}(\bm{X}_i)$$ and $$\hat{m}_{I^c_{k}}(\bm{X}_i)$$ . Repeating this procedure for all $$K$$ folds then allows for computation of the DDML estimator for $$a$$ : $\hat{a}_n = \frac{\frac{1}{n}\sum_{i=1}^n \big(Y_i-\hat{\ell}_{I^c_{k_i}}(\bm{X}_i)\big)\big(D_i-\hat{m}_{I^c_{k_i}}(\bm{X}_i)\big)}{\frac{1}{n}\sum_{i=i}^n \big(D_i-\hat{m}_{I^c_{k_i}}(\bm{X}_i)\big)^2},$ where $$k_i$$ denotes the fold of the $$i$$ th observation.