Replication of glmnet and StataCorp’s lasso #
Use Stata’s auto dataset with missing data dropped. The variable price1000
is used to illustrate scaling effects.
. sysuse auto, clear
. drop if rep78==.
. gen double price1000 = price/1000
Replication of glmnet #
To load the data into R for comparison with glmnet, use the following
commands. The packages haven and tidyr need to be installed.
> auto <- haven::read_dta("http://www.stata-press.com/data/r9/auto.dta")
> auto <- tidyr::drop_na()
> n <- nrow(auto)
> price <- auto$price
> X <- auto[, c("mpg", "rep78", "headroom", "trunk", "weight", "length",
"turn", "displacement", "gear_ratio", "foreign")]
> X$foreign <- as.integer(X$foreign)
> X <- as.matrix(X)
Replication of StataCorp’s lasso #
Replication of StataCorp’s lasso and elasticnet requires only the rescaling
of lambda by 2N. N=69; so the lasso2 lambda becomes 138000/(2x69) = 1000.
. lasso2 price mpg-foreign, lambda(138000)
. lasso linear price mpg-foreign, grid(1, min(1000))
. lassoselect lambda = 1000
. lassocoef, display(coef, penalized)
. lasso2 price mpg-foreign, alpha(0.6) lambda(138000)
. elasticnet linear price mpg-foreign, alphas(0.6) grid(1, min(1000))
. lassoselect alpha = 0.6 lambda = 1000
. lassocoef, display(coef, penalized)
Notes on invariance and objective function #
glmnet uses the same definition of the lasso L1 penalty as StataCorp’s
lasso, so lasso2’s default parameterization again requires only rescaling by
2N. When the lglmnet option is used with the lglmnet option, the L1 penalty
should be provided using the glmnet definition. To estimate in R, load
glmnet with library("glmnet") and use the following command:
> r<-glmnet(X,price,alpha=1,lambda=1000,thresh=1e-15)
To achieve the same results with lasso2:
. lasso2 price mpg-foreign, lambda(138000)
. lasso2 price mpg-foreign, lambda(1000) lglmnet
The R code below uses glmnet to estimate an elastic net model. lasso2 with
the lglmnet option will replicate it.
> r<-glmnet(X,price,alpha=0.6,lambda=1000,thresh=1e-15)
. lasso2 price mpg-foreign, alpha(0.6) lambda(1000) lglmnet
lasso2’s default parameterization of the elastic net (like StataCorp’s
elasticnet) is not invariant to scaling:
. lasso2 price mpg-foreign, alpha(0.6) lambda(138000)
. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(138)
When lasso2 uses the glmnet parameterization of the elastic net via the
lglmnet option, results are invariant to scaling: the only difference is
that the coefficients change by the same factor of proportionality as the
dependent variable.
. lasso2 price mpg-foreign, alpha(0.6) lambda(1000) lglmnet
. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(1) lglmnet
The reason that the default lasso2 paramaterization
is (like StataCorp’s) no invariant to scaling because the penalty on L2 norm is influenced by
scaling, and this in turn affects the relative weights on the L1 and L2
penalties. The example below shows how to reparameterize so that the
default lasso2 parameterization for the elastic net replicates the glmnet
parameterization. The example using the scaling above, where the dependent
variable is price1000 and the glmnet lambda=1.
Note: The large-sample standard deviation of price1000
is equal to 2.8912586.
. qui sum price1000
. di r(sd) * 1/sqrt( r(N)/(r(N)-1))
The lasso2 alpha = alpha(lglmnet)xSD(y) / (1-alpha(glmnet) +
alpha(glmnet)xSD(y)). In this example, alpha = 0.81262488.
. di (0.6*2.8912586)/( 1-0.6 + 0.6*2.8912586)
The lasso2 lambda = 2N*lambda(lglmnet) * (alpha(lglmnet) +
(1-alpha(lglmnet))/SD(y)). In this example, lambda = 101.89203.
. di 2*69*( 0.6 + (1-0.6)/2.8912586)
lasso2 using the glmnet and then replicated using the lasso2/StataCorp
parameterization:
. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(1) lglmnet
. lasso2 price1000 mpg-foreign, alpha(.81262488) lambda(101.89203)