Comparison glmnet

Replication of glmnet and StataCorp’s lasso #

Use Stata’s auto dataset with missing data dropped. The variable price1000 is used to illustrate scaling effects.

. sysuse auto, clear
. drop if rep78==.
. gen double price1000 = price/1000

Replication of glmnet #

To load the data into R for comparison with glmnet, use the following commands. The packages haven and tidyr need to be installed.

> auto <- haven::read_dta("http://www.stata-press.com/data/r9/auto.dta")
> auto <- tidyr::drop_na()
> n <- nrow(auto)
> price <- auto$price
> X <- auto[, c("mpg", "rep78", "headroom", "trunk", "weight", "length",
"turn", "displacement", "gear_ratio", "foreign")]
> X$foreign <- as.integer(X$foreign)
> X <- as.matrix(X)

Replication of StataCorp’s lasso #

Replication of StataCorp’s lasso and elasticnet requires only the rescaling of lambda by 2N. N=69; so the lasso2 lambda becomes 138000/(2x69) = 1000.

. lasso2 price mpg-foreign, lambda(138000)
. lasso linear price mpg-foreign, grid(1, min(1000))
. lassoselect lambda = 1000
. lassocoef, display(coef, penalized)

. lasso2 price mpg-foreign, alpha(0.6) lambda(138000)
. elasticnet linear price mpg-foreign, alphas(0.6) grid(1, min(1000))
. lassoselect alpha = 0.6 lambda = 1000
. lassocoef, display(coef, penalized)

Notes on invariance and objective function #

glmnet uses the same definition of the lasso L1 penalty as StataCorp’s lasso, so lasso2’s default parameterization again requires only rescaling by 2N. When the lglmnet option is used with the lglmnet option, the L1 penalty should be provided using the glmnet definition. To estimate in R, load glmnet with library("glmnet") and use the following command:

> r<-glmnet(X,price,alpha=1,lambda=1000,thresh=1e-15)

To achieve the same results with lasso2:

. lasso2 price mpg-foreign, lambda(138000)
. lasso2 price mpg-foreign, lambda(1000) lglmnet

The R code below uses glmnet to estimate an elastic net model. lasso2 with the lglmnet option will replicate it.

> r<-glmnet(X,price,alpha=0.6,lambda=1000,thresh=1e-15)

. lasso2 price mpg-foreign, alpha(0.6) lambda(1000) lglmnet

lasso2’s default parameterization of the elastic net (like StataCorp’s elasticnet) is not invariant to scaling:

. lasso2 price mpg-foreign, alpha(0.6) lambda(138000)
. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(138)

When lasso2 uses the glmnet parameterization of the elastic net via the lglmnet option, results are invariant to scaling: the only difference is that the coefficients change by the same factor of proportionality as the dependent variable.

. lasso2 price mpg-foreign, alpha(0.6) lambda(1000) lglmnet
. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(1) lglmnet

The reason that the default lasso2 paramaterization is (like StataCorp’s) no invariant to scaling because the penalty on L2 norm is influenced by scaling, and this in turn affects the relative weights on the L1 and L2 penalties. The example below shows how to reparameterize so that the default lasso2 parameterization for the elastic net replicates the glmnet parameterization. The example using the scaling above, where the dependent variable is price1000 and the glmnet lambda=1.

Note: The large-sample standard deviation of price1000 is equal to 2.8912586.

. qui sum price1000
. di r(sd) * 1/sqrt( r(N)/(r(N)-1))

The lasso2 alpha = alpha(lglmnet)xSD(y) / (1-alpha(glmnet) + alpha(glmnet)xSD(y)). In this example, alpha = 0.81262488.

. di (0.6*2.8912586)/( 1-0.6 + 0.6*2.8912586)

The lasso2 lambda = 2N*lambda(lglmnet) * (alpha(lglmnet) + (1-alpha(lglmnet))/SD(y)). In this example, lambda = 101.89203.

. di 2*69*( 0.6 + (1-0.6)/2.8912586)

lasso2 using the glmnet and then replicated using the lasso2/StataCorp parameterization:

. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(1) lglmnet
. lasso2 price1000 mpg-foreign, alpha(.81262488) lambda(101.89203)