Replication of glmnet and StataCorp’s lasso #
Use Stata’s auto dataset with missing data dropped. The variable price1000
is used to illustrate scaling effects.
. sysuse auto, clear
. drop if rep78==.
. gen double price1000 = price/1000
Replication of glmnet #
To load the data into R for comparison with glmnet
, use the following
commands. The packages haven
and tidyr
need to be installed.
> auto <- haven::read_dta("http://www.stata-press.com/data/r9/auto.dta")
> auto <- tidyr::drop_na()
> n <- nrow(auto)
> price <- auto$price
> X <- auto[, c("mpg", "rep78", "headroom", "trunk", "weight", "length",
"turn", "displacement", "gear_ratio", "foreign")]
> X$foreign <- as.integer(X$foreign)
> X <- as.matrix(X)
Replication of StataCorp’s lasso #
Replication of StataCorp’s lasso and elasticnet requires only the rescaling
of lambda by 2N. N=69; so the lasso2
lambda becomes 138000/(2x69) = 1000.
. lasso2 price mpg-foreign, lambda(138000)
. lasso linear price mpg-foreign, grid(1, min(1000))
. lassoselect lambda = 1000
. lassocoef, display(coef, penalized)
. lasso2 price mpg-foreign, alpha(0.6) lambda(138000)
. elasticnet linear price mpg-foreign, alphas(0.6) grid(1, min(1000))
. lassoselect alpha = 0.6 lambda = 1000
. lassocoef, display(coef, penalized)
Notes on invariance and objective function #
glmnet
uses the same definition of the lasso L1 penalty as StataCorp’s
lasso
, so lasso2
’s default parameterization again requires only rescaling by
2N. When the lglmnet
option is used with the lglmnet
option, the L1 penalty
should be provided using the glmnet
definition. To estimate in R, load
glmnet
with library("glmnet")
and use the following command:
> r<-glmnet(X,price,alpha=1,lambda=1000,thresh=1e-15)
To achieve the same results with lasso2
:
. lasso2 price mpg-foreign, lambda(138000)
. lasso2 price mpg-foreign, lambda(1000) lglmnet
The R code below uses glmnet
to estimate an elastic net model. lasso2
with
the lglmnet
option will replicate it.
> r<-glmnet(X,price,alpha=0.6,lambda=1000,thresh=1e-15)
. lasso2 price mpg-foreign, alpha(0.6) lambda(1000) lglmnet
lasso2
’s default parameterization of the elastic net (like StataCorp’s
elasticnet
) is not invariant to scaling:
. lasso2 price mpg-foreign, alpha(0.6) lambda(138000)
. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(138)
When lasso2
uses the glmnet
parameterization of the elastic net via the
lglmnet
option, results are invariant to scaling: the only difference is
that the coefficients change by the same factor of proportionality as the
dependent variable.
. lasso2 price mpg-foreign, alpha(0.6) lambda(1000) lglmnet
. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(1) lglmnet
The reason that the default lasso2
paramaterization
is (like StataCorp’s) no invariant to scaling because the penalty on L2 norm is influenced by
scaling, and this in turn affects the relative weights on the L1 and L2
penalties. The example below shows how to reparameterize so that the
default lasso2
parameterization for the elastic net replicates the glmnet
parameterization. The example using the scaling above, where the dependent
variable is price1000
and the glmnet
lambda=1.
Note: The large-sample standard deviation of price1000
is equal to 2.8912586.
. qui sum price1000
. di r(sd) * 1/sqrt( r(N)/(r(N)-1))
The lasso2
alpha = alpha(lglmnet)xSD(y) / (1-alpha(glmnet) +
alpha(glmnet)xSD(y)). In this example, alpha = 0.81262488.
. di (0.6*2.8912586)/( 1-0.6 + 0.6*2.8912586)
The lasso2
lambda = 2N*lambda(lglmnet) * (alpha(lglmnet) +
(1-alpha(lglmnet))/SD(y)). In this example, lambda = 101.89203.
. di 2*69*( 0.6 + (1-0.6)/2.8912586)
lasso2
using the glmnet
and then replicated using the lasso2
/StataCorp
parameterization:
. lasso2 price1000 mpg-foreign, alpha(0.6) lambda(1) lglmnet
. lasso2 price1000 mpg-foreign, alpha(.81262488) lambda(101.89203)