This week we will be talking about model selection and regularization. In particular, lasso and ridge regression.
James, G. et al. (2021). “An Introduction to Statistical Learning with Applications in R” (ISLR). Chapter 6.2 (pg. 237-242). Note: For ISLR readings, don’t get caught up in the math.
Josh Starmer. (2018). “Regularization Part 1: Ridge (L2) Regression”. Video materials from StatQuest. Note: I usually watch his videos at x1.5 speed.
Here is the R code we will review in class, with many additional questions! Remember to review it in detail after class Download
Check out the in-class activity we did for this week Download
(The answers for this are here: Download)
Here, I provide a simple example to understand why this happens. Let’s think about the simplest scenario with just one data point and one predictor (we won’t take the intercept into account, because it doesn’t affect the prediction). As seen in class, the objective function that ridge regression is trying to minizime is the following:
Then, to find the optimal
In this case, for non-zero values of
In the case of lasso, now, assuming a positive value for
The first order conditions (FOC) for this objective function:
Now, we can actually set