STA 235 - Data Science for Business Applications > Classes > Week 11 - 10/30

Week 11 - 10/30

Date: Oct 30th - Nov 1st

What we will cover

This week we will be talking about model selection and regularization. In particular, lasso and ridge regression.

Slides

Code

Here is the R code we will review in class, with many additional questions! Remember to review it in detail after class Download

Check out the in-class activity we did for this week Download

(The answers for this are here: Download)

FAQ

Here, I provide a simple example to understand why this happens. Let’s think about the simplest scenario with just one data point and one predictor (we won’t take the intercept into account, because it doesn’t affect the prediction). As seen in class, the objective function that ridge regression is trying to minizime is the following:

$F_{r} = min_{β} (y - β x)^{2} + λ β^{2}$

Then, to find the optimal $β$ 's , we need to set the first order conditions (FOC) for this objective function:

$\frac{\partial F_{r}}{\partial β} = - 2 (y - β x) x + 2 λ β = 0$ $β (2 λ + 2 x^{2}) = 2 x y$ $β = \frac{x y}{x^{2} + λ}$

In this case, for non-zero values of $x$ and $y$ , then $β$ cannot be shrunk to exactly 0, because the numerator will always be different from 0. However, if $λ \to \infty$ , then $β \to 0$ .

In the case of lasso, now, assuming a positive value for $β$ (though it works the same if $β < 0$ ), we have the following objective function and FOC:

$F_{l} = min_{β} (y - β x)^{2} + λ | β |$

The first order conditions (FOC) for this objective function:

$\frac{\partial F_{l}}{\partial β} = - 2 (y - β x) x + λ = 0$ $2 β x^{2} = 2 x y - λ$ $β = \frac{2 x y - λ}{2 x^{2}}$

Now, we can actually set $β = 0$ if $λ = 2 x y$ , with multiple values that can achieve that equality.

Week 11 - 10/30

Date: Oct 30th - Nov 1st

What we will cover

Recommended readings/videos

Slides

Code

FAQ