# Resources

### Refresher from STA 301

In these notes, you can get a refresher from the material covered in STA 301:

### R resources

For those that want some R refresher, these resources can be useful:

Cheatsheets for R are amazing! Here are two of my favorite so you can look up functions we will be using frequently explained in a simple way:

### Machine Learning

Here are a list of interesting readings if you want to explore more about the topics we cover in class and how they relate to the real world:

# FAQ

1) Is there a specific format I should follow when emailing a professor?

This is an important question, and not only for professors! (Disclaimer: This is part of a much larger hidden curriculum, so don’t worry if you don’t get it right straight away).

Useful tips for emailing a professor (or really anyone in a professional setting):

• Use an informative subject: In this case, “[STA 235] Your subject” is a good idea so I can immediately identify emails from students.

• Always include an appropriate greeting!: Most students have this down, but “Hey” is usually not considered professional (or no greeting at all, when you are starting an email chain). When addressing professors, usually a good greeting would be “Dear Prof. Smith” or even “Hi Dr. Smith” (both work fine!). Try to avoid “Mr.” or “Ms.” (and stay far, far away from “Miss” or “Mrs.”) when referring to a professor.

• Quickly introduce yourself: Reminding professors who you are is always a good idea (especially in the first email you ever send). We are usually handling multiple sections and hundreds of students, so it’s a good way to help us remember names. Something like “I’m in your Tuesday 10am class” works great, because I can pin down the section.

• Be clear about the motive of the email: Feel free to use bold fonts, underscore, etc. and try to keep emails succinct. That way you are sure that your point is coming across and is more likely that you get the appropriate response.

2) Should I go to office hours?

The answer is almost always YES!. Office hours are a great way to just check your knowledge or ask any question you might have in a non-judgement zone. Even if you don’t know what to ask but feel somewhat lost, we can work backwards and try to pinpoint some of the key elements that might be confusing and start from there.

3) What do I do if I get stuck with some code?

An important part of this course is for you to learn how to “teach” yourself how to code. How’s that? New tools (and languages!) are coming up more rapidly than you’ll be able to see in one class, so it’s an important asset for you to learn how to find some answers by yourself. If you Google an error that you got in R, most likely you’ll be able to find the answer on very useful websites like Stackoverflow (believe me, I (and everyone who codes) do it all the time).

However, I don’t want you to get stuck either! If you don’t find the answer quickly on the internet (say, 10 min.), please reach out to the instruction team on Canvas. We are here to help!

4) Notation is too hard! Can we get a cheat-sheet for this?

Ask and you shall receive! I created this short cheat sheet which hopefully helps. Let me know if you want me to include something else!

5) Why can lasso regression set coefficients to 0 and not ridge regression?

This is a great question! And for answering it, we need to look at the optimization problems we are solving. Let’s start with a simple regression, to make this easier.

As you remember, ridge regression is trying to solve the following problem:

$$\min_{\beta} \sum_i(y_i - \beta_0 - \beta_1 x_i)^2 + \lambda \beta_1^2 = \min_{\beta} F(\beta)$$

Taking the first order conditions (FOC) to solve this problem (meaning, differentiating and setting this to 0), we get that:

$$\frac{\partial F}{\partial \beta_1} = \sum 2\beta_1x^2 - \sum 2(y - \beta_0)x + 2\lambda\beta_1 = 0$$

Now solving for $\beta_1$:

$$\beta_1(\sum 2x^2 + 2\lambda) = \sum 2(y - \beta_0)x$$

$$\beta_1 = \frac{\sum (y - \beta_0)x}{\sum x^2 + \lambda}$$

So, as we see in this case, the only way $\beta_1$ reaches 0 in the optimum by maniputaling $\lambda$, is if $\lambda \rightarrow \infty$, which actually never happens (asymptotically, it can get very close).

Now, let’s look at the same optimization problem but for Lasso:

$$\min_{\beta} \sum_i(y_i - \beta_0 - \beta_1 x_i)^2 + \lambda |\beta_1| = \min_{\beta} F(\beta)$$

Taking the first order conditions (FOC) to solve this problem (meaning, differentiating and setting this to 0), we get that (assuming $\beta_1>0$, but works the same for $\beta_1<0$):

$$\frac{\partial F}{\partial \beta_1} = \sum 2\beta_1x^2 - \sum 2(y - \beta_0)x + \lambda = 0$$

Now solving for $\beta_1$:

$$\beta_1\sum 2x^2 = \sum 2(y - \beta_0)x - \lambda$$

$$\beta_1 = \frac{\sum (y - \beta_0)x - \lambda}{\sum x^2}$$

So, as we see in this case, $\beta_1$ reaches 0 in the optimal point if $\lambda = \sum (y - \beta_0)x$, which is a precise value.