In this module, youâ€™ll learn to compute linear regression models in R. Feel free to skip review sections if you are confident in your knowledge.

Â Â Â Â Videos: 20 min

Â Â Â Â Readings: 0 min

Â Â Â Â Activities: 30 min

Â Â Â Â Check-ins: 3

```
library(palmerpenguins)
penguins %>%
ggplot(aes(x = bill_depth_mm, y = bill_length_mm)) +
geom_point() +
stat_smooth(method = "lm")
```

`## `geom_smooth()` using formula 'y ~ x'`

`## Warning: Removed 2 rows containing non-finite values (stat_smooth).`

`## Warning: Removed 2 rows containing missing values (geom_point).`

```
##
## Call:
## lm(formula = bill_length_mm ~ bill_depth_mm, data = .)
##
## Coefficients:
## (Intercept) bill_depth_mm
## 55.0674 -0.6498
```

```
##
## Call:
## lm(formula = bill_length_mm ~ bill_depth_mm, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.8949 -3.9042 -0.3772 3.6800 15.5798
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 55.0674 2.5160 21.887 < 2e-16 ***
## bill_depth_mm -0.6498 0.1457 -4.459 1.12e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.314 on 340 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.05525, Adjusted R-squared: 0.05247
## F-statistic: 19.88 on 1 and 340 DF, p-value: 1.12e-05
```

```
## # A tibble: 2 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 55.1 2.52 21.9 6.91e-67
## 2 bill_depth_mm -0.650 0.146 -4.46 1.12e- 5
```

**Question 1: Code**

What is the

`data = .`

argument in the`lm()`

function?What happens if you switch the order of

`bill_length_mm`

and`bill_depth_mm`

in the`lm()`

formula?What object type was returned by

`summary()`

? What about by`tidy()`

?

**Question 2: Interpreation**

What is the equation for the regression line?

Penguin Bob has a bill that is 5mm deeper than Penguin Judy. How much longer do you expect Penguin Bobâ€™s bill to be?

Is the relationship between bill length and bill depth statistically significant?

**Question 3: A more complex model**

Run the following code, and explore the results:

```
my_model_2 <- penguins %>%
lm(bill_length_mm ~ bill_depth_mm:species, data = .)
my_model_3 <- penguins %>%
lm(bill_length_mm ~ bill_depth_mm*species, data = .)
```

Make a plot illustrating

`my_model_2`

.*(Hint: what needs to change in the aesthetic of the plot above?)*Which model of the three explains the most variance in the response variable?

Do the three species of penguin have the same average bill length? How do you know?

Do the three species of penguin have the same bill shape (i.e., the relationship between length and depth)? How do you know?