Simple Linear Regression with distinct x values – Everything Is OK

Suppose we have 3 observations:

Our model is

\[ y_i = \beta_0 + \beta_1 x_i + \varepsilon_i \\ \varepsilon_i \sim \text{Normal}(0, \sigma^2) \]

The design matrix is \(X = \begin{bmatrix} 1 & 1 \\ 1 & 2 \\ 1 & 3 \end{bmatrix}\)

We know that we can use the following code to find the estimates of \(\beta_0\) and \(\beta_1\):

X <- cbind(
  c(1, 1, 1),
  c(1, 2, 3)
)
y <- matrix(c(2, 4, 5))

beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y
beta_hat
##           [,1]
## [1,] 0.6666667
## [2,] 1.5000000

Here is a picture of the RSS as a function of \(\beta_0\) and \(\beta_1\), with our estimates \((\hat{\beta}_0, \hat{\beta}_1)\) shown with a red point:

\[RSS = \sum_{i = 1}^n (y_i - \hat{y}_i)^2 = \{2 - (\beta_0 + \beta_1 \cdot 1)\}^2 + \{4 - (\beta_0 + \beta_1 \cdot 2)\}^2 + \{5 - (\beta_0 + \beta_1 \cdot 3)\}^2\]

Simple Linear Regression with one x value – Everything Is Broken

Suppose we have 3 observations:

Our model is

\[ y_i = \beta_0 + \beta_1 x_i + \varepsilon_i \\ \varepsilon_i \sim \text{Normal}(0, \sigma^2) \]

The design matrix is \(X = \begin{bmatrix} 1 & 2 \\ 1 & 2 \\ 1 & 2 \end{bmatrix}\)

We know that there is not a unique \(\hat{\beta}\) that minimizes RSS because the columns of \(X\) are not linearly independent.

Here is a picture of the RSS as a function of \(\beta_0\) and \(\beta_1\):

\[RSS = \sum_{i = 1}^n (y_i - \hat{y}_i)^2 = \{2 - (\beta_0 + \beta_1 \cdot 2)\}^2 + \{4 - (\beta_0 + \beta_1 \cdot 2)\}^2 + \{5 - (\beta_0 + \beta_1 \cdot 2)\}^2\]

Note that there is no unique pair \((\beta_0, \beta_1)\) that minimizes RSS.