- Understand the least squares method.
- Learn how to solve overdetermined systems.
- Derive the least squares solution mathematically.
- Apply the method to regression and optimization problems.
Definition of Least Squares
The **least squares method** is a technique used to find the best approximation to an overdetermined system (more equations than unknowns).
For a system \( A\mathbf{x} = \mathbf{b} \), where \( A \) is \( m \times n \) with \( m > n \), an exact solution may not exist. The least squares method finds \( \mathbf{x} \) that minimizes:
\[ \| A\mathbf{x} - \mathbf{b} \| \]Derivation of Least Squares Solution
To minimize \( \| A\mathbf{x} - \mathbf{b} \|^2 \), differentiate with respect to \( \mathbf{x} \) and set the derivative to zero:
\[ A^T A \mathbf{x} = A^T \mathbf{b} \]This equation, called the **normal equation**, gives the least squares solution:
\[ \mathbf{x} = (A^T A)^{-1} A^T \mathbf{b} \]Proof
We define the error function:
\[ E(\mathbf{x}) = \| A\mathbf{x} - \mathbf{b} \|^2 \]Taking the derivative:
\[ \frac{d}{d\mathbf{x}} (A\mathbf{x} - \mathbf{b})^T (A\mathbf{x} - \mathbf{b}) = 2 A^T (A\mathbf{x} - \mathbf{b}) = 0 \]Solving for \( \mathbf{x} \), we obtain:
\[ A^T A \mathbf{x} = A^T \mathbf{b} \]Solving Overdetermined Systems
In an overdetermined system \( A\mathbf{x} = \mathbf{b} \), the least squares solution is the vector \( \mathbf{x} \) that minimizes the residual \( \mathbf{r} = A\mathbf{x} - \mathbf{b} \).
Applications of Least Squares
- Linear Regression: Finding the best-fit line in data analysis.
- Optimization: Approximating solutions when exact ones are infeasible.
- Signal Processing: Noise reduction and signal reconstruction.
Examples
Example 1: Solve the overdetermined system using least squares:
\[ A = \begin{bmatrix} 1 & 1 \\ 1 & -1 \\ 1 & 2 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 4 \\ 2 \\ 6 \end{bmatrix} \]Step 1: Compute \( A^T A \) and \( A^T \mathbf{b} \):
\[ A^T A = \begin{bmatrix} 3 & 2 \\ 2 & 6 \end{bmatrix} \] \[ A^T \mathbf{b} = \begin{bmatrix} 12 \\ 24 \end{bmatrix} \]Step 2: Solve \( A^T A \mathbf{x} = A^T \mathbf{b} \):
\[ \begin{bmatrix} 3 & 2 \\ 2 & 6 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 12 \\ 24 \end{bmatrix} \] \[ x_1 = 2, \quad x_2 = 3 \]Exercises
- Question 1: Find the least squares solution for: \[ A = \begin{bmatrix} 2 & 1 \\ 1 & 2 \\ 1 & -1 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 3 \\ 2 \\ 1 \end{bmatrix} \]
- Question 2: Explain why \( A^T A \) must be invertible for a unique least squares solution.
- Question 3: Derive the normal equation for least squares.
- Answer 1: \( x_1 = 1.2, x_2 = 0.8 \).
- Answer 2: \( A^T A \) must be invertible to ensure a unique solution to \( A^T A \mathbf{x} = A^T \mathbf{b} \).
- Answer 3: The normal equation is derived by minimizing \( \| A\mathbf{x} - \mathbf{b} \|^2 \).