Least Squares Method

Understand the least squares method.
Learn how to solve overdetermined systems.
Derive the least squares solution mathematically.
Apply the method to regression and optimization problems.

Definition of Least Squares
Derivation of Least Squares Solution
Solving Overdetermined Systems
Applications of Least Squares
Examples

Definition of Least Squares

The **least squares method** is a technique used to find the best approximation to an overdetermined system (more equations than unknowns).

For a system \( A\mathbf{x} = \mathbf{b} \), where \( A \) is \( m \times n \) with \( m > n \), an exact solution may not exist. The least squares method finds \( \mathbf{x} \) that minimizes:

\[ \| A\mathbf{x} - \mathbf{b} \| \]

Derivation of Least Squares Solution

To minimize \( \| A\mathbf{x} - \mathbf{b} \|^2 \), differentiate with respect to \( \mathbf{x} \) and set the derivative to zero:

\[ A^T A \mathbf{x} = A^T \mathbf{b} \]

This equation, called the **normal equation**, gives the least squares solution:

\[ \mathbf{x} = (A^T A)^{-1} A^T \mathbf{b} \]

Proof

We define the error function:

\[ E(\mathbf{x}) = \| A\mathbf{x} - \mathbf{b} \|^2 \]

Taking the derivative:

\[ \frac{d}{d\mathbf{x}} (A\mathbf{x} - \mathbf{b})^T (A\mathbf{x} - \mathbf{b}) = 2 A^T (A\mathbf{x} - \mathbf{b}) = 0 \]

Solving for \( \mathbf{x} \), we obtain:

\[ A^T A \mathbf{x} = A^T \mathbf{b} \]

Solving Overdetermined Systems

In an overdetermined system \( A\mathbf{x} = \mathbf{b} \), the least squares solution is the vector \( \mathbf{x} \) that minimizes the residual \( \mathbf{r} = A\mathbf{x} - \mathbf{b} \).

Applications of Least Squares

Linear Regression: Finding the best-fit line in data analysis.
Optimization: Approximating solutions when exact ones are infeasible.
Signal Processing: Noise reduction and signal reconstruction.

Examples

Example 1: Solve the overdetermined system using least squares:

\[ A = \begin{bmatrix} 1 & 1 \\ 1 & -1 \\ 1 & 2 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 4 \\ 2 \\ 6 \end{bmatrix} \]

Step 1: Compute \( A^T A \) and \( A^T \mathbf{b} \):

\[ A^T A = \begin{bmatrix} 3 & 2 \\ 2 & 6 \end{bmatrix} \] \[ A^T \mathbf{b} = \begin{bmatrix} 12 \\ 24 \end{bmatrix} \]

Step 2: Solve \( A^T A \mathbf{x} = A^T \mathbf{b} \):

\[ \begin{bmatrix} 3 & 2 \\ 2 & 6 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 12 \\ 24 \end{bmatrix} \] \[ x_1 = 2, \quad x_2 = 3 \]

Exercises

Question 1: Find the least squares solution for:
Question 2: Explain why \( A^T A \) must be invertible for a unique least squares solution.
Question 3: Derive the normal equation for least squares.

Answer 1: \( x_1 = 1.2, x_2 = 0.8 \).
Answer 2: \( A^T A \) must be invertible to ensure a unique solution to \( A^T A \mathbf{x} = A^T \mathbf{b} \).
Answer 3: The normal equation is derived by minimizing \( \| A\mathbf{x} - \mathbf{b} \|^2 \).

Math Notes

Search This Blog