Skip to main content

Linear Regression and Correlation

  • Understand the concept of correlation and regression.
  • Compute and interpret the correlation coefficient.
  • Learn the simple linear regression model.
  • Estimate regression coefficients using least squares.
  • Visualize regression lines and data correlation.

Definition of Correlation

Correlation measures the strength and direction of the relationship between two variables.

  • If two variables increase together, they have a positive correlation.
  • If one variable increases while the other decreases, they have a negative correlation.
  • If there is no systematic relationship, they are uncorrelated.

Correlation Coefficient

The **Pearson correlation coefficient** (\( r \)) measures the strength of a linear relationship:

\[ r = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum (x_i - \bar{x})^2 \sum (y_i - \bar{y})^2}} \]

Interpretation of \( r \):

Correlation Coefficient (\( r \)) Interpretation
\( 1.0 \) Perfect positive correlation
\( 0.5 \) to \( 1.0 \) Strong positive correlation
\( 0.0 \) to \( 0.5 \) Weak positive correlation
\( -0.5 \) to \( 0.0 \) Weak negative correlation
\( -1.0 \) to \( -0.5 \) Strong negative correlation

Simple Linear Regression

Linear regression is a method for modeling the relationship between two variables using a straight-line equation:

\[ y = b_0 + b_1 x + \varepsilon \]

where:

  • \( y \) = dependent variable (response)
  • \( x \) = independent variable (predictor)
  • \( b_0 \) = intercept (value of \( y \) when \( x = 0 \))
  • \( b_1 \) = slope (change in \( y \) for a unit change in \( x \))
  • \( \varepsilon \) = error term

Least Squares Estimation

The regression coefficients (\( b_0 \) and \( b_1 \)) are estimated using the **least squares method**:

\[ b_1 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2} \] \[ b_0 = \bar{y} - b_1 \bar{x} \]

Derivation

The sum of squared residuals is given by:

\[ S = \sum (y_i - (b_0 + b_1 x_i))^2 \]

Taking the derivative with respect to \( b_0 \) and \( b_1 \) and solving for zero gives:

\[ b_1 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2} \] \[ b_0 = \bar{y} - b_1 \bar{x} \]

Visualization of Regression Line and Correlation

Examples

Example 1: A researcher collects the following data on study hours (\( x \)) and test scores (\( y \)):

Study Hours (\( x \)) Test Score (\( y \))
2 60
4 65
6 70
8 75
10 80

Compute the regression equation.

  • \( \bar{x} = 6 \), \( \bar{y} = 70 \)
  • \( b_1 = 2.5 \)
  • \( b_0 = 55 \)

Regression equation:

\[ y = 55 + 2.5x \]

Exercises

  • Question 1: Given the dataset \( (1,2), (2,4), (3,6), (4,8) \), compute \( b_0 \) and \( b_1 \).
  • Question 2: If the correlation coefficient between height and weight is \( r = 0.8 \), interpret its meaning.
  • Question 3: Find the regression line for \( (1,3), (2,5), (3,7) \).
  • Answer 1: \( b_0 = 0 \), \( b_1 = 2 \).
  • Answer 2: Strong positive correlation.
  • Answer 3: \( y = 1 + 2x \).

This Week's Best Picks from Amazon

Please see more curated items that we picked from Amazon here .

Popular posts from this blog

LU Decomposition

LU Decomposition: A Step-by-Step Guide LU Decomposition, also known as LU Factorization, is a method of decomposing a square matrix into two triangular matrices: a lower triangular matrix L and an upper triangular matrix U . This is useful for solving linear equations, computing determinants, and inverting matrices efficiently. What is LU Decomposition? LU Decomposition expresses a matrix A as: \[ A = LU \] where: L is a lower triangular matrix with ones on the diagonal. U is an upper triangular matrix. Step-by-Step Process Consider the matrix: \[ A = \begin{bmatrix} 2 & 3 & 1 \\ 4 & 7 & 3 \\ 6 & 18 & 5 \end{bmatrix} \] Step 1: Initialize L as an Identity Matrix Start with an identity matrix for \( L \): \[ L = \begin{bmatrix} 1 & 0 & 0 \\ 0 ...

Gaussian Elimination: A Step-by-Step Guide

Gaussian Elimination: A Step-by-Step Guide Gaussian Elimination is a systematic method for solving systems of linear equations. It works by transforming a given system into an equivalent one in row echelon form using a sequence of row operations. Once in this form, the system can be solved efficiently using back-substitution . What is Gaussian Elimination? Gaussian elimination consists of two main stages: Forward Elimination: Convert the system into an upper triangular form. Back-Substitution: Solve for unknowns starting from the last equation. Definition of a Pivot A pivot is the first nonzero entry in a row when moving from left to right. Pivots are used to eliminate the elements below them, transforming the system into an upper triangular form. Step-by-Step Example Consider the system of equations: \[ \begin{aligned} 2x + 3y - z &= 5 \\ 4x + y...

Vector Spaces and Linear Transformation

Vector Spaces and Linear Transformations A vector space is a set of vectors that satisfies specific properties under vector addition and scalar multiplication. Definition of a Vector Space A set \( V \) is called a vector space over a field \( \mathbb{R} \) (real numbers) if it satisfies the following properties: Closure under addition: If \( \mathbf{u}, \mathbf{v} \in V \), then \( \mathbf{u} + \mathbf{v} \in V \). Closure under scalar multiplication: If \( \mathbf{v} \in V \) and \( c \in \mathbb{R} \), then \( c\mathbf{v} \in V \). Associativity: \( (\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v} + \mathbf{w}) \). Commutativity: \( \mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u} \). Existence of a zero vector: There exists a vector \( \mathbf{0} \) such that \( \mathbf{v} + \mathbf{0} = \mathbf{v} \). Existence of additive inverses: For eac...