PE Exam Exam > PE Exam Notes > Engineering Fundamentals Revision for PE > Formula Sheet: Regression & Correlation

Formula Sheet: Regression & Correlation

Table of Contents
1. Linear Regression
2. Correlation
3. Multiple Linear Regression
4. Prediction and Confidence Intervals
5. Hypothesis Testing in Regression
6. Nonlinear Regression
7. Important Notes and Conditions
View more

Linear Regression

Simple Linear Regression Model

General Form:

\[y = a + bx\]

y = dependent variable (response variable)
x = independent variable (predictor variable)
a = y-intercept (constant term)
b = slope (regression coefficient)

Estimated Regression Line (Least Squares Line):

\[\hat{y} = a + bx\]

\(\hat{y}\) = predicted value of y
Minimizes the sum of squared residuals

Calculation of Regression Coefficients

Slope (b):

\[b = \frac{n\sum xy - \sum x \sum y}{n\sum x^2 - (\sum x)^2}\]

Alternative formula for slope:

\[b = \frac{S_{xy}}{S_{xx}}\]

Y-Intercept (a):

\[a = \bar{y} - b\bar{x}\]

n = number of data points
\(\bar{x}\) = mean of x values = \(\frac{\sum x}{n}\)
\(\bar{y}\) = mean of y values = \(\frac{\sum y}{n}\)

Sum of Squares

Sum of Squares for x (\(S_{xx}\)):

\[S_{xx} = \sum x^2 - \frac{(\sum x)^2}{n}\]

Sum of Squares for y (\(S_{yy}\)):

\[S_{yy} = \sum y^2 - \frac{(\sum y)^2}{n}\]

Sum of Cross Products (\(S_{xy}\)):

\[S_{xy} = \sum xy - \frac{\sum x \sum y}{n}\]

Residuals and Error Analysis

Residual:

\[e_i = y_i - \hat{y}_i\]

\(e_i\) = residual for observation i
\(y_i\) = observed value
\(\hat{y}_i\) = predicted value

Sum of Squared Errors (SSE):

\[SSE = \sum (y_i - \hat{y}_i)^2 = \sum e_i^2\]

Alternative formula for SSE:

\[SSE = S_{yy} - b \cdot S_{xy}\]

Total Sum of Squares (SST):

\[SST = \sum (y_i - \bar{y})^2 = S_{yy}\]

Regression Sum of Squares (SSR):

\[SSR = \sum (\hat{y}_i - \bar{y})^2\]

Relationship:

\[SST = SSR + SSE\]

Standard Error of Estimate

Standard Error of the Estimate (\(s_e\) or \(s_{y/x}\)):

\[s_e = \sqrt{\frac{SSE}{n-2}}\]

Measures the typical distance data points fall from the regression line
Denominator uses (n-2) degrees of freedom for simple linear regression
Units are the same as the dependent variable y

Alternative formula:

\[s_e = \sqrt{\frac{\sum(y_i - \hat{y}_i)^2}{n-2}}\]

Correlation

Correlation Coefficient

Pearson Correlation Coefficient (r):

\[r = \frac{S_{xy}}{\sqrt{S_{xx} \cdot S_{yy}}}\]

Alternative formula:

\[r = \frac{n\sum xy - \sum x \sum y}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}}\]

Range: -1 ≤ r ≤ +1
r = +1: perfect positive linear correlation
r = -1: perfect negative linear correlation
r = 0: no linear correlation
r is dimensionless (no units)

Relationship between r and slope b:

\[r = b \cdot \frac{\sqrt{S_{xx}}}{\sqrt{S_{yy}}}\]

Coefficient of Determination

Coefficient of Determination (\(r^2\) or \(R^2\)):

\[r^2 = \frac{SSR}{SST} = \frac{SST - SSE}{SST} = 1 - \frac{SSE}{SST}\]

Represents the proportion of variance in y explained by x
Range: 0 ≤ \(r^2\) ≤ 1
Expressed as percentage: multiply by 100
For simple linear regression: \(r^2\) = (correlation coefficient)\(^2\)

Interpretation:

\(r^2\) = 0.85 means 85% of the variation in y is explained by the linear relationship with x
Remaining 15% is unexplained variance

Multiple Linear Regression

Multiple Regression Model

General Form:

\[y = a + b_1x_1 + b_2x_2 + ... + b_kx_k\]

y = dependent variable
\(x_1, x_2, ..., x_k\) = independent variables
a = y-intercept
\(b_1, b_2, ..., b_k\) = partial regression coefficients
k = number of independent variables

Adjusted Coefficient of Determination

Adjusted \(R^2\):

\[R_{adj}^2 = 1 - \frac{SSE/(n-k-1)}{SST/(n-1)}\]

Alternative formula:

\[R_{adj}^2 = 1 - (1-R^2)\frac{n-1}{n-k-1}\]

n = number of observations
k = number of independent variables
Adjusts for the number of predictors in the model
Penalizes addition of non-significant variables
Used for comparing models with different numbers of predictors

Standard Error for Multiple Regression

Standard Error of the Estimate:

\[s_e = \sqrt{\frac{SSE}{n-k-1}}\]

Denominator uses (n-k-1) degrees of freedom
k = number of independent variables

Prediction and Confidence Intervals

Prediction Using Regression

Point Estimate:

\[\hat{y} = a + bx\]

Substitute the value of x into the regression equation
Valid only within the range of the original data (interpolation)
Extrapolation beyond data range is unreliable

Confidence Interval for Mean Response

Confidence Interval for \(E(y|x_0)\):

\[\hat{y} \pm t_{\alpha/2, n-2} \cdot s_e \sqrt{\frac{1}{n} + \frac{(x_0 - \bar{x})^2}{S_{xx}}}\]

\(x_0\) = specific value of x
\(t_{\alpha/2, n-2}\) = t-value for desired confidence level with (n-2) degrees of freedom
Estimates the mean value of y for a given x

Prediction Interval for Individual Response

Prediction Interval for individual y value:

\[\hat{y} \pm t_{\alpha/2, n-2} \cdot s_e \sqrt{1 + \frac{1}{n} + \frac{(x_0 - \bar{x})^2}{S_{xx}}}\]

Wider than confidence interval for mean response
Accounts for individual variation plus uncertainty in mean
Note the "1+" under the square root compared to confidence interval

Hypothesis Testing in Regression

Testing Significance of Slope

Null Hypothesis:

\[H_0: b = 0\]

Tests whether there is a significant linear relationship

Test Statistic:

\[t = \frac{b - 0}{s_b}\]

Standard Error of Slope (\(s_b\)):

\[s_b = \frac{s_e}{\sqrt{S_{xx}}}\]

Compare calculated t-value to critical t-value with (n-2) degrees of freedom
Reject \(H_0\) if |t| > \(t_{\alpha/2, n-2}\)

Testing Significance of Correlation

Null Hypothesis:

\[H_0: \rho = 0\]

\(\rho\) = population correlation coefficient

Test Statistic:

\[t = \frac{r\sqrt{n-2}}{\sqrt{1-r^2}}\]

r = sample correlation coefficient
n = sample size
Degrees of freedom = n-2
Equivalent to testing if slope b = 0 in simple linear regression

Nonlinear Regression

Transformations to Linear Form

Exponential Model:

\[y = ae^{bx}\]

Linearization: Take natural logarithm of both sides

\[\ln(y) = \ln(a) + bx\]

Plot ln(y) vs. x
Slope = b, intercept = ln(a)

Power Model:

\[y = ax^b\]

Linearization: Take logarithm of both sides

\[\ln(y) = \ln(a) + b\ln(x)\]

Plot ln(y) vs. ln(x)
Slope = b, intercept = ln(a)

Logarithmic Model:

\[y = a + b\ln(x)\]

Already in linear form
Plot y vs. ln(x)

Reciprocal Model:

\[y = \frac{1}{a + bx}\]

Linearization:

\[\frac{1}{y} = a + bx\]

Plot 1/y vs. x

Important Notes and Conditions

Assumptions of Linear Regression

Linearity: Relationship between x and y is linear
Independence: Observations are independent
Homoscedasticity: Constant variance of residuals
Normality: Residuals are normally distributed (especially important for small samples)
No multicollinearity: Independent variables are not highly correlated (for multiple regression)

Correlation vs. Causation

Correlation does not imply causation
A significant correlation indicates association, not necessarily cause-and-effect
Confounding variables may influence both x and y

Outliers and Influential Points

Outlier: Point with large residual (unusual y-value)
Influential point: Point whose removal significantly changes regression equation
Leverage: Points with extreme x-values have high leverage
Always examine scatter plots and residual plots

Interpolation vs. Extrapolation

Interpolation: Predicting within the range of observed data (generally reliable)
Extrapolation: Predicting outside the range of observed data (unreliable and risky)
Relationship may not hold beyond observed data range

The document Formula Sheet: Regression & Correlation is a part of the PE Exam Course Engineering Fundamentals Revision for PE.

All you need of PE Exam at this link: PE Exam

Engineering Fundamentals Revision for PE

Join Course for Free

About this Document

Apr 20, 2026 Last updated

Related Exams

PE Exam

Document Description: Formula Sheet: Regression & Correlation for PE Exam 2026 is part of Engineering Fundamentals Revision for PE preparation. The notes and questions for Formula Sheet: Regression & Correlation have been prepared according to the PE Exam exam syllabus. Information about Formula Sheet: Regression & Correlation covers topics like and Formula Sheet: Regression & Correlation Example, for PE Exam 2026 Exam. Find important definitions, questions, notes, meanings, examples, exercises and tests below for Formula Sheet: Regression & Correlation.

Introduction of Formula Sheet: Regression & Correlation in English is available as part of our Engineering Fundamentals Revision for PE for PE Exam & Formula Sheet: Regression & Correlation in Hindi for Engineering Fundamentals Revision for PE course. Download more important topics related with notes, lectures and mock test series for PE Exam Exam by signing up for free. PE Exam: Formula Sheet: Regression & Correlation

Description

Formula Sheet: Regression & Correlation of Engineering Fundamentals Revision on EduRev - covers all the formulas compiled in a single place. Download free PDF for last-minute revision.

Information about Formula Sheet: Regression & Correlation

In this doc you can find the meaning of Formula Sheet: Regression & Correlation defined & explained in the simplest way possible. Besides explaining types of Formula Sheet: Regression & Correlation theory, EduRev gives you an ample number of questions to practice Formula Sheet: Regression & Correlation tests, examples and also practice PE Exam tests

Engineering Fundamentals Revision for PE

Join Course for Free

Download as PDF

Explore Courses for PE Exam exam

Get EduRev Notes directly in your Google search

Formula Sheet: Regression & Correlation Free PDF Download

The Formula Sheet: Regression & Correlation is an invaluable resource that delves deep into the core of the PE Exam exam. These study notes are curated by experts and cover all the essential topics and concepts, making your preparation more efficient and effective. With the help of these notes, you can grasp complex subjects quickly, revise important points easily, and reinforce your understanding of key concepts. The study notes are presented in a concise and easy-to-understand manner, allowing you to optimize your learning process. Whether you're looking for best-recommended books, sample papers, study material, or toppers' notes, this PDF has got you covered. Download the Formula Sheet: Regression & Correlation now and kickstart your journey towards success in the PE Exam exam.

Importance of Formula Sheet: Regression & Correlation

The importance of Formula Sheet: Regression & Correlation cannot be overstated, especially for PE Exam aspirants. This document holds the key to success in the PE Exam exam. It offers a detailed understanding of the concept, providing invaluable insights into the topic. By knowing the concepts well in advance, students can plan their preparation effectively. Utilize this indispensable guide for a well-rounded preparation and achieve your desired results.

Formula Sheet: Regression & Correlation Notes

Formula Sheet: Regression & Correlation Notes offer in-depth insights into the specific topic to help you master it with ease. This comprehensive document covers all aspects related to Formula Sheet: Regression & Correlation. It includes detailed information about the exam syllabus, recommended books, and study materials for a well-rounded preparation. Practice papers and question papers enable you to assess your progress effectively. Additionally, the paper analysis provides valuable tips for tackling the exam strategically. Access to Toppers' notes gives you an edge in understanding complex concepts. Whether you're a beginner or aiming for advanced proficiency, Formula Sheet: Regression & Correlation Notes on EduRev are your ultimate resource for success.

Formula Sheet: Regression & Correlation PE Exam Questions

The "Formula Sheet: Regression & Correlation PE Exam Questions" guide is a valuable resource for all aspiring students preparing for the PE Exam exam. It focuses on providing a wide range of practice questions to help students gauge their understanding of the exam topics. These questions cover the entire syllabus, ensuring comprehensive preparation. The guide includes previous years' question papers for students to familiarize themselves with the exam's format and difficulty level. Additionally, it offers subject-specific question banks, allowing students to focus on weak areas and improve their performance.

Study Formula Sheet: Regression & Correlation on the App

Students of PE Exam can study Formula Sheet: Regression & Correlation alongwith tests & analysis from the EduRev app, which will help them while preparing for their exam. Apart from the Formula Sheet: Regression & Correlation, students can also utilize the EduRev App for other study materials such as previous year question papers, syllabus, important questions, etc. The EduRev App will make your learning easier as you can access it from anywhere you want. The content of Formula Sheet: Regression & Correlation is prepared as per the latest PE Exam syllabus.

Signup to see your scores go up
within 7 days!

Continue with Google

Takes less than 10 seconds to signup