In mathematics, regression is one of the important topics in statistics. The process of determining the relationship between two variables is called as regression. It is also one of the statistical analysis methods that can be used to assessing the association between the two different variables. Here, we will study about the regression and also some example problems in regression.
Regression Definition
A regression is a statistical analysis assessing the association between two variables. It is used to find the relationship between two variables. A technique used to discover a mathematical relationship between two variables.
Regression Formula
The formula for regression is as follows,
Regression Equation (y) = a + bx
Slope (b)
Intercept (a)
Where,
x and y are the variables.
b = the slope of the regression line is also called as regression coefficient
a = intercept point of the regression line which is in the y-axis.
N = Number of values or elements
X = First Score
Y = Second Score
∑XY = Sum of the product of the first and Second Scores
∑X = Sum of First Scores
∑Y = Sum of Second Scores
∑X2 = Sum of square First Scores.
Regression Equation
A statistical measure that attempts to determine the strength of the relationship between one dependent variable and independent variable. A statistical technique used to explain the behavior of a dependent variable. Linear regression used to find the straight line, called the least squares regression line.
Regression Line Formula
Suppose y is a dependent variable, and x is an independent variable. Then, the equation for the regression line be:
⇒ y′ = a0 + a1 x
where a0 is a constant, a1 is the regression coefficient, x is the value of the independent variable, and y' is the predicted value of the dependent variable.
Regression Coefficient
Simple hypothesis testing involving the statistical significant of a single regression coefficient is conducted in the same manner in the multiple regression model as it is in the simple regression model.
The equation for the regression line be:
⇒ y′ = a0 + a1 x
where a0 is a constant intercept.
a1 is the regression coefficient, and also the slop of the regression line.
x is the value of the x variable.
and y′ is the predicted value of y variable.
Regression Coefficient Formula
The formula for calculating the regression coefficient is
a1 = r sysx
where, a1 is the regression coefficient
r = correlation between the x and y variables.
sx = standard deviation of the x variable.
sy = standard deviation of the y variable.
Interpreting Regression Coefficients
The interpretation of the regression coefficient in a multiple regression equation is little difficult as compare to simple regression. Simple regression equation represents a line, while multiple regression represents a plan or hyperplane. In multiple regression coefficient βk, i = 1,2,...,i has several interpretations. It may be interpreted as the change in ycorresponding to a unit change in xk when all other predictor variables are constant.
Simple Regression
Simple regression analysis involves a single independent, or predictor variable and a single dependent variable. It is analysis whereas the correlation does not distinguish between independent and dependent variables. In simple regression analysis, there is no partialling out of other variables because no other variables are included in the regression.
The equation of the probabilistic simple regression is y = β0+β1x1+ε
where, y is the value of the dependent variable
β0 is the population y intercept
β0 is the population slope
ε the error of prediction.
Multiple Regression
Regression analysis with two or more independent variables or with atleast one nonlinear predictor is called multiple regression analysis. Multiple regression analysis is similar to simple regression analysis. However it is more complex conceptually and computationally.
The equation of the probabilistic multiple regression is
y = β0+β1x1+β2x2+..............+βixi+ε
where, y is the value of the dependent variable
β0 is the regression constant
β1, β2,........., βi is the partial regression coefficient for the independent variables, 1, 2,...., i respectively.
'i' is the number of independent variables.
Multiple Regression Assumptions
Some assumption of the multiple regression:
The relationship between each of the predictor variables and the dependent variable is linear and the error is normally distributed and uncorrelated with the predictors.
The mean value of the error term is zero.
Multicollinearity happens when two or more predictors contain much of the same information.
Regression Test
The purpose of regression testing is to confirm that a recent program change has not adversely affected existing features. This testing is done to make sure that new code changes should not have side effects on the existing functionalities. It is a program has not regressed, test complete has all of the features needed to make regression testing fully automated and thus to help us produce high-quality products.
Non Linear Regression
Nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. Nonlinear regression is characterized by the fact that the prediction equation depends nonlinearly on one or more unknown parameters.
A nonlinear regression model is of the form:
Yi=f(xi,θ)+εi
for i = 1, 2, 3, ........, n
where the Yi are responses,
f is a known function of the covariate vector xi=(xi1,xi2,.........,xik) and the parameter vector θ=(θ1,θ2,.......θp), εi are random errors and usually assumed to be uncorrelated with mean zero and constant variance.
R Logistic Regression
Linear regression is used to measure the degree of relationship between a dependent variable and a independent variable. If the response variable is binary, we need to consider using generalized linear models. One of the most common used generalized model is logistic regression. Logistic regression may be applied to a database where there is a nonlinear relationship between the response variable and one or more predictor variables. It is commonly used for predicting the probability of occurrence of an event, based on several predictor variables that may either be numerical or categorical.
Logistic Regression Assumptions
In logistic regression no assumptions are made about the distributions of the explanatory variables. However, the explanatory variables should not be highly correlated with one another because this cause problems with estimation.
Stepwise Logistic Regression
There are several procedure for variable selection implemented in statistics packages like backward elimination, forward selection, stepwise selection etc. Stepwise selection of variables is widely used in linear regression. Most of the software packages offer an option for stepwise logistic regression. Employing a stepwise selection procedure can provide a fast and effective means to screen a large number of variables, and to fit number of logistic regression equations. The result of stepwise logistic regression will depend substantially on the significance level for variable entering and staying in the model and selection criteria used.
Anova Regression
ANOVA (Analysis of Variance) consists of calculations that provide information about levels of variability within a regression model and form a basis for tests of significance. ANOVA is a special case of regression analysis, one where the independent variables are categorical rather than continuous. The ANOVA calculations for multiple regression are nearly identical to the calculations for simple linear regression, except that the degrees of freedom are adjusted to reflect the number of explanatory variables included in the model.
Poisson Regression Model
Poisson regression analysis is a regression technique available for modeling dependent variables that describe count data. Poisson regression models are appropriate when the response variable is count data. These data can be analyzed by Poisson regression. The Poisson regression model can be described as the random component has a Poisson distribution and the mean is linked ti the linear predictor by a logarithmic function
ln(μi) = β0+β1x1i+β2x2i+.......+βpxpi
The test and the inferences on the Poisson model are carried out in the same way as the logistic model.
Regression Assumptions
Some assumptions of the linear relationship between the independent and dependent variable:
If the relationship between independent variables and the dependent variable is not linear, the results of the regression analysis will under-estimate the true relationship.
Standard multiple regression can only accurately estimate the relationship between dependent and independent variables if the relationships are linear in nature.
Examples of Regression
Listed below are some of the examples on regression.
Solved Example
Question:
Determine the regression equation by using the regression slope coefficient and intercept value as shown in the regression table given below.
X Values | Y Values |
55 | 52 |
60 | 54 |
65 | 56 |
70 | 58 |
80 | 62 |
For the given data set of data, solve the regression slope and intercept values.
Solution:
Let us count the number of values.
N = 5
Determine the values for XY, X2
X Value | Y Value | X*Y | X*X |
55 | 52 | 2860 | 3025 |
60 | 54 | 3240 | 3600 |
65 | 56 | 3640 | 4225 |
70 | 58 | 4060 | 4900 |
80 | 62 | 4960 | 6400 |
Determine the following values ∑X , ∑Y , ∑XY , ∑X2.
∑X=330
∑Y=282
∑XY=18760
∑X2=22150
Substitute values in the slope formula
Substitute the values in the intercept formula given.
Substitute the Regression coefficient value and intercept value in the regression equation
Regression Equation(y) = a + bx
= 30 + 0.4x
556 videos|198 docs
|
1. What is the difference between simple and multiple linear regression? |
2. How is logistic regression different from linear regression? |
3. What is the purpose of CSIR-NET Mathematical Sciences exam? |
4. How can I prepare for the CSIR-NET Mathematical Sciences exam effectively? |
5. What are the career opportunities after qualifying the CSIR-NET Mathematical Sciences exam? |
556 videos|198 docs
|
|
Explore Courses for Mathematics exam
|