Open App

CA Foundation Exam > CA Foundation Notes > Quantitative Aptitude for CA Foundation > Chapter Notes: Correlation And Regression

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation PDF Download

Table of contents
Chapter Overview
Introduction
Bivariate Data
Correlation Analysis
Regression Analysis
Properties of Regression Lines

Chapter Overview

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Introduction

In the previous chapter, we looked at different statistical measures related to univariate distribution, which deals with a single variable like height, weight, marks, profit, or wages.
Sometimes, it is essential to analyze more than one variable at the same time. For example:

A businessman might want to know how much investment is needed to achieve a specific level of profit.
A student may question whether improving their performance in a selection test would enhance their chances of doing well in a final exam.
To answer these types of questions, we need to study multiple variables simultaneously.

Correlation Analysis and Regression Analysis are two methods for examining data from a multivariate distribution, which involves more than one variable.

When focusing on just two variables, like x and y, we analyze what's known as bivariate distribution.

Correlation analysis helps us find out if there is a relationship between the two variables x and y, or if they are independent from each other.
For example, if x represents profit and y represents investment for a business, or if x and y are the marks in Statistics and Mathematics for a group of students, we may be interested in knowing if these variables are linked.
The strength of the correlation between x and y can be measured using various methods such as:

Product Moment Correlation Coefficient
Rank Correlation Coefficient
Coefficient of Concurrent Deviations

It is crucial to be careful about the cause-and-effect relationship in correlation analysis.
Sometimes, x and y might appear to be related due to the influence of a third variable, even if there is no direct causal link between them.
On the other hand, regression analysis is focused on predicting the value of the dependent variable based on a known value of the independent variable.
his assumes there is a mathematical relationship between the two variables, typically looking at their average relationship.

Question for Chapter Notes: Correlation And Regression

Try yourself:

Which statistical measure is used to find the strength of the correlation between two variables?

A.
Product Moment Correlation Coefficient
B.
Mean
C.
Standard Deviation
D.
Median

View Solution

Bivariate Data

When data is collected on two variables at the same time, it is called bivariate data. The frequency distribution that comes from this data is known as Bivariate Frequency Distribution.
For example, let’s say x and y represent the scores in Math and Statistics for a group of 30 students. The bivariate data would be represented as pairs (x_i, y_i) for i = 1, 2, …, 30.
Here, (x₁, y₁) refers to the scores in Mathematics and Statistics for the student with Roll Number 1, (x₂, y₂) for the student with Roll Number 2, and so forth. Finally, (x₃₀, y₃₀) indicates the score pair for the student with Roll Number 30.
Similar to Univariate Distribution, we also need to create a frequency distribution for bivariate data. This distribution considers the classification of both variables at the same time.
Typically, we arrange the data with a horizontal classification for x and a vertical classification for y.
This type of distribution is known as Bivariate Frequency Distribution, Joint Frequency Distribution, or Two-way classification of the two variables x and y.

ILLUSTRATIONS:

Example 17.1: Prepare a Bivariate Frequency table for the following data relating to the marks in Statistics (x) and Mathematics (y):

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Take mutually exclusive classification for both the variables, the first class interval being 0-4 for both.

Solution:
From the given data, we find that
Range for x = 19–1 = 18
Range for y = 19–1 = 18
We take the class intervals 0-4, 4-8, 8-12, 12-16, 16-20 for both the variables. Since the first pair of marks is (15, 13) and 15 belongs to the fourth class interval (12-16) for x and 13 belongs to the fourth class interval for y, we put a stroke in the (4, 4)-th cell. We carry on giving tally marks till the list is exhausted.

Table 17.1

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

We note, from the above table, that some of the cell frequencies (fij) are zero. Starting from the above Bivariate Frequency Distribution, we can obtain two types of univariate distributions which are known as:
(a) Marginal distribution.
(b) Conditional distribution.

If we consider the distribution of Statistics marks along with the marginal totals presented in the last column of Table 17.1, we get the marginal distribution of marks in Statistics. Similarly, we can obtain one more marginal distribution of Mathematics marks. The following table shows the marginal distribution of marks of Statistics.

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

We can find the mean and standard deviation of marks in Statistics from Table 17.2. They would be known as marginal mean and marginal SD of Statistics marks. Similarly, we can obtain the marginal mean and marginal SD of Mathematics marks. Any other statistical measure in respect of x or y can be computed in a similar manner.
If we want to study the distribution of Statistics Marks for a particular group of students, say for those students who got marks between 8 to 12 in Mathematics, we come across another univariate distribution known as conditional distribution.

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

We may obtain the mean and SD from the above table. They would be known as conditional mean and conditional SD of marks of Statistics. The same result holds for marks in Mathematics. In particular, if there are m classifications for x and n classifications for y, then there would be altogether (m + n) conditional distribution.

Correlation Analysis

While studying two variables at the same time, if it is found that the change in one variable is reciprocated by a corresponding change in the other variable either directly or inversely, then the two variables are known to be associated or correlated. Otherwise, the two variables are known to be dissociated or uncorrelated or independent. There are two types of correlation.
(i) Positive correlation
(ii) Negative correlation

If two variables move in the same direction i.e. an increase (or decrease) on the part of one variable introduces an increase (or decrease) on the part of the other variable, then the two variables are known to be positively correlated. As for example, height and weight yield and rainfall, profit and investment etc. are positively correlated.
On the other hand, if the two variables move in the opposite directions i.e. an increase (or a decrease) on the part of one variable results a decrease (or an increase) on the part of the other variable, then the two variables are known to have a negative correlation. The price and demand of an item, the profits of Insurance Company and the number of claims it has to meet etc. are examples of variables having a negative correlation.
The two variables are known to be uncorrelated if the movement on the part of one variable does not produce any movement of the other variable in a particular direction. As for example, Shoe-size and intelligence are uncorrelated.
We consider the following measures of correlation:
(a) Scatter diagram
(b) Karl Pearson’s Product moment correlation coefficient
(c) Spearman’s rank correlation co-efficient
(d) Co-efficient of concurrent deviations

(a) SCATTER DIAGRAM

This is a simple diagrammatic method to establish correlation between a pair of variables. Unlike product moment correlation co-efficient, which can measure correlation only when the variables are having a linear relationship, scatter diagram can be applied for any type of correlation – linear as well as non-linear i.e. curvilinear. Scatter diagram can distinguish between different types of correlation although it fails to measure the extent of relationship between the variables.

Each data point, which in this case a pair of values (xi, yi) is represented by a point in the rectangular axes of cordinates. The totality of all the plotted points forms the scatter diagram. The pattern of the plotted points reveals the nature of correlation. In case of a positive correlation, the plotted points lie from lower left corner to upper right corner, in case of a negative correlation the plotted points concentrate from upper left to lower right and in case of zero correlation, the plotted points would be equally distributed without depicting any particular pattern. The following figures show different types of correlation and the one to one correspondence between scatter diagram and product moment correlation coefficient.

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

(b) Karl Pearson’s Product Moment Correlation Coefficient

This is by for the best method for finding correlation between two variables provided the relationship between the two variables is linear. Pearson’s correlation coefficient may be defined as the ratio of covariance between the two variables to the product of the standard deviations of the two variables. If the two variables are denoted by x and y and if the corresponding bivariate data are (x_i, y_i) for i = 1, 2, 3, ….., n, then the coefficient of correlation between x and y, due to Karl Pearson, in given by :

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

where

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

A single formula for computing correlation coefficient is given by

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

In case of a bivariate frequency distribution, we have

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

where x_i = Mid-value of the ith class interval of x.
y_j = Mid-value of the jth class interval of y
f_io = Marginal frequency of x
f_oj = Marginal frequency of y
f_ij = frequency of the (i, j)th cell
Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Properties of Correlation Coefficient

(i) The Coefficient of Correlation is a unit-free measure.
This means that if x denotes height of a group of students expressed in cm and y denotes their weight expressed in kg, then the correlation coefficient between height and weight would be free from any unit.
(ii) The coefficient of correlation remains invariant under a change of origin and/or scale of the variables under consideration depending on the sign of scale factors.
This property states that if the original pair of variables x and y is changed to a new pair of variables u and v by effecting a change of origin and scale for both x and y i.e. Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation where a and c are the origins of x and y and b and d are the respective scales and then we have

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation r_xy and r_uv being the coefficient of correlation between x and y and u and v respectively, (17.10) established, numerically, the two correlation coefficients remain equal and they would have opposite signs only when b and d, the two scales, differ in sign.

(iii) The coefficient of correlation always lies between –1 and 1, including both the limiting values i.e.

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Example 17.2:
Calculate the correlation coefficient between x and y using the following data:

n = 10
Σxy = 220
Σx² = 200
Σy² = 262

The values of Σx and Σy are:

Σx = 40
Σy = 50

Solution:
From the given data, we have by applying (17.5),
Thus there is a good amount of positive correlation between the two variables x and y.
Alternately
Thus applying formula (17.1), we get
As before, we draw the same conclusion.

Example 17.3: Find product moment correlation coefficient from the following information:

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Solution:
In order to find the covariance and the two standard deviation, we prepare the following table:
We have
Thus the correlation coefficient between x and y in given by
We find a high degree of negative correlation between x and y. Also, we could have applied formula (17.5) as we have done for the first problem of computing correlation coefficient.
Sometimes, a change of origin reduces the computational labor to a great extent. This we are going to do in the next problem.

Example 17.4: The following data relate to the test scores obtained by eight salesmen in an aptitude test and their daily sales in thousands of rupees: Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Solution:
Let the scores and sales be denoted by x and y respectively. We take a, origin of x as the average of the two extreme values i.e. 54 and 70. Hence a = 62 similarly, the origin of y is taken as b = 24 + 35 / 2 ≅ 30
Since correlation coefficient remains unchanged due to change of origin, we have
In some cases, there may be some confusion about selecting the pair of variables for which correlation is wanted. This is explained in the following problem.

Example 17.5: Examine whether there is any correlation between age and blindness on the basis of the following data:

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Solution:
Let us denote the mid-value of age in years as x and the number of blind persons per lakh as y. Then as before, we compute correlation coefficient between x and y.
The correlation coefficient between age and blindness is given by
which exhibits a very high degree of positive correlation between age and blindness.

Example 17.6: Coefficient of correlation between x and y for 20 items is 0.4. The AM’s and SD’s of x and y are known to be 12 and 15 and 3 and 4 respectively. Later on, it was found that the pair (20, 15) was wrongly taken as (15, 20). Find the correct value of the correlation coefficient.

Solution:
Hence, corrected ∑xy= 3696 – 20 × 15 + 15 × 20 = 3696
Also, Sx²= 9
= (∑x²/ 20) – 122 = 9
∑x² = 3060
Similarly, Sy² = 16
Thus corrected – wrong value + correct value.
= 20 × 12 – 15 + 20
= 245
Similarly corrected ∑y = 20 × 15 – 20 + 15 = 295
Corrected ∑x² = 3060 – 15² + 20² = 3235
Corrected ∑y² = 4820 – 20² + 15² = 4645
Thus corrected value of the correlation coefficient by applying formula (17.5)

Example 17.7: Compute the coefficient of correlation between marks in Statistics and Mathematics for the bivariate frequency distribution shown in Table 17.6

Solution:
For the sake of computational advantage, we effect a change of origin and scale for both the variable x and y.
Where x_i and y_j denote respectively the mid-values of the x-class interval and y-class interval respectively. The following table shows the necessary calculation on the right top corner of each cell, the product of the cell frequency, corresponding u value and the respective v value has been shown. They add up in a particular row or column to provide the value of f_iju_iv_j for that particular row or column.
A single formula for computing correlation coefficient from bivariate frequency distribution is given by
The value of r shown a good amount of positive correlation between the marks in Statistics and Mathematics on the basis of the given data.

Example 17.8: Given that the correlation coefficient between x and y is 0.8, write down the correlation coefficient between u and v where
(i) 2u + 3x + 4 = 0 and 4v + 16y + 11 = 0
(ii) 2u – 3x + 4 = 0 and 4v + 16y + 11 = 0
(iii) 2u – 3x + 4 = 0 and 4v – 16y + 11 = 0
(iv) 2u + 3x + 4 = 0 and 4v – 16y + 11 = 0

Solution:
Using (17.10), we find that
i.e. r_xy = r_uv if b and d are of same sign and r_uv = –r_xy when b and d are of opposite signs, b and d being the scales of x and y respectively. In (i), u = (–2) + (-3/2) x and v = (–11/4) + (–4)y.
Since b = –3/2 and d = –4 are of same sign, the correlation coefficient between u and v would be the same as that between x and y i.e. r_xy = 0.8 =r_uv
In (ii), u = (–2) + (3/2)x and v = (–11/4) + (–4)y
Hence b = 3/2 and d = –4 are of opposite signs and we have r_uv = –r_xy = –0.8
Proceeding in a similar manner, we have r_uv = 0.8 and – 0.8 in (iii) and (iv).

(c) Spearman’s Rank Correlation Coefficient

When we need finding correlation between two qualitative characteristics, say, beauty and intelligence, we take recourse to using rank correlation coefficient. Rank correlation can also be applied to find the level of agreement (or disagreement) between two judges so far as assessing a qualitative characteristic is concerned. As compared to product moment correlation coefficient, rank correlation coefficient is easier to compute, it can also be advocated to get a first hand impression about the correlation between a pair of variables.

Spearman’s rank correlation coefficient is given by

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

where r_R denotes rank correlation coefficient and it lies between –1 and 1 inclusive of these two values.
d_i= x_i – y_i represents the difference in ranks for the i-th individual and n denotes the number of individuals.
In case u individuals receive the same rank, we describe it as a tied rank of length u. In case of a tied rank, formula (17.11) is changed to

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

In this formula, t_j represents the jth tie length and the summation Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation extends over the lengths of all the ties for both the series.

Example 17.9: compute the coefficient of rank correlation between sales and advertisement expressed in thousands of rupees from the following data:

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Solution:
Let the rank given to sales be denoted by x and rank of advertisement be denoted by y. We note that since the highest sales as given in the data, is 95, it is to be given rank 1, the second highest sales 90 is to be given rank 2 and finally rank 8 goes to the lowest sales, namely 68. We have given rank to the other variable advertisement in a similar manner. Since there are no ties, we apply formula (17.11).
Since n = 8 and applying formula (17.11), we get.
The high positive value of the rank correlation coefficient indicates that there is a very good amount of agreement between sales and advertisement.

Example 17.10: Compute rank correlation from the following data relating to ranks given by two judges in a contest:

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Solution:
We directly apply formula (17.11) as ranks are already given.
The rank correlation coefficient is given by
The very low value (almost 0) indicates that there is hardly any agreement between the ranks given by the two Judges in the contest.

Example 17.11: Compute the coefficient of rank correlation between Eco. marks and stats. Marks as given below:

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Solution:
This is a case of tied ranks as more than one student share the same mark both for Economics and Statistics. For Eco. the student receiving 80 marks gets rank 1 one getting 62 marks receives rank 2, the student with 60 receives rank 3, student with 56 marks gets rank 4 and since there are two students, each getting 50 marks, each would be receiving a common rank, the average of the next two ranks 5 and 6 i.e. 5 + 6 / 2 i.e. 5.50 and lastly the last rank..
7 goes to the student getting the lowest Eco marks. In a similar manner, we award ranks to the students with stats marks.
For Economics mark there is one tie of length 2 and for stats mark, there are two ties of lengths 2 and 3 respectively.

Example 17.12: For a group of 8 students, the sum of squares of differences in ranks for Mathematics and Statistics marks was found to be 50 what is the value of rank correlation coefficient?

Solution:
As given n = 8 and
Hence the rank correlation coefficient between marks in Mathematics and Statistics is given by

Example 17.13: For a number of towns, the coefficient of rank correlation between the people living below the poverty line and increase of population is 0.50. If the sum of squares of the differences in ranks awarded to these factors is 82.50, find the number of towns.

Solution:
∴ n = 10 as n must be a positive integer.

Example 17.14: While computing rank correlation coefficient between profits and investment for 10 years of a firm, the difference in rank for a year was taken as 7 instead of 5 by mistake and the value of rank correlation coefficient was computed as 0.80. What would be the correct value of rank correlation coefficient after rectifying the mistake?

Solution:
We are given that n = 10,
Hence rectified value of rank correlation coefficient
= 0.95

(d) Coefficient of Concurrent Deviations

A very simple and casual method of finding correlation when we are not serious about the magnitude of the two variables is the application of concurrent deviations. This method involves in attaching a positive sign for a x-value (except the first) if this value is more than the previous value and assigning a negative value if this value is less than the previous value. This is done for the y-series as well. The deviation in the x-value and the corresponding y-value is known to be concurrent if both the deviations have the same sign.

Denoting the number of concurrent deviation by c and total number of deviations as m (which must be one less than the number of pairs of x and y values), the coefficient of concurrent deviation is given by

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

If (2c–m) >0, then we take the positive sign both inside and outside the radical sign and if (2c–m) <0, we are to consider the negative sign both inside and outside the radical sign.

Like Pearson’s correlation coefficient and Spearman’s rank correlation coefficient, the coefficient of concurrent deviations also lies between –1 and 1, both inclusive.

Example 17.15: Find the coefficient of concurrent deviations from the following data.

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Solution:
In this case, m = number of pairs of deviations = 7
c = No. of positive signs in the product of deviation column = Number of concurrent deviations =2
(Since we take negative sign both inside and outside of the radical sign)
Thus there is a negative correlation between price and demand.

Regression Analysis

In regression analysis, we are concerned with the estimation of one variable for a given value of another variable (or for a given set of values of a number of variables) on the basis of an average mathematical relationship between the two variables (or a number of variables). Regression analysis plays a very important role in the field of every human activity. A businessman may be keen to know what would be his estimated profit for a given level of investment on the basis of the past records. Similarly, an outgoing student may like to know her chance of getting a first class in the final University Examination on the basis of her performance in the college selection test.
When there are two variables x and y and if y is influenced by x i.e. if y depends on x, then we get a simple linear regression or simple regression. y is known as dependent variable or regression or explained variable and x is known as independent variable or predictor or explanator. In the previous examples since profit depends on investment or performance in the University Examination is dependent on the performance in the college selection test, profit or performance in the University Examination is the dependent variable and investment or performance in the selection test is the Independent variable.
In case of a simple regression model if y depends on x, then the regression line of y on x in given by
y = a + bx …………………… (17.14)
Here a and b are two constants and they are also known as regression parameters. Furthermore, b is also known as the regression coefficient of y on x and is also denoted by b yx. We may define the regression line of y on x as the line of best fit obtained by the method of least squares and used for estimating the value of the dependent variable y for a known value of the independent variable x.
The method of least squares involves in minimizing

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation where yi demotes the actual or observed value and , the estimated value of yi for a given value of x_i, e_i is the difference between the observed value and the estimated value and ei is technically known as error or residue. This summation intends over n pairs of observations of (x_i, y_i). The line of regression of y or x and the errors of estimation are shown in the following figure.

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Minimisation of (17.15) yields the following equations known as ‘Normal Equations’

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Solving there two equations for b and a, we have the “least squares” estimates of b and a as

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

After estimating b, estimate of a is given by
Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Substituting the estimates of b and a in (17.14), we get

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

There may be cases when the variable x depends on y and we may take the regression line of x on y as

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Unlike the minimization of vertical distances in the scatter diagram as shown in figure (17.7) for obtaining the estimates of a and b, in this case we minimize the horizontal distances and get the following normal equation in a^ and b^ , the two regression parameters :

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

A single formula for estimating b is given by

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

The standardized form of the regression equation of x on y, as in (17.20), is given by

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Example 17.15: Find the two regression equations from the following data:

Hence estimate y when x is 13 and estimate also x when y is 15.

Solution:
On the basis of the above table, we have
The regression line of y on x is given by
y = a + bx
Thus the estimated regression equation of y on x is
y = 4.7178 + 0.8145x
When x = 13, the estimated value of y is given by yˆ = 4.7178 + 0.8145 × 13 = 15.3063
The regression line of x on y is given by
= 1.0743
and
Thus the estimated regression line of x on y is
x = –4.3601 + 1.0743y
When y = 15, the estimate value of x is given by
xˆ = – 4.3601 + 1.0743 × 15
= 11.75

Example 17.16: Marks of 8 students in Mathematics and statistics are given as:

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Find the regression lines. When marks of a student in Mathematics are 90, what are his most likely marks in statistics?

Solution:
We denote the marks in Mathematics and Statistics by x and y respectively. We are to find the regression equation of y on x and also of x or y. Lastly, we are to estimate y when x = 90. For computation advantage, we shift origins of both x and y.
The regression coefficients b (or b_yx) and b’ (or b_xy) remain unchanged due to a shift of origin.
Applying (17.25) and (17.26), we get
and
The regression line of y on x is
y = –10.4571 + 1.1406x
and the regression line of x on y is
x = 36.2281 + 0.5129y
For x = 90, the most likely value of y is

Example 17.17: The following data relate to the mean and SD of the prices of two shares in a stock Exchange:

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Coefficient of correlation between the share prices = 0.48
Find the most likely price of share A corresponding to a price of ₹60 of share B and also the most likely price of share B for a price of ₹ 50 of share A.

Solution:
Denoting the share prices of Company A and B respectively by x and y, we are given that
and r = 0.48
The regression line of y on x is given by y = a + bx
Thus the regression line of y on x i.e. the regression line of price of share B on that of share A is given by
y = ₹ (34.24 + 0.54x)
When x = ₹ 50, = ₹ (34.24 + 0.54 × 50)
= ₹ 61.24
= The estimated price of share B for a price of ₹ 50 of share A is ₹ 61.24
Again the regression line of x on y is given by
Hence the regression line of x on y i.e. the regression line of price of share A on that of share B in given by
x = ₹ (19.25 + 0.4267y)
When y = ₹ 60, ˆx = ₹ (19.25 + 0.4267 × 60) = ₹ 44.85

Example 17.18: The following data relate the expenditure or advertisement in thousands of rupees and the corresponding sales in lakhs of rupees.

Find an appropriate regression equation.

Solution:
Since sales (y) depend on advertisement (x), the appropriate regression equation is of y on x i.e. of sales on advertisement. We have, on the basis of the given data,
Thus, the regression line of y or x i.e. the regression line of sales on advertisement is given by
y = 6.4927 + 1.4643x

Question for Chapter Notes: Correlation And Regression

Try yourself:

Which of the following best describes the concept of Bivariate Data?

A.
Data collected on two variables at different times
B.
Data collected on two variables at the same time
C.
Data collected on multiple variables at different times
D.
Data collected on multiple variables at the same time

View Solution

Properties of Regression Lines

We consider the following important properties of regression lines:
(i) The regression coefficients remain unchanged due to a shift of origin but change due to a shift of scale.
This property states that if the original pair of variables is (x, y) and if they are changed to the pair (u, v) where

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

(ii) The two lines of regression intersect at the point where x and y are the variables under consideration.
According to this property, the point of intersection of the regression line of y on x and the regression line of x on y is i.e. the solution of the simultaneous equations in x and y.

(iii) The coefficient of correlation between two variables x and y is the simple geometric mean of the two regression coefficients. The sign of the correlation coefficient would be the common sign of the two regression coefficients.
This property says that if the two regression coefficients are denoted by b_yx (=b) and b _xy (=b’) then the coefficient of correlation is given by Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

If both the regression coefficients are negative, r would be negative and if both are positive, r would assume a positive value.

Example 17.19: If the relationship between two variables x and u is u + 3x = 10 and between two other variables y and v is 2y + 5v = 25, and the regression coefficient of y on x is known as 0.80, what would be the regression coefficient of v on u?

Solution:
u + 3x = 10

Example 17.20: For the variables x and y, the regression equations are given as 7x – 3y – 18 = 0 and 4x – y – 11 = 0
(i) Find the arithmetic means of x and y.
(ii) Identify the regression equation of y on x.
(iii) Compute the correlation coefficient between x and y.
(iv) Given the variance of x is 9, find the SD of y.

Solution:
(i) Since the two lines of regression intersect at the point , replacing x and y by respectively in the given regression equations, we get
Solving these two equations, we get
Thus the arithmetic means of x and y are given by 3 and 1 respectively.
(ii) Let us assume that 7x – 3y – 18 = 0 represents the regression line of y on x and 4x – y – 11 = 0 represents the regression line of x on y.
Now 7x – 3y – 18 = 0
Since , our assumptions are correct. Thus, truly represents the regression line of y on x.
(iii) Since r² = 7 / 12

The document Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation is a part of the CA Foundation Course Quantitative Aptitude for CA Foundation.

All you need of CA Foundation at this link: CA Foundation

	Quantitative Aptitude for CA Foundation 114 videos\|164 docs\|98 tests

Quantitative Aptitude for CA Foundation

114 videos|164 docs|98 tests

Join Course for Free

Top Courses for CA Foundation

View all

FAQs on Correlation And Regression Chapter Notes - Quantitative Aptitude for CA Foundation

1. What is bivariate data and how is it used in correlation analysis?

Ans. Bivariate data involves two variables that are analyzed to determine the relationship between them. In correlation analysis, bivariate data helps in assessing the strength and direction of the relationship, using correlation coefficients to quantify how closely the two variables are related.

2. How do you interpret correlation coefficients?

Ans. Correlation coefficients range from -1 to 1. A coefficient close to 1 indicates a strong positive relationship, meaning as one variable increases, the other also tends to increase. A coefficient close to -1 indicates a strong negative relationship, meaning as one variable increases, the other tends to decrease. A coefficient around 0 suggests no correlation between the variables.

3. What is the purpose of regression analysis in statistics?

Ans. Regression analysis is used to model the relationship between a dependent variable and one or more independent variables. It helps in predicting the value of the dependent variable based on the values of the independent variables, allowing for understanding of how changes in predictors influence the outcome.

4. What are the key properties of regression lines?

Ans. Key properties of regression lines include their ability to minimize the sum of squared differences between observed and predicted values, their slope indicating the rate of change in the dependent variable for a unit change in the independent variable, and their intercept representing the value of the dependent variable when all independent variables are zero.

5. How can correlation and regression analysis be applied in real-world scenarios?

Ans. Correlation and regression analysis can be applied in various fields such as economics to assess the impact of price changes on demand, in healthcare to analyze the relationship between lifestyle factors and health outcomes, and in marketing to understand consumer behavior and predict sales based on advertising spend.

Related Exams

CA Foundation

About this Document

	4.92/5 Rating
	Dec 22, 2024 Last updated

Document Description: Chapter Notes: Correlation And Regression for CA Foundation 2024 is part of Quantitative Aptitude for CA Foundation preparation. The notes and questions for Chapter Notes: Correlation And Regression have been prepared according to the CA Foundation exam syllabus. Information about Chapter Notes: Correlation And Regression covers topics like Chapter Overview, Introduction, Bivariate Data, Correlation Analysis, Regression Analysis, Properties of Regression Lines and Chapter Notes: Correlation And Regression Example, for CA Foundation 2024 Exam. Find important definitions, questions, notes, meanings, examples, exercises and tests below for Chapter Notes: Correlation And Regression.

Introduction of Chapter Notes: Correlation And Regression in English is available as part of our Quantitative Aptitude for CA Foundation for CA Foundation & Chapter Notes: Correlation And Regression in Hindi for Quantitative Aptitude for CA Foundation course. Download more important topics related with notes, lectures and mock test series for CA Foundation Exam by signing up for free. CA Foundation: Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Description

Full syllabus notes, lecture & questions for Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation - CA Foundation | Plus excerises question with solution to help you revise complete syllabus for Quantitative Aptitude for CA Foundation | Best notes, free PDF download

Information about Chapter Notes: Correlation And Regression

In this doc you can find the meaning of Chapter Notes: Correlation And Regression defined & explained in the simplest way possible. Besides explaining types of Chapter Notes: Correlation And Regression theory, EduRev gives you an ample number of questions to practice Chapter Notes: Correlation And Regression tests, examples and also practice CA Foundation tests

	Quantitative Aptitude for CA Foundation 114 videos\|164 docs\|98 tests

Quantitative Aptitude for CA Foundation

114 videos|164 docs|98 tests

Join Course for Free

Download as PDF

Explore Courses for CA Foundation exam

Top Courses for CA Foundation

Explore Courses

Signup for Free!

Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.

Start learning for Free

10M+ students study on EduRev

Summary

Semester Notes

practice quizzes

ppt

study material

past year papers

MCQs

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Objective type Questions

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

Important questions

Correlation And Regression Chapter Notes | Quantitative Aptitude for CA Foundation

video lectures

Previous Year Questions with Solutions

mock tests for examination

Sample Paper

Extra Questions

shortcuts and tricks

Exam

pdf

Free

Viva Questions

;

Additional Information about Chapter Notes: Correlation And Regression for CA Foundation Preparation

Chapter Notes: Correlation And Regression Free PDF Download

The Chapter Notes: Correlation And Regression is an invaluable resource that delves deep into the core of the CA Foundation exam. These study notes are curated by experts and cover all the essential topics and concepts, making your preparation more efficient and effective. With the help of these notes, you can grasp complex subjects quickly, revise important points easily, and reinforce your understanding of key concepts. The study notes are presented in a concise and easy-to-understand manner, allowing you to optimize your learning process. Whether you're looking for best-recommended books, sample papers, study material, or toppers' notes, this PDF has got you covered. Download the Chapter Notes: Correlation And Regression now and kickstart your journey towards success in the CA Foundation exam.

Importance of Chapter Notes: Correlation And Regression

The importance of Chapter Notes: Correlation And Regression cannot be overstated, especially for CA Foundation aspirants. This document holds the key to success in the CA Foundation exam. It offers a detailed understanding of the concept, providing invaluable insights into the topic. By knowing the concepts well in advance, students can plan their preparation effectively. Utilize this indispensable guide for a well-rounded preparation and achieve your desired results.

Chapter Notes: Correlation And Regression

Chapter Notes: Correlation And Regression Notes offer in-depth insights into the specific topic to help you master it with ease. This comprehensive document covers all aspects related to Chapter Notes: Correlation And Regression. It includes detailed information about the exam syllabus, recommended books, and study materials for a well-rounded preparation. Practice papers and question papers enable you to assess your progress effectively. Additionally, the paper analysis provides valuable tips for tackling the exam strategically. Access to Toppers' notes gives you an edge in understanding complex concepts. Whether you're a beginner or aiming for advanced proficiency, Chapter Notes: Correlation And Regression Notes on EduRev are your ultimate resource for success.

Chapter Notes: Correlation And Regression CA Foundation Questions

The "Chapter Notes: Correlation And Regression CA Foundation Questions" guide is a valuable resource for all aspiring students preparing for the CA Foundation exam. It focuses on providing a wide range of practice questions to help students gauge their understanding of the exam topics. These questions cover the entire syllabus, ensuring comprehensive preparation. The guide includes previous years' question papers for students to familiarize themselves with the exam's format and difficulty level. Additionally, it offers subject-specific question banks, allowing students to focus on weak areas and improve their performance.

Study Chapter Notes: Correlation And Regression on the App

Students of CA Foundation can study Chapter Notes: Correlation And Regression alongwith tests & analysis from the EduRev app, which will help them while preparing for their exam. Apart from the Chapter Notes: Correlation And Regression, students can also utilize the EduRev App for other study materials such as previous year question papers, syllabus, important questions, etc. The EduRev App will make your learning easier as you can access it from anywhere you want. The content of Chapter Notes: Correlation And Regression is prepared as per the latest CA Foundation syllabus.

Education Revolution

Signup to see your scores go up within 7 days!

Access 1000+ FREE Docs, Videos and Tests

Continue with Google

Takes less than 10 seconds to signup