Method of Least Squares, Business Mathematics and Statistics

# Method of Least Squares, Business Mathematics and Statistics - Business Mathematics and Statistics - B Com

Method of Least Squares : If a straight line is fitted to the data it will serve as a satisfactory trend, perhaps the most accurate method of fitting is that of least squares. This method is designed to accomplish two results.
(i) The sum of the vertical deviations from the straight line must equal zero.
(ii) The sum of the squares of all deviations must be less than the sum of the squares for any other conceivable straight line.
There will be many straight lines which can meet the first condition. Among all different lines, only one line will satisfy the second condition. It is because of this second condition that this method is known as the method of least squares. It may be mentioned that a line fitted to satisfy the second condition, will automatically satisfy the first condition.
The formula for a straight-line trend can most simply be expressed as

Yc = a + bX
where X represents time variable, Yc is the dependent variable for which trend values are to be calculated and a and b are the constants of the straight tine to be found by the method of least squares.
Constant is the Y-intercept. This is the difference between the point of the origin (O) and the point of the trend line and Y-axis intersect. It shows the value of Y when X = 0, constant b indicates the slope which is the change in Y for each unit change in X.
Let us assume that we are given observations of Y for n number of years. If we wish to find the values of constants a and b in such a manner that the two conditions laid down above are satisfied by the fitted equation.
Mathematical reasoning suggests that, to obtain the values of constants a and b according to the Principle of Least Squares, we have to solve simultaneously the following two equations.

∑Y = na + b∑Y ...(i)
∑XY = a∑X + b∑X2 ...(ii)

Solution of the two normal equations yield the following values for the constants a and b :
b =

and a =

Least Squares Long Method : It makes use of the above mentioned two normal equations without attempting to shift the time variable to convenient mid-year. This method is illustrated by the following example.

Illustration : Fit a linear trend curve by the least-squares method to the following data :
Year    Production (Kg.)
2001    3
2002    5
2003    6
2004    6
2005    8
2006    10
2007    11
2008    12
2009    13
2010    15

Solution : The first year 2001 is assumed to be 0, 2002 would become 1, 2003 would be 2 and so on. The various steps are outlined in the following table.

----------------------------------------------------

Year        Production
Y             X           XY         X2
1             2             3            4          5

----------------------------------------------------
2001        3             0          0            0
2002        5             1          5            1
2003        6             2          12          4
2004        6             3          18          9
2005        8             4          32          16
2006        10           5          50           25
2007        11           6          66           36
2008        12           7          84           49
2009        13           8          104         64
2010        15           9          135         11
Total        89          45         506         285

-----------------------------------------------------

The above table yields the following values for various terms mentioned below :
n = 10, ∑X = 45, ∑X2 = 285, ∑Y = 89, and ∑XY = 506

Substituting these values in the two normal equations, we obtain
89 = 10a + 45b ...(i)
506 = 45a + 285b ...(ii)
Multiplying equation (i) by 9 and equation (ii) by 2, we obtain
80l = 90a + 405b ...(iii)
1012 = 90a + 570b ...(iv)
Subtracting equation (iii) from equation (iv), we obtain
211 = 165b or b = 211/165 = 1.28
Substituting the value of b in equation (i), we obtain
89 = 10a + 45 × 1.28
89 = 10a + 57.60
10a = 89 – 57.6
10a = 31.4
a = 31.4/10 = 3.14
Substituting these values of a and b in the linear equation, we obtain the following trend line
Yc = 3. 14 + 1.28X

Inserting various values of X in this equation, we obtain the trend values as below :

-----------------------------------------------------------------

Year    Observed Y     bxX        Yc (Col. 3 plus Col. 4)
1         2            3      4            5

-----------------------------------------------------------------
2001    3            3.14  1.28 × 0  3.14
2002    5            3.14  1.28 × 1  4.42
2003    6            3.14  1.28 × 2  5.70
2004    6            3.14  1.28 × 3  6.98
2005    8            3.14  1.28 × 4  8.26
2006    10          3.14  1.28 × 5  9.54
2007    11          3.14  1.28 × 6  10.82
2008    12          3.14  1.28 × 7  12.10
2009    13          3.14  1.28 × 8  13.38
2010    15          3.14  1.28 × 9  14.66

-------------------------------------------------------------------

Least Squares Method : We can take any other year as the origin, and for that year X would be 0. Considerable saving of both time and effort is possible if the origin is taken in the middle of the whole time span covered by the entire series. The origin would than be located at the mean of the X values. Sum of the X values would then equal 0. The two normal equations would then be simplified to

∑Y = Na ...(i)
or a =
and ∑XY = b∑X2 or b = ...(ii)
Two cases of short cut method are given below. In the first case there are odd number of years while in the second case the number of observations are even.
Illustration : Fit a straight line trend on the following data :
Year 1996 1997 1998 1999 2000 2001 2002 2003 2004
Y  4      7      7     8      9      11     13     14    17
Solution : Since we have 9 observations, therefore, the origin is taken at 2000 for which X is assumed to be 0.

------------------------------

Year   Y   X   XY   X2

------------------------------
1996   4   – 4 – 16  16
1997   7   – 3 – 21   9
1998   7   – 2 – 14   4
1999   8   – 1 – 8    1
2000   9     0   0     0
2001   11    1   11   1
2002   13    2   26   4
2003   14    3   42   9
2004   17    4   68   16

-----------------------------
Total   90    0   88   60

------------------------------

Thus n = 9, SY = 90, SX = 0, SXY = 88, and SX2 = 60
Substituting these values in the two normal equations, we get
90 = 9a or a = 90/9 or a = 10
88 = 60 or b = 88/60 or b = 1.47
Trend equation is : Yc = 10 + 1.47 X
Inserting the various values of X, we obtain the trend values as below :

Solution : Here there are two mid-years viz; 2006 and 2007. The mid-point of the two years is assumed to be 0 and the time of six months is treated to be the unit. On this basis the calculations are as shown below:

----------------------------------------------

Years    Observed Y    X    XY      X2

----------------------------------------------
2003      6.7               – 7  – 46.9   49
2004      5.3               – 5  – 26.5   25
2005      4.3               – 3  – 12.9   9
2006      6.1               – 1  – 6.1     1
2007      5.6                 1    5.6     1
2008      7.9                 3    23.7   9
2009      5.8                 5    29.0   25
2010      6.1                 7    42.7   49

----------------------------------------------
Total      47.8               0    8.6    168

----------------------------------------------

From the above computations, we get the following values.
n = 8, ∑Y = 47.8, ∑X = 0, ∑XY = 8.6, ∑X2 = 168
Substituting these values in the two normal equations, we obtain
47.8 = 8a or a = 47.8/8 or a = 5.98 and 8.6 = 168 b or = 8.6/168 or b = 0.051
The equation for the trend line is : Yc = 5.98 + 0.051X
Trend values generated by this equation are below :

Second Degree Parabola
The simplest example of the non-linear trend is the second degree parabola, the equation is written in the form :
Yc = a + bX + cX2
When numerical values for a, b and c have been derived, the trend value for any year may be
computed substituting in the equation the value of X for that year. The values of a, b and c can be determined
by solving the following three normal equations simultaneously:
(i) ∑Y = Na + bSX + c∑X2
(ii) ∑XY = a∑X + b∑X2 + c∑X3
(iii) ∑X2Y = a∑X2 + b∑X3 + c∑X4
Note that the first equation is merely the summation of the given function, the second is the summation of X multiplied into the given function, and the third is the summation of X2 multiplied into the given function.
When time origin is taken between two middle years SX would be zero. In that case the equations are reduced to :
(i) ∑Y = Na + c∑X2
(ii) ∑XY = b∑X2
(iii) ∑X2Y = a∑X2 + c∑X4
The value of b can now directly be obtained from equation (ii) and value of a and c by solving equations (i) and (iii) simultaneously. Thus,
a = b = c =

Illustration : The price of a commodity during 2000 – 2005 is given below. Fit a parabola Y = a + bX + cX2 to this data. Estimate the price of the commodity for the year 2010 :

Year   Price   Year   Price
2000   100      2003    140
2001   107      2004    181
2002   128      2005    192

Also plot the actual and trend values on graph.
Solution : To determine the value a, b and c, we solve the following normal equations:

∑ Y = Na + b∑X + c∑X2
∑XY = a∑X + b∑X2 + c∑X3
∑X2Y = a∑X2 + b∑X3 + c∑X4

-----------------------------------------------------------------------------------

Year   Y       X        X2        X3        X4       XY       X2Y        Yc

-----------------------------------------------------------------------------------
2000 100     – 2       4         – 8        16      – 200     400       97.744
2001 107     – 1       1         – 1        1       – 107     107       110.426
2002 128       0       0           0         0         0         0         126.680
2003 140     +1       1          +1        1        +140     140       146.506
2004 181     +2       4          +8        16      + 362     724      169.904
2005 192     +3       9          +27       81     +576      1728     196.874

--------------------------------------------------------------------------------------
N = 6 ∑Y = 848 ∑X = 3 ∑X2 = 19 ∑X3 = 27 ∑X4 = 115 ∑XY = 771 ∑X2Y = 3099 ∑Yc = 848.134

--------------------------------------------------------------------------------------

848 = 6a + 3b + 19c ...(i)
771 = 3a +19b +27c ...(ii)
3,099 = 19a + 27b +115c ...(iii)
Solving Eqns. (i) and (ii), get
35b + 35c = 695 ...(iv)
Multiplying Eqn. (ii) by 19 and Eqn. (iii) by 3. Subtracting (iii) from (ii), we get
5352 = 280b + 168 c ...(v)
Solving Eqns. (iv) and (v), we get
c = 1.786
Substituting the value of c in Eqn. (iv), we get
b = 18.04 [35 b +(35 × 1.786) = 695]
Putting the value of b and c in Eqn. (i), we get
a = 126.68 [848 = 6a + (3 × 18.04) + (19 × 1.786))
Thus a = 126.68, b =18.04 and c = 1.786
Substituting the values in the equation
Yc = 126.68 + 18.04X + 1.786X2
When X = – 2, Y = 126.68 + 18.04(–2) + 1.786(– 2)2
= 126.68 – 36.08 + 7.144 = 97.744
When X = –1, Y = 126.68 + 18.04(–1) + 1.786(–1)2

= 126.68 – 18.04 + 1.786 = 110.426
When X = 0, Y = 126.68
When X = l, Y = 126.68 + 18.04 + 1.786 = 146.506
When X = 2, Y = 126.68 + 18.04(2) + 1.786(2)2
= 126.68 + 36.08 + 7.144 = 169.904
When X = 3, Y = 126.68 + 18.04(3) + 1.786(3)2
= 126.68 + 54.12 + 16.074 = 196.874
Price for 2010, Y = 126.68 + 18.04(8) + 1.786(8)2
When X = 8 = 126.68 + 144.32 + 114.304 = 385.304
Thus the likely price of the commodity for the year 2010 is Rs.385.304.
The graph of the actual trend values values is given below:

Conversion of Annual Trend Equation to Monthly Trend Equation
Fiting a trend line by least squares to monthly data may be excessively time consuming. It is more convenient to compute the trend equation from annual data and then convert this trend equation to a monthly trend equation.
There are two possible situations: (i) the Y units are annual totals, for example, the total number of passenger cars sold; (ii) the Y units are monthly averages, for example average monthly wholesale price Index.

Where Data are Annual Totals
A trend equation operative on an annual level is to be reduced to a monthly level. Constant value, a, is expressed in terms of annual Y values. To express it in terms of monthly values, we must divide it by 12. Similarly b is to be divided by 12 to convert the annual change to a monthly change. But this division shows us only the change for any month of two consecutive years, whereas we want change for two consecutive months. Therefore b is to be divided by 12 once again. Consequently, to convert annual trend equation to a monthly trend equation, when the annual data are expressed as annual totals, we divide a by 12 and b by 144.

Where the Data are given as monthly averages per year
In this case, Y values are on a monthly level. Therefore, a value remains unchanged in the conversion process. The b value in this case shows us the change on a monthly level, but from a month in one year to the corresponding month in the following year. Here, it is necessary only to convert b value to make it measure the change between consecutive month by dividing it with 12 only.

Merits
(i) This method has no place for subjectivity since it is a mathematical method of measuring trend,
(ii) This method gives the line of best fit because from this line the sum of the positive and negative deviations is zero and the total of the squares of these deviations is minimum.

Limitations

The best practicable use of mathematical trends is for describing movements in time series. It does not provide a clue to the causes of such movements. Therefore, forecasting on this basis may be quite risky.
Forecasting will be valid if there is a functional relationship between the variable under consideration and time for a particular trend. But if trend describes the past behaviour, it hardly throws light on the causes which may influence the future behaviour.
The other limitation is that if some items are added to the original data, a new equation has to be obtained.

Curvilinear Trend
Sometimes, the time series may not be represented by a straight line trend. Such trends are known as curvilinear trends. If the curvilinear trend is represented by a straight line or semi-log paper, or by polynomials of second or higher degree or by double logarithmic function, then the method of least squares is also applicable to such cases.

The document Method of Least Squares, Business Mathematics and Statistics | Business Mathematics and Statistics - B Com is a part of the B Com Course Business Mathematics and Statistics.
All you need of B Com at this link: B Com

115 videos|142 docs

## FAQs on Method of Least Squares, Business Mathematics and Statistics - Business Mathematics and Statistics - B Com

 1. What is the Method of Least Squares? Ans. The Method of Least Squares is a statistical technique used to find the best-fitting curve or line that represents a set of data points. It minimizes the sum of the squared differences between the observed and predicted values. This method is commonly used in regression analysis to determine the relationship between variables and make predictions.
 2. How does the Method of Least Squares work? Ans. The Method of Least Squares works by minimizing the sum of the squared differences between the observed values and the predicted values. It calculates the equation of a line or curve that best fits the data points by finding the values of the parameters that minimize the sum of the squared residuals. The residuals are the differences between the observed values and the predicted values.
 3. What is the importance of the Method of Least Squares in business mathematics and statistics? Ans. The Method of Least Squares is crucial in business mathematics and statistics as it provides a way to analyze and interpret data. It helps in determining the relationship between variables, making predictions, and estimating the parameters of a model. Businesses can use this method to analyze trends, forecast future outcomes, and make informed decisions based on statistical analysis.
 4. Can the Method of Least Squares be used for non-linear relationships? Ans. Yes, the Method of Least Squares can be used for non-linear relationships. While it is commonly used for linear regression, it can also be applied to non-linear models by transforming the data or using non-linear regression techniques. However, it is important to note that the assumptions and interpretation may differ for non-linear models compared to linear models.
 5. What are the limitations of the Method of Least Squares? Ans. The Method of Least Squares has a few limitations. Firstly, it assumes that the relationship between variables is linear, which may not always be the case in real-world scenarios. Additionally, it assumes that the errors or residuals have constant variance and are normally distributed. Violation of these assumptions can affect the accuracy of the results. Furthermore, outliers in the data can heavily influence the results, so it is important to identify and handle them appropriately.

115 videos|142 docs Explore Courses for B Com exam Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Track your progress, build streaks, highlight & save important lessons and more!
Related Searches

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

;