Table of contents | |
Unit Overview | |
Definition of Dispersion | |
Range | |
Mean Deviation | |
Standard Deviation | |
Quartile Deviation |
Absolute Measures of Dispersion
These measures are dependent on the unit of the variable being considered. They include:
Relative Measures of Dispersion
Relative measures are unit-free and are used for comparing distributions. They include:
Differences between Absolute and Relative Measures
Characteristics of an Ideal Measure of Dispersion
[Question: 0]
For a given set of observations, range may be defined as the difference between the largest and smallest of observations. Thus if L and S denote the largest and smallest observations respectively then we have
Range = L – S
The corresponding relative measure of dispersion, known as coefficient of range, is given by
Coefficient of range = L - S / L + S x 100
For a grouped frequency distribution, range is defined as the difference between the two extreme class boundaries. The corresponding relative measure of dispersion is given by the ratio of the difference between the two extreme class boundaries to the total of these class boundaries, expressed as a percentage.
We may note the following important result in connection with range:
Result:
Example 14.2.1: Following are the wages of 8 workers expressed in Rupees.
82, 96, 52, 75, 70, 65, 50, 70. Find the range and also its coefficient.
Solution: The largest and the smallest wages are L = ₹ 96 and S = ₹ 50
Thus range = ₹ 96 – ₹ 50 = ₹ 46
Coefficient of range = 96 - 50 / 96 + 50 x 100
= 31.51
Example 14.2.2: What is the range and its coefficient for the following distribution of weights?
Solution: The lowest class boundary is 49.50 kgs. and the highest class boundary is 74.50 kgs. Thus we have
Range = 74.50 kgs. – 49.50 kgs.
= 25 kgs.
Also, coefficient of range = 74.50 - 49.50 / 74.50 + 49.50 x 100
= 25 /124 x 100
= 20.16
Example 14.2.3 : If the relationship between x and y is given by 2x+3y=10 and the range of x is ₹ 15, what would be the range of y?
Solution: Since 2x+3y=10
Therefore,
Applying (14.2.1) , the range of y is given by
For a grouped frequency distribution, mean deviation about A is given by
Where xi and fi denote the mid value and frequency of the i-th class interval and
N = ∑f i
In most cases we take A as mean or median and accordingly, we get mean deviation about mean or mean deviation about median.
A relative measure of dispersion applying mean deviation is given by
Mean deviation takes its minimum value when the deviations are taken from the median. Also mean deviation remains unchanged due to a change of origin but changes in the same ratio due to a change in scale i.e. if y = a + bx, a and b being constants, then
MD of y = |b| × MD of x ………………………(14.2.4)
Example 14.2.4: What is the mean deviation about mean for the following numbers?
5, 8, 10, 10, 12, 9.
Solution:
The mean is given byThus mean deviation about mean is given by
Example. 14.2.5: Find mean deviations about median and also the corresponding coefficient for the following profits (‘000 `) of a firm during a week.
82, 56, 75, 70, 52, 80, 68.
Solution:
The profits in thousand rupees is denoted by x. Arranging the values of x in an ascending order, we get
52, 56, 68, 70, 75, 80, 82.
Therefore, Me = 70. Thus, Median profit = ₹ 70,000.Thus mean deviation about median =
= (₹) 61/7
= (₹) 8714.28Coefficient of mean deviation = MD about median / Median x 100
8714.28 / 70000 x 100
= 12.45
Example 14.2.6 : Compute the mean deviation about the arithmetic mean for the following data: x
Also find the coefficient of the mean deviation about the AM.
Solution: We are to apply formula (14.1.2) as these data refer to a grouped frequency distribution the AM is given by
Thus, MD about AM is given by
= 42.88 / 25
=1.72Coefficient of MD about its AM = MD about AM / AM x 100
= 1.72 / 3.88 x 100
= 44.33
Example 14.2.7: Compute the coefficient of mean deviation about median for the following distribution:
Solution: We need to compute the median weight in the first stage
Hence,
= 405 / 50 kg.
= 8.10 kg.Coefficient of mean deviation about median = Median deviation about Mean / Mean x 100
= 8.10 / 62.50 x 100
= 12.96
Example 14.2.8: If x and y are related as 4x+3y+11 = 0 and mean deviation of x is 5.40, what is the mean deviation of y?
Solution: Since 4x + 3y + 11 = 0
Therefore,
Hence MD of y= |b| × MD of x
= 4/3 x 5.40
= 7.20
Although mean deviation is an improvement over range so far as a measure of dispersion is concerned, mean deviation is difficult to compute and further more, it cannot be treated mathematically. The best measure of dispersion is, usually, standard deviation which does not possess the demerits of range and mean deviation.
Standard deviation for a given set of observations is defined as the root mean square deviation when the deviations are taken from the AM of the observations. If a variable x assumes n values x1, x2, x3 ………..xn then its standard deviation(s) is given by
For a grouped frequency distribution, the standard deviation is given by
(14.2.5) and (14.2.6) can be simplified to the following forms
Sometimes the square of standard deviation, known as variance, is regarded as a measure of dispersion. We have, then,
A relative measure of dispersion using standard deviation is given by coefficient of variation (cv) which is defined as the ratio of standard deviation to the corresponding arithmetic mean, expressed as a percentage.
ILLUSTRATIONS:
Example 14.2.9: Find the standard deviation and the coefficient of variation for the following numbers: 5, 8, 9, 2, 6
Solution: We present the computation in the following table.
Applying (14.2.7), we get the standard deviation as
The coefficient of variation is
CV = 100 x SD/AM
100 x 2.45 / 6
= 40.83
Example 14.2.10: Show that for any two numbers a and b, standard deviation is given by
Solution: For two numbers a and b, AM is given by
The variance is
(The absolute sign is taken, as SD cannot be negative).
Example 14.2.11: Prove that for the first n natural numbers, SD is
Solution: for the first n natural numbers AM is given by
Thus, SD of first n natural numbers is
We consider the following formula for computing standard deviation from grouped frequency distribution with a view to saving time and computational labour:
Where
Example 14.2.12: Find the SD of the following distribution:
Solution:
Applying (14.2.7), we get the SD of weight as
Properties of standard deviation
where,
and
This result can be extended to more than 2 groups. For x > 2 groups, we have
Example 14.2.13: If AM and coefficient of variation of x are 10 and 40 respectively, what is the variance of (15–2x)?
Solution: let y = 15 – 2x
Then applying (14.2.4), we get,
sy = 2 × sx ………………………………… (1)
As given cvx = coefficient of variation of x = 40 and
From (1), Sy = 2 x 4 = 8
Therefore, variance of (15 - 2x) = Sy2= 64
Example 14.2.14: Compute the SD of 9, 5, 8, 6, 2.
Without any more computation, obtain the SD of
Solution:
The SD of the original set of observations is given by
If we denote the original observations by x and the observations of sample I by y, then we have
In case of sample II, x and y are related as
Y = 10x
= 0 + (15)x
Example 14.2.15: For a group of 60 boy students, the mean and SD of stats. marks are 45 and 2 respectively. The same figures for a group of 40 girl students are 55 and 3 respectively. What is the mean and SD of marks if the two groups are pooled together?
Solution: As given
Thus the combined mean is given by
= 49
Thus
Applying (14.2.13), we get the combined SD as
Example 14.2.16: The mean and standard deviation of the salaries of the two factories are provided below :
(i) Find the combined mean salary and standard deviation of salary.
(ii) Examine which factory has more consistent structure so far as satisfying its employees are concerned.
Solution: Here we are given
thus the combined mean salary and the combined standard deviation of salary are ₹4880 and ₹ 98.58 respectively.
(ii) In order to find the more consistent structure, we compare the coefficients of variation of the two factories.
We would say factory A is more consistent
if CVA < CVB . Otherwise factory B would be more consistent.
Thus we conclude that factory A has more consistent structure.
Example 14.2.17: A student computes the AM and SD for a set of 100 observations as 50 and 5 respectively. Later on, she discovers that she has made a mistake in taking one observation as 60 instead of 50. What would be the correct mean and SD if
(i) The wrong observation is left out?
(ii) The wrong observation is replaced by the correct observation?
Solution: As given,
Wrong observation = 60, correct observation = 50
(i) Sum of the 99 observations = 5000 – 60 = 4940
AM after leaving the wrong observation = 4940/99 = 49.90
Sum of squares of the observation after leaving the wrong observation
= 252500 – 602 = 248900
Variance of the 99 observations = 248900/99 – (49.90)2
= 2514.14 – 2490.01
= 24.13∴ SD of 99 observations = 4.91
(ii) Sum of the 100 observations after replacing the wrong observation by the correct observation = 5000 – 60 + 50 = 4990
AM = 4990 /100 = 49.90
Corrected sum of squares = 252500 + 502 – 602 = 251400
Corrected SD
[Question: 0]
A relative measure of dispersion using quartiles is given by coefficient of quartile deviation which is
Quartile deviation provides the best measure of dispersion for open-end classification. It is also less affected due to extreme observations or sampling fluctuations. Like other measures of dispersion, quartile deviation remains unaffected due to a change of origin but is affected in the same ratio due to change in scale.
Example 14.2.18 : Following are the marks of the 10 students : 56, 48, 65, 35, 42, 75, 82, 60, 55, 50. Find quartile deviation and also its coefficient.
Solution:
After arranging the marks in an ascending order of magnitude, we get 35, 42, 48, 50, 55, 56, 60, 65, 75, 82
= (10 + 1) / 4th observation
= 2.75th observation
= 2nd observation + 0.75 × difference between the third and the 2nd observation.
= 42 + 0.75 × (48 – 42)
= 46.50Third quartile (Q3)= 4 )1n(3 th observation
= 8.25 th observation
= 65 + 0.25 × 10
= 67.50
Thus applying (14.2.14), we get the quartile deviation as
Also, using (14.2.15), the coefficient of quartile deviation
Example 14.2.19 : If the quartile deviation of x is 6 and 3x + 6y = 20, what is the quartile deviation of y?
Solution: 3x + 6y = 20
⇒
Therefore, quartile deviation of quartile deviation of x
= 1/2 x 6
= 3.
Example 14.2.20: Find an appropriate measures of dispersion from the following data:
Solution: Since this is an open-end classification, the appropriate measure of dispersion would be quartile deviation as quartile deviation does not taken into account the first twenty five percent and the last twenty five per cent of the observations.
Here a denotes the first Class Boundary
Thus quartile deviation of wages is given by
Example 14.2.21: The mean and variance of 5 observations are 4.80 and 6.16 respectively. If three of the observations are 2, 3 and 6, what are the remaining observations?
Solution: Let the remaining two observations be a and b, then as given
From (1), we get a = 13 – b ...........(3)
Eliminating a from (2) and (3), we getFrom (3), a= 9 or 4
Thus the remaining observations are 4 and 9.
Example 14.2.22: After shift of origin and change of scale, a frequency distribution of a continuous variable with equal class length takes the following form of the changed variable (d):
If the mean and standard deviation of the original frequency distribution are 54.12 and 2.1784 respectively, find the original frequency distribution.
Solution: We need find out the origin A and scale C from the given conditions.
Once A and C are known, the mid- values xi’s would be known. Finally, we convert the mid-values to the corresponding class boundaries by using the formula:
On the basis of the given data, we find that
Example 14.2.23: Compute coefficient of variation from the following data:
Solution: What is given in this problem is less than cumulative frequency distribution. We need first convert it to a frequency distribution and then compute the coefficient of variation.
The AM is given by:
The standard deviation is
Thus the coefficient of variation is given by
= 48.83
Example 14.2.24: You are given the distribution of wages in two factors A and B
State in which factory, the wages are more variable.
Solution: As explained in example 14.2.3, we need compare the coefficient of variation of A(i.e. vA) and of B (i.e vB).
If vA> vB', then the wages of factory A would be more variable. Otherwise, the wages of factory B would be more variable where
For Factory A
For Factory B
As VA > VB' , the wages for factory A is more variable.
114 videos|164 docs|98 tests
|
1. What is the definition of dispersion in statistics? |
2. How is the range calculated in a dataset? |
3. What is mean deviation and how is it different from standard deviation? |
4. How do you compute standard deviation? |
5. What is quartile deviation and why is it used? |
114 videos|164 docs|98 tests
|
|
Explore Courses for CA Foundation exam
|