Introduction
- The chapter discusses measures of central tendency, a numerical method to summarize data.
- Common examples include average marks, average rainfall, average production, and average income.
- Central tendency helps evaluate and compare data efficiently, such as the economic condition of farmers in a village.
- Three primary measures of central tendency are:
- Arithmetic Mean
- Median
- Mode
- Other types include Geometric Mean and Harmonic Mean, but the focus is on the three main averages.
Arithmetic Mean
The Arithmetic Mean is calculated by adding all values and dividing by the number of observations. For example, for monthly incomes of six families: 1600, 1500, 1400, 1525, 1625, 1630, the mean is:
- Sum = 1600 + 1500 + 1400 + 1525 + 1625 + 1630 = 8,280
- Mean = 8,280 / 6 = Rs 1,380
- The formula for Arithmetic Mean:
- X = (X1 + X2 + X3 + ... + XN) / N
- Where N = total number of observations.
Calculating Arithmetic Mean
Ungrouped Data
- Direct Method: Simply sum all observations and divide by the total count. Example: Marks: 40, 50, 55, 78, 58 results in:
Mean = (40 + 50 + 55 + 78 + 58) / 5 = 56.2 - Assumed Mean Method: Useful for large datasets. Assume a mean (A), calculate deviations (d), and find the actual mean using:
X = A + (Σd / N)Grouped Data - Discrete Series: Multiply frequency by observation values, sum them up, and divide by total frequency:
Formula: X = Σ(fX) / Σf - Continuous Series: Use mid-points of class intervals for calculations, following a similar process as discrete data.
Properties of Arithmetic Mean
- The sum of deviations from the mean is always zero: Σ(X - X) = 0.
- The mean is sensitive to extreme values, which can significantly affect its value.
Weighted Arithmetic Mean
- When different items have different importance, weights can be assigned to calculate a weighted mean:
- Formula: Weighted Mean = (W1 * P1 + W2 * P2) / (W1 + W2)
Median
Median is the positional value that divides a data distribution into two equal parts.
It separates values into:
- Values greater than or equal to the median
- Values less than or equal to the median
- The median is the middle element when the data set is ordered by magnitude.
- It remains unaffected by extreme values (outliers).
Computation of Median:
- Sort the data from smallest to largest.
- Find the middle value.
Example 1: For the data sets 5, 7, 6, 1, 8, 10, 12, 4, and 3:
- Sorted order: 1, 3, 4, 5, 6, 7, 8, 10, 12
- The median (middle score) is 6.
If the number of values is even, the median is the arithmetic mean of the two middle values.
Example 2: For marks of 20 students:
- Data: 25, 72, 28, 65, 29, 60, 30, 54, 32, 53, 33, 52, 35, 51, 42, 48, 45, 47, 46, 33
- Sorted order: 25, 28, 29, 30, 32, 33, 33, 35, 42, 45, 46, 47, 48, 51, 52, 53, 54, 60, 65, 72
- Two middle values: 45 and 46
- Median = (45 + 46) / 2 = 45.5 marks
To find the position of the median in an ordered data set, use the formula:
- Position of median = (N + 1) / 2
- Where N = number of items.
Discrete Series: For discrete data, locate the median using cumulative frequency.
Example: Income data:
- Income (in Rs): 10, 20, 30, 40
- Number of persons: 2, 4, 10, 4
- Cumulative frequency table:
10: 2
20: 6
30: 16
40: 20 - Median position: (20 + 1) / 2 = 10.5
- Median income is Rs 30 (found in the cumulative frequency).
Continuous Series: Locate the median class where the N/2th item lies.
Use the formula:
- Median = L + [(N/2 - cf) / f] * h
- Where:
L = lower limit of the median class
cf = cumulative frequency of the class before the median class
f = frequency of the median class
h = class interval size
Example: Daily wages of factory workers:
- Daily wages (in Rs): 55-60, 50-55, 45-50, 40-45, 35-40, 30-35, 25-30, 20-25
- Number of workers: 7, 13, 15, 20, 30, 33, 28, 14
- Cumulative frequency table:
20-25: 14
25-30: 28
30-35: 33
35-40: 30
40-45: 20
45-50: 15
50-55: 13
55-60: 7 - Median class is 35-40 (80th item).
- Median daily wage = Rs 35.83.
Quartiles
Divide the data into four equal parts.
- Q1 (first quartile): 25% of items below it
- Q2 (second quartile/median): 50% of items below it
- Q3 (third quartile): 75% of items below it
Percentiles
Divide data into hundred equal parts.
- P50 is the median value.
- Example: Securing the 82nd percentile means you scored better than 82% of candidates.
Calculation of Quartiles: Similar method as for median.
- Q1 = (N + 1) / 4th item
- Q3 = 3(N + 1) / 4th item
Example 5: Marks of ten students:
- Marks: 22, 26, 14, 30, 18, 11, 35, 41, 12, 32
- Sorted: 11, 12, 14, 18, 22, 26, 30, 32, 35, 41
- Q1 = (10 + 1) / 4 = 2.75th item = 2nd item + 0.75 * (3rd item - 2nd item) = 12 + 0.75 * (14 - 12) = 13.5 marks.
Mode
Mode is a statistical measure that represents the most frequently occurring value in a data set. It is particularly useful when you want to identify the typical value around which the maximum concentration of items occurs.
- The term "mode" derives from the French word la Mode, meaning the most fashionable or popular value in a distribution.
- Mode is denoted by Mo and is defined as the value that appears most often in the data.
Computation of Mode:
- For a discrete series, the mode is the number that appears most frequently. For example, in the data set 1, 2, 3, 4, 4, 5, the mode is 4 because it appears twice, more than any other number.
- In a frequency distribution, for instance:
Variable: 10, 20, 30, 40, 50
Frequency: 2, 8, 20, 10, 5
The mode is 30 since it has the highest frequency of 20. - Data can have:
- Unimodal: One mode (e.g., the example above).
- Bimodal: Two modes (e.g., 1, 1, 2, 2, 3, 3, 4, 4 has no mode).
- Multimodal: More than two modes.
- No mode: If all values occur with equal frequency.
Continuous Series:
- In continuous frequency distributions, the modal class is the class with the highest frequency.
- Mode can be calculated using the following formula:
Mo = L + (D1 / (D1 + D2)) * h
Where:
L = lower limit of the modal class
D1 = frequency of the modal class - frequency of the class before it
D2 = frequency of the modal class - frequency of the class after it
h = width of the class interval - For example, to calculate the modal income from a cumulative frequency table:
- Convert the cumulative frequency to an exclusive frequency distribution.
- Identify the modal class (where the frequency is highest).
- Using the formula, calculate the mode.
Relative Position of Averages
The relationship between the three measures of central tendency is as follows:
If Me is the arithmetic mean, Mi is the median, and Mo is the mode, then:
- Me > Mi > Mo or Me < Mi < Mo depending on the distribution shape.
- The median is always positioned between the arithmetic mean and the mode.
Conclusion
- Measures of central tendency summarize data by providing a single representative value.
- Arithmetic mean is the most commonly used average, easy to calculate, but can be affected by extreme values.
- Median is more robust in the presence of outliers.
- Mode is useful for qualitative data and can be easily computed graphically.
- Choosing the appropriate average depends on the purpose of analysis and the nature of the data distribution.