GMAT Exam  >  GMAT Notes  >  Quantitative Reasoning  >  Important Formulas: Statistics

Important Formulas Statistics - Quantitative Reasoning for GMAT PDF Download

Statistics is a mathematical field focused on numbers and the analysis of data. It involves the examination, interpretation, presentation, and arrangement of data. Within statistical theory, a statistic is defined as a function applied to a sample; this function remains unaffected by the distribution of the sample.

The important statistics formulas are listed in the chart below:

Important Formulas: Statistics

Mean

  • Definition: The arithmetical mean (or average) of a set of numbers is the sum of the numbers divided by the count of numbers. The mean gives a measure of the central tendency of the data.
  • Formula:
Mean
  • Remarks: Use the population mean formula when the data represent an entire population. Use the sample mean as an estimator when the data are a sample from a larger population. For grouped data, calculate the mean using class mid-points multiplied by frequencies divided by total frequency.

Example (simple data):

Find the mean of the numbers 2, 4, 7 and 9.

Sum of the numbers.

2 + 4 + 7 + 9 = 22

Number of observations.

4

Mean = Sum ÷ Number of observations.

Mean = 22 ÷ 4 = 5.5

Median

  • Definition: In a sorted list (ascending or descending), the median is the middle value that divides the data into two equal halves. The median is often more representative than the mean for skewed distributions.
  • Formulas:
Median
Median
  • Remarks: For an odd number of observations the median is the middle value. For an even number of observations the median is the average of the two middle values. For grouped data, compute the median by linear interpolation inside the median class.

Example (odd n):

Find the median of 3, 8, 11, 14, 20.

Number of observations = 5, which is odd.

Median is the 3rd value (middle value) in the sorted list.

Median = 11

Example (even n):

Find the median of 4, 6, 9, 13.

Number of observations = 4, which is even.

Median = average of 2nd and 3rd values = (6 + 9) ÷ 2

Median = 15 ÷ 2 = 7.5

Mode

  • Definition: The mode of a data set is the value that occurs most frequently. A distribution may be unimodal, bimodal, or multimodal depending on the number of modes.
  • Grouped data: For frequency distributions, the mode lies inside the modal class (the class with the highest frequency). A commonly used formula for the mode in grouped data is the linear interpolation formula shown below.
Mode
  • Remarks: In discrete ungrouped data, the mode is the value with the maximum count. For grouped data, use the modal-class formula with the preceding and following class frequencies and the class width.

Example (ungrouped):

Find the mode of 2, 5, 2, 9, 5, 2.

Frequency of 2 is 3, frequency of 5 is 2, frequency of 9 is 1.

Mode = 2 (most frequent value)

Example (grouped - illustration of method):

Identify the modal class and apply the grouped mode formula using the modal class lower limit, frequency of modal class, frequencies of neighbouring classes and class width.

Standard Deviation

  • Definition: Standard deviation measures the dispersion or spread of values about the mean. It is the square root of the variance and has the same units as the data.
  • Population and sample: Population standard deviation uses denominator N (population size); sample standard deviation uses denominator (n - 1) to correct bias (Bessel's correction).
  • Formula:
Standard Deviation
  • Remarks: The square root ensures standard deviation is in the same units as the original data. For large samples the difference between n and n - 1 is small, but for small samples use n - 1 for an unbiased estimator.

Example (sample standard deviation):

Data: 3, 7, 7, 19.

Compute the sample mean.

Mean = (3 + 7 + 7 + 19) ÷ 4 = 36 ÷ 4 = 9

Compute squared deviations from the mean and sum them.

(3 - 9)² + (7 - 9)² + (7 - 9)² + (19 - 9)² = 36 + 4 + 4 + 100 = 144

Sample variance = Sum of squared deviations ÷ (n - 1).

Sample variance = 144 ÷ 3 = 48

Sample standard deviation = √48 = 4√3 ≈ 6.928

Variance

  • Definition: Variance is the expectation of the squared deviation of a random variable from its mean. It quantifies dispersion by averaging squared distances from the mean.
  • Formula (population):
Variance
  • Alternative (computational) formula: For population variance, Var(X) = (Σx² ÷ N) - (mean)². This formula is often useful for manual calculation when Σx and Σx² are known.
  • Relation: Standard deviation = √(variance).

Example (population variance using computational formula):

Data (population): 2, 4, 6.

Compute mean.

Mean = (2 + 4 + 6) ÷ 3 = 12 ÷ 3 = 4

Compute Σx².

Σx² = 2² + 4² + 6² = 4 + 16 + 36 = 56

Population variance = (Σx² ÷ N) - (mean)².

Population variance = (56 ÷ 3) - 4² = 18.666... - 16 = 2.666... ≈ 8/3

Population standard deviation = √(8/3) ≈ 1.633

Other useful formulas and measures

  • Weighted mean: When observations have different weights, weighted mean = (Σ w_i x_i) ÷ (Σ w_i). Use for averages where items contribute unequally.
  • Geometric mean: For n positive numbers, geometric mean = (Π x_i)^(1/n). Useful for growth rates and multiplicative processes.
  • Harmonic mean: For positive numbers, harmonic mean = n ÷ (Σ 1/x_i). Useful when averaging rates or ratios.
  • Coefficient of variation (CV): CV = (Standard deviation ÷ Mean) × 100%. Use CV to compare relative variability between data sets with different units or means.
  • Percentiles and quartiles: The p-th percentile divides data so that p% of observations are at or below that value. The 25th, 50th and 75th percentiles are the first quartile (Q1), median (Q2) and third quartile (Q3) respectively.
  • Mean of combined groups: For two groups with means μ1, μ2 and sizes n1, n2, combined mean = (n1μ1 + n2μ2) ÷ (n1 + n2).

Practical notes and interpretation

  • Choice of measure: Use mean for symmetric distributions without outliers, median for skewed distributions or when outliers are present, and mode when the most frequent value is of interest.
  • Units: Mean and median retain the units of the data. Variance has squared units; standard deviation returns to original units.
  • Comparisons: Use coefficient of variation to compare spread across different datasets. Use quartiles and interquartile range (IQR = Q3 - Q1) to measure spread robustly against outliers.
  • Grouped data caution: All grouped-data formulas rely on class mid-points or linear interpolation; precision depends on class width and distribution within classes.

Summary (optional): The primary measures of central tendency are mean, median and mode. Measures of dispersion include variance and standard deviation. Use the appropriate formula for population or sample data, apply grouped-data formulas when necessary, and choose the measure that best represents the data context and the decision task at hand.

The document Important Formulas: Statistics - Quantitative Reasoning for GMAT is a part of the GMAT Course Quantitative Reasoning for GMAT.
All you need of GMAT at this link: GMAT
123 videos|186 docs|107 tests

FAQs on Important Formulas: Statistics - Quantitative Reasoning for GMAT

1. What are some important formulas in probability and statistics?
Ans. Some important formulas in probability and statistics include the formula for calculating probability: P(A) = Number of favorable outcomes / Total number of outcomes. Other important formulas include the mean (average) formula: μ = Σx / n, where Σx is the sum of all the values and n is the total number of values. The formula for calculating variance is: σ^2 = Σ(x - μ)^2 / n, where σ^2 represents the variance, Σ(x - μ)^2 is the sum of the squared differences from the mean, and n is the total number of values.
2. How do you calculate the probability of an event?
Ans. To calculate the probability of an event, you can use the formula: P(A) = Number of favorable outcomes / Total number of outcomes. The number of favorable outcomes refers to the number of outcomes that satisfy the desired condition or event, while the total number of outcomes refers to the total number of possible outcomes. By dividing the number of favorable outcomes by the total number of outcomes, you can determine the probability of that event occurring.
3. What is the mean (average) and how is it calculated in statistics?
Ans. The mean, also known as the average, is a measure of central tendency in statistics. It represents the typical value in a set of data. To calculate the mean, you add up all the values in the data set and then divide the sum by the total number of values. The formula for calculating the mean is: μ = Σx / n, where Σx represents the sum of all the values and n is the total number of values. The mean provides a useful measure to understand the average value of a dataset.
4. What is variance and how is it calculated in statistics?
Ans. Variance measures the dispersion or spread of a dataset. It quantifies how much the values in a dataset deviate from the mean. To calculate the variance, you need to subtract the mean from each value, square the result, sum up all the squared differences, and then divide by the total number of values. The formula for calculating variance is: σ^2 = Σ(x - μ)^2 / n, where σ^2 represents the variance, Σ(x - μ)^2 is the sum of the squared differences from the mean, and n is the total number of values.
5. How are probability and statistics related?
Ans. Probability and statistics are closely related fields. Probability deals with the likelihood of events occurring, while statistics involves the collection, analysis, interpretation, presentation, and organization of data. Probability theory provides a foundation for statistical analysis by quantifying uncertainty and randomness. Statistical methods, on the other hand, utilize probability concepts to make inferences, draw conclusions, and estimate parameters based on observed data. Probability and statistics are essential for understanding and predicting various phenomena in the real world, including in fields such as finance, medicine, and social sciences.
Related Searches
Important questions, Viva Questions, Important Formulas: Statistics - Quantitative Reasoning for GMAT, MCQs, Exam, Semester Notes, Objective type Questions, study material, Important Formulas: Statistics - Quantitative Reasoning for GMAT, video lectures, practice quizzes, Previous Year Questions with Solutions, Extra Questions, pdf , Important Formulas: Statistics - Quantitative Reasoning for GMAT, Sample Paper, Summary, ppt, Free, shortcuts and tricks, past year papers, mock tests for examination;