UPSC Exam  >  UPSC Notes  >  Management Optional Notes for UPSC  >  Descriptive Statistics-tabular, graphical and numerical methods

Descriptive Statistics-tabular, graphical and numerical methods | Management Optional Notes for UPSC PDF Download

Introduction

The term "statistics" carries various meanings for different individuals. To some, it represents a singular numerical representation of a dataset, while others view it as numerical measurements or counts. Mathematicians utilize statistics to succinctly summarize data in a single word, considering it as a summary of an event. The term "number," denoted as 'n,' serves as a statistic indicating the size of a dataset, representing the quantity of data points within it. Furthermore, the application of statistical knowledge extends to various aspects of daily life, aiding individuals in making decisions based on diverse sets of available information. However, in the realm of behavioral sciences, "statistics" assumes a different role, primarily focusing on drawing statistical inferences about populations based on quantitative and qualitative information at hand.

  • The term "statistics" can be defined in two distinct manners. In its singular form, "Statistics" refers to what is commonly known as statistical methods, while in its plural form, it denotes "data."
  • In this unit, we will employ the term "statistics" in its singular sense. In this context, it is delineated as a branch of science concerned with the collection, classification, analysis, and interpretation of statistical data.

The discipline of statistics can be broadly categorized into two main branches:

Descriptive Statistics, and ii) Inferential Statistics

  • Descriptive Statistics: The majority of observations in the universe exhibit variability, particularly those related to human behavior. It is widely acknowledged that attributes such as attitude, intelligence, and personality vary among individuals. To establish a meaningful definition of a group or to identify it based on their observations/scores, it becomes imperative to express these observations accurately. Descriptive statistics, as a branch of statistics, focuses on providing descriptions of acquired data. Through these descriptions, specific population groups are defined based on their corresponding characteristics. Descriptive statistics encompass processes such as classification, tabulation, diagrammatic and graphical presentation of data, as well as measures of central tendency and variability. These measures enable researchers to discern patterns within the data or scores, thereby facilitating the description of phenomena. Parameters of the distribution, which represent single estimates summarizing the distribution of data, are fundamental in defining the distribution comprehensively.

Essentially, descriptive statistics involves two primary operations:

(i) Organization of data, and (ii) Summarization of data

Question for Descriptive Statistics-tabular, graphical and numerical methods
Try yourself:
What does descriptive statistics focus on?
View Solution

Organisation of Data

There are four primary statistical methods for organizing data:

1. Classification

  • Classification involves arranging data into groups based on similarities. It summarizes the frequency of individual scores or score ranges for a variable. In its simplest form, a distribution displays each value of a variable alongside the number of occurrences for each value.
  • Once data are collected, organizing them facilitates drawing conclusions and making informed decisions. A clearer understanding of data emerges when raw data are organized into a frequency distribution, which illustrates the number of cases falling within specific class intervals or score ranges.

Frequency Distribution with Ungrouped Data and Grouped Data:

  • Ungrouped Frequency Distribution: Ungrouped data can be represented by listing all score values and tallying the occurrences of each score.
  • Grouped Frequency Distribution: When there is a wide range of score values, making it challenging to visualize the data clearly, a grouped frequency distribution is constructed. This method organizes data into classes, showing the number of observations that fall within each class interval.

Construction of Frequency Distribution:

  • To prepare a frequency distribution, several factors must be determined:
  • Range of the given data, calculated as the difference between the highest and lowest scores.
  • Number of class intervals, typically ranging between 5 and 30.
  • Limits of each class interval, known as the class width or range, denoted by 'i.' Class intervals should be of uniform width and divisible by convenient numbers like 2, 3, 5, 10, or 20.

Methods for Describing Class Limits:

Three methods for describing class limits are:

  • Exclusive Method: Classes are formed such that the upper limit of one class becomes the lower limit of the next class, assuming that the upper limit of a class is exclusive.
  • Inclusive Method: Classes are formed without overlapping limits, including scores equal to the upper limit of each class. This method is preferred for whole number measurements.
  • True or Actual Class Method: Class limits are defined mathematically, extending 0.5 units below and above the score's face value on a continuum. These limits are referred to as true or actual class limits.

Types of Frequency Distributions

There are several methods to organize frequency distributions of a dataset based on the statistical analysis or study requirements. Below are a couple of them discussed:

  • Relative Frequency Distribution: A relative frequency distribution represents the proportion of the total number of cases observed at each score value or within score value intervals.
  • Cumulative Frequency Distribution: In some cases, investigators may want to determine the number of observations less than a specific value. This can be achieved by calculating the cumulative frequency, which sums up the frequencies for a particular class interval and all preceding intervals.
  • Cumulative Relative Frequency Distribution: A cumulative relative frequency distribution expresses the cumulative frequency of any score or class interval as a proportion of the total number of cases.

Question for Descriptive Statistics-tabular, graphical and numerical methods
Try yourself:
How is data organized in a grouped frequency distribution?
View Solution

2. Tabulation

Data can be presented in the form of a table or a graph, with tabulation being the process of organizing classified data into a table. Tabular presentation enhances the comprehensibility of data and makes it suitable for further statistical analysis. A table consists of several components:

  • Table Number: When multiple tables are included in an analysis, each should be assigned a unique number for reference and identification. The number is typically centered at the top of the table.
  • Title of the Table: Every table should have a clear and concise title that describes its content. The title is placed centrally at the top of the table or just below/after the table number.
  • Caption: Captions are concise headings for columns, which may include headings and sub-headings. They are positioned in the middle of the columns, providing clarity on the data categories such as gender, location, or socioeconomic status.
  • Stub: Stubs are brief headings for rows, providing context for the data presented in each row.
  • Body of the Table: The main section of the table contains the numerical data arranged according to the captions and stubs.
  • Head Note: This note, positioned at the extreme right below the title, explains the units of measurement used in the table.
  • Footnote: Footnotes are qualifying statements placed below the table, providing additional information or clarifications not covered in the title, caption, or stubs.
  • Source of Data: It is important to mention the source of the data used in the table, typically placed at the end of the table to provide credibility and transparency.

3. Graphical Representation of Data


  • The purpose of creating a frequency distribution is to offer a structured approach to interpreting data. To enhance this interpretation, the information from a frequency distribution is often depicted in graphical or diagrammatic formats. Graphical presentation of frequency distributions involves plotting frequencies on a visual platform formed by horizontal and vertical lines, known as a graph. 
  • A graph is constructed using two perpendicular lines called the X and Y-axes, with appropriate scales indicated. The horizontal line, known as the abscissa, represents one variable, while the vertical line, the ordinate, represents the corresponding frequencies. 

Various types of graphs are used to convey statistical information effectively, including histograms, frequency polygons, frequency curves, and cumulative frequency curves.

  • Histogram: This method is widely used for illustrating continuous frequency distributions graphically. In a histogram, each class interval's upper limit serves as the lower limit for the next interval. The histogram consists of a series of rectangles, with the width representing the class interval and the height indicating the corresponding frequency.
  • Frequency Polygon: To construct a frequency polygon, an abscissa is drawn from point 'O' to point 'X', and an ordinate is drawn from 'O' to 'Y'. The class intervals are labeled on the abscissa, with exact limits or midpoints indicated. Frequencies are then plotted against their respective class intervals on the ordinate, and a line is drawn to connect these points, forming the polygon.
  • Frequency Curve: A frequency curve is a smooth, freehand curve drawn through the points of a frequency polygon. Its purpose is to reduce random or erratic fluctuations present in the data, providing a clearer representation of the distribution.

Cumulative Frequency Curve or Ogive

The graph representing a cumulative frequency distribution is called a cumulative frequency curve or ogive. There are two types of ogives based on the type of cumulative frequencies:

  • 'Less Than' Ogive: In this type, the cumulative frequencies less than each class boundary are plotted against the upper class boundaries. It is an increasing curve sloping upwards from left to right.
  • 'More Than' Ogive: Here, the cumulative frequencies greater than each class boundary are plotted against the lower class boundaries. It is a decreasing curve sloping downwards from left to right.

Question for Descriptive Statistics-tabular, graphical and numerical methods
Try yourself:
What is the purpose of tabulating data?
View Solution

4. Diagrammatic Representation of Data

A diagram serves as a visual tool for presenting statistical data in a simple and easily understandable manner. Diagrammatic presentation is solely focused on visually representing the data, whereas graphic presentation can be utilized for further analytical purposes. Various forms of diagrams include:

  • Bar Diagram: This type of diagram is particularly useful for representing categorical data. Each bar corresponds to a category, with the variable displayed on the horizontal axis and the frequency on the vertical axis. The height of each bar represents the frequency or value of the variable.
  • Subdivided Bar Diagram: Subdivided bar diagrams are employed for studying sub-classifications within a dataset. Each bar is divided and shaded according to the sub-categories of the data. The proportion of each sub-class is reflected by the portion of the bar it occupies.
  • Multiple Bar Diagram: Multiple bar diagrams are used to compare two or more sets of related phenomena or variables. Bars representing different sets are drawn side by side without any gaps, and various colors or shades are utilized to differentiate between them.
  • Pie Diagram: Also known as an angular diagram, a pie chart consists of a circle divided into sectors corresponding to the frequencies of variables in the distribution. Each sector's size is proportional to the frequency of the variable it represents. The circle, representing 360 degrees, is divided proportionally based on percentages. After calculating the angles for each component, segments are drawn in the circle, with different colors or shades used to distinguish between them.

Summary of Data

In the preceding section, we discussed the tabulation and graphical representation of data. However, in research, merely tabulating data may not suffice, especially when comparing multiple series of the same type to identify trends in variables. For such comparisons, it becomes necessary to delve deeper into the characteristics of the data, which is achieved through summary statistics. The frequency distribution of collected data can differ in terms of measures of central tendency and the extent of spread around the central value. These differences constitute the components of summary statistics.

Measures of Central Tendency

Central tendency refers to the middle point of a distribution, where values tend to cluster around a central value. Measures of central tendency aim to capture this tendency accurately. A good measure of central tendency should be clearly defined, easy to comprehend, based on all observations, and resistant to fluctuations in sampling. The three most commonly used measures of central tendency are:

  • Arithmetic Mean: This is the average obtained by dividing the sum of all values by the total number of values. It is widely used and useful for further statistical analysis.
  • Median: The median is the middle value in a dataset, dividing it into two equal parts. It is not influenced by extreme values.
  • Mode: The mode is the value in a distribution with the highest frequency, representing the most typical value.

Measures of Dispersion

Knowing only the central tendency of data is insufficient for a complete understanding. Measures of dispersion quantify the spread or variability of data. The most commonly used measures of dispersion include:

  • Range: This is the difference between the largest and smallest values in the distribution.
  • Average Deviation: It is the arithmetic mean of the differences between each score and the mean.
  • Standard Deviation: This is the most stable index of variability, calculated as the square root of the variance, which is the mean of the squared deviations from the mean. Standard deviation is less affected by sampling fluctuations compared to other measures of dispersion.

Question for Descriptive Statistics-tabular, graphical and numerical methods
Try yourself:
What type of diagram is particularly useful for representing categorical data?
View Solution

The document Descriptive Statistics-tabular, graphical and numerical methods | Management Optional Notes for UPSC is a part of the UPSC Course Management Optional Notes for UPSC.
All you need of UPSC at this link: UPSC
258 docs

Top Courses for UPSC

FAQs on Descriptive Statistics-tabular, graphical and numerical methods - Management Optional Notes for UPSC

1. What is the importance of descriptive statistics in organizing data?
Descriptive statistics plays a crucial role in organizing data by providing a summary of the main features of a dataset. It helps in understanding the data distribution, central tendency, variability, and shape. By using tabular, graphical, and numerical methods, descriptive statistics allows us to visualize and analyze the data, making it easier to draw meaningful conclusions and make informed decisions.
2. What are the different methods of organizing data in descriptive statistics?
Descriptive statistics uses various methods to organize data. Tabular methods involve presenting data in tables, such as frequency tables, where the data is grouped into intervals or categories along with their corresponding frequencies. Graphical methods include creating charts and graphs, such as histograms, bar graphs, and pie charts, which visually represent the data distribution. Numerical methods involve calculating summary measures like measures of central tendency (mean, median, mode) and measures of variability (range, variance, standard deviation).
3. How can tabular methods be used to organize data effectively?
Tabular methods, such as frequency tables, are effective in organizing data as they provide a clear overview of the distribution of values or categories. By grouping the data into intervals or categories and displaying the corresponding frequencies, tabular methods help identify patterns, outliers, and gaps in the data. They also make it easier to compare different groups or variables within the dataset. Additionally, tabular methods can be used to calculate cumulative frequencies, relative frequencies, and percentages, providing further insights into the data.
4. What are the benefits of using graphical methods to organize data?
Graphical methods offer several benefits in organizing data. Firstly, they provide a visual representation of the data distribution, making it easier to understand patterns, trends, and outliers. Graphs and charts allow for quick comparisons between different groups or variables, facilitating data analysis and interpretation. Moreover, graphical methods enhance data presentation by making it more engaging and accessible to a wider audience. They also help in identifying any discrepancies or errors in the data by visually inspecting the graph or chart.
5. How do numerical methods contribute to the organization of data in descriptive statistics?
Numerical methods in descriptive statistics contribute to the organization of data by providing summary measures that describe the central tendency, variability, and shape of the data. Measures of central tendency, such as the mean, median, and mode, give insights into the typical or average value of the data. Measures of variability, such as the range, variance, and standard deviation, quantify the spread or dispersion of the data. These numerical measures help in summarizing the data in a concise and meaningful way, facilitating comparisons and decision-making.
258 docs
Download as PDF
Explore Courses for UPSC exam

Top Courses for UPSC

Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Related Searches

ppt

,

Viva Questions

,

study material

,

pdf

,

Summary

,

Sample Paper

,

Objective type Questions

,

Previous Year Questions with Solutions

,

Important questions

,

Free

,

mock tests for examination

,

video lectures

,

MCQs

,

graphical and numerical methods | Management Optional Notes for UPSC

,

Descriptive Statistics-tabular

,

Semester Notes

,

Descriptive Statistics-tabular

,

past year papers

,

graphical and numerical methods | Management Optional Notes for UPSC

,

shortcuts and tricks

,

graphical and numerical methods | Management Optional Notes for UPSC

,

Descriptive Statistics-tabular

,

practice quizzes

,

Extra Questions

,

Exam

;