Open App

SSC CGL Exam > SSC CGL Notes > Statistics for SSC CGL > Collection of Data

Collection of Data | Statistics for SSC CGL PDF Download

Table of contents
Introduction
Sources of Data
Differences Between Primary and Secondary Data
Statistical Methods of Data Collection
Sampling Methods
What are Common Challenges in Data Collection?

Introduction

Data collection is the process of gathering information in an organized way to gain useful insights. For SSC CGL Tier 2 exam aspirants, understanding data collection is important because it is a key part of the exam. Data collection is a crucial step in any research process, providing the foundation for analysis and interpretation. It involves gathering information to answer specific questions, test hypotheses, and evaluate outcomes. The quality of data collected directly impacts the reliability and validity of research findings. This chapter explores the different sources of data, the types of data available, and the statistical methods used for data collection, along with their merits and demerits.

Sources of Data

Data can be gathered from two main sources:

Primary Source of Data
- Definition: Primary data is original data collected directly from its source by the researcher for a specific research purpose.
- Example: Data collected through surveys, experiments, or direct observations conducted by the researcher.
Secondary Source of Data
- Definition: Secondary data is data that has already been collected by someone else and is available for use.
- Example: Data from books, journals, government reports, and databases created by other researchers or institutions.

Differences Between Primary and Secondary Data

Aspect	Primary Data	Secondary Data
Originality	Original and collected firsthand	Already exists and can be readily accessed
Specificity	Specific to the researcher’s current study and tailored to research needs	May not perfectly match the researcher’s specific needs and may require adjustment or additional context
Cost and Time	Expensive and time-consuming, requiring significant resources and effort	Generally less expensive and quicker compared to collecting primary data

Question for Collection of Data

Try yourself:

What is the primary source of data?

A.
Data collected by someone else and available for use.
B.
Original data collected directly from its source by the researcher.
C.
Data from books, journals, government reports, and databases.
D.
Data collected through surveys, experiments, or direct observations conducted by the researcher.

View Solution

Statistical Methods of Data Collection

1. Direct Personal Investigation: The investigator personally collects data from the respondents through direct interaction. This method ensures that the data is accurate and reliable due to the investigator's close involvement.

Merits:

Originality: Data collected is original and unique.
Reliability: Direct collection ensures high reliability and accuracy.
Accuracy: Detailed and precise information is obtained.
Detailed Information: In-depth data can be gathered.
Elasticity: The method is flexible and adaptable to various situations.

Demerits:

Coverage Limitation: Difficult to cover large or dispersed populations.
Cost: Often costly due to the resources required.
Personal Bias: Investigator’s biases can influence the data.
Limited Scope: May not cover all necessary areas due to time and resource constraints.

2. Indirect Oral Investigation

Data is collected from knowledgeable individuals or experts who provide information based on their experience. This method relies on the respondent’s ability to provide accurate information.

Collection of Data | Statistics for SSC CGL

Merits:

Wide Coverage: Can gather information from a broad area.
Rapid Collection: Quick way to collect data.
Cost-Effective: Generally less expensive than direct methods.
Bias-Free: Reduces the risk of investigator bias.

Demerits:

Accuracy Issues: Data may be less accurate due to reliance on second-hand information.
Bias in Responses: Respondents may have their own biases.
General Conclusions: Information may lack detail and specificity.

3.Information from Local Sources or Correspondents

Local correspondents or individuals are appointed to gather data from different locations. This method is often used when information is needed from multiple regions.

Merits:

Economical: Cost-effective compared to extensive fieldwork.
Wide Coverage: Capable of covering large geographical areas.
Continuity: Provides ongoing data collection.
Special Purposes: Suitable for specific, localized studies.

Demerits:

Originality Loss: Data may lose originality due to being second-hand.
Lack of Uniformity: Variability in data quality and collection methods.
Personal Bias: Correspondents' biases can affect data quality.

4.Information Through Questionnaires and Schedules

Data is collected using questionnaires and schedules mailed to informants or administered by enumerators.

(a) Mailing Method: Suitable when:

The area of study is large.
Informants are educated.

(b) Enumerator’s Method: Suitable when:

The investigation requires detailed and skilled investigation.
The investigator needs to be well-versed in the local language and cultural norms.

Factors to Consider:

Ability of the collecting organization
Objective and scope
Method of collection
Time and condition of organization
Definition of the unit
Accuracy

5.Census Method

Data is collected covering every item of the population related to the problem under investigation. Collection of Data | Statistics for SSC CGL

Merits:

Reliable and accurate
Less biased
Extensive information
Study of diverse characteristics
Suitable for complex investigation
Indirect investigation

Demerits:

Costly
Large manpower needed
Not suitable for large-scale investigations

6.Sample Method

Data is collected about a sample, and conclusions are drawn about the entire population based on the sample.

Merits:

Economical
Time-saving
Information error reduction
Large investigations possible
Administrative convenience
More scientific

Demerits:

Partial conclusions
Possible wrong conclusions
Difficulty in selecting a representative sample
Specialized knowledge required

Question for Collection of Data

Try yourself:

Which method of data collection involves collecting data from respondents through direct interaction with the investigator?

A.
Indirect Oral Investigation
B.
Information from Local Sources
C.
Census Method
D.
Direct Personal Investigation

View Solution

Sampling Methods

1. Random Sampling

Random sampling ensures that every member of a population has an equal chance of being selected. This method minimizes selection bias and ensures representativeness.

(a)Lottery Method:

Each member of the population is assigned a unique number. Numbers are then randomly drawn to select the sample.

Example: In a class of 30 students, each student is assigned a number from 1 to 30. Numbers are drawn randomly to select 5 students for a survey.

Tables of Random Numbers:

Description: A pre-generated table of random numbers is used to select members from the population.
Example: Using a table, a researcher might randomly select students from a list of 100 based on random numbers that fall within the list.

2. Purposeful or Deliberate Sampling

Purposeful sampling involves selecting specific individuals based on certain criteria deemed important for the study. This is often used when specific insights are required. The researcher identifies individuals who are thought to have particular knowledge or experience relevant to the study.

Example: In a study about expert opinions on climate change, researchers might only select environmental scientists and policy makers.

3. Stratified or Mixed Sampling

Stratified sampling involves dividing the population into distinct sub-groups (strata) and then randomly sampling from each stratum. This ensures that all sub-groups are represented.
The population is divided based on characteristics like age, gender, income, etc. A random sample is drawn from each stratum.

Example: In a survey of a city's population, the city might be divided into age groups (e.g., 18-30, 31-50, 51+) and a random sample taken from each group to ensure representation across all ages.

4. Systematic Sampling

Systematic sampling involves selecting every nth item from a list of the population. A starting point is selected at random, and then every nth item in the list is chosen.

Example: In a list of 1000 names, every 10th person might be selected for a survey, starting from a random point in the list.

5. Cluster Sampling

Cluster sampling involves dividing the population into clusters and then randomly selecting some clusters to include all members from those clusters. The population is divided into clusters, which could be geographic or organizational units. Clusters are randomly chosen, and all members of these clusters are included in the sample.

Example: In a national survey, cities (clusters) are randomly selected, and all residents within those cities are surveyed.

6. Quota Sampling

Quota sampling involves dividing the population into groups and then selecting samples from each group to meet a specific quota. This method is similar to stratified sampling but does not involve random selection within groups. The population is segmented into groups, and samples are taken until a pre-set number of individuals from each group are included.

Example: A survey requires 100 respondents with specific quotas for gender, age, and occupation. Once quotas are met, sampling stops.

7. Convenience Sampling

Convenience sampling involves selecting individuals who are easiest to reach or most convenient for the researcher. The sample is chosen based on ease of access rather than randomness.

Example: Surveying people at a local grocery store because it is easily accessible, rather than attempting to reach a broader population.

Reliability of Sampling Data

The reliability of sampling data can be influenced by several factors:

Size of the Sample: Larger samples generally provide more reliable estimates of the population and reduce sampling error.
Method of Sampling: The choice of sampling method impacts how representative and unbiased the sample is. Random methods tend to be more reliable compared to non-random methods.
Skills of Correspondents and Enumerators: The effectiveness and competence of those collecting the data play a crucial role in data reliability. Skilled individuals are more likely to gather accurate and consistent information.
Training of Enumerators: Proper training ensures that data collection is done uniformly and correctly, minimizing errors and biases.

Important Organizations for Data Collection

Several key organizations are involved in collecting, processing, and publishing statistical data:

1. NSSO (National Sample Survey Organization):

Role: Conducts extensive surveys on various socio-economic issues in India.
Focus: Economic and social data, including employment, consumption, and health statistics.

2. RBI (Reserve Bank of India):

Role: Collects and publishes financial and economic data.
Focus: Monetary policy, financial stability, and banking sector statistics.

3. Registrar General of India:

Role: Conducts the Census of India and collects vital statistics.
Focus: Population data, demographic information, and vital statistics like births and deaths.

4. DGCIS (Directorate General of Commercial Intelligence and Statistics):

Role: Collects and publishes trade statistics.
Focus: Trade, commerce, and industrial data.

5. Labour Bureau:

Role: Gathers data on labor and employment.
Focus: Employment statistics, wage rates, and labor market trends.

Question for Collection of Data

Try yourself:

Which sampling method involves dividing the population into distinct sub-groups and then randomly sampling from each stratum?

A.
Random Sampling
B.
Purposeful Sampling
C.
Stratified Sampling
D.
Systematic Sampling

View Solution

What are Common Challenges in Data Collection?

There are some prevalent challenges faced while collecting data, let us explore a few of them to understand them better and avoid them.

Data Quality Issues: The main challenge affecting the widespread use of machine learning is poor data quality. To make technologies like machine learning effective, it is important to focus on ensuring data quality.
Inconsistent Data: When dealing with different data sources, there can be variations in the same information, such as formats, units, or even spellings. These variations can occur during company mergers or relocations. If inconsistent data is not addressed consistently, it can diminish the value of the data over time.
Data Downtime: Data is vital for the decisions and operations of data-centric businesses. However, there may be brief periods when data is unreliable or unavailable. This unavailability of data can lead to customer complaints and subpar analytical results.
Ambiguous Data: Even with careful oversight, errors can occur in large databases or data lakes, particularly with rapidly streaming data. These errors can involve unnoticed spelling mistakes, formatting issues, or misleading column headings. Unclear data can pose various challenges for reporting and analytics.
Duplicate Data: Modern businesses have to manage streaming data, local databases, and cloud data lakes. They may also have application and system silos, leading to significant duplication and overlap. Duplicate contact details, for instance, can greatly impact the customer experience.
Too Much Data: While the emphasis is on data-driven analytics and its advantages, having too much data can be a data quality issue. There is a risk of getting lost in a large volume of data when searching for information relevant to analytical efforts.
Inaccurate Data: For heavily regulated sectors like healthcare, data accuracy is essential. Improving data quality for situations like COVID-19 and future pandemics is crucial. Inaccurate information does not provide an accurate representation of the situation and cannot be used to plan the best course of action.
Hidden Data: Most businesses utilize only a portion of their data, with the remainder sometimes lost in data silos or discarded in data graveyards. Missing opportunities to develop new products, enhance services, and streamline processes are consequences of hidden data.
Finding Relevant Data: Identifying relevant data is not always straightforward. Various factors need to be considered when trying to find relevant data, such as domain relevance, demographics, and time period. Data that is irrelevant in any of these aspects becomes obsolete and hinders effective analysis.
Deciding the Data to Collect: Determining what data to collect is a crucial aspect of data collection and should be prioritized. Choices regarding the subjects covered by the data, data sources, and required information depend on the goals or objectives of data usage.
Dealing With Big Data: Big data involves extremely large and complex data sets, posing challenges in storage, analysis, and deriving results. Traditional data processing tools may not be adequate for handling big data effectively.
Low Response and Other Research Issues: Poor design and low response rates are identified as challenges in data collection, particularly in health surveys using questionnaires. Establishing an incentivized data collection program can help improve response rates.

In conclusion, collecting high-quality data is essential for accurate analysis and decision-making. Ensure data consistency, accuracy, and reliability by choosing appropriate methods and regularly cleaning the data to remove errors and duplicates. Properly training data collectors helps maintain high standards, and using advanced tools can efficiently manage large data sets. Effective data management makes information easily accessible and valuable, enhancing its overall usability.

The document Collection of Data | Statistics for SSC CGL is a part of the SSC CGL Course Statistics for SSC CGL.

All you need of SSC CGL at this link: SSC CGL

	Statistics for SSC CGL 72 videos\|87 docs\|18 tests

Statistics for SSC CGL

72 videos|87 docs|18 tests

Join Course for Free

Top Courses for SSC CGL

View all

FAQs on Collection of Data - Statistics for SSC CGL

1. What are the common sources of data for statistical analysis?

Ans. The common sources of data for statistical analysis include surveys, experiments, observations, and existing databases.

2. What is the difference between primary and secondary data?

Ans. Primary data is collected directly from the source for a specific research purpose, while secondary data is already available and collected by someone else for a different purpose.

3. What are the statistical methods used for data collection?

Ans. Statistical methods of data collection include random sampling, stratified sampling, cluster sampling, and systematic sampling.

4. What are some common challenges in data collection?

Ans. Common challenges in data collection include non-response bias, measurement error, sampling bias, and data processing errors.

5. How can sampling methods be used to collect data effectively?

Ans. Sampling methods such as random sampling ensure that every member of the population has an equal chance of being selected, leading to a representative sample for analysis.

Related Exams

SSC CGL

About this Document

	1.3K Views
	4.87/5 Rating
	Dec 26, 2024 Last updated

Document Description: Collection of Data for SSC CGL 2024 is part of Statistics for SSC CGL preparation. The notes and questions for Collection of Data have been prepared according to the SSC CGL exam syllabus. Information about Collection of Data covers topics like Introduction, Sources of Data, Differences Between Primary and Secondary Data, Statistical Methods of Data Collection, Sampling Methods, What are Common Challenges in Data Collection? and Collection of Data Example, for SSC CGL 2024 Exam. Find important definitions, questions, notes, meanings, examples, exercises and tests below for Collection of Data.

Introduction of Collection of Data in English is available as part of our Statistics for SSC CGL for SSC CGL & Collection of Data in Hindi for Statistics for SSC CGL course. Download more important topics related with notes, lectures and mock test series for SSC CGL Exam by signing up for free. SSC CGL: Collection of Data | Statistics for SSC CGL

Description

Full syllabus notes, lecture & questions for Collection of Data | Statistics for SSC CGL - SSC CGL | Plus excerises question with solution to help you revise complete syllabus for Statistics for SSC CGL | Best notes, free PDF download

Information about Collection of Data

In this doc you can find the meaning of Collection of Data defined & explained in the simplest way possible. Besides explaining types of Collection of Data theory, EduRev gives you an ample number of questions to practice Collection of Data tests, examples and also practice SSC CGL tests

	Statistics for SSC CGL 72 videos\|87 docs\|18 tests

Statistics for SSC CGL

72 videos|87 docs|18 tests

Join Course for Free

Download as PDF

Explore Courses for SSC CGL exam

Top Courses for SSC CGL

Explore Courses

Signup for Free!

Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.

Start learning for Free

10M+ students study on EduRev

Collection of Data | Statistics for SSC CGL

Exam

Important questions

MCQs

ppt

Free

Objective type Questions

study material

Summary

Collection of Data | Statistics for SSC CGL

practice quizzes

video lectures

past year papers

Sample Paper

Semester Notes

Collection of Data | Statistics for SSC CGL

shortcuts and tricks

Viva Questions

pdf

Previous Year Questions with Solutions

Extra Questions

mock tests for examination

;

Additional Information about Collection of Data for SSC CGL Preparation

Collection of Data Free PDF Download

The Collection of Data is an invaluable resource that delves deep into the core of the SSC CGL exam. These study notes are curated by experts and cover all the essential topics and concepts, making your preparation more efficient and effective. With the help of these notes, you can grasp complex subjects quickly, revise important points easily, and reinforce your understanding of key concepts. The study notes are presented in a concise and easy-to-understand manner, allowing you to optimize your learning process. Whether you're looking for best-recommended books, sample papers, study material, or toppers' notes, this PDF has got you covered. Download the Collection of Data now and kickstart your journey towards success in the SSC CGL exam.

Importance of Collection of Data

The importance of Collection of Data cannot be overstated, especially for SSC CGL aspirants. This document holds the key to success in the SSC CGL exam. It offers a detailed understanding of the concept, providing invaluable insights into the topic. By knowing the concepts well in advance, students can plan their preparation effectively. Utilize this indispensable guide for a well-rounded preparation and achieve your desired results.

Collection of Data Notes

Collection of Data Notes offer in-depth insights into the specific topic to help you master it with ease. This comprehensive document covers all aspects related to Collection of Data. It includes detailed information about the exam syllabus, recommended books, and study materials for a well-rounded preparation. Practice papers and question papers enable you to assess your progress effectively. Additionally, the paper analysis provides valuable tips for tackling the exam strategically. Access to Toppers' notes gives you an edge in understanding complex concepts. Whether you're a beginner or aiming for advanced proficiency, Collection of Data Notes on EduRev are your ultimate resource for success.

Collection of Data SSC CGL Questions

The "Collection of Data SSC CGL Questions" guide is a valuable resource for all aspiring students preparing for the SSC CGL exam. It focuses on providing a wide range of practice questions to help students gauge their understanding of the exam topics. These questions cover the entire syllabus, ensuring comprehensive preparation. The guide includes previous years' question papers for students to familiarize themselves with the exam's format and difficulty level. Additionally, it offers subject-specific question banks, allowing students to focus on weak areas and improve their performance.

Study Collection of Data on the App

Students of SSC CGL can study Collection of Data alongwith tests & analysis from the EduRev app, which will help them while preparing for their exam. Apart from the Collection of Data, students can also utilize the EduRev App for other study materials such as previous year question papers, syllabus, important questions, etc. The EduRev App will make your learning easier as you can access it from anywhere you want. The content of Collection of Data is prepared as per the latest SSC CGL syllabus.

Education Revolution

Signup to see your scores go up within 7 days!

Access 1000+ FREE Docs, Videos and Tests

Continue with Google

Takes less than 10 seconds to signup