Introduction
Data collection is the systematic process of gathering information for analysis. It plays a vital role in research, decision-making, and problem-solving across various fields. From surveys and interviews to sensors and digital platforms, data collection methods continue to evolve, driving innovation and insights in today's data-driven world.
What are the Sources of Data?
Statistical data can be obtained from two sources:
- Primary data
- Secondary data
Primary Data
The important points of primary data are:
- The enumerator (person who assembles the data) may collect the data by administering an inquiry or research. Such data is called Primary Data, as it is formulated on first-hand information.
- Primary data are unique, do not require any modification, and are costly.
Secondary data
- If the data have been examined and analyzed by another agency, they are called Secondary Data. Usually, the issued data are secondary.
- They are already in the present and therefore are not unique.
- It demands to be modified to satisfy the aim of the study at hand.
- Secondary data are low-priced.
How do we collect Data?
It is done in the following ways:
Surveys
- The survey aims to describe characteristics like cost, worth, utility (in the case of the product) reputation, honesty, and loyalty (in the case of the nominee).
- The objective of the survey is to gather data and is a method of gathering information from individuals.
Preparation of Instrument
The most prevalent type of tool employed in surveys is a questionnaire/ interview schedule. The questionnaire is either self-directed by the interviewee or conducted by the enumerator or qualified investigator. While drawing up the questionnaire/interview schedule, the following points should be kept in mind:
- The questionnaire should not be lengthy.
- The array of problems should move from indefinite to distinct.
- Questions should not be enigmatic.
- Questions should not use binary negatives.
- Questions should not be leading.
- Questions should not indicate choices.
Mode of Data Collection
The aim of probing questions is to survey the acquisition of data. There are three ways of collecting data:
- Personal Interviews
- Mailing (questionnaire) Surveys
- Telephone Interviews
Personal Interviews
In this method, the researcher has the main role as he/she conducts the interviews face-to-face with the respondents. Personal interviews are preferred due to various reasons:
- Highest Response Rate
- Allows use of all types of questions
- Better to use open-ended questions
- Allows clarification of ambiguous questions.
The personal interview has some demerits too:
- Most expensive
- Possibility of influencing respondents
- More time taking
Mailing Questionnaire
In such a method, the data is collected through the mail. The questionnaire is mailed to each person and a request is attached to complete and return it on time.
The advantages of this method are:
- Least expensive
- The only method to reach remote areas
- No influence on respondents
- Maintains anonymity of respondents
- Best for sensitive questions
The disadvantages of mail surveys are:
- It cannot be used by illiterates
- Long response time
- Does not allow an explanation of unambiguous questions
- Reactions cannot be watched
Telephone Interviews
In telephone interviews, the investigator asks questions over the telephone.
The advantages of telephone interviews are:
- Relatively low cost
- Relatively less influence on respondents
- Relatively high response rate.
The disadvantages of this method are:
- Limited use
- Reactions cannot be watched
- Possibility of influencing respondents
Pilot Survey
- After the questionnaire is ready, it is desirable to carry out a try-out with a diminutive group, known as the Pilot Survey or Pre-Testing of the questionnaire.
- The pilot survey serves to give a preliminary impression of the survey.
- It helps to pretest the questionnaire and know the lapses and drawbacks.
- It also aids in assessing the appropriateness of questions, the accuracy of guidance, the administration of enumerators, and the expense and time required in the actual survey.
Census and Sample Surveys
Census - A survey, which encompasses every component of the population, is apprehended as a Census or the Method of Complete Enumeration.
- The primary feature of this approach is that this comprises every individual unit in the whole population.
Sample Survey- A sample refers to a section of the population from which information has to be taken. A good sample (representative sample) is usually short and competent in giving reasonably accurate information about the population at a lower cost and in less time.
- Most of the surveys are sample surveys and are preferable in statistics because of several reasons.
- A sample can give rationally secure and authentic information at a lower cost and in less time.
- Now the question is how do you do the sampling? There are two main types of sampling:
- Random Sampling
- Non-random Sampling
Random Sampling
- It is also known as the lottery method.
- Random sampling is where the specific units from the population (samples) are randomly selected.
- In random sampling, each person has an equal possibility of being chosen, and the person who is selected is the same as the one who is not selected.
- Random number tables are generated to ensure an equal chance of selection of every single unit in the population.
- They are accessible either in an issued form or can be generated by employing relevant software packages.
Non-random sampling
- In this method, units of the population don’t have equal chances of being selected.
- The convenience or interpretation of the investigator plays a crucial role in the adoption of the sample.
- They are chiefly selected based on belief, purpose, ease, or quota and are non-random samples.
Sampling and Non-sampling Errors
Sampling errors
- Sampling error applies to the variations between the sample estimate and the actual value.
- It is the error that transpires when you observe the sample taken from the population.
- The point of differentiation between the actual parameter of the population and its estimate is known as sampling error.
Non-sampling errors
Non-sampling errors are more consequential than sampling errors. Sampling error can be minimized by taking a larger sample, on the other hand, it is difficult to minimize non-sampling error. Even a Census can carry non-sampling errors.
Some of the non-sampling errors are:
- Errors in Data Acquisition: This type of error stems from recording inaccurate responses.
- Non-Response Errors: Non-response happens if an interviewer is incapable of contacting a person listed in the sample or a person from the sample declines to respond. In this case, the sample research may not be representative.
- Sampling Bias: Sampling bias happens when the sampling plan is such that some portion of the target population could not possibly be incorporated into the sample.
Census of India and NSSO
- The Census of India and the National Sample Survey Organisation (NSSO), are two significant firms at the national level, which gather, manner, and tabulate data.
- The Census of India produces the most comprehensive and continuous demographic record of the population.
- The NSSO was established by the Government of India to conduct nationwide surveys on socio-economic issues.
- NSSO gives periodic measures of education, school enrolment, utilization of educational aids, employment, unemployment, manufacturing, and service sector enterprises, morbidity, maternity, child care, utilization of the public distribution system, etc.