All Exams  >   Class 10  >   The Complete SAT Course  >   All Questions

All questions of Data Analysis for Class 10 Exam

Which of the following is the most unstable average?
  • a)
    Median
  • b)
    Arithmetic mean
  • c)
    Mode
  • d)
    Geometric mean
Correct answer is option 'C'. Can you explain this answer?

Rajeev Kumar answered
  • Mode: The word mode has been derived from the French word “la Mode” which signifies the most fashionable values of distribution because it is repeated the highest number of times in the series. The mode is the most frequently observed data value. It is denoted by Mo.
    • The mode is seldom used and its computation is easy, but it is highly unstable and may change with minor shifts in the frequencies from one interval to another. 
    • However, there are situations in which the only mode can be used. 
    • For example, if a shoe company wants to how which size of shoe it should produce more, it would use mode as a measure of central tendency. The most frequently sold size of the shoes is the mode.
  • Arithmetic mean: The arithmetic mean is the most commonly used measure of central tendency. The mean represents the central tendency. It is defined as the sum of the values of all observations divided by the number of observations and is usually denoted by X. In general, if there are N observations as X1, X2, X3, ..., XN, then the Arithmetic Mean is given by



    This will be written in simpler form without the index i.
    Thus mean = N ∑ X/N where, ΣX = sum of all observations and N = total number of observations. 
  • Median is that positional value of the variable which divides the distribution into two equal parts, one part comprises all values greater than or equal to the median value and the other comprises all values less than or equal to it. The Median is the “middle” element when the data set is arranged in order of the magnitude. Since the median is determined by the position of different values, it remains unaffected if, say, the size of the largest value increases. The median can be easily computed by sorting the data from smallest to largest and finding out the middle value.
Hence, we conclude that Mode is the most unstable average.

A researcher administers an achievement test to assess and indicate the possible effect of an independent variable in his/her study. The distribution of scores on the test is found to be negatively skewed. On the basis of this, what can be started with regard to the difficulty level of the test?
  • a)
    The test is very easy
  • b)
    The test is very difficult
  • c)
    The test is neither easy nor difficult
  • d)
    The test is easy and needs normalization
Correct answer is option 'A'. Can you explain this answer?

Jack Sanders answered
Understanding the Negative Skewness of the Distribution
Negative skewness in the distribution of test scores indicates that more students scored higher on the test compared to those who scored lower. This means that the majority of students found the test relatively easy, leading to a concentration of scores at the higher end of the scale.

Implication for the Difficulty Level of the Test
Based on the negative skewness of the distribution, it can be inferred that:
- The test is very easy: Since a large number of students scored well on the test, it suggests that the test may have been too easy for the sample population. This is further supported by the fact that the distribution is negatively skewed, indicating an abundance of high scores.
Therefore, the most appropriate conclusion to draw from the negative skewness of the distribution of scores on the achievement test is that the test is very easy. This information can be valuable for the researcher in evaluating the effectiveness of the test in assessing the impact of the independent variable under study.

At the stage of data analysis in which quantitative techniques have been used by a researcher, the evidence warrants the rejection of Null Hypothesis (H0). Which of the following decisions of the researcher will be deemed appropriate?
  • a)
    Rejecting the (H0) and also the substantive research hypothesis
  • b)
    Rejecting the (H0) and accepting the substantive research hypothesis
  • c)
    Rejecting the (H0) without taking any decision on the substantive research hypothesis
  • d)
    Accepting the (H0) and rejecting the substantive research hypothesis
Correct answer is option 'B'. Can you explain this answer?

The null hypothesis is the standard method for supporting the substantive research hypothesis. Like any hypothesis, a substantive hypothesis is about the relation between two or more variables. It is called “substantive” because it has not yet been operationalized. An operational hypothesis phrased to show the manipulating and measuring the variables.
  • H0 (null hypothesis): A tentative assumption is made about the parameter. This assumption is called the null hypothesis and is denoted by H0 (null hypothesis).
  • H1 (alternate hypothesis): An alternative hypothesis (denoted by H1), which is the opposite of what is stated in the null hypothesis.
The hypothesis-testing procedure involves using sample data to determine whether or not H0 can be rejected. If H0 is rejected, the statistical conclusion is that the alternative hypothesis H1 is true. If the null hypothesis is rejected, that is taken as evidence in favor of the research hypothesis which is called the alternative hypothesis (denoted by H1).
At the stage of data analysis in which quantitative techniques have been used by a researcher, the evidence warrants the rejection of the Null Hypothesis (H0). Here, the decision of the researcher which is deemed to be appropriate will be Rejecting the (H0) and accepting the substantive research hypothesis 
  1. The above statement is true in the context of the testing of a hypothesis as It is only the null hypothesis, that can be tested.
  2. To test the null hypothesis, a researcher uses ANOVA method of research
  3. At the data-analysis stage, a null hypothesis is used to find out the maintainability of the research hypothesis.

A university teacher administers a self-made test for the summative evaluation of his/her students. The distribution of scores of students is found to be positively skewed.
What inference he/she should make about the difficulty level of this test?
  • a)
    The test is much too easy for students
  • b)
    The test is difficult for students
  • c)
    The test is quite interesting to students
  • d)
    The test is favouring those students who are low rankers
Correct answer is option 'B'. Can you explain this answer?

Rajeev Kumar answered
Evolution: it is a systematic process through which one can determine the extent of the achievement of the instructional objective. It is a comprehensive process and continuous in nature.
Key Points
Summative evaluation:
  • It is done at the end of the program
  • To make a decision
  • It’s a kind of achievement test
  • The process is sometimes normative, sometimes criterion-based
  • The methods are basically oral reports, projects, term papers, teacher-made achievement tests, etc.
  • It is used for assigning a grade and final judgment
  • Feedback is also given after the evaluation
Skewness:
  • Skewness is a measure of symmetry.
  • For the normal distribution, skewness is zero.
  • If the curve shifted left or right is will be considered positively or negatively skewed respectively.
  • Galton's formula of skewness, (Q+ Q3 - 2Q2) / (Q3 - Q1)
Important Point:
Positive Skewness:
  • A positively skewed distribution is the distribution that is composed of mostly small observations and a few relatively or extreme large observations.
  • For example, test scores on a very difficult exam probably follow a positively skewed distribution.
  • This is because most students will score quite low and only a few students will score high.
  • For example: if we have over 120 students, over 115 scored 20-50 out of 100, only less than 5 scored 70-80.
  • For this skewed distribution, the median is the one to use since it is not influenced by a few relatively very high scores. 
  • In a positively skewed distribution, the mean is greater than the median, since the mean is influenced by a few relatively very large scores.
Negative Skewness:
  • A negatively skewed distribution is the distribution composed of mostly large observations and a few relatively small observations.
  • For example, test scores on a very easy test of some general education courses will probably follow a negatively skewed distribution.
  • Most students will do very well, but a few students may never come to class and will score very poorly, so test scores will be composed of many high scores and a few relatively very low scores.
  • In this case, the median should be used instead of the mean since it is not influenced by a few relatively very low scores. 
  • In a negatively skewed distribution, the mean is less than the median, since the mean is influenced by a few relatively very low scores.
Thus, a positively skewed score distribution means that the test is difficult for students.
Additional Information
Difficulty Level:
  • Cheang & Hasni (1998) defined difficulty index as the ratio of the number of students that can answer the questions correctly to the total number of the students who sit for the exams.
  • The formula is, Difficulty index = B / J, where, B = Number of students that answer the questions correctly and J = Total number of the students who sit for the exams.
  • For subjective tests, Difficulty index = Average score / Range of full marks
  • If the value comes more than 0.8, it will be considered too easy. If the index value is less than 0.5, it is hard in nature.

Given below are two statements, one is labelled as Assertion A and the other is labelled as Reason R
Assertion A: The value of correlation coefficient is in the range of ‐1 to +1.
Reason R: Correlation between two variables doesn't help in predicting the value of a variable even if we know the value of another variable.
In light of the above statements, choose the correct answer from the options given below
  • a)
    Both A and R are true and R is the correct explanation of A
  • b)
    Both A and R are true but R is NOT the correct explanation of A
  • c)
    A is true but R is false
  • d)
    A is false but R is true
Correct answer is option 'A'. Can you explain this answer?

Daniel Foster answered
Understanding the Assertion and Reason
The question involves evaluating two statements about the correlation coefficient and its implications.
Assertion A:
- The value of the correlation coefficient is in the range of -1 to +1.
- This statement is true. The correlation coefficient (denoted as 'r') measures the strength and direction of a linear relationship between two variables. A value of +1 indicates a perfect positive correlation, -1 indicates a perfect negative correlation, and 0 indicates no correlation.
Reason R:
- Correlation between two variables doesn't help in predicting the value of a variable even if we know the value of another variable.
- This statement is false. While correlation does not imply causation, a strong correlation can indeed help in making predictions. For example, if two variables are highly correlated, knowing the value of one can provide a good estimate of the other.
Conclusion
Given the evaluations:
- Both A and R are true and R is the correct explanation of A: False (R is false).
- Both A and R are true but R is NOT the correct explanation of A: False (R is false).
- A is true but R is false: True (This is the correct option).
- A is false but R is true: False (A is true).
Thus, the correct answer is option 'C': A is true but R is false.
This conclusion highlights that while correlation can be predictive, the assertion correctly states the range of the correlation coefficient, making it a valid statement on its own.

Given below are two statements:
Statement I: In order to ensure authenticity, a researcher has to indicate original sources from which data are called.
Statement II: In data analysis and interpretation the researcher has to evince the necessary rigour and appropriateness of procedures.
In the light of the above statements, Choose the most appropriate answer from the options given below:
  • a)
    Both Statement I and Statement II are correct
  • b)
    Both Statement I and Statement II are incorrect
  • c)
    Statement I is correct but Statement II is incorrect
  • d)
    Statement I is incorrect but Statement II is correct
Correct answer is option 'A'. Can you explain this answer?

Rajeev Kumar answered
Research is defined as the creation of new knowledge and/or the use of existing knowledge in a new and creative way so as to generate new concepts, methodologies, and understandings. This could include synthesis and analysis of previous research to the extent that it leads to new and creative outcomes.
Statement I: In order to ensure authenticity, a researcher has to indicate original sources from which data are culled.
Explanation:
  • Research ethics are the set of ethics that govern how scientific and other research is performed at research institutions such as universities, and how it is disseminated. Ethics are broadly the set of rules, written and unwritten, that govern our expectations of our own and others’ behavior.
  • Respecting intellectual property is one of the fundamental code of ethics to be followed while doing research.
  • A researcher should never plagiarise, or copy, other people’s work and try to pass it off as its own.
  • A researcher should always ask for permission before using other people’s tools or methods, unpublished data, or results. Not doing so is plagiarism.
  • A researcher needs to respect copyrights and patents, together with other forms of intellectual property, and always acknowledge other's contributions. Acknowledgment is the best way to avoid any risk of plagiarism.
  • Thus, in order to ensure authenticity, a researcher has to indicate original sources from which data are culled. The statement I is correct.
Statement II: In data analysis and interpretation the researcher has to evince the necessary rigor and appropriateness of procedures.
Explanation:
  • A researcher should aim to avoid bias in any aspect of his/her research, including design, data analysis, interpretation, and peer review.
  • This means that a researcher needs to report his/her research honestly and that this applies to the methods, data, and results.
  • A researcher should not make up any data, including extrapolating unreasonably from some results or do anything which could be construed as trying to mislead anyone. It is better to undersell than over-exaggerate findings.
  • The data must be appropriate, trustworthy, and adequate for drawing inferences.
  • The data must reflect good homogeneity.
  • Proper analysis should be done through statistical methods.
  • The researcher must remain cautious about the errors that can possibly arise in the process of interpreting results.
  • The researcher should be well equipped with and must know the correct use of statistical measures for drawing inferences concerning his study.
  • Thus, in data analysis and interpretation, the researcher has to evince the necessary rigor and appropriateness of procedures. Statement II is correct.
Thus, option A is the correct answer.

Which company was recently implicated in a global data theft crime?
  • a)
    Amazon
  • b)
    Google
  • c)
    Cisco
  • d)
    Cambridge Analytica
Correct answer is option 'D'. Can you explain this answer?

Quantronics answered
Key Points
  • Cambridge Analytica company was recently implicated in a global data theft crime.
  • Cambridge Analytica started in 2013 as a British Political Consulting which use to combine data mining, data analysis, and data brokerage for strategic communication during elections.
  • CEO of Cambridge Analytica is Alexander Nix.
Additional Information
Some important events in which Cambridge Analytica was involved:
  • 2014 => Involved in 44 US political race.
  • 2015 => Performed data analysis services for Ted Cruz's presidential campaign.
  • 2016 => Worked for Donald Trump's presidential campaign.
  • 2016 => Worked for Leave European Union.
  • March 2018 => many newspaper publishers reported that CA (Cambridge Analytica) is using the personal data of Facebook users for academic purposes and collecting them.

Which among the following is a software for the analysis of qualitative data?
  • a)
    NVivo
  • b)
    R
  • c)
    SPSS
  • d)
    STATA
Correct answer is option 'A'. Can you explain this answer?

Rajeev Kumar answered
Qualitative Data Analysis Software is a system that helps with a wide range of processes that help in content analysis, transcription analysis, discourse analysis, coding, text interpretation, recursive abstraction, grounded theory methodology, and 
interpreting information so as to make informed decisions
.
Key Points
NVivo software:
  • NVivo is a software program used for qualitative and mixed-methods research.
  • Specifically, it is used for the analysis of unstructured text, audio, video, and image data, including (but not limited to) interviews, focus groups, surveys, social media, and journal articles.
  • It is produced by QSR International. As of July 2014, it is available for both Windows and Macintosh operating systems
Additional Information
R is a programming language for statistical computing and graphics.
  • It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.
  • Qualitative data analysis tools can help organize, process, and analyze data for actionable insights. 
  • Qualitative data analysis software is used across a wide range of sectors and industries such as healthcare, the legal industry, e-commerce businesses,
SPSS is a software program. it stands for Statistical Package for the Social Sciences, and it's used by various kinds of researchers for complex statistical data analysis.
  • The SPSS software package was created for the management and statistical analysis of social science data.
STATA is a powerful statistical software 
developed by StataCorp for data manipulation, visualization, statistics, and automated reporting.
  • It
     enables users to analyze, manage, and produce graphical visualizations of data
    .
  • It is primarily used by researchers in the fields of economics, biomedicine, and political science to examine data patterns.

Comprehension:
Directions: Consider the following data and answer questions:

Which one of the following is the cumulative frequency of the entire data set?
  • a)
    150
  • b)
    160
  • c)
    140
  • d)
    120
Correct answer is option 'A'. Can you explain this answer?

Rajeev Kumar answered
Key Points
  • In Statistics, a cumulative frequency is defined as the total of frequencies, that are distributed over different class intervals. It means that the data and the total are represented in the form of a table in which the frequencies are distributed according to the class interval.
  • The cumulative frequency is calculated by adding each frequency from a frequency distribution table to the sum of its predecessors. The last value will always be equal to the total for all observations since all frequencies will already have been added to the previous total.
  •  A table that displays the cumulative frequencies that are distributed over various classes is called a cumulative frequency distribution or cumulative frequency table.
  • There are two types of cumulative frequency - lesser than type and greater than typical. Cumulative frequency is used to know the number of observations that lie above (or below) a particular frequency in a given data set. Let us look at a few examples that are used in many real-world situations.
  • In general the cumulative frequency( less than) type is considered as a cumulative frequency for the whole dataset.
  • Therefore the cumulative frequency of the entire dataset is 150.

When the researcher does not know the identity of the experimental and placebo groups, the study is termed as
  • a)
    Blind
  • b)
    Double-blind
  • c)
    Panel
  • d)
    Cohort
Correct answer is option 'B'. Can you explain this answer?

Rajeev Kumar answered
The experimental research is a scientific approach to research in which the researcher manipulates/controls one or more variables and measures its effect on the other variables. The strength of this research is its internal validity (causality). However, the influence of experimenter bias can threaten the internal validity of the research.
Controlling the Researcher’s Bias: The element of experimenter-bias effect can be controlled through the process of blinding. It is the process used in experimental research by which study participants, persons providing the treatment, data collectors and data analysts are kept unaware of group assignment (control vs treatment). There can be varying degrees of blinding such as single-blind, double-blind, triple-blind, etc.
  1. Single-blind: 
    1. In this case, the participants are unaware if they are part of the experimental group or the control group. It can be used, for example, during the test phase of a new drug, where the subjects do not know if they are actually receiving it or not.
  2. Double-blind:
    1. It describes an experimental procedure in which neither the participant nor the experimenter is aware of which group (experimental or control) each participant belongs to.
    2. It uses a rigorous way of experimenting in an attempt to minimize subjective biases on the part of the experimenter and on the part of the participant and obtain a more valid result.
    3. It is most commonly utilized in medical studies that investigate the effectiveness of drugs.
Note:
Longitudinal Research: This type of research is used to study the same sample over a longer period of time. These may be used to study behavioural changes, attitudinal changes or religious effects that may have a long time effect on the selected sample.
For example, to study the impact of the Reservation Policy of the Govt. of India to overcome inequalities since independence. Types of longitudinal surveys include:

Hence, when the researcher does not know the identity of the experimental and placebo groups, the study is termed as a double-blind study.

Comprehension:
Directions: Consider the following data and answer questions:

Which one of the following is the mode value for the given data set?
  • a)
    52.61
  • b)
    45.22
  • c)
    39.81
  • d)
    55.21
Correct answer is option 'A'. Can you explain this answer?

Quantronics answered
Key Points:
  • ​In statistics, the mode is the value that is repeatedly occurring in a given set. We can also say that the value or number in a data set, which has a high frequency or appears more frequently, is called mode or modal value. It is one of the three measures of central tendency, apart from mean and median. For example, the mode of the set {3, 7, 8, 8, 9}, is 8.
  • Therefore, for a finite number of observations, we can easily find the mode. A set of values may have one mode or more than one mode or no mode at all.

= 52.61
where,
L1 = Lower class boundary of modal class
Δ= Difference of frequency density between modal and pre. modal class
Δ2 = Difference of frequency density between modal and Post modal class
(i) width of the modal class.

In order to understand the classroom teaching-learning process, which of the following research tool is most appropriate?
  • a)
    Rating Scale
  • b)
    Questionnaire
  • c)
    Observation Schedule
  • d)
    Interview Schedule
Correct answer is option 'C'. Can you explain this answer?

Rajeev Kumar answered
In order to understand the classroom teaching-learning process Observation Schedule is most appropriate.
Observation Schedule:
  • Here the data is collected based on observation
  • It could be structured or unstructured method, controlled or uncontrolled observation 
  • The observer could be a member of the observer group. Sometimes they are playing the role of the observer only.
  • It is inexpensive
  • Suitable to get current information
  • Subjects are easily available
  • The work can be started or stopped at any time 
  • For example, understand the classroom teaching-learning process
Additional Information

Which one of the following is a non‐parametric statistic?
  • a)
    F ‐ statistic
  • b)
    t ‐ statistic
  • c)
    Pearson's correlation
  • d)
    Spearman's correlation
Correct answer is option 'D'. Can you explain this answer?

Rajeev Kumar answered
The non-parametric approach is a statistical method that makes no assumptions about the sample's characteristics (its parameters) or whether the observed data is quantitative or qualitative.
Key Points:
  • Certain descriptive statistics, statistical models, inference, and statistical tests are examples of nonparametric statistics. 
  • The model structure of nonparametric approaches is determined from data rather than being established a priori.
  • The normal distribution model and the linear regression model are examples of nonparametric statistics.
  • Ordinal data is sometimes used in nonparametric statistics which means it does not rely on numbers but rather on a ranking or order of sorts.
  • The Spearman rank-order correlation coefficient is a nonparametric statistics measure of the strength and direction of the relationship between two variables assessed on an ordinal scale.
  • The test is used for ordinal variables or continuous data that fails to meet the assumptions required for the Pearson's product-moment correlation to be conducted.
​Thus, Spearman's correlation is a non‐parametric statistic. 
Additional Information:
  • F‐ statistic: An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis. The F statistic simply compares the combined effect of all variables. 
  • t ‐ statistic: The t-value expresses the magnitude of the difference in terms of the variation in your sample data. 
  • Pearson's correlation: This correlation coefficient is a single number that measures both the strength and direction of the linear relationship between two continuous variables.

If you want to compare the price of wheat over a period, which index will you use?
  • a)
    Volume Index
  • b)
    Aggregate Index
  • c)
    Both (1) and (2)
  • d)
    Price Index
Correct answer is option 'D'. Can you explain this answer?

Rajeev Kumar answered
Price Index:
  • Price index is an economic variable that is used to measure the price changes for commodities.
  • It helps in measuring the relative price changes, consisting of a series of numbers so that comparison can be done over a period of time.
  • It is a valuable economic measure used to check the average differences in prices.
  • It was necessarily developed to determine the wage changes in order to see the effect on the standard of living.
  • The index is still widely used to measure the cost differences across different countries.
Additional Information
1. Volume Index:
  • A volume index is most commonly presented as a weighted average of the proportionate changes in the quantities of a specified set of goods or services between two periods of time; volume indices may also compare the relative levels of activity in different countries.
2. Aggregate Index:
  • Aggregate index is calculated by adding all elements in the composite for the given period and then dividing this result by the sum of the elements during the base period.

In Data Processing, what does the abbreviation SAP stand for?
  • a)
    Systems, Applications, Products
  • b)
    Sales, Allocations, Purchases
  • c)
    Systems, Authorizations, Programs
  • d)
    Systems, Algorithms, Processes
Correct answer is option 'A'. Can you explain this answer?

Rajeev Kumar answered
Important Points
  • SAP is one of the world’s leading producers of software for the management of business data processes.
  • SAP provides “future-proof Cloud ERP solutions that will power the next generation of business”.
  • SAP can boost your organization's efficiency and productivity by automating repetitive tasks, making better use of your time, money, and resources.
Key Points
  • An SAP number is a unique six-digit number used by a municipality to identify a vendor in its system.

Comprehension:
Directions: Consider the following data and answer questions:

Which one of the following is the relative frequency in percentage for class limit 41-50 from the given data set?
  • a)
    42%
  • b)
    26%
  • c)
    16%
  • d)
    24%
Correct answer is option 'B'. Can you explain this answer?

Rajeev Kumar answered
  • Relative frequency can be defined as the number of times an event occurs divided by the total number of events occurring in a given scenario. The relative frequency formula is given as Relative Frequency = Subgroup frequency/ Total frequency.
  • Relative Frequency = f/ n*100, where, f is the number of times the data occurred in an observation. n = total frequencies.
  • Relative frequency is simply the class frequency (fi) It is expressed as a proportion of the total frequency (N) of a given distribution. It is sometimes measured as a percentage of the total frequency. The sum of all relative frequencies in a given distribution is equal to the total frequency.

Therefore the relative frequency for class limits 41 - 50 is 39, and it is 26% of the total frequency.

Given below are two statements:
Statement I: Use of multivariate statistics in social research has increased due to the availability of statistical software.
Statement II: Multivariate statistics are easier to comprehend as compared to the bi‐variate statistics.
In light of the above statements, choose the correct answer from the options given below:
  • a)
    Both Statement I and Statement II are true
  • b)
    Both Statement I and Statement II are false
  • c)
    Statement I is true but Statement II is false
  • d)
    Statement I is false but Statement II is true
Correct answer is option 'C'. Can you explain this answer?

Rajeev Kumar answered
Multivariate indicates that numerous dependent variables are combined to produce a single result.
Key Points
Statement I:
Use of multivariate statistics in social research has increased due to the availability of statistical software.
  • One of the most useful tools for determining links and analyzing patterns among big collections of data is multivariate analysis.
  • With the advent and proliferation of computers in the mid-1950s, multivariate analysis began to play an increasingly important role in social research.
  • Use of the multivariate analysis approach to perform very complex statistical analyses with the help of computers.
Thus Statement I is true.
Statement II: Multivariate statistics are easier to comprehend as compared to bivariate statistics.
  • Bivariate statistics is Inferential statistics that deals with the relationship between two variables.  
  • It looks at how one variable compares to another or how one variable influences another.
  • When there are more than two variables in a data set multivariate analysis is a more advanced form of statistical analysis technique which makes it complex as compared to the bivariate statistics.
Thus Statement II is false. 
Therefore, Statement I is true but Statement II is false.

Comprehension:
Directions: Consider the following data and answer questions:

Which one of the following is the cumulative frequency for the class limit 61-70 from the given data set?
  • a)
    10
  • b)
    116
  • c)
    130
  • d)
    140
Correct answer is option 'C'. Can you explain this answer?

Quantronics answered
Key Points
  • Cumulative frequency analysis is the analysis of the frequency of occurrence of values of a phenomenon less than a reference value.
  • The phenomenon may be time- or space-dependent. Cumulative frequency is also called the frequency of non-exceedance.
  • Technically, a cumulative frequency distribution is the sum of the class and all classes below it in a frequency distribution. All that means is you’re adding up a value and all of the values that came before it.
  • Therefore the cumulative frequency of the entire dataset is 150.
  • The cumulative frequency for the class limit 61 - 70 is 130.

Comprehension:
Directions: Consider the following data and answer questions:

Which one of the following is the arithmetic mean value for the given data set?
  • a)
    46.87
  • b)
    52.26
  • c)
    49.22
  • d)
    51.23
Correct answer is option 'D'. Can you explain this answer?

Quantronics answered
Key Points
  • Arithmetic mean represents a number that is obtained by dividing the sum of the elements of a set by the number of values in the set. So you can use the layman's term Average, or be a little bit fancier and use the word “Arithmetic mean” your call, take your pick -they both mean the same.
  • The arithmetic mean may be either- Simple Arithmetic Mean, or Weighted Arithmetic Mean.

Which one of the following frequency distribution is negatively skewed?
  • a)
    A
  • b)
    B
  • c)
    C
  • d)
    D
Correct answer is option 'D'. Can you explain this answer?

Quantronics answered
Key Points:
  • A negatively skewed distribution is the distribution composed of mostly large observations and a few relatively small observations. For example, a test score on a very easy test of some general education courses will probably follow a negatively skewed distribution. A negatively skewed (also known as left-skewed) distribution is a type of distribution in which more values are concentrated on the right side (tail) of the distribution graph while the left tail of the distribution graph is longer.
  • Most of the students will do very well, but a few students may never come to class and will score very poorly, so test scores will be composed of many high scores and a few relatively very low scores.
  • In this case, Median should be used instead of mean since it is not influenced by a few relatively very low scores.
  • in a negatively skewed distribution mean is less than the median since the mean is influenced by a few relatively very low scores.
Important Points:
  • In a Positively skewed distribution, the mean is greater than the median as the data is more towards the lower side and the mean average of all the values, whereas the median is the middle value of the data
Additional Information:
  • A positively skewed distribution is the distribution that is composed of mostly small observations and a few relatively or extremely large observations. For example, test scores on a very difficult exam probably follow a positively skewed distribution.
  • Because most students probably will Score quite low and only a few students probably will score high. To this date, I still remember my biochemistry mid-term at the University of Calcutta. we have over 120 students with 115 scores of 20- 50 out of a hundred, only less than 5 score 70 to 80. for this skewed distribution, the median is the one to use since it is not influenced by a few relative ne very high scores.

Which of the following is a data visualization method?
  • a)
    Line
  • b)
    Circle and Triangle
  • c)
    Pie chart and Bar chart
  • d)
    Pentagon
Correct answer is option 'C'. Can you explain this answer?

Pie charts and bar charts are both data visualization methods commonly used to represent data visually.

Pie Chart:
A pie chart is a circular graph divided into sectors, where each sector represents a proportion or percentage of a whole. It is most useful when comparing the parts of a whole, showing how each part contributes to the total. The size of each sector corresponds to the relative magnitude of the data it represents. The sectors are typically labeled with categories or values, allowing viewers to easily interpret the information presented. Pie charts are especially effective for showing simple proportions and percentages.

Bar Chart:
A bar chart, also known as a bar graph, uses rectangular bars of varying heights or lengths to represent data. The length or height of each bar corresponds to the quantity or value it represents. Bar charts are commonly used to compare different categories or groups, showing the relationship or distribution between them. They are effective in displaying categorical data and can be used to show trends over time or comparisons between different variables. The bars can be vertical or horizontal, depending on the orientation of the chart.

Comparison:
Both pie charts and bar charts are effective in visually representing data, but they have different strengths and use cases.

- Pie charts are best suited for showing proportions or percentages, as they provide a clear visual representation of how each part contributes to the whole. They are particularly useful when comparing a few categories or parts.
- Bar charts, on the other hand, are suitable for comparing multiple categories or groups. They are versatile and can be used to display a wide range of data, including quantitative, categorical, or time-based information. Bar charts are often used to show trends, patterns, or comparisons between different variables.

In conclusion, both pie charts and bar charts are effective data visualization methods, but they serve different purposes. Pie charts are best for showing proportions, while bar charts are more versatile and can be used for various types of data comparisons.

If data has been recorded using technical media, which among the following is a necessary step on the way to its interpretation?
  • a)
    Transcription
  • b)
    Structural Equation Modelling
  • c)
    Sequential Analysis
  • d)
    Sampling
Correct answer is option 'A'. Can you explain this answer?

Rajeev Kumar answered
Transcription:
  • It is the process by which audio files are interpreted into text. If data has been recorded using technical media transcription is used for interpretation.
  • The most common type of transcription is that when computer file in transcribed into text or in any form which is suitable for printing.
  • Earlier transcription was a difficult job because secretaries have to write down the speech using various techniques like shorthand. But this work has become easier as the new possibilities have emerged and there is the advancement of technology.
  • With the birth of modern technology, the work of transcription has become much easier like speed recognition (which we can see in our mobiles as well)
Note:
  • Structural equation modelling (SEM) techniques are considered today to be a major component of applied multivariate statistical analysis and are used by education researchers, economists, marketing researchers, medical researchers, and variety of other social and behavioural scientists.
  • Sequential analysis is a statistical analysis where the sample size isn't fixed beforehand. Instead, data are evaluated as soon because it is collected, there are no predetermined rules for sampling instead everything is completed as and when samples are collected and evaluated.
  • Sampling has been an age-old practice in everyday life. Whenever we want to buy a huge quantity of a commodity, we decide about the total lot by simply examining a small fraction of it. It has been established that the sample survey if planned properly.

Hence, if data has been recorded using technical media, transcription is a necessary step on the way to its interpretation.

A statistical measure that indicates the extent to which changes in one factor are accompanied by changes in another 
  • a)
    Standard deviation
  • b)
    Correlation coefficient
  • c)
    Quartile deviation
  • d)
    Range
Correct answer is option 'B'. Can you explain this answer?

answered
Correlation coefficient: If the change in one variable appears to be accompanied by a change in the other variable, the two variables are said to be co-related and this inter-variation is called correlation.
  • For example, if you want to study the relationship between height and weight - whether the change in one will bring a change in other or not. Or if you want to find the relationship between hours of study and achievement, sex and enrolment, etc., you can do so by finding a correlation between them.
  • The degree of association or the degree of relationship between two variables is measured quantitatively in the form of an index which is termed as co-efficient of correlation.
  • The coefficient of correlation is a single number that tells us to what extent the two variables are related and to what extent the variations in one variable changes with the variations in the other. 
Hence, we conclude that the above statement is about the Correlation coefficient.

Chapter doubts & questions for Data Analysis - The Complete SAT Course 2026 is part of Class 10 exam preparation. The chapters have been prepared according to the Class 10 exam syllabus. The Chapter doubts & questions, notes, tests & MCQs are made for Class 10 2026 Exam. Find important definitions, questions, notes, meanings, examples, exercises, MCQs and online tests here.

Chapter doubts & questions of Data Analysis - The Complete SAT Course in English & Hindi are available as part of Class 10 exam. Download more important topics, notes, lectures and mock test series for Class 10 Exam by signing up for free.

The Complete SAT Course

405 videos|220 docs|164 tests

Top Courses Class 10

Related Class 10 Content