Part C-Analysis of Data
Statistics is a numerical representation of information. Whenever, we quantify or apply numbers to data, in order to organise, summarise or better understand the information in which statistical methods are used.
Data analysis is done for inspecting, cleaning, transforming and presenting data with the goal of discovering useful information.
The credibility of findings and conclusions significantly depends on the quality of the research design, data collection, data management and data analysis. Data analysis is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap and evaluate data. According to Koul (1997), "the analysis of the qualitative data means studying the organised material, in order to discover inherent facts. These data are to be studied from as many angles as possible, either to explore new facts or to interpret already existing known facts".
There are many different data analysis methods, depending on the type of research. But the data may be classified into two broad categories i.e.
•
quantitative and
•
qualitative
In quantitative data analysis, one is expected to turn raw numbers into meaningful data, through the application of rational and critical thinking. It may include the calculation of frequencies of variables and differences between variables. A quantitative approach is usually associated with finding evidence to either support or reject hypothesis, which the researchers have formulated at the earlier stages of the research process.
Qualitative data analysis is the range of processes and procedures, whereby we move from the qualitative data that have been collected into some form of explanation, understanding or interpretation of the people and situations we are investigating.
The level of measurement refers to the relationship among the values that are assigned to the attributes for a variable. It is important to understand the level of measurement as it helps you to decide how to interpret the data from the variable concerned. Second, knowing the level of measurement helps you to decide which statistical techniques of
data analysis are appropriate for the numerical values that were assigned to the variables. The scale of measurement refers to ways in which variables numbers are defined and categorised. Each scale of measurement has certain properties which in turn determine the appropriateness for use in certain statistical analysis.
Before a researcher begins his/her analysis he/she must identify the level of measurement, associated with the quantitative data. The level of measurement can influence the type of analysis, one can use.
There are four levels of measurement scale and they are-
Simply, it is a system of assigning number symbols to an event, in order to label them. For example, the assignment of numbers of basketball players, in order to identify them. Such numbers cannot be considered to be associated with an ordered scale, for their order is of no consequence, the numbers are just convenient labels for the particular class of events and as such, have no quantitative value. Nominal scales provide convenient ways of keeping track of people, objects and events. One cannot do much with the numbers involved. For example, one cannot usefully average the number on the back of a group of football players and come up with a meaningful value.
Neither can one usefully compare the numbers assigned to one group, with the numbers assigned to another. The counting of members in each group is the only possible arithmetic operation, when a nominal scale is employed. Accordingly, we are restricted to use mode as the measure of central tendency. Generally, there is no used measure of dispersion for nominal scales. Chi-square test is the most common test of statistical significance that can be utilised, and for the measures of correlation, the contingency coefficient can be worked out.
It is the least powerful level of measurement. It indicates no order or distance relationship and has no arithmetic origin. It simply describes difference between things, by assigning them to categories. The scale wastes any information that we may have about varying degrees of attitude, skill understandings, etc. Inspite of all this, nominal
scales are still very useful and are wide used in surveys and other ex-post facto research, when data is being classified by major sub-groups of the population. Thus, nominal data is counted data.
The lowest level of the ordered scale chat is commonly used is the ordinal scale. le places events in order, but there is no attempt to make the interval of the scale equal, in terms of some rules. Rank orders represent ordinal scales and are frequently used in research relating to qualitative phenomena. A student's rank in graduation class involves the use of an ordinal scale. One has to be very careful in making statements about scores based on an ordinal scale. For instance, if Ram's 40, it position in his class is 10 and Mohan's position cannot be said that Ram's position is four times as go as that of Mohan. The statement would make no sense at all. Ordinal scales only permit ranking of items, from highest to lowest. Ordinal measures have no absolute values and real differences between adjacent ranks may not be equal. All that can be said is that person is higher or lower on the scale than another, but more precise comparisons can be made.
Thus, the use of an ordinal scale implies a statement of greater than or less than (equality statement is also acceptable) without our being able to state how much greater or less. The real difference between ranks I and 2 may be more or less, than the difference between ranks 5 and 6. Since, the numbers of this scale have only a rank meaning the appropriate measure of central tendency is the median. A percentile or quartile measure can be used for measuring dispersion. Correlations are restricted to various rank order methods, whereas measures of statistical significance are restricted to the non-parametric methods.
In interval scale, the intervals are adjusted in terms of rules that have been established as a basis for making the units equal. The units are only in so far as one accepts the assumptions, on which the rule is based. It can have an arbitrary zero, but it is not possible to determine for them what may be called an absolute zero or the unique origin.
The primary limitation of the interval scale is the lack of true zero; it does not have the capacity to measure the complete absence of a trait or characteristic.
The Fahrenheit scale is an example of an interval scale and shows similarities in what one can and cannot do with it. One can say that an increase in temperature from 10° to 40° involves the same increase in temperature as an increase from 60° to 70°, but one cannot say that the temperature of 60° is twice as warm as the temperature of 30° because both numbers are dependent on the fact that the zero on the scale is set arbitrarily, at the temperature of the freezing point of water. The ratio of the two temperatures i.e. 30° and 60°, means nothing because zero is an arbitrary point.
It provides more powerful measurement than ordinal scales, for interval scale also incorporates the concept of equality of interval. As such, more powerful statistical measures can be used with interval scales. Mean is an appropriate measure of central tendency, while standard deviation is the most widely used measure of dispersion. Product moment correlation techniques are appropriate and the generally used tests for statistical significance are the T-test and F-test.
zero of Ratio scale have an absolute or true measurement. The term absolute zero is not as precise as it was once believed to be. We can conceive of an absolute zero of length and similarly we can conceive of an absolute zero of time. For example, the zero point on a centimeter scale indicates the complete absence of length or height. But an absolute zero of temperature is theoretically unobtainable and it remains a concept, existing only in the scientist's mind.
The number of minor traffic rule violations and the number of incorrect letters in a page of type script, represents scores on ratio scales. Both these scales have absolute zeros and as such, all minor traffic violations and all typing errors can be assumed to be equal in significance. With ratio scales involved one can make a statement like Jyoti's typing performance was twice as good as that of Reetu. The ratio involved does have significance and facilitates a kind of comparison, which is not possible in case of an interval scale.
It represents an actual amount of variables. Examples are measures of physical dimensions, such as weight, height, distance, etc. Generally, all statistical techniques are usable with ratio scales and all manipulations that one can carry out with real numbers, can also be carried out with ratio scale values. Multiplication and division can be used with this scale, but not with other scales mentioned above. Geometric and harmonic means can be used as measures of central tendency and coefficients of variation may also be calculated.
Thus, proceeding from the nominal scale (the least precise type of scale) to ratio scale (the most precise), relevant information is obtained, increasingly. If the nature of the variables permits, the researcher should use the scale that provides the most precise description. Researchers in physical sciences have an advantage to describe variables in ratio scale form, but the behavioural sciences are generally limited to describe variables in interval scale form, a less precise type of measurement.
396 videos|67 docs
|