How many calories did each of us eat for breakfast? How far from home did everyone travel today? How big is the place that we call home? How many other people call it home? To make sense of all of this information, certain tools and ways of thinking are necessary. The mathematical science called statistics is what helps us to deal with this information overload.
Statistics is the study of numerical information, called data.
Statisticians acquire, organize, and analyze data. Each part of this process is also scrutinized. The techniques of statistics are applied to a multitude of other areas of knowledge. Below is an introduction to some of the main topics throughout statistics.
One of the recurring themes of statistics is that we are able to say something about a large group based on the study of a relatively small portion of that group. The group as a whole is known as the population. The portion of the group that we study is the sample.
As an example of this, suppose we wanted to know the average height of people living in the United States. We could try to measure over 300 million people, but this would be infeasible. It would be a logistical nightmare conduct the measurements in such a way that no one was missed and no one was counted twice.
Due to the impossible nature of measuring everyone in the United States, we could instead use statistics.
Rather than finding the heights of everyone in the population, we take a statistical sample of a few thousand. If we have sampled the population correctly, then the average height of the sample will be very close to the average height of the population.
To draw good conclusions, we need good data to work with.
The way that we sample a population to obtain this data should always be scrutinized. Which kind of sample we use depends on what question we’re asking about the population. The most commonly used samples are:
It’s equally important to know how the measurement of the sample is conducted. To go back to the above example, how do we acquire the heights of those in our sample?
Each of these ways of obtaining the data has its advantages and drawbacks. Anyone using the data from this study would want to know how it was obtained
Sometimes there is a multitude of data, and we can literally get lost in all of the details. It’s hard to see the forest for the trees. That’s why it’s important to keep our data well organized. Careful organization and graphical displays of the data help us to spot patterns and trends before we actually do any calculations.
Since The way that we graphically present our data depends upon a variety of factors.
Common graphs are:
In addition to these well-known graphs, there are others that are used in specialized situations.
One way to analyze data is called descriptive statistics. Here the goal is to calculate quantities that describe our data. Numbers called the mean, median and mode are all used to indicate the average or center of the data. The range and standard deviation are used to say how spread out the data is. More complicated techniques, such as correlation and regression describe data that is paired.
When we begin with a sample and then try to infer something about the population, we are using inferential statistics. In working with this area of statistics, the topic of hypothesis testing arises.
Here we see the scientific nature of the subject of statistics, as we state a hypothesis, then use statistical tools with our sample to determine the likelihood that we need to reject the hypothesis or not. This explanation is really just scratching the surface of this very useful part of statistics.
It is no exaggeration to say that the tools of statistics are used by nearly every field of scientific research. Here are a few areas that rely heavily on statistics:
Although some think of statistics as a branch of mathematics, it is better to think of it as a discipline that is founded upon mathematics. Specifically, statistics is built up from the field of mathematics known as probability. Probability gives us a way to determine how likely an event is to occur. It also gives us a way to talk about randomness. This is key to statistics because the typical sample needs to be randomly selected from the population.
Probability was first studied in the 1700s by mathematicians such as Pascal and Fermat. The 1700s also marked the beginning of statistics. Statistics continued to grow from its probability roots and really took off in the 1800s. Today it’s theoretical scope continues to be enlarged in what is known as mathematical statistics.