Introduction to Confidence Interval Estimates
2015(c) Thomas G. Groleau
(revised March 2018)

Introduction

Here is one of the most important concepts in statistics: Interval estimate = point estimate + margin of error.

What's a point estimate? A "statistic" is something calculated from sample data such as a sample mean or sample proportion. If we use that value to estimate a characteristic of a population (also called a parameter), then it becomes a point estimate.

For example, suppose I want to know the mean annual income of all CPAs in the state of Ohio. If I gather a representative sample of 75 CPAs and find their mean annual income is \$65,730, then I shouldn't expect the mean annual income of all CPAs to be exactly \$65,730. Of all the numbers in the universe, this is the just the point on the number line that I would use as my estimate. I would hope that this estimate is 'close' to the parameter for all CPAs but I need a margin of error to know how close. I also need something called a confidence level, but we'll worry about that later.

Estimating Proportions

Read this article. It makes many data-based claims that could be discussed but we'll focus on the headline and the first two paragraphs.

The headline says "Almost Half of Americans Aren't Saving Nearly Enough". That sounds like a lot of people but what do they mean by "almost half" What is meant by"Americans"? We get the first answer in the first paragraph "A reported 46 percent of the country's consumers store less than 5 percent of their annual incomes into longer-term savings." That's more specific. It's not just "almost half", it's 46%.

But how did they get that number? They didn't ask me. Did they ask you? If they didn't ask all "Americans" then how can they claim to know anything about all of them?

We get some answers in the second paragraph: "The study, which surveyed 1,000 adults living in the continental U.S" This answers a lot of questions. "Americans" in the headline is a poorly defined term. Anyone living in North, Central or South America could legitimately claim to be an "American". This study restricts the term to "adults" (another undefined term) living in the continental U.S. (sorry Alaskans and Hawaiians).

The headline isn't necessarily wrong, it just isn't very specific. Now that we know a bit more about the study we can make a much clearer statement: Of 1,000 adults surveyed, 46% claim to store less than 5% of their annual incomes in longer-term savings. Do you think it's reasonable to survey 1,000 U.S. adults and apply the results to ALL U.S. adults?

Read the following articles (Marijuana Use, Race Relations, Gluten-Free) and answer the following questions for each. You might need to dig into a link or two from the original article.

1. What is the question? In other words, to get counted in the "yes" percent, what had to be true about the survey respondent?
• If there seem to be competing "yes" groups just select the one you are most interested in.
2. How large was the sample (i.e. how many people were surveyed)?
3. What population are they trying to apply the results to?
4. Does the article say anything about a margin of error? If so, what is it?
5. Do you think there still ill-defined terms? If so, which ones?

Here are the answers for the savings article.

1. The respondent had to store less than 5% of their annual income in longer-term savings.
2. One thousand people.
3. All adults in the United States.
4. No mention of margin of error.
5. In this case, no definition is provided for "adult" or "longer-term savings".

See if you can answer these for the other articles before reading further.

Before we dig back into the first article or the other examples, let's look at a much simpler example. Suppose I have 150 poker chips. That's all I have. The entire universe (or population) is 150 poker ships. Of these, 69 are blue. That's 46%, almost half, of the entire population.

Now suppose that you don't know how many were blue. You just have the bag of chips and you don't have time to count them all. You're going to randomly select 20 of them. What percent do you think will be blue? (Hint: it will NOT be 46%.)

If you have time, you could get some poker chips and try it a few times. For now, we'll let a computer simulate what might happen. You can download the spreadsheet file here and run a few simulations. The file has VBA macros and it will not work until you enable the macros. Watch this video before reading any further. You might want to watch the video before you try the simulation yourself.

Note two things. First, the number of blue chips in the bag never changed. Therefore the 46% was constant. This is called the population proportion (notated p or π depending on the textbook). Second, our answers varied. These were sample statistics, specifically sample proportions (notated ). These 's are part of a sampling distribution. The sampling distribution of sample proportions is the distribution of ALL possible 's when randomly selecting 20 out of 150 of these chips.

There are many places to read about sampling distribution theory but we want to get to main point as quickly as possible. The 's follow an approximate normal distribution which means that about 95% of them will be within 2 standard deviations of the mean. If we know the mean and standard deviation of that distribution, then we can predict a range of values (an interval) where 95% of the 's will occur.

In this case, the mean of all 's is 0.46 and the standard deviation is 0.025 (rounded). This isn't magic. Something called the Central Limit Theorem tells us that, under certain circumstances, sample distributions for 's will always follow an approximate normal distribution with This standard deviation has a special name. It's called a "standard error". The sample statistic is used to estimate a population parameter. It's rare that this estimate will be exactly correct. In other words, there's usually an "error" in the estimate. This standard deviation measures the variability of that error. Thus the name "standard error".

Using the mean and standard deviation for this example, 95% of the 's should be between 0.242 and 0.678. Before we figure out where we got those numbers, let's look back at the simulation spreadsheet. Look at yours and see if this is close to what happened. Here's a screen shot from mine showing that 96.6% of the 's were between those two values. That's pretty close to the predicted 95%. Now let's see where those numbers came from. The empirical rule tells us that "about" 95% of values in a normal distribution land within 2 standard deviations of the mean. If we want to be more specific, it's really 1.96 standard deviations. Feel free to use 2 for quick calculation but we'll use 1.96 when a computer does the number crunching for us. This means our interval is calculated by: The title of this documents mentions "confidence interval estimates". We need to cover three points before we get to finally compute a confidence interval estimate.

First Point: the formulas above use π. In the real world we won't know this value. Think about the articles you read. If they already KNEW the proportion of a population then they wouldn't do a survey to estimate it. Therefore, we're going to swap for π in the formulas. That seems kind of like cheating but it works under the right conditions.

Second Point: the conditions. 1) Your data needs to be a random sample. 2) The size of your sample should be less than 10% of the entire population. 3) There should be at least 5 respondents in your "yes" group and at least 5 in your "no" group.

Third Point: confidence. Since 95% of all 's should land in the interval above, we say that we're "95% confident" that π (which is unknown) is in the interval we computed.

Finally, here it is: a 95% confidence interval estimate for a population parameter π is A 95% level of confidence is the most common level used. However, if you want a different level of confidence, you would use different values for 1.96 (which, by the way, is called a critical value). A computer will usually compute these intervals for us but here's a short table showing common confidence levels and the appropriate critical values.

 Confidence Critical Value 90% 1.645 95% 1.960 99% 2.576

Now let's go back to the original article. Of 1000 people, 46% met the researchers' "yes". They put less than 5% of their annual income into longer-term savings. Forty-six percent is a sample statistic, , not a population parameter. What does this tell us about ALL U.S. adults? Based on this survey, we don't KNOW anything. However, based on this survey we are 95% confidence that somewhere between 42.9% and 49.1% of US Adults put less than 5% of their annual income in longer-term savings.

Before we look at the other article examples, let's examine a spreadsheet that will do these calculations for us. Here's what it looks like. Go back to the marijuana, race, and gluten examples. Before reading any further use the spreadsheet template to to compute 95% confidence intervals for each one.

More on Margin of Error

Let's look back at the confidence interval formula. The statistic, , is a point estimate. It's the center of our interval. What we add to and subtract from is called the This is called the margin of error. This fits the generic formula for a confidence interval estimate that we started with: point estimate + margin of error. Without the margin of error (and confidence level) you have no idea how good your estimate is.

Let's stick with 95% confidence for an example. Suppose you're trying to figure out whether or not Candidate Jones will win an election. It initially sounds pretty good if a poll returns a point estimate of 56% voting for Jones. However, if the margin of error is 12% then we don't know much. All we can say is that we're 95% confident that somewhere between 44% and 68% will vote for Jones and that interval is too wide to tell us much. In contrast, if the margin of error was 3% then we'd be 95% confident that Jones will get between 53% and 59% of the vote. That's a pretty good estimate.

Anytime someone gives you a point estimate, you should ask for the margin of error. If you're reading an article about proportions that doesn't provide a margin or error, there's a formula for a quick approximation of the 95% margin of error: one divided by the square root of the sample size.

Let's check it for our savings article example. The properly calculated ME was .0309 and one divided by the square root of 1000 is .0316. The rough approximation is pretty close since both computations round to a 3% margin of error. Now try it for the other three examples.

We need to deal with the relationship between Margin of Error, Confidence Level, and Sample Size but first, let's see how things change when the data is numeric instead of categorical (nominal).

Estimating Means

Rather than the headlines, let's look at some recent history. We'll look at the Car Allowance Rebate System (a.k.a. "cash for clunkers") from 2009. Since it was a government funded program the data on each transaction is considered public data and was made available in Fall 2009. The initial release contained over 700,000 transactions. Later a revised set, along with a set of "cancelled" transactions, was released with over 677,000 transactions.

I've taken two random samples of around 500 transactions from Illinois, one sample from July 2009 sales and one from August 2009 sales. You can download the samples here.

Our goal is to estimate the mean MPG of all new cars purchased through CARS in Illinois each month. Like any interval estimate, we have the form point estimate + margin of error. Since we're trying to estimate the population mean of ALL new cars, μ, it would make sense to use the sample mean, for the point estimate. That gets us to + margin of error. Now we need to figure out the margin of error. As with sample proportions, 's are statistics and vary from sample to sample. Therefore they also have a sampling distribution and, under the right circumstances, they follow an approximate normal distribution with If we continue to follow the logic from proportions we would get the following 95% confidence interval formulas: If we wanted other confidence levels, we would change the 1.96 critical value just as with proportions. However, we have a small problem. This formula uses sigma, the population standard deviation. We aren't likely to know this value in real life and we'd use the sample standard deviation, s, as a substitute. With samples this large, we could make the substitution of s for sigma and go on our merry way. If we were doing statistics with calculators and pencils that's exactly what we would do and we'd make an adjustment only with "small" samples. However, we're not using pencils, we're using computers so we'll make the adjustment now.

Instead of following a normal distribution, the adjustment requires us to use something called "student's t-distribution". I can't give you a small table of critical values because they depend on both the confidence level and the sample size. We'll skip the details other than to show you what the formulas look like: Yes, that's kind of ugly. Fortunately you won't have to know it very well. The "t" stands for t-distribution. The "df" stands for "degrees of freedom" and is always one less than your sample size. The critical t is computed from the confidence level. We use 1-α for confidence. Thus 95% confidence means α = 5% and α/2 is 2.5%. If we were looking up these critical values in a table, then we would need to use all of that information. Instead, the computer will find them for us.

Before we get to the computer, let's partially try it ourselves. From the spreadsheet find the sample mean and sample standard deviation for new vehicle MPG for July (this variable is the last column in the Excel file). For 95% confidence and a sample this large, the t value is 1.964 (as I said, it's close to the same values you'd get for a normal distribution). Now try to compute the 95% confidence interval. 