What is the Chi-square goodness of fit test?

The Chi-square quality of fit test is a statistical hypothesis test supplied to recognize whether a change is most likely to come from a specified distribution or not. It is often used to evaluate whether sample data is representative the the full population.

You are watching: What conclusion is appropriate if a chi-square test produces a chi-square statistic near zero?

When deserve to I usage the test?

You deserve to use the test as soon as you have counts of worths for a categorical variable.

Is this test the exact same as Pearson’s Chi-square test?


Using the Chi-square goodness of fit test

The Chi-square kindness of fit test checks whether her sample data is likely to it is in from a particular theoretical distribution. We have a set of data values, and also an idea around how the data values are distributed. The test provides us a method to decide if the data values have actually a “good enough” fit to our idea, or if our idea is questionable.

What do we need?

For the kindness of right test, we need one variable. We likewise need an idea, or hypothesis, around how that variable is distributed. Here are a couple of examples:

We have bags of candy with five flavors in every bag. The bags need to contain one equal number of pieces of each flavor. The idea we\"d prefer to check is that the proportions that the 5 flavors in each bag are the same.For a team of children’s sporting activities teams, we want kids with a the majority of experience, part experience and also no experience common evenly across the teams. Intend we know that 20 percent of the players in the league have actually a many experience, 65 percent have some experience and also 15 percent are new players with no experience. The idea we\"d like to check is the each team has actually the same proportion of children with a lot, part or no experience as the organization as a whole.

To apply the kindness of fit check to a data set we need:

Data worths that space a straightforward random sample from the complete population.Categorical or in the name data. The Chi-square goodness of fit test is not suitable for continuous data.A data set that is huge enough so that at least five values room expected in each of the it was observed data categories. 

Chi-square quality of fit test example

Let’s use the bags that candy as an example. We collect a arbitrarily sample the ten bags. Each bag has 100 pieces of candy and five flavors. Our hypothesis is the the proportions that the five flavors in each bag space the same.

Let’s start by answering: Is the Chi-square kindness of fit test an appropriate an approach to evaluate the distribution of spices in bags of candy?

We have actually a simple random sample the 10 bags the candy. We satisfy this requirement.Our categorical variable is the spices of candy. We have the counting of every flavor in 10 bags that candy. We meet this requirement.Each bag has actually 100 piece of candy. Every bag has 5 flavors the candy. We intend to have equal numbers because that each flavor. This way we intend 100 / 5 = 20 pieces of liquid in each flavor from each bag. For 10 bags in ours sample, we intend 10 x 20 = 200 pieces of candy in each flavor. This is much more than the necessity of five expected worths in every category.

Based on the answers above, yes, the Chi-square quality of fit test is one appropriate technique to evaluate the distribution of the flavors in bags the candy. 


Another straightforward bar chart mirrors the expected counts of 200 per flavor. This is what our chart would certainly look prefer if the bags that candy had an equal number of pieces of every flavor.


The statistical test is a method to quantify the difference. Is the actual data from ours sample “close enough” to what is supposed to conclude the the smell proportions in the full population of bags are equal? Or not? native the candy data above, most civilization would say the data is no “close enough” even without a statistical test.

What if her data looked like the instance in number 5 below instead? The violet bars present the it was observed counts and also the orange bars present the meant counts. Some world would speak the data is “close enough” but others would certainly say that is not. The statistics test offers a common means to do the decision, so the everyone provides the same decision top top a set of data values. 


Statistical details

Let’s look at the candy data and also the Chi-square test because that goodness the fit utilizing statistical terms. This check is additionally known together Pearson’s Chi-square test.

Our null theory is the the relationship of spices in each bag is the same. Us have 5 flavors. The null theory is composed as:

$ H_0: p_1 = p_2 = p_3 = p_4 = p_5 $

The formula above uses p because that the proportion of every flavor. If every 100-piece bag consists of equal number of pieces of liquid for each of the 5 flavors, then the bag includes 20 piece of every flavor. The relationship of each flavor is 20 / 100 = 0.2.

The different hypothesis is that at the very least one of the proportions is different from the others. This is written as:

$ H_a: at\\ least\\ one\\ p_i\\ not\\ same $

In some cases, we space not testing for equal proportions. Watch again in ~ the instance of children\"s sporting activities teams near the peak of this page. Utilizing that together an example, ours null and alternate hypotheses are:

$ H_0: p_1 = 0.2, p_2 = 0.65, p_3 = 0.15 $

$ H_a: at\\ least\\ one\\ p_i\\ not\\ equal\\ to\\ expected\\ value $

Unlike other hypotheses the involve a single populace parameter, we cannot use just a formula. We must use words and symbols to explain our hypotheses.

We calculate the test statistic utilizing the formula below:

$ \\sum^n_i=1 \\frac(O_i-E_i)^2E_i $

In the formula above, we have n groups. The $ \\sum $ symbol way to add up the calculations because that each group. Because that each group, we execute the same measures as in the candy example. The formula shows Oi  as the it was observed value and also Ei together the expected value for a group.

We then compare the test statistic to a Chi-square value with our chosen meaning level (also called the alpha level) and also the degrees of liberty for ours data. Utilizing the candy data together an example, we collection α = 0.05 and have four levels of freedom. Because that the candy data, the Chi-square worth is created as:

$ χ²_0.05,4 $

There are two possible results from our comparison:

The test statistic is reduced than the Chi-square value. Girlfriend fail to refuse the theory of same proportions. Friend conclude that the bags of candy across the entire populace have the same variety of pieces of each flavor in them. The to the right of same proportions is “good enough.”The test statistic is greater than the Chi-Square value. You reject the hypothesis of equal proportions. You cannot conclude that the bags the candy have the same number of pieces of each flavor. The to the right of equal proportions is “not great enough.”

Let’s use a graph of the Chi-square distribution to better recognize the check results. You room checking to check out if her test statistic is a more extreme worth in the circulation than the critical value. The distribution below shows a Chi-square distribution with four levels of freedom. That shows exactly how the crucial value of 9.488 “cuts off” 95% the the data. Only 5% that the data is better than 9.488.

See more: How To Do Tai Chi In Sims Freeplay, How Do I Get My Sim To Do Tai Chi In The Park

The next circulation plot includes our results. You can see how much out “in the tail” our check statistic is, represented by the dotted line at 52.75. In fact, v this scale, it looks favor the curve is in ~ zero whereby it intersects v the dotted line. That isn’t, but it is very, an extremely close to zero. Us conclude the it is really unlikely because that this case to take place by chance. If the true population of bags of candy had actually equal smell counts, we would be incredibly unlikely to see the results that we built up from our arbitrarily sample of 10 bags.

Most statistical software mirrors the p-value because that a test. This is the likelihood of finding a much more extreme worth for the test statistic in a similar sample, assuming the the null hypothesis is correct. It’s complicated to calculate the p-value through hand. For the number above, if the check statistic is specifically 9.488, climate the p-value will be p=0.05. With the check statistic of 52.75, the p-value is very, an extremely small. In this example, most statistical software will report the p-value together “p