Hypothesis Testing – Learning the roots

This article will try to:
  • Cover the basics of hypothesis testing.
  • Explain its dependence on Central Limit Theorem

What you are already supposed to know:

If you are a student of Statistics or Business Research, Data Analytics or Business Analytics you might have heard a lot about this term ‘Hypothesis Testing’. Possibilities are that you are already applying it without having a clear picture of what is going around. You are confused how null hypothesis is defined, what is null hypothesis, how and when is it rejected or what is meant by its rejection etc. If you were having any of the above doubts and were not able to clear them up until now then you are at the right place. This article will try to clear the cloud around Hypothesis testing.
Before getting any technical, let us start with a simple question to create curiosity in the air. You are given a big candy jar having tens of thousands of candies into it by someone and that someone claims that on an average each candy weighs 10 grams. You have to verify this claim. One way of doing that would be to weigh the whole content in the jar and count the number of candies into it and then calculate the average. But this method seems little infeasible and impractical to you as counting tens of thousands of candies will take you days. You now came up with another idea. You took out just one hundred candies out of it, you weighed them & calculated their average. The average of hundred candies but turn out to be just 8 grams. The question now in front of you is, whether to accept or reject his claim. But before you conclude anything just keep this in mind that he was talking about whole candy jar and you just verified the claim of only 100 candies.
To answer such questions there is a statistical technique called Hypothesis Testing that comes to your rescue. The idea goes like this:

Let’s suppose the jar is actually filled with the candies having an average weight of 10 grams and for the sake of understanding let’s further suppose that you took out 1000 such samples of 100 candies each from the jar as you took at the first place and you got the results as depicted in the table below:

The above table says that out of 1000 samples, 80 samples were those where average weight came up to be 7 grams, 120 samples were having average weight of 8 grams, 300 were having 9 grams as average weight and so on. We have simply grouped the samples together on the basis of common average weight.
Now, so that your particular sample gave you the average weight of 8 grams, you can say that out of thousand samples selected at random only 120 such samples are possible. In other words, the probability of your sample is 120/1000 i.e. 12%. Now, if 12% is a significant number for you, it can be said that the jar is actually having the average weight of 10 grams. Note that, it is because it was initially supposed that the jar is actually same as claimed by that someone & the table of average weight distribution that we got is actually from the jar.

As of now you might have got a little bit idea of what we are heading towards. The above assumptions have cleared a bit of cloud around the topic but there are still a lot of questions to be answered like:
  • Why only 1000 samples, there are infinite random samples of size 100 possible from the jar.
  • How is the assumption of jar having average weight of 10 grams and the table of samples connected?

I will try to connect the dots but before doing that, it is time to understand Central Limit Theorem. You are supposed to know about the theorem to understand this article clearly but let’s discuss this theorem a little bit too.

Central Limit Theorem (CLT)
The Central Limit Theorem states that if you have a population with mean μ and standard deviation σ and you take sufficiently large random samples from the population with replacement, then the distribution of the sample means will be approximately normally distributed. This will be true regardless of the fact that whether the source population is normal or not. Further, the mean of this particular sampling distribution will be equal to mean of the population and variance of the sampling distribution will be equal to the variance of the population divided by the sample size. Which further indicates that larger the sample size more tendency to the normal behavior this sampling distribution would have.
I will try to fit the above statements into the current scenario to make it easy to understand. For example, from the candy jar if you begin to do random sampling of size 100. Every time you draw a sample, you calculate its average weight and put the candies back, bring out another sample and repeat the process. You will get all the possible values of average weights and when grouping them together on the basis of common average weights, you will get a table similar to one shown above. There would be little change though, the numbers you will get won’t be necessarily integers, they can be decimals like 8.3 grams. Grouping all the decimal numbers won’t be feasible, hence the frequency of average weights would be mentioned against class intervals.
For example, the above table would be represented as:

This frequency table tells us that there were 80 samples who throw up the average weight between 7 grams to 8 grams, 120 samples where average weight came up between 8 grams to 9 grams and so on. The Average weight classes will be having frequencies or number of times they occurred against them. To draw a distribution graph, frequency values would be converted to relative frequencies or probability values. If you add up all the relative frequencies together, you will get the value of 1.

If you make the class intervals very small and do the sampling infinite number of times, the central limit theorem suggests that the plot of average weight classes against the probability values will be as shown below:

Highlights of the above plot:
  • The nature of the plot is normal.
  • The mean of the plot is 10 grams (= mean of population as per assumption).
  • The variance of the plot will be variance of population/100 (sample size =100).

This type of distribution plot that we get after repeated sampling is called the sampling distribution
So now, CLT has given us the tool to inspect all those infinite random samples that are possible from the candy jar and not just 1000.

Back to the testing
Let’s now recall the original problem where we took a single sample of 100 candies and got the average weight of 8 grams. Since the original claim was of more than 8 grams average weight, we will calculate the probability of getting an average weight of 8 grams or less when a random sample is drawn from a jar having overall average weight of 10 grams. If the probability is high enough, we can conclude that the assumption was right or in other words the claim saying that the average weight is 10 grams cannot be nullified. If the probability value is too little, it means the chances of getting this sample from the population having average weight of 10 grams is too low & since we still got this sample, the claim that the average weight of population is 10 grams is doubtful. So far, the concept of validating a sample through sampling distribution must be clear. We will now proceed to the mathematical stuff and hypothesis formulation.

Hypothesis Testing

Recalling the previous assumptions. The average weight of candies in the candy jar is 10 grams and as per CLT the sampling distribution of the jar would be normal with its peak at 10 grams as shown below (I know I keep repeating this :p). As you are already aware, for a normal probability distribution if we have to calculate probability between two points it is given by the area under the curve between those 2 points as shown in the graph below (probability between 8 and 10): 

If we want to calculate the probability value analytically, we have to use the Gaussian equation:

Applying to the present context, we need to find the probability of getting the average weight less or equal to 8 grams, which would be given by:
Let us inspect the various parameters in the above equation
  𝜇    = Mean of the sampling distribution
        = Mean of the population
        = 10 grams
𝜎      = Standard deviation of the sampling distribution 
        = standard deviation of the population/√n
        = standard deviation of the population/√100

Changing Normal to Standard Normal
You might be aware that we can convert any normal distribution integral to Standard normal integral by setting x-mean/std deviation = Z                        
(z would be the new variable in standard normal equation)
You can read more about normal distribution and standard normal distribution here 
Standard Normal Distribution is the one having mean 0 and standard deviation 1.
Now in the above case 𝜇 would be same as population mean and standard deviation would be 𝜎/√n so we have:
The above quantity is called Z – statistic (Zee statistic) and is directly linked to Hypothesis testing.
The advantage of calculating Z – statistic from mean and standard deviation is that we can easily use the Z- table, already formulated, to calculate Probability value. For example, the one available here .
Also, in most of the cases as in present case we don’t know the standard deviation of population and in that scenario, we calculate the standard deviation of the sample and consider that as the population standard deviation.

Steps in Hypothesis Testing
  • Assume the claim about population data to be true (Null Hypothesis).
  • Take a sample and calculate mean and standard deviation.
  • Calculate Z – Statistic using the formula
  • Use Z-Statistic to calculate Probability value (called p-value) from Z-table
  • Reject or don’t reject null hypothesis based on p-value

Regarding p-value
You may ask the question that what is the value of probability below which we reject null hypothesis. The answer is: it depends upon case to case and relies wholly upon the one who tests the hypothesis. The value of probability which is considered as threshold is called significance level. Normally a 5% significance level is considered in most of the cases.

Completing the case
Coming back to the candy jar. The things that we know so far are
Sample mean                   = 8 grams
Population mean             = 10 grams
Population Standard deviation (let’s assume some value for it here, in actual scenario you can calculate it from the sample you got) = 2 grams
Sample size (n)               = 100
Significance Level            = 5% (let’s settle on 5%)
The above Z- statistic can now be used to calculate p-value (Probability) to find how significant our sample is. We will use the Z-table available here. If you look up the Z-table, you will find the p-value is significantly low (much lower than our significance level). Hence, we can safely reject the null hypothesis that the candy jar average weight is 10 grams.

Further Reading
If you find the above concepts interesting, you can further read the following topics to know more about this field of statistics.

Thanks for reading this
Have a good time 😊