Testing Hypothesis

Testing Hypothesis

Introduction..... Symbols..... Hypothesis Test..... Errors..... Procedure..... Tests Methods (Chi, T-test and F-tests)..... Examples.....


A hypothesis (Greek= assumption) is a proposed explanation for a phenomenon.   A supposition or assumption advanced as a basis for argument which is tested using statistical methods generally using experimental samples.

A statistical hypothesis is an assumption about the distribution of a random variable.  As an example a collection of machine parts has a critical diameter of 25,4mm.   A statistical test of a hypothesis is a procedure which uses a sample to find out if the hypothesis that the mean of the population is 25,4mm can be accepted- believed to be true ***, or if it should be rejected - it is believed to be false.

The hypothesis may result from the following requirements.

1) The hypothesis be derive from a quality requirement
2) There be a simple need to confirm existing information
3) The hypothesis results from as a need to verify a theory
4) The hypothesis can be a simple guess resulting from anecdotal observations

*** Note: Statistical purist do not accept null hypothesis or alternative hypothesis.
They always use the term not-reject in place of accept.

The notes below sufficient to enable a mechanical engineer to understand the principles involved,   For more detailed information please use the linked sites or quality reference documents

α = Significance (1% 5% etc)-
Probability of a type I error
β = Probability of not rejecting the alternative hypothesis -
Probability of a type II error
γ = Confidence (95% 99% etc)
n = sample number
f(x) = probability function. (values between 0 and 1)
F(x) = probability distribution function.
ν = number of degrees of freedoms
Ho = Null Hypothesis
H1 = Alternate Hypothesis
c = critical value
Φ (x) = Probability distribution function.(Standardised probability )
μ = population /random variable mean
σ 2 = population /random variable variance
σ = population /random variable standard deviation
xm = arithmetic mean of sample
sx 2 = variance of sample
sx = Standard deviation of sample
ν = number of degrees of freedoms

Hypothesis Testing

Generally the question of interest is simplified into two competing claims ( hypotheses) the null hypothesis, denoted H0, against the alternative hypothesis, denoted H1. These two hypotheses are not however treated on an equal basis, special consideration is given to the null hypothesis.

The null hypothesis, H0 represents a theory that has been put forward, because it is considered to be true or because it is to be used as a basis but has not been proved.

The conclusion of testing an hypothesis is alway given in terms of the null hypothesis H0
Either it is not rejected or it is rejected in favour of the alternative hypothesis H1.

A simple hypothesis is one which defines a distribution completely e.g. H0 : X ~ N(5,20).  The null hypothesis is that the variable X has a normal population with a mean μ =5 and a variance σ 2 = 20

The significance (α ) of a statistical hypothesis test is the probability of wrongly rejecting the null hypothesis H0, if it is in fact true.

Consider a typical population normal distribution with an assumed mean μ o.
A sample is taken from the population with a sample mean = xm.   To test the null hypothesis H0 that the population mean μ = μ o a critical value c is determined such that the probability (α) of xm being outside the range μo - to - c is is very small.
(α is normally selected at 1% or 5%) This is the probability that the hypothesis is rejected when it is actually true.

If c is greater than μ o then the test is a right sided test.
If c is less than μ o the test is a left sided test.
If the range is μ o c the test is two sided
These options are shown in the figure below:

The alternative to the null hypothesis H0: μ = μ o can therefore take three forms

μ > μ o
μ < μ o
μ μ o

The first two options are one sided and the third is a two sided alternative.

Error Types

In an hypothesis test two types of error can occur. A type I error and a type II error. A type I error occurs when the null hypothesis is rejected when it is in true; that is, H0 is wrongly rejected.  A type II error occurs when the null hypothesis H0, is not rejected when it is in fact false.

A type I error is an alarm without the fire:...A type II error is a fire without an alarm.

A table showing the error types is shown below for illustration purposes the null hypothesis is H0:μ = μ o

Reject H0
say μ = μ 1
Don't Reject H0
say μ = μ o
Truth - H0
say μ = μ o
Type I Error
P = α
P = 1 - α
Truth - H1say
μ = μ 1
P = 1- β
Type II error
P = β

Consider the hypothesis that the mean of a population is μ o against an alternative hypothesis that the mean is a single value μ 1.  This is clearly a simplified case of reality.   μ 1 is greater than μ o and the problem is therefore a right handed test.  The probability density curves of the variable under consideration showing the null hypothesis and the alternative hypothesis is shown below.  The type I error with a probability (significance) of α is shown as is the type II error which has a probability (significance) of β.    The figure below shows why it is better to use the term "not-reject" as opposed to "accept" e.g

it is possible to have a high confidence (1- α ) that μ= μ o with a significant risk β that μ = μ 1 .

Confidence = 1 - α

The power of a statistical hypothesis test measures the test's ability to reject the null hypothesis when it is actually false.  In other words, the power of a hypothesis test is the probability of not committing a type II error.    It is calculated as 1- probability of a type II error.   Clearly the higher the power the better.

Power = 1 - P(type II error) = 1 - β

It is generally not easy to calculate the significance β to enable the power to be determined as there may be infinite alternative hypothesis values.   [It is very easy to calculate β if there was only one alternative hypothesis value e.g mean = μ1 ].  The primary method or reducing β and the consequent risk of a type II error is to increase the size of the sample (n).   Increasing n reduces the sample standard deviation sx resulting in thinner bell shaped curves and moving c towards the centre of the accept hypothesis region


A typical procedure to be following in Hypothesis testing is shown below

1) Specify the null hypothesis Ho and the alternative hypothesis H1
The null hypothesis is generally that a statistic is equal to a value and the alternative is that they are not equal

2) Set the significance value (α).   This is generally either 0,01 or 0,05

3) Decide which probability distribution is applicable

4) Using the significance value α to in conjunction with distribution tables to establish a critical value c

5) Using sample information calculate a sample statistic which is compared to the probability distribution.

6) Arrive at a conclusion.

F-test, T-tests, Chi-test

A number of special statistical distributions are available for testing hypothesis and relevant notes are provided on the pages linked below....

1) F-tests
When comparing two samples it is often necessary to test the validity that the samples are from the same distribution.   The F ratio test is used for this purpose.....F-test

2)The Chi-squared test is a tool which enables determination how much a sample distribution can deviate from a population if the hypothesis of equivalence is true.....Chi-Test

3) T-tests
When comparing two samples it is often necessary to test the validity that the samples are from the same distribution.   The F ratio test is used for this purpose....T-test

Example 1:

It is assumed that the birth of a single child there is a 50% probability that the child will be a girl and 50% will be a boy.   This hypothesis is tested by taking a sample of 4000 births in one year.  The number of boys in this sample is 2100.  The sample seems to indicate that the hypothesis is wrong.

The hypothesis to be testing is the null hypothesis H0: p = 50% = 0,5. (probability = 50%)

If the hypothesis is true then there should be 2000 boys in the sample of 4000 births.  For the sample the proportion of boys appears to be greater than 50% and so the relevant test is a right sided test.  A value of c is chosen such that if the hypothesis is true the probability of there being more than c boys in the sample is very small ( α = 1%). This significance is such that there is a 1% chance of incorrectly rejecting the hypothesis.

n = the sample number,  X = number of boys in 4000 births,   p = the probability of boys.   :Assuming that the hypothesis is true the critical value c is chosen from the equation.

P ( X > c ) p= 0,5 = α = 0,01

Reference pages
Normal distribution
Discrete distributions-Binomial distribution
Normal distribution tables
Confidence limits

X has a binomial distribution with a mean = 0,5 .4000 = 2000 and a variance σ 2 = n.p.(1-p)= 4000.0,5.0,5 = 1000.
As n is large the distribution can be approximated by a normal distribution.
The following equation applies

P (X > c) = 1 - P (X c) = (approx) 1 - Φ ((c-2000)/ 1000) = 0,01

From the normal distribution table - see ref page Φ(2,36) = 99%.  Therefore 2,36 = (c - 2000)/ 1000. therefore c = 2074
For this test X = 2100 > c = 2074 and therefore the null hypothesis is rejected.

Example 2:

The hypothesis to be tested is that the pupils in a particular school have above average IQ's.    It is known that IQ scores are normally distributed with a mean μ = 100 and standard deviation σ = 15.   A random sample of 11 children (n = 11) from the school shows a mean ( xm ) of 112.8.

The standard deviation of the sample mean = σ /(n)= 4,52

For this example the null hypothesis is that the pupils have an average - or below average IQ (μ μ o ).  The alternative hypothesis is that the pupils have an above average IQ μ > μ o

The significance level μ is selected at 5% .   The confidence level is therefore selected at 1 - 5% = 95%

The population is assumed to be normal

ref to the standardised distribution table Normal tables
Φ(1,65) = 95%.   Therefore 1,65 = (c - μ)/ (σ /(n)) = (c - 100)/4,52 . therefore c = 107.46

The sample statistic xm = 112,8.  This is greater than the critical value c and therefore the alternative hypothesis is not rejected

The conclusion is that it is likely that the pupils in the school do have a higher than average IQ

Example 3:

Consider a sample (n= 18) from a population in a diet clinic as shown in the table below.   The ideal weight is 100 and weights are expressed related to this ideal weight.  The standard deviation of the population is not known.  The hypothesis is that the weight of 100 is the population mean.

108 118 97 113 110 106 89 110 96
124 116 102 119 153 98 123 112 118

The sample mean is xm = 111,8 and the sample standard deviation sx = 14,24.
These values are calculated as shown on webpage Sample variables

The first requirement is to establish the null hypothesis:
For this example the null hypothesis that the population mean μ = 100 (The ideal value).
The alternative hypothesis is that the mean μ is is different to the ideal value .
For this test the requirement is for use of a two-sided test .   The two-sided hypotheses are

H0: μ = 100

The significance level α is set at 5% (This is the total area under the two tails) and therefore the confidence interval = (1- α) = 95% -(this is the total area under the central "Accept Ho" region )

As the sample is a small sample and the population standard deviation is uknown the t-statistic is selected .

The t- statistic is given by

The number of degrees of freedom ν = n-1 = 17

The +ve t value relating to (1 - α ) = 0,5 + 0.95/2 = 0,975. is obtained by referring to the t-tables T-Tables t = = 2,11

The resulting critical value c = 100 ± 2,11.( 14,24 / 17) = 92,71 to 107.29
Therefore if the value of xm is within the range 92,71 to 107.29 then there is a 95% confidence for not rejecting the null hypothesis.
If xm is outside this range there is only 5% significance that the null hypothesis correct

In fact xm = 111,8 and this is outside the accept range and the hypothesis is therefore rejected.

The actual t statistic for xm = 111,8 = 3,44 and this relates to significance that the mean = 100 at below 1%

Useful Related Links
  1. Acastat -software: Hypothesis Testing.... Very clear and relevant notes
  2. Statistics Glossary - Hypothesis Testing .... Very accessible notes with some detail.