Toss a coin 20 times in a sample and count
the number of heads. Repeat the same process for seven samples each
constituting 20 tosses. Calculate the sample proportion of heads for each
sample and test whether each sample proportion is a significant estimate of the
population proportion. Discuss what makes the sample proportion a statistically
significant estimate of the population proportion.
Key Words: Population Proportion, Sample Proportion, Standard Error, z-score,
Statistical Significance, Sample Size
Introduction
Not all sample proportions are statistically significant estimates of the
population proportion. Then, questions arise what sample proportion is
significant and what makes the statistical significance. Tossing of a coin is
an example of the binary categorical random variable to explain the statistical
significance of the sample proportions. Refer to my statistical note 36 to know
more about population and sample proportions and normality of the sampling
distribution of the sample proportions.
Observed Data
I tossed a coin 20 times in a
sample (S) and the same process was repeated for seven samples (S1 to S7). Can
one guess how many heads will there be in each sample? Table 1 presents the
outcome of 20 tosses of a coin in each of seven samples.
Every sample of 20 tosses in
Table 1 is from a population constituting the large number of possible tosses. Below
section discusses on whether the sample proportions are statistically
significant estimates of the population proportion.
Discussion
Population proportion,
denoted by ‘p’ in coin tossing experiments is 0.5. Sample proportion is denoted
by ’p^’ and spelt as ‘p-Hat’. The sample
proportion of heads in 20 tosses of a coin ranged between 0.35 to 0.65 (Table 1).
Mean of sample proportions ‘p^’, also called center, is the population proportion ‘p’. Symbolically,
it is indicated by µp^=p. In coin tossing, 0.5 is the mean of sample
proportions or the population proportion.
Standard Deviation of Sample Proportions
is expressed as the square root of the population proportion multiplied by one
minus population proportion divided by sample size. This is referred to as spread
or Standard Error (SE) of sample proportions, denoted by σp^ is Ö[p x (1-p)/n] where ‘p’ is the
population proportion and ‘n’ is the sample size. In this example, SE is
calculated to be Square root [0.5 x (1-0.5)/20], equal to 0.111803.
Sampling distribution of sample proportions with the sample size of 20
tosses of a coin is approximate to normal distribution with mean p=0.5 and σp^=0.111803.
In sampling distribution of sample proportions following normal
distribution, z-score or test statistic is a measure calculated as the
difference between population proportion and sample proportion divided by SE of
sample proportions. Symbolically, z=(p+p^)/Ö[p x (1-p)/n]. In normal
distribution the z- score equal to minus or plus 1.96 is a
commonly used cut-off point for the sample proportion to be a statistically
significant estimate of the population proportion indicating that 95 percent
samples have population proportion within the confidence interval of minus 1.96 to plus 1.96. It means one is 95 percent confident that the population
proportion will fall within 1.96 confidence interval. z-score
ranging between negative to positive cut-off point is called the confidence
interval. z-score
less than minus 1.96 and greater than plus 1.96 indicate that the sample
proportion is a statistically significant estimate of the population proportion
from a different population. One point to note here is that the sample size ‘n’ is directly
proportional to z-score indicating that as the sample size ‘n’ increases,
z-score also increases.
I calculated the z-score for all seven sample proportions in this example
(Table 2). For example, for the sample proportion P^=0.35, z = (p^-p)
/ σp^, where σp^=Ö[p x (1-p)/n] = Ö[0.50
x 0.50/20] = 0.111803 so that z =
(0.35-0.50) / 0.111803 = 1.3416. Likewise, the z-score was calculated
for each sample proportion and tabulated. Using the cut-off point of z-score
equal to minus or plus 1.96, the sample
proportions in this example were not found to be the significant estimates of the population
parameter of 0.50, given the sample size of 20 tosses of a coin.
Table 2: Sample proportions and
their significance to estimate population proportion of samples constituting 20
tosses of a coin
The sample proportions as smaller as 0.2808 and as bigger as 0.7192 are statistically
significant estimates of the population proportion 0.50, given the sample size
of 20 tosses of a coin.
If the sample size is increased, the sample proportions bigger than 0.2808
and smaller than 0.7192 will be statistically significant estimates of
population proportion of 0.50. Example, if the sample size is increased to 50
tosses, the sample proportions as smaller as 0.3614 and as bigger as 0.6386 are
statistically significant estimates of the population proportion of 0.50.
Likewise, if the sample size is increased to 100 tosses, the sample proportions
as smaller as 0.4020 and as bigger as 0.5980 are statistically significant
estimate of the population proportion of 0.50.
I am curious whether John Kerrich’s observed sample proportion of heads equal to 0.5067 in 10,000 tosses of a coin is a statistically significant estimate of population proportion of 0.5 or not. z-score for this is calculated to be 1.34, which is lower than the cut-off point of 1.96 indicating that this is one of 95 percent samples each of 10,000 tosses so that one can be 95 percent confident that this sample proportion is an insignificant estimate of the population proportion, 0.50. Thus, this sample proportion is not a statistically significant estimate of the population proportion, 0.50.
I am curious whether John Kerrich’s observed sample proportion of heads equal to 0.5067 in 10,000 tosses of a coin is a statistically significant estimate of population proportion of 0.5 or not. z-score for this is calculated to be 1.34, which is lower than the cut-off point of 1.96 indicating that this is one of 95 percent samples each of 10,000 tosses so that one can be 95 percent confident that this sample proportion is an insignificant estimate of the population proportion, 0.50. Thus, this sample proportion is not a statistically significant estimate of the population proportion, 0.50.
Conclusion
Not all sample
proportions are statistically significant estimates of the population
proportion. The sample size is pivotal for identifying whether the
sample proportion is a statistically significant estimate of the population
proportion or not.