Family Tree

Family Tree

About Me

My photo
Kathmandu, Bagmati Zone, Nepal
I am Basan Shrestha from Kathmandu, Nepal. I use the term 'BASAN' as 'Balancing Actions for Sustainable Agriculture and Natural Resources'. I am a Design, Monitoring & Evaluation professional. I hold 1) MSc in Regional and Rural Development Planning, Asian Institute of Technology, Thailand, 2002; 2) MSc in Statistics, Tribhuvan University (TU), Kathmandu, Nepal, 1995; and 3) MA in Sociology, TU, 1997. I have more than 10 years of professional experience in socio-economic research, monitoring and documentation on agricultural and natural resource management. I had worked in Lumle Agricultural Research Centre, western Nepal from Nov. 1997 to Dec. 2000; CARE Nepal, mid-western Nepal from Mar. 2003 to June 2006 and WTLCP in far-western Nepal from June 2006 to Jan. 2011, Training Institute for Technical Instruction (TITI) from July to Sep 2011, UN Women Nepal from Sep to Dec 2011 and Mercy Corps Nepal from 24 Jan 2012 to 14 August 2016 and CAMRIS International in Nepal commencing 1 February 2017. I have published articles to my credit.

Saturday, September 8, 2018

Two-category Discrete Probability Distribution of Sampling With Replacement, Observation and Theory, Statistical Note 34

Toss a coin five times and count the number of heads. Repeat the same process for seven times or sets each constituting five tosses. Calculate the observed and theoretical discrete probability distributions of number of heads.

The observed probability distribution is based on the real-time data. The theoretical probability distribution is based on an ideal situation. Using the observed data is important to understand the theory. The main objective of this note is to develop understanding of complex probability and probability distribution concepts using a simple experiment.

Tossing of a coin is an example of the two-category discrete probability distribution of sampling with replacement. Refer to my earlier Statistical Notes also for clarity on calculating the two-category discrete probability using tree diagram, formula and Excel software function.

In this note, first I present the observed data and then present the probability and two-category discrete probability concepts using the trail data. This note tries to clarify the concept two-category discrete probability distribution based on the observed data. Former notes first tried to clarify the theory and then discussed the observation. Unlike, this note is other way round which first discusses on the observed data using the tree diagram, and then clarifies the theoretical distribution. This is also because to analyze and interpret meaningfully the observed data based on the theory.

Observed Data

I tossed a coin five times in a set and the same process was repeated for seven sets or times. Table 1 presents the outcome of five tosses of a coin in each of seven sets. Head and tail were coded H and T respectively for symbolic representation. Besides, a cell with tail is shaded sky blue and the cell with head is shared brown. I will discuss more on this table in following sections.


Table 1: Outcomes in five tosses of a coin in each of seven sets (head=H and tail=T)







Queries
Several questions may arise looking at the outcomes data in Table 1. For example,
·         Why is every outcome in the table different than others?
·         Is there any pattern of outcomes in five tosses of a coin?
·         Why were there outcomes with only one to three heads in five tosses? Why not less or more than those number of heads?
·         How many unique outcomes will there be in five tosses?
·         Can unique outcomes be grouped?
·         How many different groups of outcomes will there be having five heads to no heads in five tosses of a coin?
·         What is the probability of an event in the first set S1 (T,T,T,H and H) in Table 1 that has tails and head in exactly this order?
·         What is the probability of two heads out of five tosses in which the order does not matter whether a head or a tail occurs in which toss?
·         Looking at Table 1, what will be the observed discrete probability distribution of number of heads?
·         What will be the theoretical probability distribution of number of heads in five tosses of a coin?
·         How different will be the observed from the theoretical discrete probability distribution of number of heads in five tosses of a coin?

Response

These questions can be answered using different tools – Tree Diagram, Binomial Expansion and Binomial Distribution functions. Look at specific statistical notes to get answers to these questions. Below are responses to the queries:

Questions: Why is every outcome in the table different than others? Is there any pattern of outcomes in five tosses of a coin? Why were there outcomes with only one to three heads in five tosses? Why not less or more than those number of heads?

A coin is unbiased such that both head and tail are equally likely to turn up with the probability of half for each side. Every toss of a coin is a random experiment in which any of the sides of the coin is likely to turn up. Thus, every outcome in the table different than other. However, it is likely that the same set of outcomes in several tosses could appear. There are some patterns of outcomes in five tosses. Four sets (S3 to S5 and S7) each with five tosses have three heads, two sets (S1 and S2) have two head and one set (S6) has one head. Other sets could have other number of heads ranging from zero to all five heads in five tosses. Thus, there is no guarantee that a specified number of heads will turn up in any number of tosses. 

Question: How many unique outcomes will there be in five tosses?

Refer to my statistical notes 9 and 15 on the total number of possible outcomes of sampling with replacement. Each unique outcome will be different than others based on the turning up of a coin in a certain toss. The total number of possible outcomes is calculated by using the formula ‘k to the power r’ or ‘k to the rth power’ or ‘kr’, where ‘k’ is the number of possible  outcomes in an experiment or trail and ‘r’ is the number of times an experiment or a trail is conducted. In this example, ‘n’ is two and ‘r’ is five so that the total number of possible outcomes is calculated by multiplying two possibilities (head or tail) in each of the five tosses. This is calculated as 2 X 2 X 2 X 2 X 2 equal to 32 represented by the ‘two to the power five’ or ‘two to the fifth power’, denoted by 25. This is clearly seen on the tree diagram 1 as well. The outcomes with the number of heads ranging from five heads, denoted by ‘5H’ to zero head, denoted by ‘0H’ are indicated by different colors on the third block from right in tree diagram 1. Besides, seven different outcomes of five tosses of a coin listed in Table 1 are out of 32 outcomes that are shown with the respective outcome numbers S1 to S7 with different colors at the right most part of the tree diagram 1.
























Diagram 1: Tree Diagram Showing Outcomes in Five Toss of a Coin

Questions: Can unique outcomes be grouped? How many groups of outcomes will there be having five heads to no heads in five tosses of a coin?

Refer to my statistical note 15 on the grouping of unique outcomes of sampling with replacement in which the order does not matter. The possible number of outcomes groups is based on the number of heads or the tails in tosses irrespective of the order of the turning up of a head. This is calculated using the formula C(k+r-1,r)=(k+r-1)!/ (k-1)!r!, where ‘C’ refers to the combination, ‘k’ is the number of possible  outcomes in an experiment or trail and ‘r’ is the number of times an experiment or a trail is conducted. In this example, ‘k’ is two and ‘r’ is five so that the number of groups of possible outcomes is calculated to be 6! divided by 5!, equal to 6.

The grouping of outcomes with the number of heads are shown in tree diagram 1. There are six groups of outcomes, ranging from G1 to G6. G1 has only one outcome with five heads in five tosses, G2 has five outcomes with four heads, G3 has 10 outcomes with three heads, G4 has ten outcomes with two heads, G5 has five outcomes with one head and G6 has one outcome with no head, means all tails in five tosses.

This grouping can be shown using the Binomial Expansion formula. Let ‘H’ be the head and ‘T’ be the tail. Since, a coin is tossed five times, a power five of sum of ‘H’ and ‘T’ is used for the Binomial expansion. The expansion of (H+T)5 is expressed as:
(H+T)5 = H5+5H4T+10H3T2+10H2T3+5HT4+T5

‘H5’ means there is one outcome G1 having five heads in five tosses of a coin, in the same way ‘5H4T’ means there are five outcomes having four heads and one tail, ‘10H3T2’ means 10 outcomes with three heads and two tails, ‘10H2T3’ means 10 outcomes with two heads and three tails, ‘5HT4’ means five outcomes with one head and four tails,  and ‘T5’ means one outcome constituting five tails in five tosses of a coin.

The same expansion is used for the probability of head, ‘h’ and the probability of tail, ‘t’. The expansion is: (h+t)5 = h5+5h4t+10h3t2+10h2t3+5ht4+t5

The number of outcomes in a certain group of outcomes or the Binomial coefficient can be identified using Pascal’s Triangle as discussed in my Statistical Note 16, as shown in Diagram 2.








Diagram 2: Number of toss of a coin and Binomial Coefficient using Pascal’s Triangle

With the increase in the number of tosses, say 20 tosses of a coin, it is difficult to draw the tree diagram as well as to write the Binomial expansion of (H+T)20, particularly the coefficients of each group of outcomes. In such a case, Binomial Distribution function is used to identify the coefficients. The formula is: C(n,x)HxTn-xwhere ‘n’ is the number of trails and ‘H’ is the head and ‘T’ is the tail. Example, I use Binomial distribution formula to calculate the number of outcomes in the group constituting three heads and two tails in five tosses. It is C(5,3)H3T2, equivalent to 10 H3T2 which is same as the third group in the above Binomial expansion.
   
Using the Binomial distribution function, as discussed in my Statistical Note 16, the expansion looks:
(H+T)n = C(n,0)Hn+C(n,1)Hn-1T+C(n,2)Hn-2T2+C(n,3)Hn-3T3+ C(n,x)Hn-xTx+……+ C(n,n)Tn
This expression can be used to calculate the number of outcomes in a certain group of heads and ultimately the total number of outcomes for the given number of trials or experiments with replacement.

Question: What is the probability of an event in the first set S1 (T,T,T,H and H) in Table 1 that has tails and head in exactly this order?

Probability is calculated by dividing the number of outcomes by the total number of possible outcomes. There is only one outcome that has three consecutive tails and two consecutive heads in five tosses of a coin. As already discussed in length above there are 32 outcomes in five tosses. Thus, the probability of an event (T,T,T,H and H) is one by 32, equal to 0.03125.

Question: What is the probability of two heads out of five tosses in which the order does not matter whether a head or a tail occurs in which toss?

Looking at the tree diagram, Binomial expansion and the Binomial distribution function as discussed above, there are 10 outcomes having two heads and three tails in five tosses of a coin. Thus, the probability of two head is 10 divided by 32, equal to 0.3125.

Question: Looking at Table 1, what will be the observed discrete probability distribution of number of heads?

To summarize, the number of heads out of five tosses in seven sets ranged from one to three (Table 1). Two heads turned up two times in two of seven sets of five tosses (S1 and S2), three heads turned up four times (S3 to S5 and S7), one head turned up (S6) in five tosses . Thus, turning up of three heads is most likely to occur, three out of seven times with probability P(X=3)=0.571428, highlighted yellow in Table 2.

Table 2: Number of Heads Out of Five Tosses of a Coin in Each of Seven Sets and Observed Probability Distribution of Number of Heads






Question: What will be the theoretical probability distribution of number of heads in five tosses of a coin?

I discussed on the Theoretical Two-Category Discrete Probability Distribution calculation in my former Statistical Note 29 also. As discussed above, the number of outcomes under different groups of heads can be calculated using tree diagram, Binomial expansion and Binomial distribution function. Here, I present only the table constituting the number of heads in five tosses and respective probabilities (Table 3). Binomial distribution function in Excel is also used to calculate the two-category theoretical probability distribution with replacement.

Table 3: Number of Heads, Number of Outcome Groups and Theoretical Probability Distribution of Number of Heads








Turning up of two or three heads in five tosses have highest probability and are thus, highly likely to occur. These are highlighted yellow. The likelihood decreases towards both sides from two or three heads. Two extreme number of heads, zero and five heads, have the least chance of occurrence.

Question: How different will be the observed from the theoretical discrete probability distribution of number of heads in five tosses of a coin?

Chart 1 compares the observed and theoretical two category discrete probability distribution of heads in five tosses of a coin.  This clearly shows the bell-shaped curve, the symmetric line chart of theoretical probability distribution and how different the observed distribution and charts are.














Conclusion

Tree diagram, Binomial expansion and Binomial distribution function are important tools to calculate the number of outcomes and the probability. The observed two-category probability distribution differs from the theoretical distribution. The observed data could differ from one set to another because of non-uniformity in the condition in which a coin is tossed repeatedly.

No comments:

Post a Comment