Family Tree

Family Tree

About Me

My photo
Kathmandu, Bagmati Zone, Nepal
I am Basan Shrestha from Kathmandu, Nepal. I use the term 'BASAN' as 'Balancing Actions for Sustainable Agriculture and Natural Resources'. I am a Design, Monitoring & Evaluation professional. I hold 1) MSc in Regional and Rural Development Planning, Asian Institute of Technology, Thailand, 2002; 2) MSc in Statistics, Tribhuvan University (TU), Kathmandu, Nepal, 1995; and 3) MA in Sociology, TU, 1997. I have more than 10 years of professional experience in socio-economic research, monitoring and documentation on agricultural and natural resource management. I had worked in Lumle Agricultural Research Centre, western Nepal from Nov. 1997 to Dec. 2000; CARE Nepal, mid-western Nepal from Mar. 2003 to June 2006 and WTLCP in far-western Nepal from June 2006 to Jan. 2011, Training Institute for Technical Instruction (TITI) from July to Sep 2011, UN Women Nepal from Sep to Dec 2011 and Mercy Corps Nepal from 24 Jan 2012 to 14 August 2016 and CAMRIS International in Nepal commencing 1 February 2017. I have published articles to my credit.

Friday, June 29, 2018

Discrete Probability Distribution of Sampling Without Replacement, Tree Diagram and Formula, Statistical Note 18

Among 40 participants in a training, 18 were vegetarians and 22 were non-vegetarians. 2 participants are selected at random one after another without replacement of the name of the first selected participant. Calculate the probability distribution of vegetarians in which the order does not matter whether the vegetarians are sampled or not in the first or second draw.

Counting the total and favorable numbers of outcomes constituting the specified number of objects sampled without replacement from the finite discrete binary population is important to calculate the probability of favorable events. Tree diagram is an important means to visualize, count the number of outcomes and calculate the probability. Formula is another means of counting the outcomes and calculating the discrete probability distribution. I have taken an example from my statistical notes 3 and 6 to show how to visualize and calculate the discrete probability distribution without replacement using the tree diagram and formula.

Tree Diagram

In this example, two participants will be selected consecutively without replacement, referred to as sampling without replacement. The probability of each outcome is presented in Diagram 1. For detailed discussion on how they are derived please refer to my statistical note 6.










Diagram 1: First and second steps showing marginal and conditional probabilities (without replacement of the first selected participant)

There are two possibilities, either the selected participant is a vegetarian or a non-vegetarian. The joint probability that both the first and second selected participants are the vegetarians, denoted by P(V1∩V2), is the product of P(V1) and P(V2/V1) which is equal to 0.197. Following the same process, other joint probabilities are calculated, P(V1∩NV1) equal to 0.253, P(NV1∩V1) equal to 0.253, and P(NV1∩NV2) equal to 0.297.
  
Table 1: Discrete probability distribution of vegetarians sampled without replacement
The probability distribution of vegetarians as per the question is discussed (Table 1). Let X be an event that takes the discrete value or the number of vegetarians in two consecutive selection of participants. X takes the value 2 for the joint probability P(V1∩V2) that vegetarians are selected both the times, 1 for both joint probabilities P(V1∩NV1) and P(NV1∩V1), which are same as one of two participants are randomly selected at the first stage or the second stage is a vegetarian if the position of the vegetarian does not matter. Thus, their probabilities are added. X takes the value 0 for the joint probability P(NV1∩NV2) that none of two selected participants is a vegetarian.

The probability distribution of vegetarians shows that there is 29.7 percent chance that no vegetarian (or both non-vegetarians) is selected, there is 50.6 percent chance that one of two participants selected will be a vegetarian and there is 19.7 percent chance that both participants will be vegetarians or none of them will be non-vegetarian. If the probabilities are added, there is 80.3 percent chance that up to one vegetarian will be selected. There is cent percent chance of getting two or less number of vegetarians in the draw of two participants.

Formula

This example has three characteristic features. First, the example has a finite population of 40 participants, denoted by ‘N’. Second, each participant can be characterized as success or failure. Since the question asks the probability of vegetarians, the selection of a vegetarian is considered as a success, say denoted by ‘Z’, and there are 18 successes in the population. Third, a sample of two participants, denoted by ‘n’, is selected without replacement in a way that each sample of 2 participants is equally likely to be selected.

Let X be a random variable of interest that takes one of 0, 1 or 2 values as the number of vegetarians in the sample of two participants sampled without replacement, denoted by ‘x’. The probability distribution of X depends on the parameters, ‘n’, ‘M’ and ‘N’, and is given by the expression
P(X=x) = h(x;n,M,N) = Number of outcomes having X=x divided by total number of outcomes
P(X=x) = h(x;n,M,N) = [C(M,x) X C(N-M,n-x)]/C(N,n)

This distribution is referred to as Hypergeometric distribution.

In this example, n=2, M=18, N=40 and ‘x’ takes the value 0 to 2. Putting these values in the above formula, one gets
  
P(X=0) = [C(18,0) X C(22,2)/C(40,2)] = (22 X 21)/ (40 X 39)= 0.297
P(X=1) = [C(18,1) X C(22,1)/C(40,2)] = (18 X 22 X 2)/ (40 X 39) = 0.506
P(X=2) = [C(18,2) X C(22,0)/C(40,2)] = (18 X 17)/ (40 X 39) = 0.197

These values are equal to the ones presented in table 1 above. It indicates that both tree diagram and formula produce the same values are useful to calculate the discrete probability distribution without replacement.

No comments:

Post a Comment