Among 40 participants in a training, 18 were vegetarians
and 22 were non-vegetarians. 2 participants are selected at random one after
another without replacement of the name of the first selected participant. Calculate
the probability distribution of vegetarians in which the order does not matter
whether the vegetarians are sampled or not in the first or second draw.
Counting the total and favorable numbers
of outcomes constituting the specified number of objects sampled without
replacement from the finite discrete binary population is important to
calculate the probability of favorable events. Tree diagram is an important means
to visualize, count the number of outcomes and calculate the probability.
Formula is another means of counting the outcomes and calculating the discrete probability
distribution. I have
taken an example from my statistical notes 3 and 6 to show how to visualize and
calculate the discrete probability distribution without replacement using the
tree diagram and formula.
Tree Diagram
In this example, two participants will be
selected consecutively without replacement, referred to as sampling without
replacement. The probability of each outcome is
presented in Diagram 1. For detailed discussion on how they are derived please
refer to my statistical note 6.
Diagram 1: First and second steps showing marginal and conditional probabilities
(without replacement of the first selected participant)
There are two possibilities, either
the selected participant is a vegetarian or a non-vegetarian. The joint probability that both the first
and second selected participants are the vegetarians, denoted by P(V1∩V2), is
the product of P(V1) and P(V2/V1) which is equal to 0.197. Following the same
process, other joint probabilities are calculated, P(V1∩NV1) equal to 0.253,
P(NV1∩V1) equal to 0.253, and P(NV1∩NV2) equal to 0.297.
The probability distribution of vegetarians as per the
question is discussed (Table 1). Let X be an event that takes the discrete
value or the number of vegetarians in two consecutive selection of
participants. X takes the value 2 for the joint probability P(V1∩V2) that vegetarians
are selected both the times, 1 for both joint probabilities P(V1∩NV1) and P(NV1∩V1),
which are same as one of two participants are randomly selected at the first
stage or the second stage is a vegetarian if the
position of the vegetarian does not matter. Thus, their probabilities are
added. X takes the value 0 for the joint probability P(NV1∩NV2) that none of
two selected participants is a vegetarian.
The probability distribution of vegetarians shows that
there is 29.7 percent chance that no vegetarian (or both non-vegetarians) is
selected, there is 50.6 percent chance that one of two participants selected
will be a vegetarian and there is 19.7 percent chance that both participants
will be vegetarians or none of them will be non-vegetarian. If the
probabilities are added, there is 80.3 percent chance that up to one vegetarian
will be selected. There is cent percent chance of getting two or less number of
vegetarians in the draw of two participants.
Formula
This example has three characteristic
features. First, the example has a finite population of 40 participants,
denoted by ‘N’. Second, each participant can be characterized as success or
failure. Since the question asks the probability of vegetarians, the selection
of a vegetarian is considered as a success, say denoted by ‘Z’, and there are
18 successes in the population. Third, a sample of two participants, denoted by
‘n’, is selected without replacement in a way that each sample of 2
participants is equally likely to be selected.
Let X be a random variable of interest
that takes one of 0, 1 or 2 values as the number of vegetarians in the sample
of two participants sampled without replacement, denoted by ‘x’. The
probability distribution of X depends on the parameters, ‘n’, ‘M’ and ‘N’, and
is given by the expression
P(X=x) = h(x;n,M,N) = Number of
outcomes having X=x divided by total number of outcomes
P(X=x) = h(x;n,M,N) = [C(M,x) X
C(N-M,n-x)]/C(N,n)
This distribution is referred to as Hypergeometric
distribution.
In this example, n=2, M=18, N=40 and
‘x’ takes the value 0 to 2. Putting these values in the above formula, one gets
P(X=0) = [C(18,0) X C(22,2)/C(40,2)] =
(22 X 21)/ (40 X 39)= 0.297
P(X=1) = [C(18,1) X C(22,1)/C(40,2)] =
(18 X 22 X 2)/ (40 X 39) = 0.506
P(X=2) = [C(18,2) X C(22,0)/C(40,2)] =
(18 X 17)/ (40 X 39) = 0.197
These values are equal to the ones presented in table
1 above. It indicates that both tree diagram and formula produce the same
values are useful to calculate the discrete probability distribution without
replacement.
No comments:
Post a Comment