Family Tree

Family Tree

About Me

My photo
Kathmandu, Bagmati Zone, Nepal
I am Basan Shrestha from Kathmandu, Nepal. I use the term 'BASAN' as 'Balancing Actions for Sustainable Agriculture and Natural Resources'. I am a Design, Monitoring & Evaluation professional. I hold 1) MSc in Regional and Rural Development Planning, Asian Institute of Technology, Thailand, 2002; 2) MSc in Statistics, Tribhuvan University (TU), Kathmandu, Nepal, 1995; and 3) MA in Sociology, TU, 1997. I have more than 10 years of professional experience in socio-economic research, monitoring and documentation on agricultural and natural resource management. I had worked in Lumle Agricultural Research Centre, western Nepal from Nov. 1997 to Dec. 2000; CARE Nepal, mid-western Nepal from Mar. 2003 to June 2006 and WTLCP in far-western Nepal from June 2006 to Jan. 2011, Training Institute for Technical Instruction (TITI) from July to Sep 2011, UN Women Nepal from Sep to Dec 2011 and Mercy Corps Nepal from 24 Jan 2012 to 14 August 2016 and CAMRIS International in Nepal commencing 1 February 2017. I have published articles to my credit.

Thursday, May 31, 2018

Distribution of Probability With Replacement and Tree Diagram: An Example, Statistical Note 7

Among 40 participants in a training, 18 were vegetarians and 22 were non-vegetarians. 2 participants are selected at random one after another with replacement of the name of the first selected participant. Calculate the probability distribution of vegetarians.

I have taken this example from the total number of participants by food habits of my statistical note 3 to show how to visualize and calculate the probability distribution using the probability tree. I refer to my note 6 for comparison with this and identifying the difference.

There will be two consecutive selections of two participants. The first participant is selected from among the total of 40 participants. The name of the first participants selected is returned back to the sample space or the population of 40 participants. Then, the second participant will be selected again from the same 40 participants. This process is referred to as sampling with replacement. The probability in this case is called probability with replacement or independent probability.

At the first stage, there are two possibilities of randomly selecting the first participant. The first participant could be either a vegetarian or a non-vegetarian (Diagram 1). Let V1 be a simple event that the fist selected participant is a vegetarian. The marginal probability that a randomly selected participant is a vegetarian, denoted by P(V1) is 18 divided by 40, 0.45 (blue block). Similarly, the probability that a randomly selected participant is a non-vegetarian, denoted by P(NV1) is 22 divided by 40, 0.55 (green block). It is calculated also as one minus P(V1), which is equal to 0.55.

Diagram 1: First and second steps showing marginal probabilities (with replacement of the first selected participant)

At the second stage also, there will be 40 participants with four possibilities of randomly selecting a participant.  The process will be same as that at the first stage, so that P(V2) will be same as P(V1) and P(NV2) will be same as P(NV1).

Now, let me discuss about the joint probabilities selecting both the first and second participants. Let P(V1 intersection V2) or (V1∩V2) be a joint event that both the first and second selected participants are the vegetarians. The joint probability of (V1∩V2), denoted by P(V1∩V2), is the product of P(V1) and P(V2) and that is equal to 0.45 multiplied by 0.45, equal to 0.202. Following the same process, other joint probabilities are calculated, P(V1∩NV1) equal to 0.248, P(NV1∩V2) equal to 0.248, and P(NV1∩NV2) equal to 0.303.

Table 1: Discrete probability distribution of vegetarians sampled with replacement





The probability distribution of vegetarians as per the question is discussed (Table 1). Let X be an event that takes the discrete value or the number of vegetarians in two consecutive selection of participants. X takes the value 2 for the joint probability P(V1∩V2) that vegetarians are selected both times, 1 for both joint probabilities P(V∩NV) and P(NV∩V) are same as one of two participants randomly selected at the first stage or the second stage is a vegetarian if the position of the vegetarian does not matter. Thus, their probabilities are added.  X takes the value 0 for the joint probability P(NV∩NV) that non-vegetarians are selected both times.

The probability distribution of vegetarians shows that there is 30.3 percent chance that no vegetarian (or both non-vegetarians) is selected, there is 49.6 percent chance that one of two participants selected will be a vegetarian and there is 20.2 percent chance that both participants will be vegetarians or none of them will be non-vegetarian. If the probabilities are added, there is 79.9 percent chance that upto one vegetarian will be selected. There is cent percent chance of getting two or less number of vegetarians in the draw of two participants.

Monday, May 28, 2018

Distribution of Probability Without Replacement and Tree Diagram: An Example, Statistical Note 6

Among 40 participants in a training, 18 were vegetarians and 22 were non-vegetarians. 2 participants are selected at random one after another without replacement of the name of the first selected participant, calculate the probability distribution of vegetarians.

I have taken this example from the total number of participants by food habits of my statistical note 3 to show how to visualize and calculate the probability distribution using the probability tree.

There will be two consecutive selections of two participants. The first participant is selected from among the total of 40 participants and then the second participant will be selected from the remaining 39 participants without putting back the first participant in the list. This process is referred to as sampling without replacement. The probability in this case is called probability without replacement or dependent probability.

At the first stage, there are two possibilities of randomly selecting the first participant. The first participant could be either a vegetarian or a non-vegetarian (Diagram 1). Let V1 be a simple event that the fist selected participant is a vegetarian. The marginal probability that a randomly selected participant is a vegetarian, denoted by P(V1) is 18 divided by 40, 0.45 (blue block). Similarly, the probability that a randomly selected participant is a non-vegetarian, denoted by P(NV1) is 22 divided by 40, 0.55 (green block). It is calculated also as one minus P(V1), which is equal to 0.55.










Diagram 1: First and second steps showing marginal and conditional probabilities (without replacement of the first selected participant)

At the second stage, 39 participants are left with four possibilities of randomly selecting the second participant. The first two possibilities are discussed and remaining two possibilities will follow the same process.

Let V2/V1 be an event that the second selected participant is also a vegetarian given the first participant is a vegetarian.  Now, the conditional probability of V2/V1, denoted by P(V2/V1), is 17 vegetarians left divided by total of 39 participants left, 0.43 (grey block). Now, the second possibility is discussed. Let NV1/V1 be an event that the second selected participant is a non-vegetarian given the first participant is a vegetarian. The conditional probability of NV1/V1, denoted by P(NV1/V1), is 22 non-vegetarians divided by total of 39 participants left, 0.56 (yellow block). Following the same process, P(V1/NV1) is 0.46 (red block) and P(NV2/NV1) is 0.54 (purple block).

Now, let me discuss about the joint probabilities selecting both the first and second participants. Let P(V1 intersection V2) or (V1∩V2) be a joint event that both the first and second selected participants are the vegetarians. The joint probability of (V1∩V2), denoted by P(V1∩V2), is the product of P(V1) and P(V2/V1) and that is equal to 0.45 multiplied by 0.43, equal to 0.194. Following the same process, other joint probabilities are calculated, P(V1∩NV1) equal to 0.252, P(NV1∩V1) equal to 0.253, and P(NV1∩NV2) equal to 0.297.
  
Table 1: Discrete probability distribution of sampled vegetarians
The probability distribution of vegetarians as per the question is discussed (Table 1). Let X be an event that takes the discrete value or the number of vegetarians in two consecutive selection of participants. X takes the value 2 for the joint probability P(V1∩V2) that vegetarians are selected both the times, 1 for both joint probabilities P(V1∩NV1) and P(NV1∩V1), which are same as one of two participants are randomly selected at the first stage or the second stage is a vegetarian if the position of the vegetarian does not matter. Thus, their probabilities are added. X takes the value 0 for the joint probability P(NV1∩NV2) that none of two selected participants is a vegetarian.

The probability distribution of vegetarians shows that there is 29.7 percent chance that no vegetarian (or both non-vegetarians) is selected, there is 50.5 percent chance that one of two participants selected will be a vegetarian and there is 19.4 percent chance that both participants will be vegetarians or none of them will be non-vegetarian. If the probabilities are added, there is 85.2 percent chance that upto one vegetarian will be selected. There is cent percent chance of getting two or less number of vegetarians in the draw of two participants.

Saturday, May 26, 2018

Joint Probability and Venn Diagram: An Example, Statistical Note 5

A Venn Diagram is an important tool to visualize the joint probability. I take an example from my statistical note 3 to apply the Venn Diagram. Table 1 presents data on the number of training participants by sex of participants and food habit.

Table 1: Food habit of training participants by sex





In the crosstab above, let me take the joint probability that a randomly selected participant is a woman who is a vegetarian also, denoted by P(W intersection V) or P(W∩V) is the product of P(W) and P(V/W). P(W) is calculated as 16 divided by 40, 0.40. P(V/W) is 12 divided by 16, 0.75. P(W∩V) is the product of 0.4 and 0.75, equal to 0.30. This probability value is equal to the first cross-sectional cell value (12) between women column and vegetarian row divided by the grant total value (40). Another way of calculating the P(W∩V) is the product of P(V) and P(W/V).

The same events and calculations are shown in diagram 1 also. A set or an event W that the participants in the training are women, with the corresponding probability P(W) is shown by the blue circle with the probability value.  Likewise, a set or an event V that the participants in the training are vegetarians, with the corresponding probability P(V) is shown by the yellow circle with the probability value. The area of overlap or an intersection between two circles is an event (W∩V) that a randomly selected participant is a woman who is a vegetarian also, is indicated by the blue line. The calculation of P(W∩V) is explained in the green box linked to that blue line.

Diagram 1: Joint event and probability

Wednesday, May 23, 2018

Probability Rule of Independence, Statistical Note 4

Two events are said to be independent if the occurrence of one event does not affect the occurrence of the another. In that case the probability of the second event given the first event is equal to the probability of the first event.

I take an example from my statistical note 3 to show whether the sex of participants is independent of the food habit of the participants. Table 1 presents data on the number of training participants by sex of participants and food habit.

Table 1: Food habit of training participants variable by sex





Here, I will take the case of the joint probability of the first cell, P(VÇW). The sex of participant is independent of the food habit if P(V/W) is equal to P(V) or if P(W/V) is equal to P(W). The P(V/W) is equal to 12 divided by 16, 0.75. The P(V) is equal to 18 divided by 40, 0.45. P(V/W) is not equal to P(V). In another case, P(W/V) is 12 divided by 18, 0.67. The PW) is equal to 16 divided by 40, 0.4. Here also, P(W/V) is not equal to P(W). These prove that the food habit of the participant is dependent on the sex of the participant.

Now, I will manipulate the cell values to show that the food habit is independent on the sex of the participants. The number of women and men participants by food habits were made equal. It shows that the number of vegetarians or non-vegetarians whether women or men are equal meaning food habit is not changed irrespective of sex of the participants. P(V/W) needs to be equal to P(V) or P(W/V) needs to be equal to P(W) to be the food habit independent of sex of the participant.

Table 2: Food habit of training participants invariable by sex
P(V/W) is 8 divided by 16, that is 0.50 and P(V) is 20 divided by 40, which is equal to 0.50. It shows that P(V/W) is equal to P(V). Likewise, P(W/V) is equal to 8 divided by 20, equal to 0.40 and P(W) is equal to 16 divided by 40, which is also equal to 0.40. This also shows that P(W/V) is equal to P(W). These shows that the probability using the multiplication rule of probability without replacement, denoted by  P(VÇW) equal to P(V/W) multiplied by P(W) is equal to the probability using the multiplication rule of probability with replacement, denoted by P(VÇW) equal to P(V) multiplied by P(W). Thus, food habit is independent of the sex of the participant.

Tuesday, May 22, 2018

Probability and Contingency Table: An Example, Statistical Note 3

I have taken an example from my statistical notes 1 and 2 to show the process of calculating the marginal, conditional and joint probabilities using the data presented in a contingency table, also known as the cross tabulation or crosstab. The cell values also are added to the crosstab as shown in table 1 below:

Table 1: Food habit of training participants by sex





This example has two discrete random variables or categorical variables each with two mutually exclusive categories of response. One categorical variable is the sex of training participants which has two categories of response: women (W) or men (M). Another categorical variable is the food habit which also has two mutually exclusive categories: vegetarian (V) and non-vegetarian (NV).

In the column total row, the simple or marginal probability of an independent event of women denoted by P(W) is 0.40, that is 40 percent of total training participants are women. This is calculated by dividing the column total or the marginal total of 16 women participants in the contingency table divided by the grand total of 40 participants. Likewise, the simple or marginal probability of an independent event of men denoted by P(M) is calculated at 0.60. Similar processes are followed in the row total column as well to calculate the simple or marginal probabilities of vegetarians denoted by P(V) equal to 0.45 and non-vegetarians denoted by P(NV) equal to 0.55 (table 2).

Table 2: Calculation of Marginal, Conditional and Joint Probabilities












The conditional probability of a vegetarian, a dependent event, given among the women represented by P(V/W) is 12 divided by 16, which is equal to 0.75, that is 75 percent women are vegetarians. Likewise, the conditional probabilities of P(NV/W), P(V/M) and P(NV/M) can be calculated following the same process. The conditional probabilities are also shown in table 2.

The joint probability that a randomly selected participant is a woman who is a vegetarian also, denoted by P(W intersection V) or P(WÇV) is the product of P(W) and P(V/W), the product of 0.4 and 0.75, equal to 0.30. This probability value is equal to the cross-sectional cell value between women column and vegetarian row divided by the grant total value in the contingency table, as shown in tables 2 and 3. The joint probabilities of P(WÇNV), P(MÇV) and P(MÇNV) can be calculated by using the same process.

Table 3: Cell values, Joint probabilities and cell values as percentage of grand total

Saturday, May 19, 2018

Conditional Probability Tree Diagram: Example, Statistical Note 2

A Tree Diagram is an important tool to visualize the events and their respective probabilities. I have taken an example from my statistical note 1 (Calculating the probability that a randomly selected person is a woman who is a vegetarian also) to show the process of drawing the probability tree diagram.

At the first step, there are two possible mutually exclusive or independent outcomes: women or men in the sample space of total training participants. The outcomes are independent because the selection of a woman does not depend on men. Let W be a simple event that a selected participant is a woman. The simple or marginal probability of the simple event W, denoted by P(W) is 0.40. It means that is 40 percent participants in the training are women. Another possible simple event is that the selected participant is a man, denoted by M and the simple or marginal probability of the event M denoted by P(M) is 0.60, that is 60 percent participants are men. These marginal probabilities at the first step are shown in the diagram, also referred to as the tree diagram 1.


Diagram 1: First step showing marginal probabilities
Once a woman is selected at the first step, there are two possible mutually exclusive dependent outcomes in the second step: vegetarian or non-vegetarian. Let V/W be an event that among the women participants, one is a vegetarian (V).  Now, the conditional probability of V/W denoted by P(V/W) is 0.75, that is 75 percent women participants are vegetarians. Likewise, let NV/W be an event that among the women participants, one is a non-vegetarian (NV).  The conditional probability of NV/W denoted by P(NV/W) is 0.25, that is 25 percent women participants are non-vegetarians. Similarly, conditional probabilities that a participant selected among women is a vegetarian or a non-vegetarian can be shown using a tree diagram 2.
 


Diagram 2: First and second steps showing marginal and conditional probabilities