Family Tree

Family Tree

About Me

My photo
Kathmandu, Bagmati Zone, Nepal
I am Basan Shrestha from Kathmandu, Nepal. I use the term 'BASAN' as 'Balancing Actions for Sustainable Agriculture and Natural Resources'. I am a Design, Monitoring & Evaluation professional. I hold 1) MSc in Regional and Rural Development Planning, Asian Institute of Technology, Thailand, 2002; 2) MSc in Statistics, Tribhuvan University (TU), Kathmandu, Nepal, 1995; and 3) MA in Sociology, TU, 1997. I have more than 10 years of professional experience in socio-economic research, monitoring and documentation on agricultural and natural resource management. I had worked in Lumle Agricultural Research Centre, western Nepal from Nov. 1997 to Dec. 2000; CARE Nepal, mid-western Nepal from Mar. 2003 to June 2006 and WTLCP in far-western Nepal from June 2006 to Jan. 2011, Training Institute for Technical Instruction (TITI) from July to Sep 2011, UN Women Nepal from Sep to Dec 2011 and Mercy Corps Nepal from 24 Jan 2012 to 14 August 2016 and CAMRIS International in Nepal commencing 1 February 2017. I have published articles to my credit.

Wednesday, July 11, 2018

Two Category Discrete Probability Distribution of Sampling With Replacement and EXCEL Function, Statistical Note 22


Among 40 participants in a training, 18 were vegetarians and 22 were non-vegetarians. 2 participants are selected at random one after another with replacement of the name of the first selected participant. Calculate the probability distribution of vegetarians in which the order does not matter whether the vegetarians are sampled or not in the first or second draw.

Counting the total and favorable numbers of outcomes constituting the specified number of objects sampled with replacement from the finite population is important to calculate the probability of favorable events. Excel software is commonly available in the desktop or the laptop and is an important means to calculate the discrete probability distribution. I take an example from my Statistical Note 17, and will show how calculate the discrete probability distribution with replacement using Excel function. Refer to that note for conceptual clarity on using the tree diagram and formula to calculate the Binomial Distribution probability.

EXCEL Function

Excel software has a ‘Formulas’ tab, which has ‘Insert Function’ command under ‘Function Library’ group. Select ‘BINOM.DIST’ function from the list of all function. The ‘BINOM.DIST’ formula has four fields that look like - BINOMDIST(number_s,trials,probability_s,cumulative) as shown in Diagram 1.


















Diagram 1: Selection of ‘BINOM.DIST’ function in Excel Data Worksheet

Diagram 2 shows the BINOM.DIST function arguments using the dataset. The field ‘Number_s’ takes the number of successes in trials. In this example, this field takes the value from zero to two vegetarians. Because, in the sample of two participants there could be no vegetarian, one vegetarian or two vegetarians selected. In this calculation, one vegetarian in two independent draws of participants has been used as shown in the cell B3 of the table as well as the function argument box and the dataset 1 in Diagram 2.
Diagram 2: ‘BINOM.DIST’ Function Arguments Using Dataset in Excel Worksheet and using ‘FALSE’ logical value in the field ‘Cumulative’
                                                                                                                               
The field ‘Trials’ is the number of independent trials. In this example, two participants were drawn independently and shown in Diagram 2 in the field of the function argument box.

The field ‘Probability’ is the probability of success on in any individual trial. A probability value lies between 0 and 1. In this example, 18 out of 40 participants were vegetarians so that the probability of a vegetarian is 0.45. It is shown both in the field with the cell number N52 of the function argument box as well as the dataset in Diagram 2.

The field ‘Cumulative’ is a logical value that determines the form of the function. If ‘Cumulative’ is ‘FALSE’, ‘BINOM.DIST’ calculates the probability mass function (PMF), which gives the probability associated with the value assigned to the field ‘Number_s’ as the number of successes. It is shown in both the function argument box and the dataset with the cell value N53 in Diagram 2.

Fixing all four fields in the function arguments, ‘BINOM.DIST’ function calculated the PMF equal to 0,495. It means that there is 49.5 percent chance that one of two participants sampled with replacement will be a vegetarian. This is equal to the value manually calculated in my Statistical Note 17.

The cumulative distribution function (CDF) is another important probability calculation. If the field ‘Cumulative’ takes the value TRUE, the ‘BINOM.DIST’ results in the CDF, which gives the probability up to the number of successes mentioned in the field ‘Number_s’. In this example, the probability upto 1 vegetarian selected is the sum of probability of no vegetarian or non-vegetarian selected in both samples and the probability of one vegetarian selected between two participants sampled. The logical value ‘TRUE’ is used an shown in both the function argument box and the dataset with the cell value ‘N53’ that gives the CDF equal to 0.7975 as shown in Diagram 3.
Diagram 3: ‘BINOM.DIST’ Function Arguments Using Dataset in Excel Worksheet and using ‘TRUE’ logical value in the field ‘Cumulative’

It means that there is 79.8 percent chance that up to one vegetarian will be selected in two participants sampled with replacement. This is equal to the one manually calculated in the Statistical Note 17.


PMFs and PDFs can be calculated for all number of successes, that is the number of two non-vegetarians (no vegetarian), one vegetarian and two vegetarians out of two participants sampled without replacement as shown in tables and argument functions in both Diagrams 1 and 2.

Discussion in this note and my former Statistical Note 17 indicates that both manual and auto calculation produce the same values and are useful to calculate the discrete probability distribution with replacement. Conceptual understanding is a backbone and automatization is efficient. Thus, both are important knowledge and skill sets.

No comments:

Post a Comment