 # Distributions

• Table of Contents

## I. Hypergeometric distribution

### Physical setup

Suppose we have a collection of $N$ objects which can be classified into two distinct categories. Denote the categories:

1. Success ($S$)
2. Failure ($F$)

Suppose that within these $N$ objects there exist $r$ of type $S$ and therefore $N-r$ of type $F$. We choose $n$ objects without replacement, that is, we remove items in succession from the original set of $N$. Let $X$ be a random variable representing the number of successes obtained. Then $X$ has a hypergeometric distribution.

### Probability function

We require three values. Refer to the primer on combinations for a discussion on counting rules and notation.

First, we require the total possible number of ways that $n$ distinct items can be chosen from $N$ without replacement and where order does not matter. This is ‘$N$ Choose $n$

Second, we wish to know the number of ways that $x$ successes can be drawn from a total of $r$

Third, we are left to select $n-x$ failures from a total of $N-r$

Then the probability of selecting $x$ successes is

### Examples

#### A Deck of cards

A deck of playing cards contains 4 suits (hearts, spade, diamonds,clubs) for each of 13 types {Ace, 2, 3,…, 10, King, Queen, Jack}. Deal 5 cards from the deck. The probability of selecting up to four Aces from the deck follows a hypergeometric distribution ## II. Binomial distribution

### Physical setup

Consider an experiment in which we have two distinct types of outcomes that can be categorized as

1. Success ($S$)
2. Failure ($F$)

Suppose the probability of success is $p$ and failure is $(1-p)$. Repeat the experiment $n$ independent times. If $X$ is the number of successes then it has a binomal distribution denoted $X\sim Bi(n, p)$.

### Probability function

We require three values. Refer to the primer on combinations for a discussion on counting rules and notation.

First, we require the total possible number of ways that $x$ successes can be arranged within $n$ experiments which is

Then the probability of each arrangement is $p$ multiplied $x$ times for successes and likewise for failures

Therefore,

### Moments

The mean of a binomial distribution of sample size $n$ and probability $p$ is

The variance is

### Comparison of hypergeometric and binomial

The key difference between the hypergeometric and the binomial distribution is that the hypergeometric involves the probability of an event when selection is made without replacement. In other words, the hypergeometric setup assumes some dependence amongst the selection of successes and failures. For example, choosing an ace from a deck and removing it reduces the probability of selecting a remaining ace. In contrast, the binomial distribution assumes independence and can be viewed as appropriate when event selection is made with replacement.

There are limiting cases where the hypergeometric can be approximated by the binomial. Consider the hypergeometric case where the total number of possible successes and failures $N$ is large compared to the number of selections $n$. Then the probability of success $p=r/N$ does not appreciably upon selection without replacement. That is, for $N>>n$

### Examples

#### A fair die

Toss a fair die 10 times and let $X$ be the number of sixes then $X\sim Bi(10,1/6)$. ## III. Poisson distribution

### Physical setup

Consider a limiting case of the binomial distribution as $n\rightarrow\infty$ and $p\rightarrow 0$ but $np=\mu$ is fixed. This means that the event of interest is relatively rare. Then $X$ has a Poisson distribution $P \sim Poisson(\mu)$.

### Probability function

Since $np=\mu$ then $p=\frac{\mu}{n}$ and

### Moments

The mean of a Poisson distribution with some sampling size $n$ and probability $p$ is

The variance is

### Examples

#### The birthday game

Suppose that 200 people are at a party. What is the probability that 2 of them were born on December 25th? In this case $n=200$ and assuming birthdays are independent then $p=1/365$ and the mean $\mu=np$

## IV. Gamma distribution

### Gamma function

Definition The gamma function os $\alpha$ is

There are two nice properties of the gamma function that we will use.

### Gamma distribution

Let $X$ be a non-negative continuous random variable. Then if the probability function is of the form

then $X$ has a gamma distribution $X\sim Gamma(x; \alpha,\beta)$. Typically, $\alpha$ is called the ‘shape’ parameter and $\beta$ the ‘scale’ or ‘rate’.

## V. Negative binomial distribution

### Physical setup

The setup is very similar to the binomial. Consider an experiment in which we have two distinct types of outcomes that can be categorized as

1. Success ($S$)
2. Failure ($F$)

Suppose the probability of success is $p$ and failure is $(1-p)$. Repeat an experiment until a pre-specified number of failures $r$ have been obtained. Let $X$ be the number of successes before the $r^{th}$ failure. Then $X$ has a negative binomial distribution denoted $X\sim NB(r, p)$.

### Probability function

There will be $x+r$ total trials but the last event is a failure so we really care about the first $x+r-1$ trials. There will be $x$ successes and $(r-1)$ failures in any order. Each order has a probability identical to a binomial trial $p^x(1-p)^r$.

### Moments

The mean of a negative binomial distribution is

The variance is

### The Poisson-gamma mixture

Count data such as RNA sequencing mapped reads is often modeled with a Poisson distribution where the mean and variance are equal to $\mu$. However, there are cases where the variance exceeds that specified by the mean. To account for this ‘overdispersed’ data, the negative binomial distribution can be utilized. As we will show below, the negative binomial arises as a Poisson distribution where the Poisson parameter is itself a random variable distributed according to a Gamma distribution.

Let us state this in a more precise fashion. Suppose that we have distribution of counts $X$ that follows a Poisson distribution indexed by the parameter $\Theta$. Now suppose that $\Theta$ is itself some function of another random variable $\Theta = \mu\epsilon$ where $\epsilon \sim Gamma(x;\alpha,\beta)$. Then the conditional distribution of the random variable of counts is

Let $\epsilon$ follow a gamma distribution with shape $\alpha$ and scale $\beta$.

The joint density of $N$ and $\Theta$ is

Derive the marginal distribution of $X$ by integrating over the values of $epsilon$.

The key here is to transform the integrand into a gamma distribution with shape parameter $x+\alpha$ and scale $\beta+\mu$ and noting that the integral over all values is unity.

It is simple to see that this result is the negative binomial with $r=\alpha$ and $p=\mu/(\beta+\mu)$. In this case the moments can be stated using these new variables.

From the moments of the negative binomial stated above, the mean is

The variance is

#### Alternative Poisson-gamma notation

This mixture model will be important in our discussion of RNA sequencing data differential expression testing. In this case the notation is altered where $r=\alpha=\beta=\phi^{-1}$ and $\phi$ is the ‘dispersion’ parameter for some counts of an RNA species $Y$. Also the gamma function is used to replace the binomial coefficients.

From the above discussion, we can restate the mean and variance.

Note that as the dispersion parameter $\phi$ approaches zero the negative binomial variance approaches the mean. Thus the dispersion parameter accounts for the extra variability over and above that expected with a Poisson.