]> The Poisson Distribution

## 4. The Poisson Distribution

#### The Probability Density Function

We have shown that the $k$ arrival time in the Poisson process has the gamma probability density function with shape parameter $k$ and rate parameter $r$:

$f k t r k t k 1 k 1 r t , t 0$

Recall also that at least $k$ arrivals come in the interval $0 t$ if and only if the $k$ arrival occurs by time $t$:

$N t k T k t$

Use integration by parts to show that

$N t k s 0 t f k s 1 j 0 k 1 r t r t j j , k$

Use the result of Exercise 1 to show that the probability density function of the number of arrivals in the interval $0 t$ is

$N t k r t r t k k , k$

The corresponding distribution is called the Poisson distribution with parameter $r t$; the distribution is named after Simeon Poisson.

In the Poisson experiment, vary $r$ and $t$ with the scroll bars and note the shape of the density function. Now with $r 2$ and $t 3$, run the experiment 1000 times with an update frequency of 10 and watch the apparent convergence of the relative frequency function to the density function.

The Poisson distribution is one of the most important in probability. In general, a discrete random variable $N$ in an experiment is said to have the Poisson distribution with parameter $c 0$ if it has the probability density function

$g k c c k k , k$

Show directly that $g$ is a valid probability density function.

Show that

1. $g n 1 g n$ if and only if $n c$.
2. $g$ at first increases and then decreases, and thus the distribution is unimodal
3. If $c$ is not an integer, there is a single mode at $c$. If $c$ is an integer there are two modes at $c 1$ and $c$.

Suppose that requests to a web server follow the Poisson model with rate $r 5$. per minute. Find the probability that there will be at least 8 requests in a 2 minute period.

Defects in a certain type of wire follow the Poisson model with rate 1.5 per meter. Find the probability that there will be no more than 4 defects in a 2 meter piece of the wire.

#### Moments

Suppose that $N$ has the Poisson distribution with parameter $c$. The following exercises give the mean, variance, and probability generating function of $N$.

Show that $N c$.

Show that $N c$.

Show that $u N c u 1$. for $u$.

Returning to the Poisson process $N t t 0$ with rate parameter $r$, it follows that $N t r t$ and $N t r t$ for $t 0$. Once again, we see that $r$ can be interpreted as the average arrival rate. In an interval of length $t$, we expect about $r t$ arrivals.

In the Poisson experiment, vary $r$ and $t$ with the scroll bars and note the location and size of the mean/standard deviation bar. Now with $r 3$ and $t 4$, run the experiment 1000 times with an update frequency of 10 and watch the apparent convergence of the sample mean and standard deviation to the distribution mean and standard deviation, respectively.

Suppose that customers arrive at a service station according to the Poisson model, at a rate of $r 4$. Find the mean and standard deviation of the number of customers in an 8 hour period.

#### Stationary, Independent Increments

Let us see what the basic regenerative assumption of the Poisson process means in terms of the counting variables $N t t 0$.

Show that if $s t$, then $N t N s$ is the number of arrivals in the interval $s t$.

Recall that our basic assumption is that the process essentially starts over at time $s$ and the behavior after time $s$ is independent of the behavior before time $s$.

Argue that:

1. $N t N s$ has the same distribution as $N t s$ namely Poisson with parameter $r t s$.
2. $N t N s$ and $N s$ are independent.

Suppose that $N$ and $M$ are independent random variables, and that $N$ has the Poisson distribution with parameter $c$ and $M$ has the Poisson distribution with parameter $d$. Show that $N M$ has the Poisson distribution with parameter $c d$.

1. Give a probabilistic proof, based on the Poisson process.
2. Give an analytic proof using probability density functions.
3. Give an analytic proof using probability generating functions.

In the Poisson experiment, select $r 1$ and $t 3$. Run the experiment 1000 times, updating after each run. By computing the appropriate relative frequency functions, investigate empirically the independence of the random variables $N 1$ and $N 3 N 1$.

#### Normal Approximation

Now note that for $k$,

$N k N 1 N 2 N 1 N k N k 1$

The random variables in the sum on the right are independent and each has the Poisson distribution with parameter $r$.

Use the central limit theorem to show that the distribution of the standardized variable below converges to the standard normal distribution as $k$.

$Z k N k k r k r$

A bit more generally, the same result is true with the integer $k$ replaced by the positive real number $c$. Thus, if $N$ has the Poisson distribution with parameter $c$, and $c$ is large, then the distribution of $N$ is approximately normal with mean $c$ and standard deviation $c$. When using the normal approximation, we should remember to use the continuity correction, since the Poisson is a discrete distribution.

In the Poisson experiment, set $r 1$ and $t 1$. Increase $r$ and $t$ and note how the graph of the probability density function becomes more bell-shaped.

In the Poisson experiment, set $r 5$ and $t 4$. Run the experiment 1000 times with an update frequency of 100. Compute and compare the following:

1. $15 N 4 22$
2. The relative frequency of the event $15 N 4 22$ .
3. The normal approximation to $15 N 4 22$.

Suppose that requests to a web server follow the Poisson model with rate $r 5$. Compute the normal approximation to the probability that there will be at least 280 requests in a 1 hour period.

#### Conditional Distributions

Consider again the Poisson model with arrival time sequence $T 1 T 2$ and counting process $N t t 0$.

Let $t 0$. Show that the conditional distribution of $T 1$ given $N t 1$ is uniform on the interval $0 t$. Interpret the result.

More generally, show that given $N t n$, the conditional distribution of $T 1 T n$ is the same as the distribution of the order statistics of a random sample of size $n$ from the uniform distribution on the interval $0 t$.

Note that the conditional distribution in the last exercise is independent of the rate $r$. This result means that, in a sense, the Poisson model gives the most random distribution of points in time.

Suppose that requests to a web server follow the Poisson model, and that 1 request comes in a five minute period. Find the probability that the request came during the first 3 minutes of the period.

In the Poisson experiment, set $r 1$ and $t 2$. Run the experiment 1000 times, updating after each run. Compute the appropriate relative frequency functions and investigate empirically the theoretical result in Exercise 23.

Suppose that $0 s t$ and that $n$ is a positive integer. Show that the conditional distribution of $N s$ given $N t n$ is binomial with trial parameter $n$ and success parameter $p s t$. Note that the conditional distribution is independent of the rate $r$. Interpret the result.

Suppose that requests to a web server follow the Poisson model, and that 10 requests come during a 5 minute period. Find the probability that at least 4 requests came during the first 3 minutes of the period.

#### Estimating the Rate

In many practical situations, the rate $r$ of the process in unknown and must be estimated based on observing the number of arrivals in an interval.

Show that $N t t r$ and hence $N t t$ is an unbiased estimator of $r$.

Since the estimator is unbiased, the variance measures the mean square error of the estimator.

Show that $N t t r t$ and hence $N t t 0$ as $t$. This means that $N t t$ is an consistent estimator of $r$.

In the Poisson experiment, set $r 3$ and $t 5$. Run the experiment 100 times, updating after each run.

1. For each run, compute the estimate of $r$ based on $N t$ .
2. Over the 100 runs, compute the average of the squares of the errors.
3. Compare the result in (b) with the variance in Exercise 28.

Suppose that requests to a web server follow the Poisson model with unknown rate $r$ per minute. In a one hour period, the server receives 342 requests. Estimate $r$.