]>
A Markov process is a random process in which the future is independent of the past, given the present. Markov processes, named for Andrei Markov are among the most important of all random processes. In a sense, they are the stochastic analogs of differential equations and recurrence relations, which are of course, among the most important deterministic processes.
Suppose that
is a random process on a probability space
where
is a random variable
taking values in
for each
.
We think of
as the state of a system at time
. The state space
is usually either a countable set or a nice
region of
for some
. The time space
is either
or
.
For , let denote the -algebra of events generated by . Intuitively, contains the events that can be defined in terms of for . In other words, if we are allowed to observe the random variables for , then we can tell whether or not a given event in has occurred.
The random process is a Markov process if the following property (known as the Markov property) holds: For every and with , and for every and , the conditional distribution of given and is the same as the conditional distribution of just given :
In the statement of the Markov property, think of as the present time and hence is a time in the future. Thus, is the present state and is an event that has occurred in the past. If we know the present state, then any additional knowledge of events in the past is irrelevant in terms of predicting the future.
The complexity of Markov processes depends greatly on whether the time space or the state space are discrete or continuous. In this chapter, we assume both are discrete, that is we assume that and that is countable (and hence the state variables have discrete distributions). In this setting, Markov processes are known as Markov chains. The theory of Markov chains is very beautiful and very complete.
The Markov property for a Markov chain can be stated as follows: for any sequence of states ,
Suppose that is a Markov chain with state space . As before, let is the -algebra generated by for each . A random variable taking values in is a stopping time or a Markov time for if for each . Intuitively, we can tell whether or not by observing the chain up to time . In some sense, a stopping time is a random time that does not require that we see into the future.
Suppose that is a stopping time for . Show that for ,
The quintessential example of a stopping time is the hitting time to a nonempty set of states :
where as usual, we define . This random variable gives the first positive time that the chain is in .
Show that is a stopping time for :
The strong Markov property states that the future is independent of the past, given the present, when the present time is a stopping time. For a Markov chain, the ordinary Markov property implies the strong Markov property.
The study of Markov chains is simplified by the use of operator notation that is analogous to operations on vectors and matrices. Suppose that and and that and . Define by
Define by
and define by
Finally, we will sometimes let denote the real number
In all of the definitions, we assume that if is infinite then the either the terms in the sums are nonnegative or the sums converge absolutely, so that the order of the terms in the sums does not matter. We will often refer to a function as a matrix on . Indeed, if is finite, then really is a matrix in the usual sense, with rows and columns labeled by the elements of . The product is ordinary matrix multiplication; the product is the product of the matrix and the column vector ; the product is the product of the row vector and the matrix ; and the product is he product of as a row vector and as a column vector, or equivalently, the inner product of the vectors and .
The sum of two matrices on or two functions on is defined in the usual way, as is a real multiple of a matrix or function.
In the following exercises, suppose that , , and are matrices on and that and are functions on . Assume also that the sums exist.
Show that the associate property holds whenever the operations makes sense. In particular,
Show that the distributive property holds whenever the operations makes sense. In particular,
The commutative property does not hold in general. Give examples where
If is a matrix on we denote , the -fold (matrix) power of for . By convention, , the identity matrix on , defined by
If is a matrix on and , we will let denote the restriction of to . Similarly, if is a function on , then denotes the restriction of to .
If , then is a discrete probability density function (or probability mass function) on if
We have studied these in detail! On the other hand, if , then is a transition probability matrix or stochastic matrix on if
Thus, if is a transition matrix, then is a probability density function on for each . In matrix terminology, the row sums are 1.
Show that if is a transition probability matrix on then where is the constant function 1 on . Thus, in the language of linear algebra, is a right eigenvector of , corresponding to the eigenvalue 1.
Suppose that and are transition probability matrices on and that is a probability density function on . Show that
Suppose that is the probability density function of a random variable taking values in and that . Show that (assuming that the expected value exists).
A function on is said to be left-invariant for a transition probability matrix if . In the language of linear algebra, is a left eigenvector of corresponding to the eigenvalue 1.
Show that if is left-invariant for then for each .
Suppose again that is a Markov chain with state space . For and with , let
The matrix is the transition probability matrix from time to time . The result in the next exercise is known as the Chapman-Kolmogorov equation, named for Sydney Chapman and Andrei Kolmogorov. It gives the basic relationship between the transition matrices.
Suppose that , , and are nonnegative integers with . Use basic properties of conditional probability and the Markov property to show that
.It follows immediately that all of the transition probability matrices for can be obtained from the one-step transition probability matrices: if and are nonnegative integers with then
.Suppose that and are nonnegative integers with . Show that if has probability density function , then has probability density function .
Combining the last two results, it follows that the distribution of (the initial distribution) and the one-step transition matrices determine the distribution of for each . Actually, these basic quantities determine the joint distributions of the process, a much stronger result.
Suppose that has probability density function . Show that for any sequence of states ,
Computations of this sort are the reason for the term chain in Markov chain.
A Markov chain is said to be time homogeneous if the transition matrix from time to time depends only on the difference for any nonnegative integers and with . That is,
It follows that there is a single one-step transition probability matrix , given by
and all other transition matrices can be expressed as powers of . Indeed if then , and the Chapman-Kolmogorov equation is simply the law of exponents for matrix powers. From Exercise 11, if has probability density function then is the probability density function of . In particular, if has probability density function then is the probability density function of . The joint distribution in Exercise 12 above also simplifies: if has probability density function , then for any sequence of states ,
Suppose that has probability density function and that is left-invariant for . Show that has probability density function for each and hence the sequence of random variables is identically distributed (although certainly not independent in general).
In the context of the previous exercise, the probability distribution on associated with is said to be invariant or stationary for or for the Markov chain . Stationary distributions turn out to be of fundamental importance in the study of the limiting behavior of Markov chains.
The assumption of time homogeneity is not as restrictive as might first appear. The following exercise shows that any Markov chain can be turned into a homogeneous chain by enlarging the state space with a time component.
Suppose that is an inhomogeneous Markov chain with state space , and let denote the one-step transition probability at time , for and . Suppose that is a random variable taking values in , independent of . Let and let for . Show that is a homogeneous Markov chain on with transition probability matrix given by
From now on, unless otherwise noted, the term Markov chain will mean homogeneous Markov chain.
Suppose that is an Markov chain with state space and transition probability matrix . For fixed , show that is a Markov chain on with transition probability matrix .
The following exercise also uses the basic trick of enlarging the state space to turn a random process into a Markov chain.
Suppose that is a random process on in which the future depends stochastically on the last two states. Specifically, suppose that for any sequence of states
We also assume that this probability is independent of , and we denote it by . Let for . Show that is a Markov chain on with transition probability matrix given by
The result in the last exercise generalizes in a completely straightforward way to the case where the future of the random process depends stochastically on the last states, for some fixed .
Suppose again that is a Markov chain with state space and transition probability matrix . The directed graph associated with has vertex set and edge set . That is, there is a directed edge from to if and only if state leads to state in one step. Note that the graph may well have loops, since a state can certainly lead back to itself in one step.
Suppose that is a nonempty subset of . Show that
That is, is the probability of going from state to state in steps, remaining in all the while. In graphical terms, it is the sum of products of probabilities along paths of length from to that stay inside . Note that in general, .
Perhaps the simplest, non-trivial Markov chain has two states, say and the transition probability matrix given below, where and are parameters with and .
Show that the eigenvalues of are 1 and .
Show that where
.Show that
Show that
Show that the only invariant probability density function for the chain is .
In spite of its simplicity, the two state chain illustrates some of the basic limiting behavior and the connection with invariant distributions that we will study in general in a later section.
Suppose that is a sequence of independent random variables taking values in a countable set , and that are identically distributed with (discrete) probability density function .
Show that is a Markov chain on with transition probability matrix given by for and . Show also that is invariant for .
As a Markov chain, the process is not very interesting, although it is very interesting in other ways. Suppose now that , the set of integers, and consider the partial sum process (or random walk) associated with :
Show that is a Markov chain on with transition probability matrix given by for and .
Consider the special case where and , where . Show that the transition probability matrix is given by
This special case is the simple random walk on . When we have the simple, symmetric random walk. Simple random walks are studied in more detail in the chapter on Bernoulli Trials.
A matrix on is doubly stochastic if it is nonnegative and if the row and columns sums are 1:
Suppose that is a Markov chain on a finite state space with doubly stochastic transition probability matrix . Show that the uniform distribution on is invariant.
A matrix on is symmetric if for all .
The Markov chains in the following exercises model important processes that are studied in separate sections. We will refer to these chains frequently.
Read the introduction to the Ehrenfest chains and work the exercises.
Read the introduction to the Bernoulli-Laplace chain and work the exercises.
Read the introduction to the reliability chains and work the exercises.
Read the introduction to the branching chain and work the exercises.
Read the introduction to the queuing chains and work the exercises.
Read the introduction to random walks on graphs and work the exercises.
Read the introduction to birth-death chains and work the exercises.