Lectures On Lévy Processes, Stochastic Calculus And Financial ...
Lectures on L´evy Processes, Stochastic
Calculus and Financial Applications,
Ovronnaz September 2005
David Applebaum
Probability and Statistics Department,
University of Sheffield,
Hicks Building, Hounsfield Road,
Sheffield, England, S3 7RH
e-mail: D.Applebaum@sheffield.ac.uk
Introduction
A L´
evy process is essentially a stochastic process with stationary and in-
dependent increments. The basic theory was developed, principally by Paul
L´evy in the 1930s. In the past 15 years there has been a renaissance of
interest and a plethora of books, articles and conferences. Why ?
There are both theoretical and practical reasons.
Theoretical
• There are many interesting examples - Brownian motion, simple and
compound Poisson processes, α-stable processes, subordinated processes,
financial processes, relativistic process, Riemann zeta process . . .
• L´evy processes are simplest generic class of process which have (a.s.)
continuous paths interspersed with random jumps of arbitrary size oc-
curring at random times.
• L´evy processes comprise a natural subclass of semimartingales and of
Markov processes of Feller type.
• Noise. L´evy processes are a good model of “noise” in random dynamical
systems.
1
Input + Noise = Output
Attempts to describe this differentially leads to stochastic calculus. A
large class of Markov processes can be built as solutions of stochastic
differential equations driven by L´evy noise.
L´evy driven stochastic partial differential equations are beginning to be
studied with some intensity.
• Robust structure. Most applications utilise L´evy processes taking val-
ues in Euclidean space but this can be replaced by a Hilbert space, a
Banach space (these are important for spdes), a locally compact group,
a manifold. Quantised versions are non-commutative L´evy processes on
quantum groups.
Applications
These include:
• Turbulence via Burger’s equation (Bertoin)
• New examples of quantum field theories (Albeverio, Gottshalk, Wu)
• Viscoelasticity (Bouleau)
• Time series - L´evy driven CARMA models (Brockwell)
• Finance ( a cast of thousands)
The biggest explosion of activity has been in mathematical finance. Two
major areas of activity are:
• option pricing in incomplete markets.
• interest rate modelling.
2
1
Lecture 1: Infinite Divisibility and L´
evy
Processes
1.1
Some Basic Ideas of Probability
Notation. Our state space is Euclidean space Rd. The inner product be-
tween two vectors x = (x1, . . . , xd) and y = (y1, . . . , yd) is
d
(x, y) =
xiyi.
i=1
The associated norm (length of a vector) is
1
d
2
|x| = (x, x)12 =
x2
.
i
i=1
Let (Ω, F, P ) be a probability space, so that Ω is a set, F is a σ-algebra
of subsets of Ω and P is a probability measure defined on (Ω, F) .
Random variables are measurable functions X : Ω → Rd. The law of X
is pX, where for each A ∈ F, pX(A) = P (X ∈ A).
(Xn, n ∈ N) are independent if for all i1, i2, . . . ir ∈ N, Ai , A , . . . , A ∈
1
i2
ir
B(Rd),
P (Xi ∈ A
∈ A
∈ A
∈ A
∈ A
∈ A
1
1, Xi2
2, . . . , Xir
r) = P (Xi1
1)P (Xi2
2) · · · P (Xir
r).
If X and Y are independent, the law of X + Y is given by convolution of
measures
pX+Y = pX ∗ pY , where (pX ∗ pY )(A) =
pX(A − y)pY (dy).
Rd
Equivalently
g(y)(pX ∗ pY )(dy) =
g(x + y)pX(dx)pY (dy),
Rd
Rd
Rd
for all g ∈ Bb(Rd) (the bounded Borel measurable functions on Rd).
If X and Y are independent with densities fX and fY , respectively, then
X + Y has density fX+Y given by convolution of functions:
fX+Y = fX ∗ fY , where (fX ∗ fY )(x) =
fX(x − y)fY (y)dy.
Rd
3
The characteristic function of X is φX : Rd → C, where
φX(u) =
ei(u,x)pX(dx).
Rd
Theorem 1.1 (Kac’s theorem) X1, . . . , Xn are independent if and only if
n
E exp i
(uj, Xj)
= φX (u
(u
1
1) · · · φXn
n)
j=1
for all u1, . . . , un ∈ Rd.
More generally, the characteristic function of a probability measure µ on
Rd is
φµ(u) =
ei(u,x)µ(dx).
Rd
Important properties are:-
1. φµ(0) = 1.
2. φµ is positive definite i.e.
c
i,j i ¯
cjφµ(ui − uj) ≥ 0, for all ci ∈ C, ui ∈
Rd, 1 ≤ i, j ≤ n, n ∈ N.
3. φµ is uniformly continuous - Hint: Look at |φµ(u + h) − φµ(u)| and use
dominated convergence)).
Conversely Bochner’s theorem states that if φ : Rd → C satisfies (1), (2)
and is continuous at u = 0, then it is the characteristic function of some
probability measure µ on Rd.
ψ : Rd → C is conditionally positive definite if for all n ∈ N and
c1, . . . , cn ∈ C for which
n
c
j=1 j = 0 we have
n
cj ¯
ckψ(uj − uk) ≥ 0,
j,k=1
for all u1, . . . , un ∈ Rd. ψ : Rd → C will be said to be hermitian if
ψ(u) = ψ(−u), for all u ∈ Rd.
Theorem 1.2 (Schoenberg correspondence) ψ : Rd → C is hermitian
and conditionally positive definite if and only if etψ is positive definite for
each t > 0.
4
Proof. We only give the easy part here.
Suppose that etψ is positive definite for all t > 0. Fix n ∈ N and choose
c1, . . . , cn and u1, . . . , un as above. We then find that for each t > 0,
1 n c
t
j ¯
ck(etψ(uj−uk) − 1) ≥ 0,
j,k=1
and so
n
1 n
cj ¯
ckψ(uj − uk) = lim
cj ¯
ck(etψ(uj−uk) − 1) ≥ 0.
t→0 t
j,k=1
j,k=1
5
1.2
Infinite Divisibility
We study this first because a L´evy process is infinite divisibility in motion,
i.e. infinite divisibility is the underlying probabilistic idea which a L´evy
process embodies dynamically.
Let µ be a probability measure on Rd. Define µ∗n = µ ∗ · · · ∗ µ (n times).
We say that µ has a convolution nth root, if there exists a probability measure
µ 1n for which (µ 1n )∗n = µ.
µ is infinitely divisible if it has a convolution nth root for all n ∈ N. In
this case µ 1n is unique.
Theorem 1.3 µ is infinitely divisible iff for all n ∈ N, there exists a proba-
bility measure µn with characteristic function φn such that
φµ(u) = (φn(u))n,
for all u ∈ Rd. Moreover µn = µ 1n .
Proof. If µ is infinitely divisible, take φn = φ 1 . Conversely, for each
µ n
n ∈ N, by Fubini’s theorem,
φµ(u) =
· · ·
ei(u,y1+···+yn)µn(dy1) · · · µn(dyn)
Rd
Rd
=
ei(u,y)µ∗n(dy).
n
Rd
But φµ(u) =
ei(u,y)µ(dy) and φ determines µ uniquely. Hence µ =
Rd
µ∗n.
n
- If µ and ν are each infinitely divisible, then so is µ ∗ ν.
- If (µ
w
n, n ∈ N) are infinitely divisible and µn ⇒ µ, then µ is infinitely
divisible.
[Note: Weak convergence. µ w
n ⇒ µ means
lim
f (x)µn(dx) =
f (x)µ(dx),
n→∞
Rd
Rd
for each f ∈ Cb(Rd).]
A random variable X is infinitely divisible if its law pX is infinitely di-
visible, e.g. X d
= Y (n) + · · · + Y (n)
, . . . , Y (n)
1
n
, where Y (n)
1
n
are i.i.d., for each
n ∈ N.
6
1.2.1
Examples of Infinite Divisibility
In the following, we will demonstrate infinite divisibility of a random variable
X by finding i.i.d. Y (n), . . . , Y (n)
+ · · · + Y (n)
1
n
such that X d
= Y (n)
1
n
, for each
n ∈ N.
Example 1 - Gaussian Random Variables
Let X = (X1, . . . , Xd) be a random vector.
We say that it is (non − degenerate)Gaussian if it there exists a vector
m ∈ Rd and a strictly positive-definite symmetric d × d matrix A such that
X has a pdf (probability density function) of the form:-
1
1
f (x) =
exp − (x − m, A−1(x − m)) ,
(1.1)
(2π)d2
det(A)
2
for all x ∈ Rd.
In this case we will write X ∼ N(m, A). The vector m is the mean of X, so
m = E(X) and A is the covariance matrix so that A = E((X − m)(X − m)T ).
A standard calculation yields
1
φX(u) = exp {i(m, u) − (u, Au)},
(1.2)
2
and hence
m
1
1
(φX(u)) 1n = exp i
, u −
u, Au
,
n
2
n
so we see that X is infinitely divisible with each Y (n) ∼ N(m, 1 A) for each
j
n
n
1 ≤ j ≤ n.
We say that X is a standard normal whenever X ∼ N(0, σ2I) for some σ > 0.
We say that X is degenerate Gaussian if (1.2) holds with det(A) = 0, and
these random variables are also infinitely divisible.
Example 2 - Poisson Random Variables
In this case, we take d = 1 and consider a random variable X taking
values in the set n ∈ N ∪ {0}. We say that is Poisson if there exists c > 0
for which
cn
P (X = n) =
e−c.
n!
In this case we will write X ∼ π(c). We have E(X) = Var(X) = c. It is
easy to verify that
φX(u) = exp[c(eiu − 1)],
from which we deduce that X is infinitely divisible with each Y (n) ∼ π( c ),
j
n
for 1 ≤ j ≤ n, n ∈ N.
7
Example 3 - Compound Poisson Random Variables
Let (Z(n), n ∈ N) be a sequence of i.i.d. random variables taking values in
Rd with common law µZ and let N ∼ π(c) be a Poisson random variable which
is independent of all the Z(n)’s. The compound Poisson random variable X
is defined as follows:-
X = Z(1) + · · · + Z(N).
Proposition 1.1 For each u ∈ Rd,
φX(u) = exp
(ei(u,y) − 1)cµZ(dy) .
Rd
Proof. Let φZ be the common characteristic function of the Zn’s. By condi-
tioning and using independence we find,
∞
φX(u) =
E(ei(u,Z(1)+···+Z(N))|N = n)P (N = n)
n=0
∞
cn
=
E(ei(u,Z(1))+···+Z(n)))e−c n!
n=0
∞ [cφ
= e−c
Z (u)]n
n!
n=0
= exp[c(φZ(u) − 1)],
and the result follows on writing φZ(u) = ei(u,y)µZ(dy).
If X is compound Poisson as above, we write X ∼ π(c, µZ). It is clearly
infinitely divisible with each Y (n) ∼ π( c , µ
j
n
Z ), for 1 ≤ j ≤ n.
1.2.2
The L´
evy-Khintchine Formula
de Finetti (1920’s) suggested that the most general infinitely divisible random
variable could be written X = Y + W , where Y and W are independent,
Y ∼ N(m, A), W ∼ π(c, µZ). Then φX(u) = eη(u), where
1
η(u) = i(m, u) − (u, Au) +
(ei(u,y) − 1)cµ
2
Z (dy).
(1.3)
Rd
This is WRONG! ν(·) = cµZ(·) is a finite measure here. L´evy and Khint-
chine showed that ν can be σ-finite, provided it is what is now called a L´evy
measure on Rd − {0} = {x ∈ Rd, x = 0}, i.e.
(|y|2 ∧ 1)ν(dy) < ∞,
(1.4)
8
(where a ∧ b := min{a, b}, for a, b ∈ R). Since |y|2 ∧ ≤ |y|2 ∧ 1 whenever
0 < ≤ 1, it follows from (1.4) that
ν((− , )c) < ∞ for all > 0.
Here is the fundamental result of this lecture:-
Theorem 1.4 (L´
evy-Khintchine) A Borel probability measure µ on Rd is
infinitely divisible if there exists a vector b ∈ Rd, a non-negative symmetric
d × d matrix A and a L´evy measure ν on Rd − {0} such that for all u ∈ Rd,
1
φµ(u) = exp i(b, u) − (u, Au) +
(ei(u,y) − 1 − i(u, y)χ (y))ν(dy) .
2
ˆ
B
Rd−{0}
(1.5)
where ˆ
B = B1(0) = {y ∈ Rd; |y| < 1}.
Conversely, any mapping of the form (1.5) is the characteristic function
of an infinitely divisible probability measure on Rd.
The triple (b, A, ν) is called the characteristics of the infinitely divisible
random variable X. Define η = log φµ, where we take the principal part of
the logarithm. η is called the L´evy symbol or the characteristic exponent.
We’re not going to prove this result here. To understand it, it is instruc-
tive to let (Un, n ∈ N) be a sequence of Borel sets in B1(0) with Un ↓ {0}.
Observe that
η(u) = lim ηn(u) where each
n→∞
1
ηn(u) = i
b −
yν(dy), u
− (u, Au) +
(ei(u,y) − 1)ν(dy),
U c
2
n∩ ˆ
B
U c
n
so η is in some sense (to be made more precise later) the limit of a
sequence of sums of Gaussians and independent compound Poissons. Inter-
esting phenomena appear in the limit as we’ll see below. First, we classify
L´evy symbols analytically:-
Theorem 1.5 η is a L´evy symbol if and only if it is a continuous, hermitian
conditionally positive definite function for which η(0) = 0.
9
1.2.3
Stable Laws
This is one of the most important subclasses of infinitely divisible laws.
We consider the general central limit problem in dimension d = 1, so let
(Yn, n ∈ N) be a sequence of real valued i.i.d. random variables and consider
the rescaled partial sums
Y
S
1 + Y2 + · · · + Yn − bn
n =
,
σn
where (bn, n ∈ N) is an arbitrary sequence of real numbers and (σn, n ∈ N) an
arbitrary sequence of positive numbers. We are interested in the case where
there exists a random variable X for which
lim P (Sn ≤ x) = P (X ≤ x),
(1.6)
n→∞
for all x ∈ R i.e. (Sn, n ∈ N) converges in distribution to X. If each bn = nm
√
and σn =
nσ for fixed m ∈ R, σ > 0 then X ∼ N(m, σ2) by the usual
Laplace - de-Moivre central limit theorem.
More generally a random variable is said to be stable if it arises as a limit
as in (1.6). It is not difficult to show that (1.6) is equivalent to the following:-
There exist real valued sequences (cn, n ∈ N) and (dn, n ∈ N) with each
cn > 0 such that
X
d
1 + X2 + · · · + Xn = cnX + dn
(1.7)
where X1, . . . , Xn are independent copies of X. X is said to be strictly
stable if each dn = 0.
To see that (1.7) ⇒ (1.6) take each Yj = Xj, bn = dn and σn = cn. In fact
it can be shown that the only possible choice of cn in (1.7) is cn = σn 1α , where
0 < α ≤ 2 and σ > 0. The parameter α plays a key role in the investigation
of stable random variables and is called the index of stability.
Note that (1.7) can also be expressed in the equivalent form
φX(u)n = eiudnφX(cnu),
for each u ∈ R.
It follows immediately from (1.7) that all stable random variables are
infinitely divisible and the characteristics in the L´evy-Khintchine formula
are given by the following result.
Theorem 1.6 If X is a stable real-valued random variable, then its charac-
teristics must take one of the two following forms.
10
1. When α = 2, ν = 0 (so X ∼ N(b, A)).
c
c
2. When α = 2, A = 0 and ν(dx) =
1
χ
2
χ
x1+α (0,∞)(x)dx+|x|1+α (−∞,0)(x)dx,
where c1 ≥ 0, c2 ≥ 0 and c1 + c2 > 0.
A careful transformation of the integrals in the L´evy-Khintchine formula
gives a different form for the characteristic function which is often more
convenient.
Theorem 1.7 A real-valued random variable X is stable if and only if there
exists σ > 0, −1 ≤ β ≤ 1 and µ ∈ R such that for all u ∈ R,
1.
1
φX(u) = exp iµu − σ2u2
when α = 2.
2
2.
πα
φX(u) = exp iµu − σα|u|α(1 − iβsgn(u) tan(
))
when α = 1, 2.
2
3.
2
φX(u) = exp iµu − σ|u|(1 + iβ sgn(u) log(|u|))
when α = 1.
π
It can be shown that E(X2) < ∞ if and only if α = 2 (i.e. X is Gaussian)
and E(|X|) < ∞ if and only if 1 < α ≤ 2.
All stable random variables have densities fX, which can in general be
expressed in series form. In three important cases, there are closed forms.
1. The Normal Distribution
α = 2,
X ∼ N(µ, σ2).
2. The Cauchy Distribution
σ
α = 1, β = 0 fX(x) =
.
π[(x − µ)2 + σ2]
3. The L´
evy Distribution
1
σ
1
σ
α = , β = 1 f
2
1
exp −
, for x > µ.
2
X (x) =
2π
(x − µ)32
2(x − µ)
11
In general the series representations are given in terms of a real valued
parameter λ.
For x > 0 and 0 < α < 1:
1 ∞ Γ(kα + 1)
kπ
fX(x, λ) =
(−x−α)k sin
(λ − α)
πx
k!
2
k=1
For x > 0 and 1 < α < 2,
1 ∞ Γ(kα−1 + 1)
kπ
fX(x, λ) =
(−x)k sin
(λ − α)
πx
k!
2α
k=1
In each case the formula for negative x is obtained by using
fX(−x, λ) = fX(x, −λ),
for x > 0.
Note that if a stable random variable is symmetric then Theorem 1.7
yields
φX(u) = exp(−ρα|u|α) for all 0 < α ≤ 2,
(1.8)
where ρ = σ(0 < α < 2) and ρ = σ
√ , when α = 2, and we will write
2
X ∼ SαS in this case.
One of the reasons why stable laws are so important in applications is the
nice decay properties of the tails. The case α = 2 is special in that we have
exponential decay, indeed for a standard normal X there is the elementary
estimate
e−1y2
2
P (X > y) ∼ √
as y → ∞,
2πy
When α = 2 we have the slower polynomial decay as expressed in the
following,
1 + β
lim yαP (X > y) = Cα
σα,
y→∞
2
1 − β
lim yαP (X < −y) = Cα
σα,
y→∞
2
where Cα > 1. The relatively slow decay of the tails for non-Gaussian sta-
ble laws (“heavy tails”) makes them ideally suited for modelling a wide range
of interesting phenomena, some of which exhibit “long-range dependence”.
Deeper mathematical investigations of heavy tails require the mathematical
technique of regular variation.
The generalisation of stability to random vectors is straightforward - just
replace X1, . . . , Xn, X and each dn in (1.7) by vectors and the formula in
12
Theorem 1.6 extends directly. Note however that when α = 2 in the random
vector version of Theorem 1.6, the L´evy measure takes the form
c
ν(dx) =
dx
|x|d+α
where c > 0.
We can generalise the definition of stable random variables if we weaken
the conditions on the random variables (Y (n), n ∈ N) in the general central
limit problem by requiring these to be independent, but no longer necessarily
identically distributed. In this case (subject to a technical growth restriction)
the limiting random variables are called self-decomposable (or of class L) and
they are also infinitely divisible. Alternatively a random variable X is self-
decomposable if and only if for each 0 < a < 1, there exists a random variable
Ya which is independent of X such that
X d
= aX + Ya ⇔ φX(u) = φX(au)φY (u),
a
for all u ∈ Rd. An infinitely divisible law is self-decomposable if and only
if the L´evy measure is of the form:
k(x)
ν(dx) =
dx,
|x|
where k is decreasing on (0, ∞) and increasing on (−∞, 0). There has
recently been increasing interest in these distributions both from a theoretical
and applied perspective. Examples include gamma, Pareto, Student-t, F and
log-normal distributions.
2
L´
evy Processes
Let X = (X(t), t ≥ 0) be a stochastic process defined on a probability space
(Ω, F, P ). We say that it has independent increments if for each n ∈ N
and each 0 ≤ t1 < t2 < · · · < tn+1 < ∞, the random variables (X(tj+1) −
X(tj), 1 ≤ j ≤ n) are independent and it has stationary increments if each
X(tj+1) − X(tj) d= X(tj+1 − tj) − X(0).
We say that X is a L´evy process if
(L1) Each X(0) = 0 (a.s),
(L2) X has independent and stationary increments,
13
(L3) X is stochastically continuous i.e. for all a > 0 and for all s ≥ 0,
lim P (|X(t) − X(s)| > a) = 0.
t→s
Note that in the presence of (L1) and (L2), (L3) is equivalent to the
condition
lim P (|X(t)| > a) = 0.
t↓0
The sample paths of a process are the maps t → X(t)(ω) from R+ to Rd,
for each ω ∈ Ω.
We are now going to explore the relationship between L´evy processes and
infinite divisibility.
Proposition 2.1 If X is a L´evy process, then X(t) is infinitely divisible for
each t ≥ 0.
Proof. For each n ∈ N, we can write
X(t) = Y (n)(t) + · · · + Y (n)(t)
1
n
where each Y (n)(t) = X(kt) − X((k−1)t). The Y (n)(t)’s are i.i.d.
k
n
n
k
by (L2).
By Proposition 2.1, we can write φX(t)(u) = eη(t,u) for each t ≥ 0, u ∈ Rd,
where each η(t, ·) is a L´evy symbol.
Theorem 2.1 If X is a L´evy process, then
φX(t)(u) = etη(u),
for each u ∈ Rd, t ≥ 0, where η is the L´evy symbol of X(1).
Proof.
Suppose that X is a L´evy process and for each u ∈ Rd, t ≥ 0,
define φu(t) = φX(t)(u) then by (L2) we have for all s ≥ 0,
φu(t + s) = E(ei(u,X(t+s)))
= E(ei(u,X(t+s)−X(s))ei(u,X(s)))
= E(ei(u,X(t+s)−X(s)))E(ei(u,X(s)))
= φu(t)φu(s) . . . (i)
Now φu(0) = 1 . . . (ii) by (L1), and the map t → φu(t) is continuous.
However the unique continuous solution to (i) and (ii) is given by φu(t) =
14
etα(u), where α : Rd → C. Now by Proposition 2.1, X(1) is infinitely divisible,
hence α is a L´evy symbol and the result follows.
We now have the L´evy-Khinchine formula for a L´evy process X =
(X(t), t ≥ 0):-
1
E(ei(u,X(t))) = exp t i(b, u) − (u, Au) +
(ei(u,y) − 1 − i(u, y)χ (y))ν(dy)
,
2
ˆ
B
Rd−{0}
(2.9)
for each t ≥ 0, u ∈ Rd, where (b, A, ν) are the characteristics of X(1).
We will define the L´evy symbol and the characteristics of a L´evy process X
to be those of the random variable X(1). We will sometimes write the former
as ηX when we want to emphasise that it belongs to the process X.
Let pt be the law of X(t), for each t ≥ 0. By (L2), we have for all s, t ≥ 0
that:
pt+s = pt ∗ ps.
By (L3), we have p w
t → δ0 as t → 0, i.e. limt→0 f (x)pt(dx) = f (0).
(pt, t ≥ 0) is a weakly continuous convolution semigroup of probability
measures on Rd. Conversely, given any such semigroup, we can always con-
struct a L´evy process on path space via Kolmogorov’s construction.
Informally, we have the following asymptotic relationship between the law
of a L´evy process and its L´evy measure:
p
ν = lim t .
t↓0 t
More precisely
1
lim
f (x)pt(dx) =
f (x)ν(dx),
(2.10)
t↓0 t
Rd
Rd
for bounded, continuous functions f which vanish in some neighborhood
of the origin.
15
2.1
Examples of L´
evy Processes
Example 1, Brownian Motion and Gaussian Processes
A (standard) Brownian motion in Rd is a L´evy process B = (B(t), t ≥ 0)
for which
(B1)
B(t) ∼ N(0, tI) for each t ≥ 0,
(B2)
B has continuous sample paths.
It follows immediately from (B1) that if B is a standard Brownian motion,
then its characteristic function is given by
1
φB(t)(u) = exp{− t|u|2},
2
for each u ∈ Rd, t ≥ 0.
We introduce the marginal processes Bi = (Bi(t), t ≥ 0) where each Bi(t)
is the ith component of B(t), then it is not difficult to verify that the Bi’s
are mutually independent Brownian motions in R. We will call these one-
dimensional Brownian motions in the sequel.
Brownian motion has been the most intensively studied L´evy process.
In the early years of the twentieth century, it was introduced as a model for
the physical phenomenon of Brownian motion by Einstein and Smoluchowski
and as a description of the dynamical evolution of stock prices by Bachelier.
The theory was placed on a rigorous mathematical basis by Norbert Wiener
in the 1920’s.
We could try to use the Kolmogorov existence theorem to construct one-
dimensional Brownian motion from the following prescription on cylinder
sets of the form IH
= {ω ∈ Ω; ω(t
t1,...,tn
1) ∈ [a1, b1], . . . , ω(tn) ∈ [an, bn]} where
H = [a1, b1] × · · · [an, bn] and we have taken Ω to be the set of all mappings
from R+ to R:
P (IH
)
t1,...,tn
1
1
x2
(x
=
exp −
1 +
2 − x1)2 + · · ·
H (2π) n2
t
2
t
t
1(t2 − t1) . . . (tn − tn−1)
1
2 − t1
(x
+
n − xn−1)2
dx
t
1 · · · dxn.
n − tn−1
However there there is then no guarantee that the paths are continu-
ous. The literature contains a number of ingenious methods for constructing
Brownian motion. One of the most delightful of these (originally due to Paley
and Wiener) obtains this, in the case d = 1, as a random Fourier series
√2 ∞ sin(πt(n + 1))
B(t) =
2
ξ(n),
π
n + 1
n=0
2
16
for each t ≥ 0, where (ξ(n), n ∈ N) is a sequence of i.i.d. N(0, 1) random
variables.
We list a number of useful properties of Brownian motion in the case
d = 1.
• Brownian motion is locally H¨older continuous with exponent α for every
0 < α < 1 i.e. for every T > 0, ω ∈ Ω there exists K = K(T, ω) such
2
that
|B(t)(ω) − B(s)(ω)| ≤ K|t − s|α,
for all 0 ≤ s < t ≤ T .
• The sample paths t → B(t)(ω) are almost surely nowhere differentiable.
• For any sequence, (tn, n ∈ N) in R+ with tn ↑ ∞,
lim inf B(tn) = −∞ a.s.
lim sup B(tn) = ∞ a.s.
n→∞
n→∞
• The law of the iterated logarithm:-
B(t)
P
lim sup
= 1
= 1.
t↓0
(2t log(log(1)))12
t
1.2
0.8
0.4
0
−0.4
−0.8
−1.2
−1.6
−2.0
0
1
2
3
4
5
Figure 1 Simulation of standard Brownian motion
17
Let A be a non-negative symmetric d×d matrix and let σ be a square root
of A so that σ is a d × m matrix for which σσT = A. Now let b ∈ Rd and let
B be a Brownian motion in Rm. We construct a process C = (C(t), t ≥ 0)
in Rd by
C(t) = bt + σB(t),
(2.11)
then C is a L´evy process with each C(t) ∼ N(tb, tA). It is not difficult
to verify that C is also a Gaussian process, i.e. all its finite dimensional
distributions are Gaussian. It is sometimes called Brownian motion with
drift. The L´evy symbol of C is
1
ηC(u) = i(b, u) − (u, Au).
2
In fact a L´evy process has continuous sample paths if and only if it is of
the form (2.11).
Example 2 - The Poisson Process
The Poisson process of intensity λ > 0 is a L´evy process N taking values
in N ∪ {0} wherein each N(t) ∼ π(λt) so we have
(λt)n
P (N(t) = n) =
e−λt,
n!
for each n = 0, 1, 2, . . .. The Poisson process is widely used in applications
and there is a wealth of literature concerning it and its generalisations. We
define non-negative random variables (Tn, N ∪ {0}) (usually called waiting
times) by T0 = 0 and for n ∈ N,
Tn = inf{t ≥ 0; N(t) = n},
then it is well known that the Tn’s are gamma distributed. Moreover, the
inter-arrival times Tn − Tn−1 for n ∈ N are i.i.d. and each has exponential
distribution with mean 1 . The sample paths of N are clearly piecewise
λ
constant with “jump” discontinuities of size 1 at each of the random times
(Tn, n ∈ N).
18
10
8
6
Number
4
2
0
0
5
10
15
20
Time
Figure 2. Simulation of a Poisson process (λ = 0.5)
For later work it is useful to introduce the compensated Poisson process
˜
N = ( ˜
N(t), t ≥ 0) where each ˜
N(t) = N(t) − λt. Note that E( ˜
N(t)) = 0 and
E( ˜
N(t)2) = λt for each t ≥ 0 .
Example 3 - The Compound Poisson Process
Let (Z(n), n ∈ N) be a sequence of i.i.d. random variables taking values
in Rd with common law µZ and let N be a Poisson process of intensity λ
which is independent of all the Z(n)’s. The compound Poisson process Y is
defined as follows:-
Y (t) = Z(1) + . . . + Z(N(t)),
(2.12)
for each t ≥ 0, so each Y (t) ∼ π(λt, µZ).
By Proposition 1.1 we see that Y has L´evy symbol
ηY (u) =
(ei(u,y) − 1)λµZ(dy) .
Again the sample paths of Y are piecewise constant with “jump discon-
tinuities” at the random times (T (n), n ∈ N), however this time the size of
the jumps is itself random, and the jump at T (n) can be any value in the
range of the random variable Z(n).
19
0
-1
Path
-2
-3
0
10
20
30
Time
Figure 3. Simulation of a compound Poisson process with N(0, 1)
summands(λ = 1).
Example 4 - Interlacing Processes
Let C be a Gaussian L´evy process as in Example 1 and Y be a compound
Poisson process as in Example 3, which is independent of C. Define a new
process X by
X(t) = C(t) + Y (t),
for all t ≥ 0, then it is not difficult to verify that X is a L´evy process
with L´evy symbol of the form (1.3). Using the notation of Examples 2 and 3,
we see that the paths of X have jumps of random size occurring at random
times. In fact we have,
X(t) = C(t)
for 0 ≤ t < T1,
= C(T1) + Z1
when t = T1,
= X(T1) + C(t) − C(T1)
for T1 < t < T2,
= X(T2−) + Z2
when t = T2,
and so on recursively. We call this procedure an interlacing as a continuous
path process is “interlaced” with random jumps. From the remarks after
Theorem 1.4, it seems reasonable that the most general L´evy process might
arise as the limit of a sequence of such interlacings, and this can be established
rigorously.
Example 5 - Stable L´
evy Processes
A stable L´evy process is a L´evy process X in which the L´evy symbol is
given by theorem 1.6. So, in particular, each X(t) is a stable random variable.
20
Of particular interest is the rotationally invariant case whose L´evy symbol is
given by
η(u) = −σα|u|α,
where α is the index of stability (0 < α ≤ 2). One of the reasons why these
are important in applications is that they display self-similarity. In general,
a stochastic process Y = (Y (t), t ≥ 0) is self-similar with Hurst index H >
0 if the two processes (Y (at), t ≥ 0) and (aHY (t), t ≥ 0) have the same
finite-dimensional distributions for all a ≥ 0. By examining characteristic
functions, it is easily verified that a rotationally invariant stable L´evy process
is self-similar with Hurst index H = 1 , so that e.g. Brownian motion is self-
α
similar with H = 1. A L´evy process X is self-similar if and only if each X(t)
2
is strictly stable.
80
40
0
−40
−80
−120
−160
−200
−240
0
1
2
3
4
5
Figure 4 Simulation of the Cauchy process.
2.2
Densities of L´
evy Processes
Question: When does a L´evy process have a density ft for all t > 0 so that
for all Borel sets B:
P (Xt ∈ B) = pt(B) =
ft(x)dx.
B
In general, a random variable has a continuous density if its characteristic
function is integrable and in this case, the density is the Fourier transform
21
of the characteristic function. So for L´evy processes, if for all t > 0,
|etη(u)|du =
et (η(u))du < ∞
Rd
Rd
we have
ft(x) = (2π)−d
etη(u)−i(u,x)du.
Rd
For example (d = 1) if X is α-stable, it has a density since for all 1 ≤
α ≤ 2:
e−t|u|αdu ≤
e−t|u|du < ∞,
R
R
and for 0 ≤ α < 1:
2
∞
e−t|u|αdu =
e−tyy 1 −1
α
dy < ∞.
R
α 0
In d = 1 the following result giving a condition to have a density in terms
of the L´evy measure is due to Orey:
Theorem 2.2 A L´evy process X has a smooth density ft for all t > 0 if
1
r
lim inf
x2ν(dx) > 0,
r↓0
r2−β −r
for some 0 < β < 2.
A L´evy process has a L´evy density gν if its L´evy measure ν is absolutely
continuous with respect to Lebesgue measure, then gν is defined to be the
dν
Radon-Nikodym derivative
.
dx
Example. Let X be a compound Poisson process with each X(t) =
Y1 + Y2 + · · · + YN(t) wherein each Yj has a density fY , then gν = λfY is the
L´evy density.
We have pt(A) = e−λtδ0(A) +
f ac(x)dx, where for x = 0
A t
∞ (λt)n
f ac(x) = e−λt
f ∗n(x).
t
n!
Y
n=1
f ac(x) is the density of X conditioned on the fact that it jumps at least
once between 0 and t. In this case, (2.10) takes the precise form (for x = 0)
f ac(x)
g
t
ν (x) = lim
.
t↓0
t
22
2.3
Subordinators
A subordinator is a one-dimensional L´evy process which is increasing a.s.
Such processes can be thought of as a random model of time evolution, since
if T = (T (t), t ≥ 0) is a subordinator we have
T (t) ≥ 0 for each t > 0 a.s.
and T (t1) ≤ T (t2) whenever t1 ≤ t2 a.s.
Now since for X(t) ∼ N(0, At) we have P (X(t) ≥ 0) = P (X(t) ≤ 0) = 1,
2
it is clear that such a process cannot be a subordinator. More generally we
have
Theorem 2.3 If T is a subordinator then its L´evy symbol takes the form
η(u) = ibu +
(eiuy − 1)λ(dy),
(2.13)
(0,∞)
where b ≥ 0, and the L´evy measure λ satisfies the additional requirements
λ(−∞, 0) = 0 and
(y ∧ 1)λ(dy) < ∞.
(0,∞)
Conversely, any mapping from Rd → C of the form (2.13) is the L´evy
symbol of a subordinator.
We call the pair (b, λ), the characteristics of the subordinator T .
Now for each t ≥ 0, the map u → E(eiuT(t)) can be analytically continued
to the region {iu, u > 0} and we then obtain the following expression for the
Laplace transform of the distribution
E(e−uT(t)) = e−tψ(u),
where ψ(u) = −η(iu) = bu +
(1 − e−uy)λ(dy)
(2.14)
(0,∞)
for each t, u ≥ 0. We note that this is much more useful for both theo-
retical and practical application than the characteristic function.
The function ψ is usually called the Laplace exponent of the subordinator.
23
Examples
(1) The Poisson Case
Poisson processes are clearly subordinators. More generally a compound
Poisson process will be a subordinator if and only if the Z(n)’s in equation
(2.12) are all R+ valued.
(2) α-Stable Subordinators
Using straightforward calculus, we find that for 0 < α < 1, u ≥ 0,
α
∞
dx
uα =
(1 − e−ux)
.
Γ(1 − α) 0
x1+α
Hence by (2.14), Theorem 2.3 and Theorem 1.6, we see that for each
0 < α < 1 there exists an α-stable subordinator T with Laplace exponent
ψ(u) = uα.
and the characteristics of T are (0, λ) where λ(dx) =
α
dx .
Γ(1−α) x1+α
Note that when we analytically continue this to obtain the L´evy symbol
we obtain the form given in Theorem 1.7(2) with µ = 0, β = 1 and σα =
cos απ .
2
(3) The L´
evy Subordinator
The 1-stable subordinator has a density given by the L´evy distribution
2
(with µ = 0 and σ = t2 )
2
t
fT(t)(s) =
√
s−32 e− t2
4s ,
2 π
for s ≥ 0. The L´evy subordinator has a nice probabilistic interpretation as
a first hitting time for one-dimensional standard Brownian motion (B(t), t ≥
0), more precisely
t
T (t) = inf s > 0; B(s) = √
.
(2.15)
2
To show directly that for each t ≥ 0,
∞
1
E(e−uT(t)) =
e−usf
2
T (t)(s)ds = e−tu ,
0
write gt(u) = E(e−uT(t)). Differentiate with respect to u and make the
substitution x = t2 to obtain the differential equation g (u) = − t√ g
4us
t
2 u t(u).
Via the substitution y = t√ we see that g
2 s
t(0) = 1 and the result follows.
24
(4) Inverse Gaussian Subordinators
We generalise the L´evy subordinator by replacing Brownian motion by
the Gaussian process C = (C(t), t ≥ 0) where each C(t) = B(t) + µt and
µ ∈ R. The inverse Gaussian subordinator is defined by
T (t) = inf{s > 0; C(s) = δt}
where δ > 0 and is so-called since t → T (t) is the generalised inverse of a
Gaussian process.
Using martingale methods, we can show that for each t, u > 0,
√
E(e−uT(t)) = e−tδ( 2u+µ2−µ),
(2.16)
In fact each T (t) has a density:-
δt
1
fT(t)(s) = √
eδtµs−32 exp − (t2δ2s−1 + µ2s) ,
(2.17)
2π
2
for each s, t ≥ 0.
In general any random variable with density fT(1) is called an inverse
Gaussian and denoted as IG(δ, µ).
(5) Gamma Subordinators
Let (T (t), t ≥ 0) be a gamma process with parameters a, b > 0 so that
each T (t) has density
bat
fT(t)(x) =
xat−1e−bx,
Γ(at)
for x ≥ 0; then it is easy to verify that for each u ≥ 0,
∞
u −at
u
e−uxfT(t)(x)dx = 1 +
= exp −ta log 1 +
.
0
b
b
From here it is a straightforward exercise in calculus to show that
∞
∞
e−uxfT(t)(x)dx = exp −t
(1 − e−ux)ax−1e−bxdx .
0
0
From this we see that (T (t), t ≥ 0) is a subordinator with b = 0 and
λ(dx) = ax−1e−bxdx. Moreover ψ(u) = a log 1 + u is the associated Bern-
b
stein function (see below).
25
5
4
3
2
1
0
0
1
2
3
4
5
Figure 5 Simulation of a gamma subordinator.
Before we go further into the probabilistic properties of subordinators
we’ll make a quick diversion into analysis.
Let f ∈ C∞((0, ∞)). We say it is completely monotone if (−1)nf (n) ≥ 0
for all n ∈ N, and a Bernstein function if f ≥ 0 and (−1)nf (n) ≤ 0 for all
n ∈ N. We then have the following
Theorem 2.4
1. f is a Bernstein function if and only if the mapping
x → e−tf(x) is completely monotone for all t ≥ 0.
2. f is a Bernstein function if and only if it has the representation
∞
f (x) = a + bx +
(1 − e−yx)λ(dy),
0
for all x > 0 where a, b ≥ 0 and ∞(y ∧ 1)λ(dy) < ∞.
0
3. g is completely monotone if and only if there exists a measure µ on
[0, ∞) for which
∞
g(x) =
e−xyµ(dy).
0
To interpret this theorem, first consider the case a = 0. In this case, if
we compare the statement of Theorem 2.4 with equation (2.14), we see that
there is a one to one correspondence between Bernstein functions for which
26
limx→0 f(x) = 0 and Laplace exponents of subordinators. The Laplace trans-
forms of the laws of subordinators are always completely monotone functions
and a subclass of all possible measures µ appearing in Theorem 2.4 (3) is given
by all possible laws pT(t) associated to subordinators. A general Bernstein
function with a > 0 can be given a probabilistic interpretation by means of
“killing”.
One of the most important probabilistic applications of subordinators is
to “time change”. Let X be an arbitrary L´evy process and let T be a
subordinator defined on the same probability space as X such that X and
T are independent. We define a new stochastic process Z = (Z(t), t ≥ 0) by
the prescription
Z(t) = X(T (t)),
for each t ≥ 0 so that for each ω ∈ Ω, Z(t)(ω) = X(T (t)(ω))(ω). The key
result is then the following.
Theorem 2.5 Z is a L´evy process.
We compute the L´evy symbol of the subordinated process Z.
Proposition 2.2
ηZ = −ψT ◦ (−ηX).
Proof. For each u ∈ Rd, t ≥ 0,
E(eiηZ(t)(u)) = E(ei(u,X(T(t))))
=
E(ei(u,X(s)))pT(t)(ds)
=
esηX(u)pT(t)(ds)
= E(e−(−ηX(u))T(t))
= e−tψT (−ηX(u)).
Example : From Brownian Motion to 2α-stable Processes
Let T be an α-stable subordinator (with 0 < α < 1) and X be a d-
dimensional Brownian motion with covariance A = 2I, which is independent
of T . Then for each s ≥ 0, u ∈ Rd, ψT (s) = sα and ηX(u) = −|u|2, and hence
ηZ(u) = −|u|2α, i.e. Z is a rotationally invariant 2α-stable process.
27
In particular, if d = 1 and T is the L´evy subordinator, then Z is the
Cauchy process, so each Z(t) has a symmetric Cauchy distribution with pa-
rameters µ = 0 and σ = 1. It is interesting to observe from (2.15) that Z is
constructed from two independent standard Brownian motions.
Examples of subordinated processes have recently found useful applica-
tions in mathematical finance. We briefly mention two interesting cases:-
(i) The Variance Gamma Process
In this case Z(t) = B(T (t)), for each t ≥ 0, where B is a standard
Brownian motion and T is an independent gamma subordinator. The name
derives from the fact that, in a formal sense, each Z(t) arises by replacing the
variance of a normal random variable by a gamma random variable. Using
Proposition 2.2, a simple calculation yields
u2 −at
ΦZ(t)(u) = 1 +
,
2b
for each t ≥ 0, u ∈ R, where a and b are the usual parameters which deter-
mine the gamma process. It is an easy exercise in manipulating characteristic
functions to compute the alternative representation:
Z(t) = G(t) − L(t),
where G and L are independent gamma subordinators each with parameters
√2b and a. This yields a nice financial representation of Z as a difference of
independent “gains” and “losses”. From this representation, we can compute
that Z has a L´evy density
a
√
√
g
2bx
2bx
ν (x) =
(e
χ
χ
|x|
(−∞,0)(x) + e−
(0,∞)(x)).
The CGMY processes are a generalisation of the variance-gamma processes
due to Carr, Geman, Madan and Yor. They are characterised by their L´evy
density:
a
gν(x) =
(eb1xχ
|x|1+α
(−∞,0)(x) + e−b2xχ(0,∞)(x)),
where a > 0, 0 ≤ α < 2 and b1, b2 ≥ 0. We obtain stable L´evy processes
when b1 = b2 = 0. In fact, the CGMY processes are a subclass of the
tempered stable processes. Note how the exponential dampens the effects of
large jumps.
28
(ii) The Normal Inverse Gaussian Process
In this case Z(t) = C(T (t)) + µt for each t ≥ 0 where each C(t) =
B(t) + βt, with β ∈ R. Here T is an inverse Gaussian subordinator, which is
independent of B, and in which we write the parameter γ =
α2 − β2, where
α ∈ R with α2 ≥ β2. Z depends on four parameters and has characteristic
function
ΦZ(t)(α, β, δ, µ)(u) = exp {δt( α2 − β2 −
α2 − (β + iu)2) + iµtu}
for each u ∈ R, t ≥ 0. Here δ > 0 is as in (2.16).
Each Z(t) has a density given by
x − µt −1
x − µt
fZ(t)(x) = C(α, β, δ, µ; t)q
K
eβx,
δt
1
δtαq
δt
√
√
for each x ∈ R, where q(x) =
1 + x2, C(α, β, δ, µ; t) = π−1αeδt α2−β2−βµt
and K1 is a Bessel function of the third kind.
2.4
Filtrations, Markov Processes and Martingales
We recall the probability space (Ω, F, P ) which underlies our investigations.
F contains all possible events in Ω. When we introduce the arrow of time,
its convenient to be able to consider only those events which can occur up to
and including time t. We denote by Ft this sub-σ-algebra of F. To be able
to consider all time instants on an equal footing, we define a filtration to be
an increasing family (Ft, t ≥ 0) of sub-σ-algebras of F, , i.e.
0 ≤ s ≤ t < ∞ ⇒ Fs ⊆ Ft.
A stochastic process X = (X(t), t ≥ 0) is adapted to the given filtration
if each X(t) is Ft-measurable.
e.g. any process is adapted to its natural filtration,
FX = σ{X(s); 0 ≤ s ≤ t}.
t
An adapted process X = (X(t), t ≥ 0) is a Markov process if for all
f ∈ Bb(Rd), 0 ≤ s ≤ t < ∞,
E(f (X(t))|Fs) = E(f(X(t))|X(s)) (a.s.).
(2.18)
(i.e. “past” and “future” are independent, given the present).
The transition probabilities of a Markov process are ps,t(x, A) = P (X(t) ∈
A|X(s) = x), i.e. the probability that the process is in the Borel set A at
time t given that it is at the point x at the earlier time s.
29
Theorem 2.6 If X is an adapted L´evy process wherein each X(t) has law qt,
then it is a Markov process with transition probabilities ps,t(x, A) = qt−s(A −
x).
Proof. This essentially follows from
E(f (X(t))|Fs) = E(f(X(s) + X(t) − X(s))|Fs)
=
f (X(s) + y)qt−s(dy).
Rd
Now let X be an adapted process defined on a filtered probability space
which also satisfies the integrability requirement E(|X(t)|) < ∞ for all t ≥ 0.
We say that it is a martingale if for all 0 ≤ s < t < ∞,
E(X(t)|Fs) = X(s)
a.s.
Note that if X is a martingale, then the map t → E(X(t)) is constant.
An adapted L´evy process with zero mean is a martingale (with respect
to its natural filtration) since in this case, for 0 ≤ s ≤ t < ∞ and using the
convenient notation Es(·) = E(·|Fs):
Es(X(t)) = Es(X(s) + X(t) − X(s))
= X(s) + E(X(t) − X(s)) = X(s)
Although there is no good reason why a generic L´evy process should be a
martingale (or even have finite mean), there are some important examples:
e.g. the processes whose values at time t are
• σB(t) where B(t) is a standard Brownian motion, and σ is an r × d
matrix.
• ˜
N(t) where ˜
N is a compensated Poisson process with intensity λ.
Some important martingales associated to L´evy processes include:
• exp{i(u, X(t)) − tη(u)}, where u ∈ Rd is fixed.
• |σB(t)|2 − tr(A)t where A = σT σ.
• ˜
N(t)2 − λt.
30