Steven Shreve: Stochastic Calculus And Finance
Steven Shreve: Stochastic Calculus and Finance
PRASAD CHALASANI
SOMESH JHA
Carnegie Mellon University
Carnegie Mellon University
chal@cs.cmu.edu
sjha@cs.cmu.edu
THIS IS A DRAFT: PLEASE DO NOT DISTRIBUTE
c Copyright; Steven E. Shreve, 1996
July 25, 1997
Contents
1
Introduction to Probability Theory
11
1.1
The Binomial Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.2
Finite Probability Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
1.3
Lebesgue Measure and the Lebesgue Integral . . . . . . . . . . . . . . . . . . . .
22
1.4
General Probability Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
1.5
Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
1.5.1
Independence of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
1.5.2
Independence of -algebras . . . . . . . . . . . . . . . . . . . . . . . . .
41
1.5.3
Independence of random variables . . . . . . . . . . . . . . . . . . . . . .
42
1.5.4
Correlation and independence . . . . . . . . . . . . . . . . . . . . . . . .
44
1.5.5
Independence and conditional expectation. . . . . . . . . . . . . . . . . .
45
1.5.6
Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
1.5.7
Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
2
Conditional Expectation
49
2.1
A Binomial Model for Stock Price Dynamics . . . . . . . . . . . . . . . . . . . .
49
2.2
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
2.3
Conditional Expectation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
2.3.1
An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
2.3.2
Definition of Conditional Expectation . . . . . . . . . . . . . . . . . . . .
53
2.3.3
Further discussion of Partial Averaging . . . . . . . . . . . . . . . . . . .
54
2.3.4
Properties of Conditional Expectation . . . . . . . . . . . . . . . . . . . .
55
2.3.5
Examples from the Binomial Model . . . . . . . . . . . . . . . . . . . . .
57
2.4
Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
1
2
3
Arbitrage Pricing
59
3.1
Binomial Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
3.2
General one-step APT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
3.3
Risk-Neutral Probability Measure
. . . . . . . . . . . . . . . . . . . . . . . . . .
61
3.3.1
Portfolio Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
3.3.2
Self-financing Value of a Portfolio Process . . . . . . . . . . . . . . . .
62
3.4
Simple European Derivative Securities . . . . . . . . . . . . . . . . . . . . . . . .
63
3.5
The Binomial Model is Complete . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
4
The Markov Property
67
4.1
Binomial Model Pricing and Hedging . . . . . . . . . . . . . . . . . . . . . . . .
67
4.2
Computational Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
4.3
Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
4.3.1
Different ways to write the Markov property
. . . . . . . . . . . . . . . .
70
4.4
Showing that a process is Markov
. . . . . . . . . . . . . . . . . . . . . . . . . .
73
4.5
Application to Exotic Options . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
5
Stopping Times and American Options
77
5.1
American Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
5.2
Value of Portfolio Hedging an American Option . . . . . . . . . . . . . . . . . . .
79
5.3
Information up to a Stopping Time . . . . . . . . . . . . . . . . . . . . . . . . . .
81
6
Properties of American Derivative Securities
85
6.1
The properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
6.2
Proofs of the Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
6.3
Compound European Derivative Securities . . . . . . . . . . . . . . . . . . . . . .
88
6.4
Optimal Exercise of American Derivative Security . . . . . . . . . . . . . . . . . .
89
7
Jensen’s Inequality
91
7.1
Jensen’s Inequality for Conditional Expectations . . . . . . . . . . . . . . . . . . .
91
7.2
Optimal Exercise of an American Call . . . . . . . . . . . . . . . . . . . . . . . .
92
7.3
Stopped Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
8
Random Walks
97
8.1
First Passage Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
3
8.2
is almost surely finite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
8.3
The moment generating function for
. . . . . . . . . . . . . . . . . . . . . . . .
99
8.4
Expectation of
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.5
The Strong Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.6
General First Passage Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.7
Example: Perpetual American Put . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.8
Difference Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.9
Distribution of First Passage Times . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.10 The Reflection Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
9
Pricing in terms of Market Probabilities: The Radon-Nikodym Theorem.
111
9.1
Radon-Nikodym Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
9.2
Radon-Nikodym Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
9.3
The State Price Density Process
. . . . . . . . . . . . . . . . . . . . . . . . . . . 113
9.4
Stochastic Volatility Binomial Model . . . . . . . . . . . . . . . . . . . . . . . . . 116
9.5
Another Applicaton of the Radon-Nikodym Theorem . . . . . . . . . . . . . . . . 118
10 Capital Asset Pricing
119
10.1 An Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
11 General Random Variables
123
11.1 Law of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.2 Density of a Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.3 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.4 Two random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
11.5 Marginal Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.6 Conditional Expectation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.7 Conditional Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.8 Multivariate Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
11.9 Bivariate normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
11.10MGF of jointly normal random variables . . . . . . . . . . . . . . . . . . . . . . . 130
12 Semi-Continuous Models
131
12.1 Discrete-time Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4
12.2 The Stock Price Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
12.3 Remainder of the Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
12.4 Risk-Neutral Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
12.5 Risk-Neutral Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
12.6 Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
12.7 Stalking the Risk-Neutral Measure . . . . . . . . . . . . . . . . . . . . . . . . . . 135
12.8 Pricing a European Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
13 Brownian Motion
139
13.1 Symmetric Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
13.2 The Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
13.3 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
13.4 Brownian Motion as a Limit of Random Walks . . . . . . . . . . . . . . . . . . . 141
13.5 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
13.6 Covariance of Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
13.7 Finite-Dimensional Distributions of Brownian Motion . . . . . . . . . . . . . . . . 144
13.8 Filtration generated by a Brownian Motion . . . . . . . . . . . . . . . . . . . . . . 144
13.9 Martingale Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
13.10The Limit of a Binomial Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
13.11Starting at Points Other Than 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
13.12Markov Property for Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . 147
13.13Transition Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
13.14First Passage Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
14 The Itˆo Integral
153
14.1 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
14.2 First Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
14.3 Quadratic Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
14.4 Quadratic Variation as Absolute Volatility . . . . . . . . . . . . . . . . . . . . . . 157
14.5 Construction of the It ˆo Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
14.6 It ˆo integral of an elementary integrand . . . . . . . . . . . . . . . . . . . . . . . . 158
14.7 Properties of the It ˆo integral of an elementary process . . . . . . . . . . . . . . . . 159
14.8 It ˆo integral of a general integrand . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5
14.9 Properties of the (general) It ˆo integral . . . . . . . . . . . . . . . . . . . . . . . . 163
14.10Quadratic variation of an It ˆo integral . . . . . . . . . . . . . . . . . . . . . . . . . 165
15 Itˆo’s Formula
167
15.1 It ˆo’s formula for one Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . 167
15.2 Derivation of It ˆo’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
15.3 Geometric Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
15.4 Quadratic variation of geometric Brownian motion . . . . . . . . . . . . . . . . . 170
15.5 Volatility of Geometric Brownian motion
. . . . . . . . . . . . . . . . . . . . . . 170
15.6 First derivation of the Black-Scholes formula . . . . . . . . . . . . . . . . . . . . 170
15.7 Mean and variance of the Cox-Ingersoll-Ross process . . . . . . . . . . . . . . . . 172
15.8 Multidimensional Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . 173
15.9 Cross-variations of Brownian motions . . . . . . . . . . . . . . . . . . . . . . . . 174
15.10Multi-dimensional It ˆo formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
16 Markov processes and the Kolmogorov equations
177
16.1 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
16.2 Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
16.3 Transition density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
16.4 The Kolmogorov Backward Equation
. . . . . . . . . . . . . . . . . . . . . . . . 180
16.5 Connection between stochastic calculus and KBE . . . . . . . . . . . . . . . . . . 181
16.6 Black-Scholes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
16.7 Black-Scholes with price-dependent volatility . . . . . . . . . . . . . . . . . . . . 186
17 Girsanov’s theorem and the risk-neutral measure
189
17.1 Conditional expectations under IP
f
. . . . . . . . . . . . . . . . . . . . . . . . . . 191
17.2 Risk-neutral measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
18 Martingale Representation Theorem
197
18.1 Martingale Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 197
18.2 A hedging application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
18.3 d-dimensional Girsanov Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 199
18.4 d-dimensional Martingale Representation Theorem . . . . . . . . . . . . . . . . . 200
18.5 Multi-dimensional market model . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
6
19 A two-dimensional market model
203
19.1 Hedging when
1
1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
,
19.2 Hedging when
= 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
20 Pricing Exotic Options
209
20.1 Reflection principle for Brownian motion . . . . . . . . . . . . . . . . . . . . . . 209
20.2 Up and out European call.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
20.3 A practical issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
21 Asian Options
219
21.1 Feynman-Kac Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
21.2 Constructing the hedge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
21.3 Partial average payoff Asian option . . . . . . . . . . . . . . . . . . . . . . . . . . 221
22 Summary of Arbitrage Pricing Theory
223
22.1 Binomial model, Hedging Portfolio
. . . . . . . . . . . . . . . . . . . . . . . . . 223
22.2 Setting up the continuous model . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
22.3 Risk-neutral pricing and hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
22.4 Implementation of risk-neutral pricing and hedging . . . . . . . . . . . . . . . . . 229
23 Recognizing a Brownian Motion
233
23.1 Identifying volatility and correlation . . . . . . . . . . . . . . . . . . . . . . . . . 235
23.2 Reversing the process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
24 An outside barrier option
239
24.1 Computing the option value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
24.2 The PDE for the outside barrier option . . . . . . . . . . . . . . . . . . . . . . . . 243
24.3 The hedge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
25 American Options
247
25.1 Preview of perpetual American put . . . . . . . . . . . . . . . . . . . . . . . . . . 247
25.2 First passage times for Brownian motion: first method . . . . . . . . . . . . . . . . 247
25.3 Drift adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
25.4 Drift-adjusted Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
25.5 First passage times: Second method . . . . . . . . . . . . . . . . . . . . . . . . . 251
7
25.6 Perpetual American put . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
25.7 Value of the perpetual American put . . . . . . . . . . . . . . . . . . . . . . . . . 256
25.8 Hedging the put . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
25.9 Perpetual American contingent claim . . . . . . . . . . . . . . . . . . . . . . . . . 259
25.10Perpetual American call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
25.11Put with expiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
25.12American contingent claim with expiration
. . . . . . . . . . . . . . . . . . . . . 261
26 Options on dividend-paying stocks
263
26.1 American option with convex payoff function . . . . . . . . . . . . . . . . . . . . 263
26.2 Dividend paying stock
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
26.3 Hedging at time t1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
27 Bonds, forward contracts and futures
267
27.1 Forward contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
27.2 Hedging a forward contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
27.3 Future contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
27.4 Cash flow from a future contract . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
27.5 Forward-future spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
27.6 Backwardation and contango . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
28 Term-structure models
275
28.1 Computing arbitrage-free bond prices: first method . . . . . . . . . . . . . . . . . 276
28.2 Some interest-rate dependent assets
. . . . . . . . . . . . . . . . . . . . . . . . . 276
28.3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
28.4 Forward rate agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
28.5 Recovering the interest rt from the forward rate . . . . . . . . . . . . . . . . . . 278
28.6 Computing arbitrage-free bond prices: Heath-Jarrow-Morton method . . . . . . . . 279
28.7 Checking for absence of arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . 280
28.8 Implementation of the Heath-Jarrow-Morton model . . . . . . . . . . . . . . . . . 281
29 Gaussian processes
285
29.1 An example: Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
30 Hull and White model
293
8
30.1 Fiddling with the formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
30.2 Dynamics of the bond price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
30.3 Calibration of the Hull & White model . . . . . . . . . . . . . . . . . . . . . . . . 297
30.4 Option on a bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
31 Cox-Ingersoll-Ross model
303
31.1 Equilibrium distribution of rt . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
31.2 Kolmogorov forward equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
31.3 Cox-Ingersoll-Ross equilibrium density . . . . . . . . . . . . . . . . . . . . . . . 309
31.4 Bond prices in the CIR model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
31.5 Option on a bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
31.6 Deterministic time change of CIR model . . . . . . . . . . . . . . . . . . . . . . . 313
31.7 Calibration
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
31.8 Tracking down ' 0
0
in the time change of the CIR model
. . . . . . . . . . . . . 316
32 A two-factor model (Duffie & Kan)
319
32.1 Non-negativity of Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
32.2 Zero-coupon bond prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
32.3 Calibration
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
33 Change of num´eraire
325
33.1 Bond price as num´eraire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
33.2 Stock price as num´eraire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
33.3 Merton option pricing formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
34 Brace-Gatarek-Musiela model
335
34.1 Review of HJM under risk-neutral IP . . . . . . . . . . . . . . . . . . . . . . . . . 335
34.2 Brace-Gatarek-Musiela model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
34.3 LIBOR
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
34.4 Forward LIBOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
34.5 The dynamics of Lt; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
34.6 Implementation of BGM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
34.7 Bond prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
34.8 Forward LIBOR under more forward measure . . . . . . . . . . . . . . . . . . . . 343
9
34.9 Pricing an interest rate caplet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
34.10Pricing an interest rate cap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
34.11Calibration of BGM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
34.12Long rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
34.13Pricing a swap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
10
Chapter 1
Introduction to Probability Theory
1.1
The Binomial Asset Pricing Model
The binomial asset pricing model provides a powerful tool to understand arbitrage pricing theory
and probability theory. In this course, we shall use it for both these purposes.
In the binomial asset pricing model, we model stock prices in discrete time, assuming that at each
step, the stock price will change to one of two possible values. Let us begin with an initial positive
stock price S0. There are two positive numbers, d and u, with
0 d u;
(1.1)
such that at the next period, the stock price will be either dS0 or uS0. Typically, we take d and u
to satisfy 0
d 1 u, so change of the stock price from S0 to dS0 represents a downward
movement, and change of the stock price from S0 to uS0 represents an upward movement. It is
common to also have d = 1u, and this will be the case in many of our examples. However, strictly
speaking, for what we are about to do we need to assume only (1.1) and (1.2) below.
Of course, stock price movements are much more complicated than indicated by the binomial asset
pricing model. We consider this simple model for three reasons. First of all, within this model the
concept of arbitrage pricing and its relation to risk-neutral pricing is clearly illuminated. Secondly,
the model is used in practice because with a sufficient number of steps, it provides a good, compu-
tationally tractable approximation to continuous-time models. Thirdly, within the binomial model
we can develop the theory of conditional expectations and martingales which lies at the heart of
continuous-time models.
With this third motivation in mind, we develop notation for the binomial model which is a bit
different from that normally found in practice. Let us imagine that we are tossing a coin, and when
we get a “Head,” the stock price moves up, but when we get a “Tail,” the price moves down. We
denote the price at time 1 by S1H = uS0 if the toss results in head (H), and by S1T = dS0 if it
11
12
S (HH) = 16
2
S (H) = 8
1
S (HT) = 4
2
S = 4
0
S (TH) = 4
2
S (T) = 2
1
S (TT) = 1
2
Figure 1.1: Binomial tree of stock prices with S0 = 4, u = 1=d = 2.
results in tail (T). After the second toss, the price will be one of:
S2HH = uS1H = u2S0; S2HT = dS1H = duS0;
S2TH = uS1T = udS0; S2TT = dS1T = d2S0:
After three tosses, there are eight possible coin sequences, although not all of them result in different
stock prices at time 3.
For the moment, let us assume that the third toss is the last one and denote by
= HHH;HHT;HTH;HTT;THH;THT;TTH;TTT
f
g
the set of all possible outcomes of the three tosses. The set
of all possible outcomes of a ran-
dom experiment is called the sample space for the experiment, and the elements ! of
are called
sample points. In this case, each sample point ! is a sequence of length three. We denote the k-th
component of ! by !k. For example, when ! = HT H, we have !1 = H, !2 = T and !3 = H.
The stock price Sk at time k depends on the coin tosses. To emphasize this, we often write Sk!.
Actually, this notation does not quite tell the whole story, for while S3 depends on all of !, S2
depends on only the first two components of !, S1 depends on only the first component of !, and
S0 does not depend on ! at all. Sometimes we will use notationsuch S2!1;!2 just to record more
explicitly how S2 depends on ! = !1;!2;!3.
Example 1.1 Set S0 = 4, u = 2 and d = 12. We have then the binomial “tree” of possible stock
prices shown in Fig. 1.1. Each sample point ! = !1;!2;!3 represents a path through the tree.
Thus, we can think of the sample space
as either the set of all possible outcomes from three coin
tosses or as the set of all possible paths through the tree.
To complete our binomial asset pricing model, we introduce a money market with interest rate r;
$1 invested in the money market becomes $1 + r in the next period. We take r to be the interest
CHAPTER 1. Introduction to Probability Theory
13
rate for both borrowing and lending. (This is not as ridiculous as it first seems, because in a many
applications of the model, an agent is either borrowing or lending (not both) and knows in advance
which she will be doing; in such an application, she should take r to be the rate of interest for her
activity.) We assume that
d 1 + r u:
(1.2)
The model would not make sense if we did not have this condition. For example, if 1+ r
u, then
the rate of return on the money market is always at least as great as and sometimes greater than the
return on the stock, and no one would invest in the stock. The inequality d
1 + r cannot happen
unless either r is negative (which never happens, except maybe once upon a time in Switzerland) or
d 1. In the latter case, the stock does not really go “down” if we get a tail; it just goes up less
than if we had gotten a head. One should borrow money at interest rate r and invest in the stock,
since even in the worst case, the stock price rises at least as fast as the debt used to buy it.
With the stock as the underlying asset, let us consider a European call option with strike price
K 0 and expiration time 1. This option confers the right to buy the stock at time 1 for K dollars,
and so is worth S1
K at time 1 if S1 K is positive and is otherwise worth zero. We denote by
,
,
V1! = S1! K+ = max S1! K;0
,
f
,
g
the value (payoff) of this option at expiration. Of course, V1! actually depends only on !1, and
we can and do sometimes write V1!1 rather than V1!. Our first task is to compute the arbitrage
price of this option at time zero.
Suppose at time zero you sell the call for V0 dollars, where V0 is still to be determined. You now
have an obligation to pay off uS0
K+ if !1 = H and to pay off dS0 K+ if !1 = T. At
,
,
the time you sell the option, you don’t yet know which value !1 will take. You hedge your short
position in the option by buying 0 shares of stock, where 0 is still to be determined. You can use
the proceeds V0 of the sale of the option for this purpose, and then borrow if necessary at interest
rate r to complete the purchase. If V0 is more than necessary to buy the 0 shares of stock, you
invest the residual money at interest rate r. In either case, you will have V0 0S0 dollars invested
,
in the money market, where this quantity might be negative. You will also own 0 shares of stock.
If the stock goes up, the value of your portfolio (excluding the short position in the option) is
0S1H + 1 + rV0 0S0;
,
and you need to have V1H. Thus, you want to choose V0 and 0 so that
V1H = 0S1H + 1 + rV0 0S0:
(1.3)
,
If the stock goes down, the value of your portfolio is
0S1T + 1 + rV0 0S0;
,
and you need to have V1T . Thus, you want to choose V0 and 0 to also have
V1T = 0S1T + 1 + rV0 0S0:
(1.4)
,
14
These are two equations in two unknowns, and we solve them below
Subtracting (1.4) from (1.3), we obtain
V1H V1T = 0S1H S1T;
(1.5)
,
,
so that
V
0 = V1H
1T
,
S1H S1T:
(1.6)
,
This is a discrete-time version of the famous “delta-hedging” formula for derivative securities, ac-
cording to which the number of shares of an underlying asset a hedge should hold is the derivative
(in the sense of calculus) of the value of the derivative security with respect to the price of the
underlying asset. This formula is so pervasive the when a practitioner says “delta”, she means the
derivative (in the sense of calculus) just described. Note, however, that my definition of 0 is the
number of shares of stock one holds at time zero, and (1.6) is a consequence of this definition, not
the definition of 0 itself. Depending on how uncertainty enters the model, there can be cases
in which the number of shares of stock a hedge should hold is not the (calculus) derivative of the
derivative security with respect to the price of the underlying asset.
To complete the solution of (1.3) and (1.4), we substitute (1.6) into either (1.3) or (1.4) and solve
for V0. After some simplification, this leads to the formula
V
1 + r d
1 + r
0 = 1
,
1 + r u d V1H + u ,u d V1T :
(1.7)
,
,
This is the arbitrage price for the European call option with payoff V1 at time 1. To simplify this
formula, we define
~p = 1 + r d
1 + r
,
u d ; ~q = u ,u d = 1 ~p;
(1.8)
,
,
,
so that (1.7) becomes
V0 = 1
1 + r ~pV1H + ~qV1T :
(1.9)
Because we have taken d
u, both ~p and ~q are defined,i.e., the denominator in (1.8) is not zero.
Because of (1.2), both ~
p and ~qare in the interval 0;1, and because they sum to 1, we can regard
them as probabilities of H and T , respectively. They are the risk-neutral probabilites. They ap-
peared when we solved the two equations (1.3) and (1.4), and have nothing to do with the actual
probabilities of getting H or T on the coin tosses. In fact, at this point, they are nothing more than
a convenient tool for writing (1.7) as (1.9).
We now consider a European call which pays off K dollars at time 2. At expiration, the payoff of
this option is V2
= S2 K+, where V2 and S2 depend on !1 and !2, the first and second coin
,
tosses. We want to determine the arbitrage price for this option at time zero. Suppose an agent sells
the option at time zero for V0 dollars, where V0 is still to be determined. She then buys 0 shares
CHAPTER 1. Introduction to Probability Theory
15
of stock, investing V0
0S0 dollars in the money market to finance this. At time 1, the agent has
,
a portfolio (excluding the short position in the option) valued at
X1 = 0S1 + 1 + rV0 0S0:
(1.10)
,
Although we do not indicate it in the notation, S1 and therefore X1 depend on !1, the outcome of
the first coin toss. Thus, there are really two equations implicit in (1.10):
X1H = 0S1H + 1 + rV0 0S0;
,
X1T = 0S1T + 1 + rV0 0S0:
,
After the first coin toss, the agent has X1 dollars and can readjust her hedge. Suppose she decides to
now hold 1 shares of stock, where 1 is allowed to depend on !1 because the agent knows what
value !1 has taken. She invests the remainder of her wealth, X1
1S1 in the money market. In
,
the next period, her wealth will be given by the right-hand side of the following equation, and she
wants it to be V2. Therefore, she wants to have
V2 = 1S2 + 1 + rX1 1S1:
(1.11)
,
Although we do not indicate it in the notation, S2 and V2 depend on !1 and !2, the outcomes of the
first two coin tosses. Considering all four possible outcomes, we can write (1.11) as four equations:
V2HH = 1HS2HH+ 1 + rX1H 1HS1H;
,
V2HT = 1HS2HT + 1 + rX1H 1HS1H;
,
V2TH = 1TS2TH + 1 + rX1T 1TS1T;
,
V2TT = 1TS2TT + 1 + rX1T 1TS1T:
,
We now have six equations, the two represented by (1.10) and the four represented by (1.11), in the
six unknowns V0, 0, 1H, 1T , X1H, and X1T .
To solve these equations, and thereby determine the arbitrage price V0 at time zero of the option and
the hedging portfolio 0, 1H and 1T , we begin with the last two
V2TH = 1TS2TH + 1 + rX1T 1TS1T;
,
V2TT = 1TS2TT + 1 + rX1T 1TS1T:
,
Subtracting one of these from the other and solving for 1T , we obtain the “delta-hedging for-
mula”
V
1T = V2TH
2TT
,
S2TH S2TT;
(1.12)
,
and substituting this into either equation, we can solve for
X1T = 1
1 + r ~pV2TH + ~qV2TT :
(1.13)
16
Equation (1.13), gives the value the hedging portfolio should have at time 1 if the stock goes down
between times 0 and 1. We define this quantity to be the arbitrage value of the option at time 1 if
!1 = T, and we denote it by V1T. We have just shown that
V1T = 1
1 + r ~pV2TH+ ~qV2TT :
(1.14)
The hedger should choose her portfolio so that her wealth X1T if !1 = T agrees with V1T
defined by (1.14). This formula is analgous to formula (1.9), but postponed by one step. The first
two equations implicit in (1.11) lead in a similar way to the formulas
V
1H = V2HH
2HT
,
S2HH S2HT
(1.15)
,
and X1H = V1H, where V1H is the value of the option at time 1 if !1 = H, defined by
V1H = 1
1 + r ~pV2HH + ~qV2HT :
(1.16)
This is again analgous to formula (1.9), postponed by one step. Finally, we plug the values X1H =
V1H and X1T = V1T into the two equations implicit in (1.10). The solution of these equa-
tions for 0 and V0 is the same as the solution of (1.3) and (1.4), and results again in (1.6) and
(1.9).
The pattern emerging here persists, regardless of the number of periods. If Vk denotes the value at
time k of a derivative security, and this depends on the first k coin tosses !1;:::;!k, then at time
k 1, after the first k 1 tosses !1;:::;!k 1 are known, the portfolio to hedge a short position
,
,
,
should hold k 1!1;:::;!k 1 shares of stock, where
,
,
V
k 1!1;:::;!k 1 = Vk!1;:::;!k 1;H
k!1;:::;!k 1;T
,
,
,
,
,
Sk!1;:::;!k 1;H Sk!1;:::;!k 1;T;
(1.17)
,
,
,
and the value at time k
1 of the derivative security, when the first k 1 coin tosses result in the
,
,
outcomes !1;:::;!k 1, is given by
,
Vk 1!1;:::;!k 1 = 1
1;H+ ~qVk!1;:::;!k 1;T
,
,
1 + r ~pVk!1;:::;!k,
,
(1.18)
1.2
Finite Probability Spaces
Let
be a set with finitely many elements. An example to keep in mind is
= HHH;HHT;HTH;HTT;THH;THT;TTH;TTT
(2.1)
f
g
of all possible outcomes of three coin tosses. Let
be the set of all subsets of
. Some sets in
F
F
are , HHH; HHT; HT H; HTT , TTT , and
itself. How many sets are there in
?
;
f
g
f
g
F
CHAPTER 1. Introduction to Probability Theory
17
Definition 1.1 A probability measure IP is a function mapping
into 0; 1 with the following
F
properties:
(i) IP = 1,
(ii) If A1;A2;::: is a sequence of disjoint sets in , then
F
!
IP 1 Ak = 1
X
IPAk:
k=1
k=1
Probability measures have the following interpretation. Let A be a subset of . Imagine that
is
F
the set of all possible outcomes of some random experiment. There is a certain probability, between
0 and 1, that when that experiment is performed, the outcome will lie in the set A. We think of
IPA as this probability.
1
2
Example 1.2 Suppose a coin has probability 3 for H and 3 for T. For the individual elements of
in (2.1), define
IP HHH =
3
2
1 ;
IP HHT =
1
2;
f
g
3
f
g
3
3
IP HTH =
2
2
1
2 ;
IP HTT =
1
2 ;
f
g
3
3
f
g
3
3
IP THH =
2
2
1
1 ;
IP THT =
1
2 ;
f
g
3
3
f
g
3
3
IP TTH =
2
3
1
2 ;
IP TTT =
2 :
f
g
3
3
f
g
3
For A
, we define
2
F
IPA = X IP ! :
(2.2)
f
g
! A
2
For example,
IP HHH;HHT;HTH;HTT
3
2
2
= 1 + 2 1
2 + 1 2 = 1
f
g
3
3
3
3 3
3;
1
which is another way of saying that the probability of H on the first toss is 3.
As in the above example, it is generally the case that we specify a probability measure on only some
of the subsets of
and then use property (ii) of Definition 1.1 to determine IP A for the remaining
sets A
. In the above example, we specified the probability measure only for the sets containing
2
F
a single element, and then used Definition 1.1(ii) in the form (2.2) (see Problem 1.4(ii)) to determine
IP for all the other sets in .
F
Definition 1.2 Let
be a nonempty set. A
-algebra is a collection
of subsets of
with the
G
following three properties:
(i)
,
;
2
G
18
(ii) If A
, then its complement Ac
,
2
G
2
G
(iii) If A1;A2;A3;::: is a sequence of sets in , then 1
k=1Ak is also in .
G
G
Here are some important -algebras of subsets of the set
in Example 1.2:
0 =
; ;
F
;
1 =
; ; HHH;HHT;HTH;HTT ; THH;THT;TTH;TTT ;
F
;
f
g
f
g
2 =
; ; HHH;HHT ; HTH;HTT ; THH;THT ; TTH;TTT ;
F
;
f
g
f
g
f
g
f
g
and all sets which can be built by taking unions of these ;
3 =
= The set of all subsets of :
F
F
To simplify notation a bit, let us define
AH = HHH;HHT;HTH;HTT = H on the first toss ;
f
g
f
g
AT = THH;THT;TTH;TTT = T on the first toss ;
f
g
f
g
so that
1 = ; ;AH;AT ;
F
f;
g
and let us define
AHH = HHH;HHT = HH on the first two tosses ;
f
g
f
g
AHT = HTH;HTT = HT on the first two tosses ;
f
g
f
g
ATH = THH;THT = TH on the first two tosses ;
f
g
f
g
ATT = TTH;TTT = TT on the first two tosses ;
f
g
f
g
so that
2 =
; ;AHH;AHT;ATH;ATT;
F
f;
AH;AT;AHH ATH;AHH ATT;AHT ATH;AHT ATT;
AcHH;AcHT;AcTH;AcTT :g
We interpret -algebras as a record of information. Suppose the coin is tossed three times, and you
are not told the outcome, but you are told, for every set in
1 whether or not the outcome is in that
F
set. For example, you would be told that the outcome is not in
and is in
. Moreover, you might
;
be told that the outcome is not in AH but is in AT . In effect, you have been told that the first toss
was a T , and nothing more. The -algebra 1 is said to contain the “information of the first toss”,
F
which is usually called the “information up to time 1”. Similarly, 2 contains the “information of
F
CHAPTER 1. Introduction to Probability Theory
19
the first two tosses,” which is the “information up to time 2.” The -algebra 3 =
contains “full
F
F
information” about the outcome of all three tosses. The so-called “trivial” -algebra
0 contains no
F
information. Knowing whether the outcome ! of the three tosses is in (it is not) and whether it is
;
in
(it is) tells you nothing about !
Definition 1.3 Let
be a nonempty finite set. A filtration is a sequence of -algebras 0; 1; 2;:::; n
F
F
F
F
such that each -algebra in the sequence contains all the sets contained by the previous -algebra.
Definition 1.4 Let
be a nonempty finite set and let
be the
-algebra of all subsets of
. A
F
random variable is a function mapping
into IR.
Example 1.3 Let
be given by (2.1) and consider the binomial asset pricing Example 1.1, where
S0 = 4, u = 2 and d = 12. Then S0, S1, S2 and S3 are all random variables. For example,
S2HHT = u2S0 = 16. The “random variable” S0 is really not random, since S0! = 4 for all
!
. Nonetheless, it is a function mapping
into IR, and thus technically a random variable,
2
albeit a degenerate one.
A random variable maps
into IR, and we can look at the preimage under the random variable of
sets in IR. Consider, for example, the random variable S2 of Example 1.1. We have
S2HHH = S2HHT = 16;
S2HTH = S2HTT = S2THH = S2THT = 4;
S2TTH = S2TTT = 1:
Let us consider the interval 4; 27 . The preimage under S2 of this interval is defined to be
!
;S2! 4;27 = !
;4 S2 27 = AcTT:
f
2
2
g
f
2
g
The complete list of subsets of
we can get as preimages of sets in IR is:
; ;AHH;AHT ATH;ATT;
;
and sets which can be built by taking unions of these. This collection of sets is a -algebra, called
the
-algebra generated by the random variable S2, and is denoted by S2. The information
content of this
-algebra is exactly the information learned by observing S2. More specifically,
suppose the coin is tossed three times and you do not know the outcome !, but someone is willing
to tell you, for each set in S2, whether ! is in the set. You might be told, for example, that ! is
not in AHH, is in AHT ATH, and is not in ATT . Then you know that in the first two tosses, there
was a head and a tail, and you know nothing more. This information is the same you would have
gotten by being told that the value of S2! is 4.
Note that
2 defined earlier contains all the sets which are in S2, and even more. This means
F
that the information in the first two tosses is greater than the information in S2. In particular, if you
see the first two tosses, you can distinguish AHT from ATH, but you cannot make this distinction
from knowing the value of S2 alone.
20
Definition 1.5 Let
be a nonemtpy finite set and let
be the -algebra of all subsets of
. Let X
F
be a random variable on ; . The -algebra X generated by X is defined to be the collection
F
of all sets of the form !
;X! A , where A is a subset of IR. Let be a sub- -algebra of
f
2
2
g
G
. We say that X is -measurable if every set in X is also in .
F
G
G
Note: We normally write simply X
A rather than !
;X! A .
f
2
g
f
2
2
g
Definition 1.6 Let
be a nonempty, finite set, let
be the -algebra of all subsets of
, let IP be
F
a probabilty measure on ; , and let X be a random variable on . Given any set A
IR, we
F
define the induced measure of A to be
XA = IP X A :
L
f
2
g
In other words, the induced measure of a set A tells us the probability that X takes a value in A. In
the case of S2 above with the probability measure of Example 1.2, some sets in IR and their induced
measures are:
S = IP = 0;
L
;
;
2
S IR = IP = 1;
L
2
S 0; = IP = 1;
L
1
2
2
2
S 0; 3 = IP S2 = 1 = IPATT =
:
L
f
g
2
3
2
1
In fact, the induced measure of S2 places a mass of size 3 = 19 at the number 16, a mass of size
4
2
2
9 at the number 4, and a mass of size 3
= 49 at the number 1. A common way to record this
information is to give the cumulative distribution function FS x of S2, defined by
2
8
0; if x 1;
F
4
x 4;
S x = IPS2 x =
9; if 1
8
(2.3)
x 16;
2
9; if 4
:
1; if 16 x:
By the distribution of a random variable X, we mean any of the several ways of characterizing
X. If X is discrete, as in the case of S2 above, we can either tell where the masses are and how
L
large they are, or tell what the cumulative distribution function is. (Later we will consider random
variables X which have densities, in which case the induced measure of a set A
IR is the integral
of the density over the set A.)
Important Note. In order to work through the concept of a risk-neutral measure, we set up the
definitions to make a clear distinction between random variables and their distributions.
A random variable is a mapping from
to IR, nothing more. It has an existence quite apart from
discussion of probabilities. For example, in the discussion above, S2TTH = S2TTT = 1,
1
1
regardless of whether the probability for H is 3 or 2.
CHAPTER 1. Introduction to Probability Theory
21
The distribution of a random variable is a measure
X on IR, i.e., a way of assigning probabilities
L
to sets in IR. It depends on the random variable X and the probability measure IP we use in . If we
1
1
set the probability of H to be 3, then S assigns mass
L
9 to the number 16. If we set the probability
2
1
1
of H to be 2, then S assigns mass
L
4 to the number 16. The distribution of S2 has changed, but
2
the random variable has not. It is still defined by
S2HHH = S2HHT = 16;
S2HTH = S2HTT = S2THH = S2THT = 4;
S2TTH = S2TTT = 1:
Thus, a random variable can have more than one distribution (a “market” or “objective” distribution,
and a “risk-neutral” distribution).
In a similar vein, two different random variables can have the same distribution. Suppose in the
1
binomial model of Example 1.1, the probability of H and the probability of T is 2. Consider a
European call with strike price 14 expiring at time 2. The payoff of the call at time 2 is the random
variable S2
14+, which takes the value 2 if ! = HHH or ! = HHT, and takes the value 0 in
,
1
3
every other case. The probability the payoff is 2 is 4, and the probability it is zero is 4. Consider also
a European put with strike price 3 expiring at time 2. The payoff of the put at time 2 is 3
S2+,
,
which takes the value 2 if ! = TTH or ! = TTT . Like the payoff of the call, the payoff of the
1
3
put is 2 with probability 4 and 0 with probability 4. The payoffs of the call and the put are different
random variables having the same distribution.
Definition 1.7 Let
be a nonempty, finite set, let
be the -algebra of all subsets of
, let IP be
F
a probabilty measure on ; , and let X be a random variable on . The expected value of X is
F
defined to be
IEX = X X!IP ! :
(2.4)
f
g
!2
Notice that the expected value in (2.4) is defined to be a sum over the sample space
. Since
is a
finite set, X can take only finitely many values, which we label x1;:::;xn. We can partition into
the subsets X1 = x1 ;:::; Xn = xn , and then rewrite (2.4) as
f
g
f
g
IEX = X X!IP !
f
g
!2n
= X
X
X!IP !
f
g
k=1 ! Xk=xk
2f
g
n
= X xk X IP !
f
g
k=1 ! Xk=xk
2f
g
n
= X xkIP Xk = xk
f
g
k=1
n
= X xk X xk :
L
f
g
k=1
22
Thus, although the expected value is defined as a sum over the sample space
, we can also write it
as a sum over IR.
To make the above set of equations absolutely clear, we consider S2 with the distribution given by
(2.3). The definition of IES2 is
IES2 = S2HHHIP HHH + S2HHTIP HHT
f
g
f
g
+S2HTHIP HTH + S2HTTIP HTT
f
g
f
g
+S2THHIP THH + S2THTIP THT
f
g
f
g
+S2TTHIP TTH + S2TTTIP TTT
f
g
f
g
= 16 IPAHH + 4 IPAHT ATH + 1 IPATT
= 16 IP S2 = 16 + 4 IP S2 = 4 + 1 IP S2 = 1
f
g
f
g
f
g
= 16 S 16 + 4 S 4 + 1 S 1
L
f
g
L
f
g
L
f
g
2
2
2
= 16 1
4
4
9 + 4 9 + 4 9
= 489:
Definition 1.8 Let
be a nonempty, finite set, let
be the -algebra of all subsets of
, let IP be a
F
probabilty measure on ; , and let X be a random variable on . The variance of X is defined
F
to be the expected value of X
IEX2, i.e.,
,
VarX
= XX! IEX2IP ! :
(2.5)
,
f
g
!2
One again, we can rewrite (2.5) as a sum over IR rather than over . Indeed, if X takes the values
x1;:::;xn, then
n
n
X
X
VarX =
xk IEX2IP X = xk = xk IEX2 Xxk:
,
f
g
,
L
k=1
k=1
1.3
Lebesgue Measure and the Lebesgue Integral
In this section, we consider the set of real numbers IR, which is uncountably infinite. We define the
Lebesgue measure of intervals in IR to be their length. This definition and the properties of measure
determine the Lebesgue measure of many, but not all, subsets of IR. The collection of subsets of
IR we consider, and for which Lebesgue measure is defined, is the collection of Borel sets defined
below.
We use Lebesgue measure to construct the Lebesgue integral, a generalization of the Riemann
integral. We need this integral because, unlike the Riemann integral, it can be defined on abstract
spaces, such as the space of infinite sequences of coin tosses or the space of paths of Brownian
motion. This section concerns the Lebesgue integral on the space IR only; the generalization to
other spaces will be given later.
CHAPTER 1. Introduction to Probability Theory
23
Definition 1.9 The Borel
-algebra, denoted
IR, is the smallest -algebra containing all open
B
intervals in IR. The sets in IR are called Borel sets.
B
Every set which can be written down and just about every set imaginable is in
IR. The following
B
discussion of this fact uses the -algebra properties developed in Problem 1.3.
By definition, every open interval a; b is in IR, where a and b are real numbers. Since IR is
B
B
a -algebra, every union of open intervals is also in
IR. For example, for every real number a,
B
the open half-line
a; = 1 a;a + n
1
n=1
is a Borel set, as is
;a = 1 a n;a:
,1
,
n=1
For real numbers a and b, the union
;a b;
,1
1
is Borel. Since
IR is a -algebra, every complement of a Borel set is Borel, so IR contains
B
B
a;b =
c
;a b; :
,1
1
This shows that every closed interval is Borel. In addition, the closed half-lines
a; = 1 a;a + n
1
n=1
and
;a = 1 a n;a
,1
,
n=1
are Borel. Half-open and half-closed intervals are also Borel, since they can be written as intersec-
tions of open half-lines and closed half-lines. For example,
a;b = ;b a; :
,1
1
Every set which contains only one real number is Borel. Indeed, if a is a real number, then
a
= 1 a 1
:
f
g
,
n=1
n;a + 1n
This means that every set containing finitely many real numbers is Borel; if A = a1;a2;:::;an ,
f
g
then
n
A =
ak :
f
g
k=1
24
In fact, every set containing countably infinitely many numbers is Borel; if A = a1;a2;::: , then
f
g
n
A =
ak :
f
g
k=1
This means that the set of rational numbers is Borel, as is its complement, the set of irrational
numbers.
There are, however, sets which are not Borel. We have just seen that any non-Borel set must have
uncountably many points.
Example 1.4 (The Cantor set.) This example gives a hint of how complicated a Borel set can be.
We use it later when we discuss the sample space for an infinite sequence of coin tosses.
Consider the unit interval 0; 1 , and remove the middle half, i.e., remove the open interval
A 1
1 = 4; 34 :
The remaining set
C
3
1 = 0; 14
4;1
has two pieces. From each of these pieces, remove the middle half, i.e., remove the open set
A
13
2 = 1
16; 316
16; 15
16 :
The remaining set
C
3 3 15
2 = 0; 1
16
16; 14
4; 13
16
16;1 :
has four pieces. Continue this process, so at stage k, the set Ck has 2k pieces, and each piece has
1
length 4k . The Cantor set
C = 1 Ck
k=1
is defined to be the set of points not removed at any stage of this nonterminating process.
1
Note that the length of A1, the first set removed, is 2. The “length” of A2, the second set removed,
1
1
is 8 + 18 = 14. The “length” of the next set removed is 4 32 = 18, and in general, the length of the
k
k
-th set removed is 2, . Thus, the total length removed is
1
1
X
k=1 2k = 1;
and so the Cantor set, the set of points not removed, has zero “length.”
Despite the fact that the Cantor set has no “length,” there are lots of points in this set. In particular,
none of the endpoints of the pieces of the sets C1;C2;::: is ever removed. Thus, the points
0; 14; 34;1; 116; 316; 1316; 1516; 164;:::
are all in C. This is a countably infinite set of points. We shall see eventually that the Cantor set
has uncountably many points.
CHAPTER 1. Introduction to Probability Theory
25
Definition 1.10 Let
IR be the -algebra of Borel subsets of IR. A measure on IR; IR is a
B
B
function mapping into 0;
with the following properties:
B
1
(i) = 0,
;
(ii) If A1;A2;::: is a sequence of disjoint sets in IR, then
B
!
1 Ak = 1
X
Ak:
k=1
k=1
Lebesgue measure is defined to be the measure on IR; IR which assigns the measure of each
B
interval to be its length. Following Williams’s book, we denote Lebesgue measure by 0.
A measure has all the properties of a probability measure given in Problem 1.4, except that the total
measure of the space is not necessarily 1 (in fact, 0IR =
), one no longer has the equation
1
Ac = 1 A
,
in Problem 1.4(iii), and property (v) in Problem 1.4 needs to be modified to say:
(v) If A1;A2;::: is a sequence of sets in IR with A1
A2
and A1
, then
B
1
!
1 Ak = lim An:
k=1
n!1
To see that the additional requirment A1
is needed in (v), consider
1
A1 = 1; ;A2 = 2; ;A3 = 3; ;::::
1
1
1
Then
1
k=1Ak = , so 0 1
k=1Ak = 0, but limn
0An = .
;
1
!1
We specify that the Lebesgue measure of each interval is its length, and that determines the Lebesgue
measure of all other Borel sets. For example, the Lebesgue measure of the Cantor set in Example
1.4 must be zero, because of the “length” computation given at the end of that example.
The Lebesgue measure of a set containing only one point must be zero. In fact, since
a a 1
f
g
,
n;a + 1n
for every positive integer n, we must have
0
1
0 a
0 a
= 2
f
g
,
n;a + 1n
n:
Letting n
, we obtain
!
1
0 a = 0:
f
g
26
The Lebesgue measure of a set containing countably many points must also be zero. Indeed, if
A = a1;a2;::: , then
f
g
0A = 1
X
0 ak = 1
X
0 = 0:
f
g
k=1
k=1
The Lebesgue measure of a set containing uncountably many points can be either zero, positive and
finite, or infinite. We may not compute the Lebesgue measure of an uncountable set by adding up
the Lebesgue measure of its individual members, because there is no way to add up uncountably
many numbers. The integral was invented to get around this problem.
In order to think about Lebesgue integrals, we must first consider the functions to be integrated.
Definition 1.11 Let f be a function from IR to IR. We say that f is Borel-measurable if the set
x IR;fx A is in IR whenever A
IR. In the language of Section 2, we want the
f
2
2
g
B
2
B
-algebra generated by f to be contained in IR.
B
Definition 3.4 is purely technical and has nothing to do with keeping track of information. It is
difficult to conceive of a function which is not Borel-measurable, and we shall pretend such func-
tions don’t exist. Hencefore, “function mapping IR to IR” will mean “Borel-measurable function
mapping IR to IR” and “subset of IR” will mean “Borel subset of IR”.
Definition 1.12 An indicator function g from IR to IR is a function which takes only the values 0
and 1. We call
A = x IR;gx = 1
f
2
g
the set indicated by g. We define the Lebesgue integral of g to be
Z
g d
IR
0 = 0A:
A simple function h from IR to IR is a linear combination of indicators, i.e., a function of the form
n
hx = X ckgkx;
k=1
where each gk is of the form
g
1;
A
kx =
if x
k;
2
0; if x = Ak;
2
and each ck is a real number. We define the Lebesgue integral of h to be
n
n
Z
hd
Z
X
c
g
X
c
R
0 =
k
kd0 =
k0Ak:
k=1
IR
k=1
Let f be a nonnegative function defined on IR, possibly taking the value
at some points. We
1
define the Lebesgue integral of f to be
Z
f d
hd
fx
IR :
IR
0 = sup ZIR 0;h is simple and hx
for every x
2
CHAPTER 1. Introduction to Probability Theory
27
It is possible that this integral is infinite. If it is finite, we say that f is integrable.
Finally, let f be a function defined on IR, possibly taking the value
at some points and the value
1
at other points. We define the positive and negative parts of f to be
,1
f+x = max fx;0 ; f
f
;
,
x = max
x;0
f
g
f,
g
respectively, and we define the Lebesgue integral of f to be
Z
f d
f+ d
Z
f d
,
IR
0 = ZIR
0
0;
,
,
IR
R
R
provided the right-hand side is not of the form
. If both
R
IR f+ d0 and IR f d
,
0 are finite
1
,
1
(or equivalently, IR f d0
, since f = f+ + f,), we say that f is integrable.
j
j
1
j
j
Let f be a function defined on IR, possibly taking the value
at some points and the value
at
1
,1
other points. Let A be a subset of IR. We define
Z
f d
l
I
A
0 = ZIR Af d0;
where
l
I
1;
A;
Ax =
if x 2
0; if x = A;
2
is the indicator function of A.
The Lebesgue integral just defined is related to the Riemann integral in one very important way: if
b
R
R
the Riemann integral a fxdx is defined, then the Lebesgue integral a;b f d0 agrees with the
Riemann integral. The Lebesgue integral has two important advantages over the Riemann integral.
The first is that the Lebesgue integral is defined for more functions, as we show in the following
examples.
Example 1.5 Let Q be the set of rational numbers in 0; 1 , and consider f
= lIQ. Being a countable
set, Q has Lebesgue measure zero, and so the Lebesgue integral of f over 0; 1 is
Z
f d
0;1
0 = 0:
1
R
To compute the Riemann integral 0 fxdx, we choose partition points 0 = x0
x1
xn = 1 and divide the interval 0;1 into subintervals x0;x1 ; x1;x2 ;:::; xn 1;xn . In each
,
subinterval xk 1;xk there is a rational point qk, where fqk = 1, and there is also an irrational
,
point rk, where frk = 0. We approximate the Riemann integral from above by the upper sum
n
n
X
fqkxk xk 1 = X 1 xk xk 1 = 1;
,
,
k=1
,
k=1
,
and we also approximate it from below by the lower sum
n
n
X
frkxk xk 1 = X 0 xk xk 1 = 0:
,
,
k=1
,
k=1
,
28
No matter how fine we take the partition of 0; 1 , the upper sum is always 1 and the lower sum is
always 0. Since these two do not converge to a common value as the partition becomes finer, the
Riemann integral is not defined.
Example 1.6 Consider the function
fx =
; if x = 0;
1
0; if x = 0:
6
This is not a simple function because simple function cannot take the value
. Every simple
1
function which lies between 0 and f is of the form
hx = y; if x = 0;
0; if x = 0;
6
for some y
0; , and thus has Lebesgue integral
2
1
Z
hd
0 = 0:
IR
0 = y0f g
It follows that
Z
f d
hd
fx
IR = 0:
IR
0 = sup ZIR 0;h is simple and hx
for every x
2
R
Now consider the Riemann integral
1
fxdx, which for this function f is the same as the
,1
1
R
Riemann integral
1 fx dx. When we partition 1; 1 into subintervals, one of these will contain
,
,
1
R
the point 0, and when we compute the upper approximating sum for
1 fx dx, this point will
,
contribute
times the length of the subinterval containing it. Thus the upper approximating sum is
1
. On the other hand, the lower approximating sum is 0, and again the Riemann integral does not
1
exist.
The Lebesgue integral has all linearity and comparison properties one would expect of an integral.
In particular, for any two functions f and g and any real constant c,
Z
f + gd
f d
g d
IR
0 = ZIR 0 + ZIR 0;
Z
cf d
f d
IR
0 = c ZIR
0;
and whenever fx
gx for all x IR, we have
2
Z
f d Z gdd
IR
0
0:
IR
Finally, if A and B are disjoint sets, then
Z
f d
f d
f d
A B
0 = ZA
0 + ZB
0:
CHAPTER 1. Introduction to Probability Theory
29
There are three convergence theorems satisfied by the Lebesgue integral. In each of these the sit-
uation is that there is a sequence of functions fn;n = 1; 2;:: : converging pointwise to a limiting
function f. Pointwise convergence just means that
lim
n
fnx = fx for every x IR:
2
!1
There are no such theorems for the Riemann integral, because the Riemann integral of the limit-
ing function f is too often not defined. Before we state the theorems, we given two examples of
pointwise convergence which arise in probability theory.
Example 1.7 Consider a sequence of normal densities, each with variance 1 and the n-th having
mean n:
f
x n 2
nx = 1
,
:
2
p2e,
These converge pointwise to the function
fx = 0 for every x IR:
2
R
R
R
We have IR fnd0 = 1 for every n, so limn
IR fnd0 = 1, but IR f d0 = 0.
!1
Example 1.8 Consider a sequence of normal densities, each with mean 0 and the n-th having vari-
1
ance n:
f
r
n x2
nx = 2 e, n:
2
These converge pointwise to the function
fx =
; if x = 0;
1
0; if x = 0:
6
R
R
R
We have again IR fnd0 = 1 for every n, so limn
IR fnd0 = 1, but IR f d0 = 0. The
!1
function f is not the Dirac delta; the Lebesgue integral of this function was already seen in Example
1.6 to be zero.
Theorem 3.1 (Fatou’s Lemma) Let fn;n = 1; 2;:: : be a sequence of nonnegative functions con-
verging pointwise to a function f. Then
Z
f d liminf Z f
IR
0
n d0:
n
IR
!1
R
If limn
IR fn d0 is defined, then Fatou’s Lemma has the simpler conclusion
!1
Z
f d
lim Z f
IR
0
n d0:
n
IR
!1
This is the case in Examples 1.7 and 1.8, where
lim Z
n
f
IR n d0 = 1;
!1
30
R
while IR f d0 = 0. We could modify either Example 1.7 or 1.8 by setting
R
R
gn = fn if n is even,
but gn = 2fRn if n is odd. Now IR gn d0 = 1 if n is even, but IR gn d0 = 2 if n is odd. The
sequence
IRR gn d0 1
f
g
n=1 has two cluster points, 1 and 2. By definition, the smaller one,
R
1, is
liminfn
IR gn d0 and the larger one, 2, is lim supn
IR gn d0. Fatou’s Lemma guarantees
!1
!1
that even the smaller cluster point will be greater than or equal to the integral of the limiting function.
The key assumption in Fatou’s Lemma is that all the functions take only nonnegative values. Fatou’s
Lemma does not assume much but it is is not very satisfying because it does not conclude that
Z
f d
Z
f
IR
0 = lim
n
IR n d0:
!1
There are two sets of assumptions which permit this stronger conclusion.
Theorem 3.2 (Monotone Convergence Theorem) Let fn;n = 1; 2;::: be a sequence of functions
converging pointwise to a function f. Assume that
0 f1x f2x f3x
for every x
IR:
2
Then
Z
f d
Z
f
IR
0 = lim
n
IR n d0;
!1
where both sides are allowed to be
.
1
Theorem 3.3 (Dominated Convergence Theorem) Let fn;n = 1; 2;::: be a sequence of functions,
which may take either positive or negative values, converging pointwise to a function
R
f. Assume
that there is a nonnegative integrable function g (i.e., IR g d0
) such that
1
fnx gx for every x IR for every n:
j
j
2
Then
Z
f d
Z
f
IR
0 = lim
n
IR n d0;
!1
and both sides will be finite.
1.4
General Probability Spaces
Definition 1.13 A probability space ; ;IP consists of three objects:
F
(i)
, a nonempty set, called the sample space, which contains all possible outcomes of some
random experiment;
(ii)
, a -algebra of subsets of
;
F
(iii) IP , a probability measure on ; , i.e., a function which assigns to each set A
a number
F
2
F
IPA 0;1 , which represents the probability that the outcome of the random experiment
2
lies in the set A.
CHAPTER 1. Introduction to Probability Theory
31
Remark 1.1 We recall from Homework Problem 1.4 that a probability measure IP has the following
properties:
(a) IP = 0.
;
(b) (Countable additivity) If A1;A2;::: is a sequence of disjoint sets in , then
F
!
IP 1 Ak = 1
X
IPAk:
k=1
k=1
(c) (Finite additivity) If n is a positive integer and A1;:::;An are disjoint sets in , then
F
IPA1
An = IPA1 + + IPAn:
(d) If A and B are sets in
and A
B, then
F
IPB = IPA + IPB A:
n
In particular,
IPB IPA:
(d) (Continuity from below.) If A1;A2;::: is a sequence of sets in
with A1
A2
, then
F
!
IP 1 Ak = lim IPAn:
k=1
n!1
(d) (Continuity from above.) If A1;A2;::: is a sequence of sets in
with A1
A2
, then
F
!
IP 1 Ak = lim IPAn:
k=1
n!1
We have already seen some examples of finite probability spaces. We repeat these and give some
examples of infinite probability spaces as well.
Example 1.9 Finite coin toss space.
Toss a coin n times, so that
is the set of all sequences of H and T which have n components.
We will use this space quite a bit, and so give it a name:
n. Let be the collection of all subsets
F
of
n. Suppose the probability of H on each toss is p, a number between zero and one. Then the
probability of T is q
= 1 p. For each ! = !1;!2;:::;!n in n, we define
,
IP ! = pNumber of H in ! qNumber of T in !:
f
g
For each A
, we define
2
F
IPA = X IP ! :
(4.1)
f
g
! A
2
We can define IP A this way because A has only finitely many elements, and so only finitely many
terms appear in the sum on the right-hand side of (4.1).
32
Example 1.10 Infinite coin toss space.
Toss a coin repeatedly without stopping, so that
is the set of all nonterminating sequences of H
and T . We call this space
. This is an uncountably infinite space, and we need to exercise some
1
care in the construction of the -algebra we will use here.
For each positive integer n, we define n to be the -algebra determined by the first n tosses. For
F
example,
2 contains four basic sets,
F
AHH = ! = !1;!2;!3;:::;!1 = H;!2 = H
f
g
= The set of all sequences which begin with HH;
AHT = ! = !1;!2;!3;:::;!1 = H;!2 = T
f
g
= The set of all sequences which begin with HT;
ATH = ! = !1;!2;!3;:::;!1 = T;!2 = H
f
g
= The set of all sequences which begin with TH;
ATT = ! = !1;!2;!3;:::;!1 = T;!2 = T
f
g
= The set of all sequences which begin with TT:
Because
2 is a -algebra, we must also put into it the sets , , and all unions of the four basic
F
;
sets.
In the
-algebra
, we put every set in every
-algebra
n, where n ranges over the positive
F
F
integers. We also put in every other set which is required to make
be a -algebra. For example,
F
the set containing the single sequence
HHHHH
= H on every toss
f
g
f
g
is not in any of the
n -algebras, because it depends on all the components of the sequence and
F
not just the first n components. However, for each positive integer n, the set
H on the first n tosses
f
g
is in
n and hence in . Therefore,
F
F
H on every toss = 1 H on the first n tosses
f
g
f
g
n=1
is also in
.
F
We next construct the probability measure IP on
; which corresponds to probability p
F
2
1
0;1 for H and probability q = 1 p for T. Let A
be given. If there is a positive integer n
,
2
F
such that A
n, then the description of A depends on only the first n tosses, and it is clear how to
2
F
define IP A. For example, suppose A = AHH ATH, where these sets were defined earlier. Then
A is in 2. We set IPAHH = p2 and IPATH = qp, and then we have
F
IPA = IPAHH ATH = p2 + qp = p + qp = p:
In other words, the probability of a H on the second toss is p.
CHAPTER 1. Introduction to Probability Theory
33
Let us now consider a set A
for which there is no positive integer n such that A
. Such
2
F
2
F
is the case for the set H on every toss . To determine the probability of these sets, we write them
f
g
in terms of sets which are in
n for positive integers n, and then use the properties of probability
F
measures listed in Remark 1.1. For example,
H on the first toss
H on the first two tosses
f
g
f
g
H on the first three tosses
f
g
;
and
1
H on the first n tosses = H on every toss :
f
g
f
g
n=1
According to Remark 1.1(d) (continuity from above),
IP H on every toss = lim IP H on the first n tosses = lim pn:
f
g
n
f
g
n
!1
!1
If p = 1, then IP H on every toss = 1; otherwise, IP H on every toss = 0.
f
g
f
g
A similar argument shows that if 0
p 1 so that 0 q 1, then every set in
which contains
1
only one element (nonterminating sequence of H and T ) has probability zero, and hence very set
which contains countably many elements also has probabiliy zero. We are in a case very similar to
Lebesgue measure: every point has measure zero, but sets can have positive measure. Of course,
the only sets which can have positive probabilty in
are those which contain uncountably many
1
elements.
In the infinite coin toss space, we define a sequence of random variables Y1;Y2;::: by
Y
1
k! =
if !k = H;
0 if !k = T;
and we also define the random variable
n
X! =
Y
X
k!
k=1 2k :
Since each Yk is either zero or one, X takes values in the interval 0; 1 . Indeed, XTTTT
= 0,
XHHHH = 1 and the other values of X lie in between. We define a “dyadic rational
m
3
number” to be a number of the form 2k , where k and m are integers. For example, 4 is a dyadic
rational. Every dyadic rational in (0,1) corresponds to two sequences !
. For example,
2
1
XHHTTTTT = XHTHHHHH = 3
4:
The numbers in (0,1) which are not dyadic rationals correspond to a single !
; these numbers
2
1
have a unique binary expansion.
34
Whenever we place a probability measure IP on ; , we have a corresponding induced measure
F
X on 0; 1 . For example, if we set p = q = 1
L
2 in the construction of this example, then we have
X 0; 1 = IP First toss is T = 1
L
2
f
g
2;
1
X
= IP First toss is H = 1
L
2;1
f
g
2;
X 0; 1 = IP First two tosses are TT = 1
L
4
f
g
4;
1
X
= IP First two tosses are TH = 1
L
4; 12
f
g
4;
1
X
= IP First two tosses are HT = 1
L
2; 34
f
g
4;
3
X
= IP First two tosses are HH = 1
L
4;1
f
g
4:
Continuing this process, we can verify that for any positive integers k and m satisfying
0 m 1 m
,
1;
2k
2k
we have
m 1
X
,
= 1
L
2k ; m
2k
2k:
In other words, the X -measure of all intervals in 0; 1 whose endpoints are dyadic rationals is the
L
same as the Lebesgue measure of these intervals. The only way this can be is for X to be Lebesgue
L
measure.
1
It is interesing to consider what
X would look like if we take a value of p other than
L
2 when we
construct the probability measure IP on .
We conclude this example with another look at the Cantor set of Example 3.2. Let
pairs be the
subset of
in which every even-numbered toss is the same as the odd-numbered toss immediately
preceding it. For example, HHTTTTHH is the beginning of a sequence in pairs, but HT is not.
Consider now the set of real numbers
C X
:
0
= !;!
pairs
f
2
g
The numbers between 14; 12 can be written as X!, but the sequence ! must begin with either
TH or HT. Therefore, none of these numbers is in C0. Similarly, the numbers between 116; 316
can be written as X!, but the sequence ! must begin with TTTH or TTHT , so none of these
numbers is in C0. Continuing this process, we see that C0 will not contain any of the numbers which
were removed in the construction of the Cantor set C in Example 3.2. In other words, C
C
0
.
With a bit more work, one can convince onself that in fact C = C
0
, i.e., by requiring consecutive
coin tosses to be paired, we are removing exactly those points in 0; 1 which were removed in the
Cantor set construction of Example 3.2.
CHAPTER 1. Introduction to Probability Theory
35
In addition to tossing a coin, another common random experiment is to pick a number, perhaps
using a random number generator. Here are some probability spaces which correspond to different
ways of picking a number at random.
Example 1.11
Suppose we choose a number from IR in such a way that we are sure to get either 1, 4 or 16.
4
Furthermore, we construct the experiment so that the probability of getting 1 is 9, the probability of
4
1
getting 4 is 9 and the probability of getting 16 is 9. We describe this random experiment by taking
to be IR,
to be
IR, and setting up the probabilitymeasure so that
F
B
IP 1 = 4
4 = 4
16 = 1
f
g
9; IPf g 9; IPf g 9:
This determines IP A for every set A
IR. For example, the probability of the interval 0;5
2
B
8
is 9, because this interval contains the numbers 1 and 4, but not the number 16.
The probability measure described in this example is
S , the measure induced by the stock price
L
2
S2
1
, when the initial stock price S0 = 4 and the probability of H is 3. This distributionwas discussed
immediately following Definition 2.8.
Example 1.12 Uniform distribution on 0; 1 .
Let
= 0;1 and let = 0;1 , the collection of all Borel subsets containined in 0;1 . For
F
B
each Borel set A
0;1 ,we define IPA = 0A tobe the Lebesgue measure of the set. Because
0 0;1 = 1, this gives us a probability measure.
This probability space corresponds to the random experiment of choosing a number from 0; 1 so
that every number is “equally likely” to be chosen. Since there are infinitely mean numbers in 0; 1 ,
this requires that every number have probabilty zero of being chosen. Nonetheless, we can speak of
the probability that the number chosen lies in a particular set, and if the set has uncountably many
points, then this probability can be positive.
I know of no way to design a physical experiment which corresponds to choosing a number at
random from 0; 1 so that each number is equally likely to be chosen, just as I know of no way to
toss a coin infinitely many times. Nonetheless, both Examples 1.10 and 1.12 provide probability
spaces which are often useful approximations to reality.
Example 1.13 Standard normal distribution.
Define the standard normal density
x2
'x = 1
2 :
p2 e,
Let
= IR, = IR and for every Borel set A IR, define
F
B
IPA = Z 'd
A
0:
(4.2)
36
If A in (4.2) is an interval a; b , then we can write (4.2) as the less mysterious Riemann integral:
x2
IP a;b b
= Z 1
2 dx:
a p2 e,
This corresponds to choosing a point at random on the real line, and every single point has probabil-
ity zero of being chosen, but if a set A is given, then the probability the point is in that set is given
by (4.2).
The construction of the integral in a general probability space follows the same steps as the con-
struction of Lebesgue integral. We repeat this construction below.
Definition 1.14 Let ; ;IP be a probability space, and let X be a random variable on this space,
F
i.e., a mapping from
to IR, possibly also taking the values
.
1
If X is an indicator, i.e,
X! = lI
1
A;
A! =
if ! 2
0 if ! Ac;
2
for some set A
, we define
2
F
Z
X dIP = IPA:
If X is a simple function, i.e,
n
X! = X cklIAk!;
k=1
where each ck is a real number and each Ak is a set in , we define
F
n
n
Z
X dIP =
Z
X
ck lIA
X
k dIP =
ckIPAk:
k=1
k=1
If X is nonnegative but otherwise general, we define
Z
X dIP
= sup Z Y dIP;Y is simple and Y ! X! for every !
:
2
In fact, we can always construct a sequence of simple functions Yn;n = 1; 2;:: : such that
0 Y1! Y2! Y3! ::: for every !
;
2
and Y ! = limn
Yn! for every !
. With this sequence, we can define
2
!1
Z
X dIP = lim Z
n
Yn dIP:
!1
CHAPTER 1. Introduction to Probability Theory
37
If X is integrable, i.e,
Z
X+ dIP
; Z X dIP
;
,
1
1
where
X+! = max X!;0 ; X ! = max X!;0 ;
,
f
g
f,
g
then we define
Z
X dIP = Z X+ dIP
Z
X dIP:
,
,
,
If A is a set in
and X is a random variable, we define
F
Z
X dIP = Z lI X dIP:
A
A
The expectation of a random variable X is defined to be
IEX = Z X dIP:
The above integral has all the linearity and comparison properties one would expect. In particular,
if X and Y are random variables and c is a real constant, then
Z
X + Y dIP = Z X dIP + Z Y dIP;
Z
cX dIP = cZ X dP;
If X!
Y ! for every !
, then
2
Z
X dIP Z Y dIP:
In fact, we don’t need to have X!
Y ! for every !
in order to reach this conclusion; it is
2
enough if the set of ! for which X!
Y ! has probability one. When a condition holds with
probability one, we say it holds almost surely. Finally, if A and B are disjoint subsets of
and X
is a random variable, then
Z
X dIP = Z X dIP + Z X dIP:
A B
A
B
We restate the Lebesgue integral convergence theorem in this more general context. We acknowl-
edge in these statements that conditions don’t need to hold for every !; almost surely is enough.
Theorem 4.4 (Fatou’s Lemma) Let Xn;n = 1; 2;:: : be a sequence of almost surely nonnegative
random variables converging almost surely to a random variable X. Then
Z
X dIP liminf Z Xn dIP;
n!1
or equivalently,
IEX liminf IEXn:
n!1
38
Theorem 4.5 (Monotone Convergence Theorem) Let Xn;n = 1; 2;:: : be a sequence of random
variables converging almost surely to a random variable X. Assume that
0 X1 X2 X3
almost surely:
Then
Z
X dIP = lim Z
n
XndIP;
!1
or equivalently,
IEX = lim
n
IEXn:
!1
Theorem 4.6 (Dominated Convergence Theorem) Let Xn;n = 1; 2;:: : be a sequence of random
variables, converging almost surely to a random variable X. Assume that there exists a random
variable Y such that
Xn Y almost surely for every n:
j
j
Then
Z
X dIP = lim Z
n
Xn dIP;
!1
or equivalently,
IEX = lim
n
IEXn:
!1
In Example 1.13, we constructed a probability measure on IR; IR by integrating the standard
B
R
normal density. In fact, whenever ' is a nonnegative function defined on R satisfying IR 'd0 = 1,
we call ' a density and we can define an associated probability measure by
IPA = Z 'd
IR:
A
0 for every A
(4.3)
2
B
We shall often have a situation in which two measure are related by an equation like (4.3). In fact,
the market measure and the risk-neutral measures in financial markets are related this way. We say
that ' in (4.3) is the Radon-Nikodym derivative of dIP with respect to 0, and we write
' = dIP
d :
(4.4)
0
The probability measure IP weights different parts of the real line according to the density '. Now
suppose f is a function on R; IR;IP . Definition 1.14 gives us a value for the abstract integral
B
Z
f dIP:
IR
We can also evaluate
Z
f'd
IR
0;
which is an integral with respec to Lebesgue measure over the real line. We want to show that
Z
f dIP = Z f'd
IR
IR
0;
(4.5)
CHAPTER 1. Introduction to Probability Theory
39
dIP
an equation which is suggested by the notation introduced in (4.4) (substitute d for ' in (4.5) and
0
“cancel” the d0). We include a proof of this because it allows us to illustrate the concept of the
standard machine explained in Williams’s book in Section 5.12, page 5.
The standard machine argument proceeds in four steps.
Step 1. Assume that f is an indicator function, i.e., fx = lIAx for some Borel set A
IR. In
that case, (4.5) becomes
IPA = Z 'd
A
0:
This is true because it is the definition of IP A.
Step 2. Now that we know that (4.5) holds when f is an indicator function, assume that f is a
simple function, i.e., a linear combination of indicator functions. In other words,
n
fx = X ckhkx;
k=1
where each ck is a real number and each hk is an indicator function. Then
"
n
Z
f dIP = Z X c
dIP
IR
IR
khk
k=1
n
=
Z
X
ck hk dIP
k=1
IR
n
=
Z
X
ck hk'd0
k=1
IR
"
n
= Z X c
'd
IR
khk
0
k=1
= Z f'd
IR
0:
Step 3. Now that we know that (4.5) holds when f is a simple function, we consider a general
nonnegative function f. We can always construct a sequence of nonnegative simple functions
fn;n = 1;2;::: such that
0 f1x f2x f3x ::: for every x IR;
2
and fx = limn
fnx for every x IR. We have already proved that
2
!1
Z
f
f
IR n dIP = ZIR n'd0 for every n:
We let n
and use the Monotone Convergence Theorem on both sides of this equality to
!
1
get
Z
f dIP = Z f'd
IR
IR
0:
40
Step 4. In the last step, we consider an integrable function f, which can take both positive and
negative values. By integrable, we mean that
Z
f+ dIP
; Z f dIP
:
,
IR
1
IR
1
¿From Step 3, we have
Z
f+ dIP = Z f+'d
IR
IR
0;
Z
f dIP
f 'd
,
= Z ,
IR
IR
0:
Subtracting these two equations, we obtain the desired result:
Z
f dIP = Z f+ dIP Z f dIP
,
IR
IR
,
IR
= Z f+'d Z f 'd
,
IR
0
0
,
IR
= Z f'd
R
0:
1.5
Independence
In this section, we define and discuss the notion of independence in a general probability space
; ;IP, although most of the examples we give will be for coin toss space.
F
1.5.1
Independence of sets
Definition 1.15 We say that two sets A
and B
are independent if
2
F
2
F
IPA B = IPAIPB:
Suppose a random experiment is conducted, and ! is the outcome. The probability that !
A is
2
IPA. Suppose you are not told !, but you are told that ! B. Conditional on this information,
2
the probability that !
A is
2
IPA B = IPA B
j
IPB :
The sets A and B are independent if and only if this conditional probability is the uncondidtional
probability IP A, i.e., knowing that !
B does not change the probability you assign to A. This
2
discussion is symmetric with respect to A and B; if A and B are independent and you know that
! A, the conditional probability you assign to B is still the unconditionalprobability IPB.
2
Whether two sets are independent depends on the probability measure IP . For example, suppose we
toss a coin twice, with probability p for H and probability q = 1
p for T on each toss. To avoid
,
trivialities, we assume that 0
p 1. Then
IP HH = p2; IP HT = IP TH = pq; IP TT = q2:
(5.1)
f
g
f
g
f
g
f
g
CHAPTER 1. Introduction to Probability Theory
41
Let A = HH; HT and B = HT; T H . In words, A is the set “H on the first toss” and B is the
f
g
f
g
set “one H and one T .” Then A B = HT . We compute
f
g
IPA = p2 + pq = p;
IPB = 2pq;
IPAIPB = 2p2q;
IPA B = pq:
These sets are independent if and only if 2p2q = pq, which is the case if and only if p = 12.
1
If p = 12, then IPB, the probability of one head and one tail, is 2. If you are told that the coin
tosses resulted in a head on the first toss, the probability of B, which is now the probability of a T
1
on the second toss, is still 2.
Suppose however that p = 0:01. By far the most likely outcome of the two coin tosses is TT , and
the probability of one head and one tail is quite small; in fact, IP B = 0:0198. However, if you
are told that the first toss resulted in H, it becomes very likely that the two tosses result in one head
and one tail. In fact, conditioned on getting a H on the first toss, the probability of one H and one
T is the probability of a T on the second toss, which is 0:99.
1.5.2
Independence of -algebras
Definition 1.16 Let
and
be sub- -algebras of
. We say that
and
are independent if every
G
H
F
G
H
set in
is independent of every set in
, i.e,
G
H
IPA B = IPAIPB for every A
; B
:
2
H
2
G
Example 1.14 Toss a coin twice, and let IP be given by (5.1). Let
= 1 be the -algebra
G
F
determined by the first toss:
contains the sets
G
; ; HH;HT ; TH;TT :
;
f
g
f
g
Let
be the -albegra determined by the second toss:
contains the sets
H
H
; ; HH;TH ; HT;TT :
;
f
g
f
g
These two -algebras are independent. For example, if we choose the set HH; HT from and
f
g
G
the set HH; T H from , then we have
f
g
H
IP HH;HT IP HH;TH = p2 + pqp2 + pq = p2;
f
g
f
g
IP
HH;HT
HH;TH = IP HH = p2:
f
g
f
g
f
g
No matter which set we choose in
and which set we choose in
, we will find that the product of
G
H
the probabilties is the probability of the intersection.
42
Example 1.14 illustrates the general principle that when the probability for a sequence of tosses is
defined to be the product of the probabilities for the individual tosses of the sequence, then every
set depending on a particular toss will be independent of every set depending on a different toss.
We say that the different tosses are independent when we construct probabilities this way. It is also
possible to construct probabilities such that the different tosses are not independent, as shown by
the following example.
Example 1.15 Define IP for the individual elements of
= HH;HT;TH;TT to be
f
g
IP HH = 1
HT = 2
TH = 1
TT = 1
f
g
9; IPf g 9; IPf g 3; IPf g 3;
and for every set A
, define IP A to be the sum of the probabilities of the elements in A. Then
IP = 1, so IP is a probability measure. Note that the sets H on first toss = HH;HT and
f
g
f
g
H on second toss = HH;TH have probabilities IP HH;HT = 1
HH;TH =
f
g
f
g
f
g
3 and IPf
g
4
4
9, so the product of the probabilities is 27. On the other hand, the intersection of HH; HT
f
g
1
and HH; T H contains the single element HH , which has probability
f
g
f
g
9. These sets are not
independent.
1.5.3
Independence of random variables
Definition 1.17 We say that two random variables X and Y are independent if the -algebras they
generate X and Y are independent.
In the probability space of three independent coin tosses, the price S2 of the stock at time 2 is
S
S
independent of
3
S . This is because S2 depends on only the first two coin tosses, whereas 3
S is
2
2
either u or d, depending on whether the third coin toss is H or T .
Definition 1.17 says that for independent random variables X and Y , every set defined in terms of
X
S
is independent of every set defined in terms of Y . In the case of S2 and 3
n
S just considered, for ex-
2
S
ample, the sets S2 = udS0 = HT H; HT T and
= uo = HHH;HTH;THH;TTH
3
f
g
f
g
S
f
g
2
are indepedent sets.
Suppose X and Y are independent random variables. We defined earlier the measure induced by X
on IR to be
XA = IP X A ; A IR:
L
f
2
g
Similarly, the measure induced by Y is
Y B = IP Y B ; B IR:
L
f
2
g
Now the pair X; Y takes values in the plane IR2, and we can define the measure induced by the
pair
X;Y C = IP X; Y C ; C IR2:
L
f
2
g
The set C in this last equation is a subset of the plane IR2. In particular, C could be a “rectangle”,
i.e, a set of the form A
B, where A IR and B IR. In this case,
X;Y A B = X A
Y B ;
f
2
g
f
2
g
f
2
g
CHAPTER 1. Introduction to Probability Theory
43
and X and Y are independent if and only if
X;Y A B = IP
X A
Y B
L
f
2
g
f
2
g
= IP X A IP Y B
(5.2)
f
2
g
f
2
g
= XA Y B:
L
L
In other words, for independent random variables X and Y , the joint distribution represented by the
measure
X;Y factors into the product of the marginal distributions represented by the measures
L
X and Y .
L
L
A joint density for X; Y is a nonnegative function fX;Y x; y such that
Z
X;Y A B = Z
fX;Y x;ydxdy:
L
A B
Not every pair of random variables X; Y has a joint density, but if a pair does, then the random
variables X and Y have marginal densities defined by
fXx = Z 1 fX;Y x;d; fY yZ 1 fX;Y ;yd:
,1
,1
These have the properties
XA = Z fXx dx; A IR;
L
A
Y B = Z fY y dy; B IR:
L
B
Suppose X and Y have a joint density. Then X and Y are independent variables if and only if
the joint density is the product of the marginal densities. This follows from the fact that (5.2) is
equivalent to independence of X and Y . Take A =
;x and B = ;y , write (5.1) in terms
,1
,1
of densities, and differentiate with respect to both x and y.
Theorem 5.7 Suppose X and Y are independent random variables. Let g and h be functions from
IR to IR. Then gX and hY are also independent random variables.
PROOF: Let us denote W = gX and Z = hY . We must consider sets in W and Z. But
a typical set in W is of the form
!;W! A = ! : gX! A ;
f
2
g
f
2
g
which is defined in terms of the random variable X. Therefore, this set is in X. (In general,
we have that every set in
W is also in X, which means that X contains at least as much
information as W . In fact, X can contain strictly more information than W , which means that X
will contain all the sets in W and others besides; this is the case, for example, if W = X2.)
In the same way that we just argued that every set in
W is also in X, we can show that
every set in Z is also in Y . Since every set in X is independent of every set in Y , we
conclude that every set in W is independent of every set in Z.
44
Definition 1.18 Let X1;X2;::: be a sequence of random variables. We say that these random
variables are independent if for every sequence of sets A1
X1;A2 X2;::: and for every
2
2
positive integer n,
IPA1 A2
An = IPA1IPA2 IPAn:
1.5.4
Correlation and independence
Theorem 5.8 If two random variables X and Y are independent, and if g and h are functions from
IR to IR, then
IE gXhY = IEgX IEhY ;
provided all the expectations are defined.
PROOF: Let gx = lIAx and hy = lIBy be indicator functions. Then the equation we are
trying to prove becomes
IP
X A
Y B = IP X A IP Y B ;
f
2
g
f
2
g
f
2
g
f
2
g
which is true because X and Y are independent. Now use the standard machine to get the result for
general functions g and h.
The variance of a random variable X is defined to be
VarX
= IE X IEX 2:
,
The covariance of two random variables X and Y is defined to be
CovX; Y
= IEhX IEXY IEY i
,
,
= IE XY IEX IEY:
,
According to Theorem 5.8, for independent random variables, the covariance is zero. If X and Y
both have positive variances, we define their correlation coefficient
X;Y = CovX;Y
p
VarXVarY :
For independent random variables, the correlation coefficient is zero.
Unfortunately, two random variables can have zero correlation and still not be independent. Con-
sider the following example.
Example 1.16 Let X be a standard normal random variable, let Z be independent of X and have
the distribution IP Z = 1 = IP Z = 1 = 0. Define Y = XZ. We show that Y is also a
f
g
f
,
g
standard normal random variable, X and Y are uncorrelated, but X and Y are not independent.
The last claim is easy to see. If X and Y were independent, so would be X2 and Y 2, but in fact,
X2 = Y 2 almost surely.
CHAPTER 1. Introduction to Probability Theory
45
We next check that Y is standard normal. For y
IR, we have
2
IP Y y = IP Y y and Z = 1 + IP Y y and Z = 1
f
g
f
g
f
,
g
= IP X y and Z = 1 + IP X y and Z = 1
f
g
f,
,
g
= IP X y IP Z = 1 + IP X y IP Z = 1
f
g
f
g
f,
g
f
,
g
= 12IP X y + 1 X y :
f
g
2IPf, g
Since X is standard normal, IP X
y = IP X
y , and we have IP Y y = IP X y ,
f
g
f
,
g
f
g
f
g
which shows that Y is also standard normal.
Being standard normal, both X and Y have expected value zero. Therefore,
CovX; Y = IE XY = IE X2Z = IEX2 IEZ = 1 0 = 0:
Where in IR2 does the measure X;Y put its mass, i.e., what is the distribution of X; Y ?
L
We conclude this section with the observation that for independent random variables, the variance
of their sum is the sum of their variances. Indeed, if X and Y are independent and Z = X + Y ,
then
VarZ
= IEhZ IEZ2i
,
= IE
X + Y IEX IEY 2i
,
,
= IEhX IEX2 + 2X IEXY IEY + Y IEY 2i
,
,
,
,
= VarX+ 2IE X IEX IE Y IEY + VarY
,
,
= VarX+ VarY :
This argument extends to any finite number of random variables. If we are given independent
random variables X1;X2;:::;Xn, then
VarX1 + X2 +
+ Xn = VarX1 + VarX2 + + VarXn:
(5.3)
1.5.5
Independence and conditional expectation.
We now return to property (k) for conditional expectations, presented in the lecture dated October
19, 1995. The property as stated there is taken from Williams’s book, page 88; we shall need only
the second assertion of the property:
(k) If a random variable X is independent of a -algebra , then
H
IE X = IEX:
jH
The point of this statement is that if X is independent of , then the best estimate of X based on
H
the information in
is IEX, the same as the best estimate of X based on no information.
H
46
To show this equality, we observe first that IEX is -measurable, since it is not random. We must
H
also check the partial averaging property
Z
IEX dIP = Z X dIP
:
A
A
for every A 2 H
If X is an indicator of some set B, which by assumption must be independent of , then the partial
H
averaging equation we must check is
Z
IPBdIP = Z lI
A
A B dIP:
The left-hand side of this equation is IP AIP B, and the right hand side is
Z
l
IAlIB dIP = Z lIA B dIP = IPA B:
The partial averaging equation holds because A and B are independent. The partial averaging
equation for general X independent of
follows by the standard machine.
H
1.5.6
Law of Large Numbers
There are two fundamental theorems about sequences of independent random variables. Here is the
first one.
Theorem 5.9 (Law of Large Numbers) Let X1;X2;::: be a sequence of independent, identically
2
distributed random variables, each with expected value and variance
. Define the sequence of
averages
Y
+ X
n = X1 + X2 +
n
n
; n = 1;2;::::
Then Yn converges to almost surely as n
.
!
1
We are not going to give the proof of this theorem, but here is an argument which makes it plausible.
We will use this argument later when developing stochastic calculus. The argument proceeds in two
steps. We first check that IEYn = for every n. We next check that VarYn
0 as n 0. In
!
!
other words, the random variables Yn are increasingly tightly distributed around as n
.
!
1
For the first step, we simply compute
IEYn = 1n IEX1+IEX2+ +IEXn = 1
+ = :
n + +
|
z
n times
For the second step, we first recall from (5.3) that the variance of the sum of independent random
variables is the sum of their variances. Therefore,
n
n 2
2
X
X
k
X
VarYn =
Var
=
k=1
n
k=1 n2 = n :
As n
, we have VarYn
0.
!
1
!
CHAPTER 1. Introduction to Probability Theory
47
1.5.7
Central Limit Theorem
The Law of Large Numbers is a bit boring because the limit is nonrandom. This is because the
denominator in the definition of Yn is so large that the variance of Y
p
n converges to zero. If we want
to prevent this, we should divide by
n rather than n. In particular, if we again have a sequence of
independent, identically distributed random variables, each with expected value and variance 2,
but now we set
Z
+ X + + X
n = X1
2
n
,
,
,
pn
;
then each Zn has expected value zero and
n
n 2
X
X
k
X
VarZn =
Var
,
=
p
k=1
n
k=1 n = 2:
As n
, the distributions of all the random variables Zn have the same degree of tightness, as
!
1
measured by their variance, around their expected value 0. The Central Limit Theorem asserts that
as n
, the distribution of Zn approaches that of a normal random variable with mean (expected
!
1
2
value) zero and variance
. In other words, for every set A
IR,
lim
Z
x2
n
IP Zn A = 1
e, dx:
2
2
p
f
2
g
2 A
!1
48
Chapter 2
Conditional Expectation
Please see Hull’s book (Section 9.6.)
2.1
A Binomial Model for Stock Price Dynamics
Stock prices are assumed to follow this simple binomial model: The initial stock price during the
period under study is denoted S0. At each time step, the stock price either goes up by a factor of u
or down by a factor of d. It will be useful to visualize tossing a coin at each time step, and say that
the stock price moves up by a factor of u if the coin comes out heads (H), and
down by a factor of d if it comes out tails (T ).
Note that we are not specifying the probability of heads here.
Consider a sequence of 3 tosses of the coin (See Fig. 2.1) The collection of all possible outcomes
(i.e. sequences of tosses of length 3) is
= HHH;HHT;HTH;HTT;THH;THH;THT;TTH;TTT :
f
g
A typical sequence of
will be denoted !, and !k will denote the kth element in the sequence !.
We write Sk! to denote the stock price at “time” k (i.e. after k tosses) under the outcome !. Note
that Sk! depends only on !1;!2;::: ;!k. Thus in the 3-coin-toss example we write for instance,
S1! 4= S1!1;!2;!3 4= S1!1;
S2! 4= S2!1;!2;!3 4= S2!1;!2:
Each Sk is a random variable defined on the set . More precisely, let
= . Then is a
F
P
F
-algebra and ; is a measurable space. Each Sk is an -measurable function
IR, that is,
F
F
!
S 1
,
k is a function
where
is the Borel
-algebra on IR. We will see later that Sk is in fact
B !F
B
49
50
3
S
ω = Η
3 (HHH) = u S
3
0
2
S (HH) = u S
2
0
2
ω = Η
ω = Τ
S3 (HHT) = u d S0
2
3
2
S3 (HTH) = u d S
S (H) = uS
0
1
0
2
S3 (THH) = u d S
ω = Τ
0
2
ω = Η
ω = Η
1
3
(HT) = ud S
2
S
S
0
0
2
S (TH) = ud S0
ω = Τ
ω = Τ
3
1
2
S
ω = Η
3 (HTT) = d u S0
2
2
S3 (THT) = d u S
S (T) = dS
0
1
0
2
S3 (TTH) = d u S
ω = Τ
0
2
ω = Η
3
2
S (TT) = d S
2
0
ω = Τ
3
3
S3 (TTT) = d S0
Figure 2.1: A three coin period binomial model.
measurable under a sub- -algebra of
. Recall that the Borel -algebra
is the -algebra generated
F
B
by the open intervals of IR. In this course we will always deal with subsets of IR that belong to
.
B
For any random variable X defined on a sample space
and any y
IR, we will use the notation:
2
X y 4= !
;X! y :
f
g
f
2
g
The sets X
y ; X y ; X = y ; etc, are defined similarly. Similarly for any subset B of IR,
f
g
f
g
f
g
we define
X B 4= !
;X! B :
f
2
g
f
2
2
g
Assumption 2.1 u
d 0.
2.2
Information
Definition 2.1 (Sets determined by the first k tosses.) We say that a set A
is determined by
the first k coin tosses if, knowing only the outcome of the first k tosses, we can decide whether the
outcome of all tosses is in A. In general we denote the collection of sets determined by the first k
tosses by
k. It is easy to check that k is a -algebra.
F
F
Note that the random variable Sk is k-measurable, for each k = 1; 2;::: ;n.
F
Example 2.1 In the 3 coin-toss example, the collection
of sets determined by the first toss consists of:
F
1
CHAPTER 2. Conditional Expectation
51
1. A 4
= HHH;HHT;HTH;HTT ,
f
g
H
2. A 4
= THH;THT;TTH;TTT ,
f
g
T
3.
,
4.
.
The collection
of sets determined by the first two tosses consists of:
F
2
1. A
4
= HHH;HHT ,
f
g
H
H
2. A
4
= HTH;HTT ,
f
g
H
T
3. A
4
= THH;THT ,
f
g
T
H
4. A
4
= TTH;TTT ,
f
g
T
T
5. The complements of the above sets,
6. Any union of the above sets (including the complements),
7.
and
.
Definition 2.2 (Information carried by a random variable.) Let X be a random variable
IR.
!
We say that a set A
is determined by the random variable X if, knowing only the value X!
of the random variable, we can decide whether or not !
A. Another way of saying this is that for
2
every y
IR, either X 1y A
1y A =
,
or X,
. The collection of susbets of
determined
2
by X is a -algebra, which we call the -algebra generated by X, and denote by X.
If the random variable X takes finitely many different values, then X is generated by the collec-
tion of sets
X 1
!
,
X!
;
f
j
2
g
these sets are called the atoms of the -algebra X.
In general, if X is a random variable
IR, then X is given by
!
X = X 1
:
,
B;B
f
2
B g
Example 2.2 (Sets determined by S ) The -algebra generated by S consists of the following sets:
2
2
1. A
= HHH;HHT = !
;S ! = u S
2
,
f
g
f
2
g
H
H
2
0
2. A
= TTH;TTT = S = d S ;
2
f
g
f
g
T
T
2
0
3. A
A = S = udS ;
f
g
H
T
T
H
2
0
4. Complements of the above sets,
5. Any union of the above sets,
6.
= S !
,
f
2
g
2
7.
= S ! IR .
f
2
g
2
52
2.3
Conditional Expectation
In order to talk about conditional expectation, we need to introduce a probability measure on our
coin-toss sample space
. Let us define
p 0;1 is the probability of H,
2
q 4= 1 p is the probabilityof T,
,
the coin tosses are independent, so that, e.g., IP HHT = p2q; etc.
IPA 4= P! A IP!, A .
8
2
Definition 2.3 (Expectation.)
IEX 4= X X!IP!:
!2
If A
then
I
1
A
A!
if !
4
=
2
0 if ! A
62
and
IEIAX = Z XdIP = X X!IP!:
A
! A
2
We can think of IEIAX as a partial average of X over the set A.
2.3.1
An example
Let us estimate S1, given S2. Denote the estimate by IES1 S2. From elementary probability,
j
IES1 S2 is a random variable Y whose value at ! is defined by
j
Y ! = IES1 S2 = y;
j
where y = S2!. Properties of IES1 S2:
j
IES1 S2 should depend on !, i.e., it is a random variable.
j
If the value of S2 is known, then the value of IES1 S2 should also be known. In particular,
j
– If ! = HHH or ! = HHT , then S2! = u2S0. If we know that S2! = u2S0, then
even without knowing !, we know that S1! = uS0. We define
IES1 S2HHH = IES1 S2HHT = uS0:
j
j
– If ! = TTT or ! = TTH, then S2! = d2S0. If we know that S2! = d2S0, then
even without knowing !, we know that S1! = dS0. We define
IES1 S2TTT = IES1 S2TTH = dS0:
j
j
CHAPTER 2. Conditional Expectation
53
– If !
A = HTH;HTT;THH;THT , then S2! = udS0. If we know S2! =
2
f
g
udS0, then we do not know whether S1 = uS0 or S1 = dS0. We then take a weighted
average:
IPA = p2q + pq2 + p2q + pq2 = 2pq:
Furthermore,
Z
S
A 1dIP = p2quS0 + pq2uS0 + p2qdS0 + pq2dS0
= pqu + dS0
For !
A we define
2
R
IES1 S2! = A S1dIP
j
IPA = 12u + dS0:
Then
Z
IES S
S
A
1 2dIP = Z
1dIP:
j
A
In conclusion, we can write
IES1 S2! = gS2!;
j
where
8
uS0
if x = u2S0
gx = 12u + dS0 if x = udS0
dS
:
0
if x = d2S0
In other words, IES1 S2 is random only through dependence on S2. We also write
j
IES1 S2 = x = gx;
j
where g is the function defined above.
The random variable IES1 S2 has two fundamental properties:
j
IES1 S2 is S2-measurable.
j
For every set A
S2,
2
Z
IES S
S
A
1 2dIP = Z 1dIP:
j
A
2.3.2
Definition of Conditional Expectation
Please see Williams, p.83.
Let ; ;IP be a probability space, and let be a sub- -algebra of . Let X be a random variable
F
G
F
on ; ;IP . Then IEX is defined to be any random variable Y that satisfies:
F
jG
(a) Y is -measurable,
G
54
(b) For every set A
, we have the “partial averaging property”
2
G
Z
Y dIP = Z XdIP:
A
A
Existence. There is always a random variable Y satisfying the above properties (provided that
IE X
), i.e., conditional expectations always exist.
j
j
1
Uniqueness. There can be more than one random variable Y satisfying the above properties, but if
Y
!
;Y ! = Y ! = 1:
0
is another one, then Y = Y 0 almost surely, i.e., IP
0
f
2
g
Notation 2.1 For random variables X; Y , it is standard notation to write
IEX Y 4= IEX Y :
j
j
Here are some useful ways to think about IEX :
jG
A random experiment is performed, i.e., an element ! of
is selected. The value of ! is
partially but not fully revealed to us, and thus we cannot compute the exact value of X!.
Based on what we know about !, we compute an estimate of X!. Because this estimate
depends on the partial information we have about !, it depends on !, i.e., IE X Y ! is a
j
function of !, although the dependence on ! is often not shown explicitly.
If the -algebra
contains finitely many sets, there will be a “smallest” set A in containing
G
G
!, which is the intersectionof all sets in containing !. The way ! is partially revealed to us
G
is that we are told it is in A, but not told which element of A it is. We then define IE X Y !
j
to be the average (with respect to IP ) value of X over this set A. Thus, for all ! in this set A,
IE X Y ! will be the same.
j
2.3.3
Further discussion of Partial Averaging
The partial averaging property is
Z
IEX dIP = Z XdIP; A :
A
(3.1)
jG
A
8
2
G
We can rewrite this as
IE IA:IEX = IE IA:X :
(3.2)
jG
Note that IA is a -measurable random variable. In fact the following holds:
G
Lemma 3.10 If V is any -measurable random variable, then provided IE V:IEX
,
G
j
jG
j
1
IE V:IEX = IE V:X :
(3.3)
j
G
CHAPTER 2. Conditional Expectation
55
Proof: To see this, first use (3.2) and linearity of expectations to prove (3.3) when V is a simple
n
-measurable random variable, i.e., V is of the form V = Pk=1 ckIA
and
G
K , where each Ak is in G
each ck is constant. Next consider the case that V is a nonnegative -measurable random variable,
G
but is not necessarily simple. Such a V can be written as the limit of an increasing sequence
of simple random variables Vn; we write (3.3) for each Vn and then pass to the limit, using the
Monotone Convergence Theorem (See Williams), to obtain (3.3) for V . Finally, the general -
G
measurable random variable V can be written as the difference of two nonnegative random-variables
V = V + V ,, and since (3.3) holds for V + and V , it must hold for V as well. Williams calls
,
this argument the “standard machine” (p. 56).
Based on this lemma, we can replace the second condition in the definition of a conditional expec-
tation (Section 2.3.2) by:
(b’) For every
-measurable random-variable V , we have
G
IE V:IEX = IE V:X :
(3.4)
j
G
2.3.4
Properties of Conditional Expectation
Please see Willams p. 88. Proof sketches of some of the properties are provided below.
(a) IEIEX = IEX:
jG
Proof: Just take A in the partial averaging property to be .
The conditional expectation of X is thus an unbiased estimator of the random variable X.
(b) If X is -measurable, then
G
IEX = X:
jG
Proof: The partial averaging property holds trivially when Y is replaced by X. And since X
is -measurable, X satisfies the requirement (a) of a conditional expectation as well.
G
If the information content of
is sufficient to determine X, then the best estimate of X based
G
on
is X itself.
G
(c) (Linearity)
IEa1X1 + a2X2 = a1IEX1 + a2IEX2 :
jG
jG
jG
(d) (Positivity) If X
0 almost surely, then
IEX 0:
jG
Proof: Take A = !
;IE
R
X ! 0 . Thissetis in sinceIEX is -measurable.
f
2
jG
g
G
jG
G
Partial averaging implies A IEX dIP = RA XdIP. The right-hand side is greater than
jG
or equal to zero, and the left-hand side is strictly negative, unless IP A = 0. Therefore,
IPA = 0.
56
(h) (Jensen’s Inequality) If
: R R is convex and IE X
, then
!
j
j
1
IE X
IEX :
jG
jG
Recall the usual Jensen’s Inequality: IE X
IEX:
(i) (Tower Property) If
is a sub- -algebra of , then
H
G
IEIEX = IEX :
jG
jH
jH
is a sub- -algebra of
means that
contains more information than
. If we estimate X
H
G
G
H
based on the information in
, and then estimate the estimator based on the smaller amount
G
of information in
, then we get the same result as if we had estimated X directly based on
H
the information in
.
H
(j) (Taking out what is known) If Z is -measurable, then
G
IEZX = Z:IEX :
jG
jG
When conditioning on , the -measurable random variable Z acts like a constant.
G
G
Proof: Let Z be a -measurable random variable. A random variable Y is IEZX if and
G
jG
only if
(a) Y is -measurable;
G
R
(b)
A Y dIP = RA ZXdIP; A
.
8
2
G
Take Y = Z:IEX . Then Y satisfies (a) (a product of -measurable random variables is
jG
G
-measurable). Y also satisfies property (b), as we can check below:
G
Z
Y dIP = IEI
A
A:Y
= IE IAZIEX
jG
= IE IAZ:X ((b’) with V = IAZ
= Z ZXdIP:
A
(k) (Role of Independence) If
is independent of X; , then
H
G
IEX ; = IEX :
j
G
H
jG
In particular, if X is independent of , then
H
IEX = IEX:
jH
If
is independent of X and , then nothing is gained by including the information content
H
G
of
in the estimation of X.
H
CHAPTER 2. Conditional Expectation
57
2.3.5
Examples from the Binomial Model
Recall that
1 = ; AH;AT; . Notice that IES2 1 must be constant on AH and AT.
F
f
g
jF
Now since IES2 1 must satisfy the partial averaging property,
jF
Z
IES
S
A
2 1dIP = Z
2dIP;
H
jF
AH
Z
IES
S
A
2 1dIP = Z
2dIP:
T
jF
AT
We compute
Z
IES
A
2 1dIP = IPAH:IES2 1!
jF
jF
H
= pIES2 1!; ! AH:
jF
8
2
On the other hand,
Z
S
A 2dIP = p2u2S0 + pqudS0:
H
Therefore,
IES2 1! = pu2S0 + qudS0; ! AH:
jF
8
2
We can also write
IES2 1! = pu2S0 + qudS0
jF
= pu + qduS0
= pu + qdS1!; ! AH
8
2
Similarly,
IES2 1! = pu + qdS1!; ! AT:
jF
8
2
Thus in both cases we have
IES2 1! = pu + qdS1!; !
:
jF
8
2
A similar argument one time step later shows that
IES3 2! = pu + qdS2!:
jF
We leave the verification of this equality as an exercise. We can verify the Tower Property, for
instance, from the previous equations we have
IE IES3 2 1 = IE pu+ qdS2 2
jF
jF
jF
= pu + qdIES2 1 (linearity)
jF
= pu + qd2S1:
This final expression is IES3 1.
jF
58
2.4
Martingales
The ingredients are:
A probability space ; ;IP .
F
A sequence of -algebras
0; 1;::: ; n, with the property that 0
1
:::
n
F
F
F
F
F
F
. Such a sequence of -algebras is called a filtration.
F
A sequence of random variables M0;M1;::: ;Mn. This is called a stochastic process.
Conditions for a martingale:
1. Each Mk is k-measurable. If you know the information in k, then you know the value of
F
F
Mk. We say that the process Mk is adapted to the filtration k .
f
g
fF
g
2. For each k, IEMk+1 k = Mk. Martingales tend to go neither up nor down.
jF
A supermartingale tends to go down, i.e. the second condition above is replaced by IEMk+1 k
jF
Mk; a submartingaletends to go up, i.e. IEMk+1 k Mk.
jF
Example 2.3 (Example from the binomial model.) For k = 1; 2 we already showed that
IES
= pu + qdS :
jF
k +1
k
k
For k = 0, we set
= ; , the “trivial -algebra”. This -algebra contains no information, and any
F
f
g
0
-measurable random variable must be constant (nonrandom). Therefore, by definition, IES
is that
F
jF
0
1
0
constant which satisfies the averaging property
Z
IES
dIP = Z S dIP:
jF
1
0
1
The right hand side is IES = pu + qdS , and so we have
1
0
IES
= pu + qdS :
jF
1
0
0
In conclusion,
If pu + qd = 1 then S ;
;k = 0;1;2;3 is a martingale.
f
F
g
k
k
If pu + qd
1 then S ; ;k = 0;1;2;3 is a submartingale.
f
F
g
k
k
If pu + qd
1 then S ; ;k = 0;1;2;3 is a supermartingale.
f
F
g
k
k
Chapter 3
Arbitrage Pricing
3.1
Binomial Pricing
Return to the binomial pricing model
Please see:
Cox, Ross and Rubinstein, J. Financial Economics, 7(1979), 229–263, and
Cox and Rubinstein (1985), Options Markets, Prentice-Hall.
Example 3.1 (Pricing a Call Option) Suppose u = 2;d = 0:5;r = 25(interest rate), S = 50. (In this
0
and all examples, the interest rate quoted is per unit time, and the stock prices S ;S ;::: are indexed by the
0
1
same time periods). We know that
S ! = 100 if ! = H
1
1
25
if ! = T
1
Find the value at time zero of a call option to buy one share of stock at time 1 for $50 (i.e. the strike price is
$50).
The value of the call at time 1 is
V ! = S ! 50 = 50 if ! = H
1
+
,
1
1
0
if ! = T
1
Suppose the option sells for $20 at time 0. Let us construct a portfolio:
1. Sell 3 options for $20 each. Cash outlay is
$60:
,
2. Buy 2 shares of stock for $50 each. Cash outlay is $100.
3. Borrow $40. Cash outlay is
$40:
,
59
60
This portfolio thus requires no initial investment. For this portfolio, the cash outlay at time 1 is:
! = H
! = T
1
1
Pay off option
$150
$0
Sell stock
$200
$50
,
,
Pay off debt
$50
$50
,
,
,
,
,
,
,
,
,
,
$0
$0
The arbitrage pricing theory (APT) value of the option at time 0 is V = 20.
0
Assumptions underlying APT:
Unlimited short selling of stock.
Unlimited borrowing.
No transaction costs.
Agent is a “small investor”, i.e., his/her trading does not move the market.
Important Observation: The APT value of the option does not depend on the probabilities of H
and T .
3.2
General one-step APT
Suppose a derivative security pays off the amount V1 at time 1, where V1 is an 1-measurable
F
random variable. (This measurability condition is important; this is why it does not make sense
to use some stock unrelated to the derivative security in valuing it, at least in the straightforward
method described below).
Sell the security for V0 at time 0. (V0 is to be determined later).
Buy 0 shares of stock at time 0. (0 is also to be determined later)
Invest V0
0S0 in the money market, at risk-free interest rate r. (V0 0S0 might be
,
,
negative).
Then wealth at time 1 is
X1 4= 0S1 + 1 + rV0 0S0
,
= 1 + rV0 + 0S1 1 + rS0:
,
We want to choose V0 and 0 so that
X1 = V1
regardless of whether the stock goes up or down.
CHAPTER 3. Arbitrage Pricing
61
The last condition above can be expressed by two equations (which is fortunate since there are two
unknowns):
1 + rV0 + 0S1H 1 + rS0 = V1H
(2.1)
,
1 + rV0 + 0S1T 1 + rS0 = V1T
(2.2)
,
Note that this is where we use the fact that the derivative security value Vk is a function of Sk,
i.e., when Sk is known for a given !, Vk is known (and therefore non-random) at that ! as well.
Subtracting the second equation above from the first gives
V
0 = V1H
1T
,
S1H S1T:
(2.3)
,
Plug the formula (2.3) for 0 into (2.1):
1 + rV0 = V1H 0S1H 1 + rS0
,
,
= V
V
V
1H
1H
1T
,
u 1 rS0
,
u dS0
,
,
,
=
1
u d u dV1H V1H V1Tu 1 r
,
,
,
,
,
,
= 1 + r d
1 r
,
u d V1H + u , ,
u d V1T:
,
,
We have already assumed u
d 0. We now also assume d 1 + r u (otherwise there would
be an arbitrage opportunity). Define
~p
d
1 r
,
4
= 1 + r ,
u d ; ~q 4= u ,u d :
,
,
Then ~p
0 and ~q 0. Since ~p+ ~q = 1, we have 0 ~p 1 and ~q = 1 ~p. Thus, ~p; ~q are like
,
probabilities. We will return to this later. Thus the price of the call at time 0 is given by
V0 = 1
1 + r ~pV1H + ~qV1T :
(2.4)
3.3
Risk-Neutral Probability Measure
Let
be the set of possible outcomes from n coin tosses. Construct a probability measure IP
f
on
by the formula
IP!
j;!j=H
j;!j=T
g
~qf
g
f
1;!2;::: ;!n 4
= ~pf
IP
IE
IP
f
is called the risk-neutral probability measure. We denote by f the expectation under f. Equa-
tion 2.4 says
V
0 = IE 1
:
f
1 + rV1
62
k
n
Theorem 3.11 Under IP
1+r S
,
f , the discounted stock price process
k; k k=0 is a martingale.
f
F
g
Proof:
IE 1 + r k+1S
,
f
k+1 k
jF
= 1 + r k+1
,
~pu + ~qdSk
= 1 + r k+1 u1 + r d
1 r
,
,
S
,
u d + du ,
u d
k
,
,
= 1 + r k+1u + ur ud + du d dr
,
,
,
,
u d
Sk
,
= 1 + r k+1u d1 + r
,
,
u d
Sk
,
= 1 + r kS
,
k:
3.3.1
Portfolio Process
The portfolio process is = 0; 1;::: ; n 1, where
,
k is the number of shares of stock held between times k and k + 1.
Each k is k-measurable. (No insider trading).
F
3.3.2
Self-financing Value of a Portfolio Process
Start with nonrandom initial wealth X0, which need not be 0.
Define recursively
Xk+1 = kSk+1 + 1 + rXk kSk
(3.1)
,
= 1 + rXk + kSk+1 1 + rSk:
(3.2)
,
Then each Xk is k-measurable.
F
k
n
Theorem 3.12 Under IP
1 + r X
,
f, the discounted self-financing portfolio process value
k; k k=0
f
F
g
is a martingale.
Proof: We have
1 + r k+1X
kX
1 + r k+1S
kS :
,
1 + r,
,
k+1 = 1 + r, k + k
k+1
k
,
CHAPTER 3. Arbitrage Pricing
63
Therefore,
IE 1 + r k+1X
,
f
k+1 k
jF
= IE 1 + r kX
,
f
k k
jF
+IE 1 + r k+1
,
f
kSk+1 k
jF
IE 1 + r k
,
f
kSk k
,
jF
= 1 + r kX
,
k (requirement (b) of conditional exp.)
+kIE 1 + r k+1S
,
f
k+1 k (taking out what is known)
jF
1 + r k
,
kSk (property (b))
,
= 1 + r kX
,
k (Theorem 3.11)
3.4
Simple European Derivative Securities
Definition 3.1 () A simple European derivative security with expiration time m is an m-measurable
F
random variable Vm. (Here, m is less than or equal to n, the number of periods/coin-tosses in the
model).
Definition 3.2 () A simple European derivative security Vm is said to be hedgeable if there exists
a constant X0 and a portfolio process = 0;::: ; m 1 such that the self-financing value
,
process X0;X1;::: ;Xm given by (3.2) satisfies
Xm! = Vm!; !
:
8
2
In this case, for k = 0; 1;::: ;m, we call Xk the APT value at time k of Vm.
Theorem 4.13 (Corollary to Theorem 3.12) If a simple European security Vm is hedgeable, then
for each k = 0; 1;:: : ; m, the APT value at time k of Vm is
Vk
m
4
= 1 + rkIE 1 + r V
,
f
m k :
(4.1)
jF
Proof: We first observe that if Mk;
k; k = 0; 1;:: : ; m is a martingale, i.e., satisfies the
f
F
g
martingale property
IE M
f
k+1 k = Mk
jF
for each k = 0; 1;:: : ; m 1, then we also have
,
IE M
1:
f
m k = Mk;k = 0; 1;:: : ; m
(4.2)
jF
,
When k = m 1, the equation (4.2) follows directly from the martingale property. For k = m 2,
,
,
we use the tower property to write
IE M
IE IE M
f
m m 2 = f f m m 1 m 2
jF
jF
jF
,
,
,
= IE M
f
m 1 m 2
jF
,
,
= Mm 2:
,
64
We can continue by induction to obtain (4.2).
If the simple European security Vm is hedgeable, then there is a portfolio process whose self-
financing value process X0;X1;::: ;Xm satisfies Xm = Vm. By definition, Xk is the APT value
at time k of Vm. Theorem 3.12 says that
X0;1+ r 1X
mX
,
1;::: ; 1 + r,
m
is a martingale, and so for each k,
1 + r kX IE 1 + r mX
IE 1 + r mV
,
k =
,
,
f
m k = f
m k :
jF
jF
Therefore,
Xk = 1 + rkIE 1 + r mV
,
f
m k :
jF
3.5
The Binomial Model is Complete
Can a simple European derivative security always be hedged? It depends on the model. If the answer
is “yes”, the model is said to be complete. If the answer is “no”, the model is called incomplete.
Theorem 5.14 The binomial model is complete. In particular, let Vm be a simple European deriva-
tive security, and set
Vk!1;::: ;!k = 1 + rkIE 1 + r mV
,
f
m k !1;::: ;!k;
(5.1)
jF
V
k!1;::: ;!k = Vk+1!1;::: ;!k;H
k+1!1;::: ;!k;T
,
Sk+1!1;::: ;!k;H Sk+1!1;::: ;!k;T:
(5.2)
,
m
Starting with initial wealth V0 = IE 1 + r
V
,
f
m , the self-financing value of the portfolio process
0;1;::: ;m 1 is the process V0;V1;::: ;Vm.
,
Proof: Let V0;::: ;Vm 1 and 0;::: ; m 1 be defined by (5.1) and (5.2). Set X0 = V0 and
,
,
define the self-financing value of the portfolio process 0;::: ; m 1 by the recursive formula 3.2:
,
Xk+1 = kSk+1 + 1 + rXk kSk:
,
We need to show that
Xk = Vk; k 0;1;::: ;m :
(5.3)
8
2
f
g
We proceed by induction. For k = 0, (5.3) holds by definition of X0. Assume that (5.3) holds for
some value of k, i.e., for each fixed !1;::: ;!k, we have
Xk!1;::: ;!k = Vk!1;::: ;!k:
CHAPTER 3. Arbitrage Pricing
65
We need to show that
Xk+1!1;::: ;!k;H = Vk+1!1;::: ;!k;H;
Xk+1!1;::: ;!k;T = Vk+1!1;::: ;!k;T:
We prove the first equality; the second can be shown similarly. Note first that
IE 1 + r k+1V
IE IE 1 + r mV
,
,
f
k+1 k = f f
m k+1 k
jF
jF
jF
= IE 1 + r mV
,
f
m k
jF
= 1 + r kV
,
k
k
n
In other words, 1 + r V
IP
,
k k=0 is a martingale under f. In particular,
f
g
Vk!1;::: ;!k = IE 1 + r 1V
,
f
k+1 k !1;::: ;!k
jF
=
1
1 + r ~pVk+1!1;::: ;!k;H + ~qVk+1!1;::: ;!k;T:
Since !1;::: ;!k will be fixed for the rest of the proof, we simplify notation by suppressing these
symbols. For example, we write the last equation as
Vk = 1
1 + r ~pVk+1H + ~qVk+1T:
We compute
Xk+1H
= kSk+1H + 1 + rXk kSk
,
= k Sk+1H 1 + rSk + 1 + rVk
,
= Vk+1H Vk+1T
,
S
1 + rS
k+1H Sk+1T Sk+1H
k
,
,
+~pVk+1H + ~qVk+1T
= Vk+1H Vk+1T
,
uS
uS 1 + rS
k dSk
k
k
,
,
+~pVk+1H + ~qVk+1T
= V
1 r
k+1H Vk+1Tu , ,
+ ~pVk+1H + ~qVk+1T
,
u d
,
= Vk+1H Vk+1T ~q+ ~pVk+1H + ~qVk+1T
,
= Vk+1H:
66
Chapter 4
The Markov Property
4.1
Binomial Model Pricing and Hedging
Recall that Vm is the given simple European derivative security, and the value and portfolio pro-
cesses are given by:
Vk = 1 + rkIE 1 + r mV
1:
,
f
m k ; k = 0; 1;:: : ; m
jF
,
V
k!1;::: ;!k = Vk+1!1;::: ;!k;H
k+1!1;::: ;!k;T
,
S
1:
k+1!1;::: ;!k;H Sk+1!1;::: ;!k;T; k = 0; 1;:: : ; m ,
,
Example 4.1 (Lookback Option) u = 2;d = 0:5;r = 0:25;S = 4; ~p =
= 0:5; ~q = 1 ~p = 0:5:
1+r ,d
,
0
u,d
Consider a simple European derivative security with expiration 2, with payoff given by (See Fig. 4.1):
V = max S 5 :
+
,
2
k
0k 2
Notice that
V HH = 11; V HT = 3 = V TH = 0; V TT = 0:
6
2
2
2
2
The payoff is thus “path dependent”. Working backward in time, we have:
V H = 1
HH + ~qV HT = 4
11 + 0:5 3 = 5:60;
1
1 + r ~pV2
2
5 0:5
V T = 4
0 + 0:5 0 = 0;
1
5 0:5
V = 4
5:60 + 0:5 0 = 2:24:
0
5 0:5
Using these values, we can now compute:
= V H V T
,
1
1
0
S H S T = 0:93;
,
1
1
H = V HH V HT
,
2
2
1
S HH S HT = 0:67;
,
2
2
67
68
S (HH) = 16
2
S (H) = 8
1
S (HT) = 4
2
S = 4
0
S (TH) = 4
2
S (T) = 2
1
S (TT) = 1
2
Figure 4.1: Stock price underlying the lookback option.
T = V TH V TT
,
2
2
1
S TH S TT = 0:
,
2
2
Working forward in time, we can check that
X H = S H + 1 + rX S = 5:59; V H = 5:60;
,
1
0
1
0
0
0
1
X T = S T + 1 + rX S = 0:01; V T = 0;
,
1
0
1
0
0
0
1
X HH = HS HH + 1 + rX H HS H = 11:01; V HH = 11;
,
1
1
1
1
1
1
1
etc.
Example 4.2 (European Call) Let u = 2;d = ;r = ;S = 4; ~p = ~q =
1
1
1
, and consider a European call
0
2
4
2
with expiration time 2 and payoff function
V = S 5 :+
,
2
2
Note that
V HH = 11; V HT = V TH = 0; V TT = 0;
2
2
2
2
V H = 4 :11 + :0 = 4:40
1
1
5 12
2
V T = 4 :0 + :0 = 0
1
1
5 12
2
V = 4
4:40 +
0 = 1:76:
1
0
5 12
2
Define v x to be the value of the call at time k when S = x. Then
k
k
v x = x 5+
,
2
v x = 4 v 2x + v x=2 ;
1
1
5 1 2
2
2
2
v x = 4 v 2x + v x=2 :
1
0
5 1 1
1
2
2
CHAPTER 4. The Markov Property
69
In particular,
v 16 = 11; v 4 = 0; v 1 = 0;
2
2
2
v 8 = 4 :11 + :0 = 4:40;
1
1
5 12
2
v 2 = 4 :0 + :0 = 0;
1
1
5 12
2
v = 4
4:40 +
0 = 1:76:
1
0
5 12
2
Let
x be the number of shares in the hedging portfolioat time k when S = x. Then
k
k
x = v 2x v x=2
,
k +1
k +1
k
2x x=2
; k = 0;1:
,
4.2
Computational Issues
For a model with n periods (coin tosses),
has 2n elements. For period k, we must solve 2k
equations of the form
Vk!1;::: ;!k = 1
1 + r ~pVk+1!1;::: ;!k;H + ~qVk+1!1;::: ;!k;T :
For example, a three-month option has 66 trading days. If each day is taken to be one period, then
n = 66 and 266 7 1019.
There are three possible ways to deal with this problem:
1. Simulation. We have, for example, that
V0 = 1 + r nIEV
,
f
n;
and so we could compute V0 by simulation. More specifically, we could simulate n coin
tosses ! = !1;::: ;!n under the risk-neutral probability measure. We could store the
value of Vn!. We could repeat this several times and take the average value of Vn as an
approximation to IEV
f
n.
2. Approximate a many-period model by a continuous-time model. Then we can use calculus
and partial differential equations. We’ll get to that.
3. Look for Markov structure. Example 4.2 has this. In period 2, the option in Example 4.2 has
three possible values v216;v24;v21, rather than four possible values V2HH;V2HT ;V2TH;V2TT .
If there were 66 periods, then in period 66 there would be 67 possible stock price values (since
the final price depends only on the number of up-ticks of the stock price – i.e., heads – so far)
and hence only 67 possible option values, rather than 266
7 1019.
70
4.3
Markov Processes
Technical condition always present: We consider only functions on IR and subsets of IR which are
Borel-measurable, i.e., we only consider subsets A of IR that are in and functions g : IR IR such
B
!
1
that g, is a function
.
B
!B
n
Definition 4.1 () Let ; ; be a probability space. Let
k k=0 be a filtration under . Let
F
P
fF
g
F
Xk nk=0 be a stochastic process on ; ; . This process is said to be Markov if:
f
g
F
P
The stochastic process Xk is adapted to the filtration
k , and
f
g
fF
g
(The Markov Property). For each k = 0; 1;:: : ; n
1, the distribution of Xk+1 conditioned
,
on
k is the same as the distribution of Xk+1 conditioned on Xk.
F
4.3.1
Different ways to write the Markov property
(a) (Agreement of distributions). For every A
4
= IR, we have
2
B
B
IPXk+1 A k = IE IAXk+1 k
2
jF
jF
= IE IAXk+1 Xk
j
= IP Xk+1 A Xk :
2
j
(b) (Agreement of expectations of all functions). For every (Borel-measurable) function h : IR IR
!
for which IE hXk+1
, we have
j
j
1
IE hXk+1 k = IE hXk+1 Xk :
jF
j
(c) (Agreement of Laplace transforms.) For every u
IR for which IEeuXk+1
, we have
2
1
IE euXk+1
k = IE euXk+1 Xk :
F
(If we fix u and define hx = eux, then the equations in (b) and (c) are the same. However in
(b) we have a condition which holds for every function h, and in (c) we assume this condition
only for functions h of the form hx = eux. A main result in the theory of Laplace transforms
is that if the equation holds for every h of this special form, then it holds for every h, i.e., (c)
implies (b).)
(d) (Agreement of characteristic functions) For every u
IR, we have
2
IE heiuXk+1 i
i
k = IE heiuXk+1 Xk ;
jF
j
where i = p 1. (Since eiux = cos x+sin x
1 we don’t need to assume that IE eiux
,
j
j
j
j
j
j
.)
1
CHAPTER 4. The Markov Property
71
Remark 4.1 In every case of the Markov properties where IE ::: Xk appears, we could just as
j
well write gXk for some function g. For example, form (a) of the Markov property can be restated
as:
For every A
, we have
2
B
IPXk+1 A k = gXk;
2
jF
where g is a function that depends on the set A.
Conditions (a)-(d) are equivalent. The Markov property as stated in (a)-(d) involves the process at
a “current” time k and one future time k + 1. Conditions (a)-(d) are also equivalent to conditions
involving the process at time k and multiple future times. We write these apparently stronger but
actually equivalent conditions below.
Consequences of the Markov property. Let j be a positive integer.
(A) For every Ak+1
IR;::: ; k+j IR,
A
IP Xk+1 Ak+1;::: ;Xk+j Ak+j k = IP Xk+1 Ak+1;::: ;Xk+j Ak+j Xk :
2
2
jF
2
2
j
(A’) For every A
IRj,
2
IP Xk+1;::: ;Xk+j A k = IP Xk+1;::: ;Xk+j A Xk :
2
jF
2
j
(B) For every function h : IRj IR for which IE hXk+1;::: ;Xk+j
, we have
!
j
j
1
IE hXk+1;::: ;Xk+j k = IE hXk+1;::: ;Xk+j Xk :
jF
j
X +:::+u
(C) For every u = uk+1;::: ;uk+j IRj for which IE euk
k
k jXk j
+1
+1
+
+
, we have
2
j
j
1
IE euk Xk +:::+uk jXk j
Xk +:::+uk jXk j X
+1
+1
+
+
k = IE euk+1 +1
+
+
k :
jF
j
(D) For every u = uk+1;::: ;uk+j IRj we have
2
IE eiuk Xk +:::+uk jXk j
Xk +:::+uk jXk j X
+1
+1
+
+
k = IE eiuk+1 +1
+
+
k :
jF
j
Once again, every expression of the form IE::: Xk can also be written as gXk, where the
j
function g depends on the random variable represented by ::: in this expression.
Remark. All these Markov properties have analogues for vector-valued processes.
72
Proof that (b) =
(A). (with j = 2 in (A)) Assume (b). Then (a) also holds (take h = IA).
Consider
IP Xk+1 Ak+1;Xk+2 Ak+2 k
2
2
jF
= IE IAk+1Xk+1IAk+2Xk+2 k
jF
(Definition of conditional probability)
= IE IE IAk+1Xk+1IAk+2Xk+2 k+1 k
jF
jF
(Tower property)
= IE IAk+1Xk+1:IE IAk+2Xk+2 k+1 k
jF
jF
(Taking out what is known)
= IE IAk+1Xk+1:IE IAk+2Xk+2 Xk+1 k
j
jF
(Markov property, form (a).)
= IE IAk+1Xk+1:gXk+1 k
jF
(Remark 4.1)
= IE IAk+1Xk+1:gXk+1 Xk
j
(Markov property, form (b).)
Now take conditional expectation on both sides of the above equation, conditioned on Xk, and
use the tower property on the left, to obtain
IP Xk+1 Ak+1;Xk+2 Ak+2 Xk = IE IA
Xk :
(3.1)
2
2
j
k+1Xk+1:gXk+1j
Since both
IP Xk+1 Ak+1;Xk+2 Ak+2 k
2
2
jF
and
IP Xk+1 Ak+1;Xk+2 Ak+2 Xk
2
2
j
are equal to the RHS of (3.1)), they are equal to each other, and this is property (A) with j = 2.
Example 4.3 It is intuitively clear that the stock price process in the binomial model is a Markov process.
We will formally prove this later. If we want to estimate the distribution of S
based on the information in
k +1
, the only relevant piece of information is the value of S . For example,
F
k
k
IE S
= ~pu + ~qdS = 1 + rS
e
(3.2)
jF
k +1
k
k
k
is a function of S . Note however that form (b) of the Markov property is stronger then (3.2); the Markov
k
property requires that for any function h,
IE hS
e
jF
k +1
k
is a function of S . Equation (3.2) is the case of hx = x.
k
Consider a model with 66 periods and a simple European derivative security whose payoff at time 66 is
V = 1
+ S + S :
66
3S64
65
66
CHAPTER 4. The Markov Property
73
The value of this security at time 50 is
V = 1 + r IE 1 + r V
50
,66
e
jF
50
66
50
= 1 + r IE V S ;
,16
e
j
66
50
because the stock price process is Markov. (We are using form (B) of the Markov property here). In other
words, the F -measurable random variable V can be written as
50
50
V ! ;::: ;! = gS ! ;::: ;!
50
1
50
50
1
50
for some function g, which we can determine with a bit of work.
4.4
Showing that a process is Markov
Definition 4.2 (Independence) Let ; ; be a probability space, and let
and
be sub- -
F
P
G
H
algebras of
. We say that
and
are independent if for every A
and B
, we have
F
G
H
2
G
2
H
IPA B = IPAIPB:
We say that a random variable X is independent of a -algebra if X, the -algebra generated
G
by X, is independent of .
G
Example 4.4 Consider the two-period binomial model. Recall that
is the -algebra of sets determined
F
1
by the first toss, i.e.,
contains the four sets
F
1
A 4= HH;HT ; A 4= TH;TT ; ; :
f
g
f
g
H
T
Let
be the -algebra of sets determined by the second toss, i.e.,
contains the four sets
H
H
HH;TH ; HT;TT ; ; :
f
g
f
g
Then
and
are independent. For example, if we take A = HH; HT from
and B = HH; T H
F
H
f
g
F
f
g
1
1
from
, then IP A B = IP HH = p2 and
H
IPAIPB = p + pqp + pq = p p + q = p :
2
2
2
2
2
Note that
and S are not independent (unless p = 1 or p = 0). For example, one of the sets in S is
F
1
2
2
!;S ! = u S = HH
HH;HT
HH
2
. If we take A =
from
and B =
from
S , then
f
g
f
g
f
g
F
f
g
2
0
1
2
IPA B = IPHH = p2, but
IPAIPB = p + pqp = p p + q = p :
2
2
3
3
The following lemma will be very useful in showing that a process is Markov:
Lemma 4.15 (Independence Lemma) Let X and Y be random variables on a probability space
; ; . Let be a sub- -algebra of . Assume
F
P
G
F
74
X is independent of ;
G
Y is -measurable.
G
Let fx; y be a function of two variables, and define
gy 4= IEfX;y:
Then
IE fX;Y = gY :
jG
Remark. In this lemma and the following discussion, capital letters denote random variables and
lower case letters denote nonrandom variables.
Example 4.5 (Showing the stock price process is Markov) Consider an n-period binomial model. Fix a
time k and define X 4
= S
4
=
= H
= T
k +1
S and
. Then X = u if !
and X = d if !
. Since X
G
F
k
k +1
k +1
k
depends only on the k + 1st toss, X is independent of . Define Y 4
= S , so that Y is -measurable. Let h
G
G
k
be any function and set fx; y 4
= hxy. Then
gy 4= IEfX;y = IEhXy = phuy + qhdy:
The Independence Lemma asserts that
IE hS
= IE h Sk+1:S
jF
jF
k +1
k
S
k
k
k
= IE fX;Y jG
= gY
= phuS + qhdS :
k
k
This shows the stock price is Markov. Indeed, if we condition both sides of the above equation on S and
k
use the tower property on the left and the fact that the right hand side is S -measurable, we obtain
k
IE hS S = phuS + qhdS :
j
k +1
k
k
k
Thus IE hS
and IE hS
X are equal and form (b) of the Markov property is proved.
jF
j
k +1
k
k +1
k
Not only have we shown that the stock price process is Markov, but we have also obtained a formula for
IE hS
as a function of S . This is a special case of Remark 4.1.
jF
k +1
k
k
4.5
Application to Exotic Options
Consider an n-period binomial model. Define the running maximum of the stock price to be
Mk 4= max S
1 j k j:
Consider a simple European derivative security with payoff at time n of vnSn;Mn.
Examples:
CHAPTER 4. The Markov Property
75
vnSn;Mn = Mn K+ (Lookback option);
,
vnSn;Mn = IMn BSn K+ (Knock-in Barrier option).
,
n
Lemma 5.16 The two-dimensional process Sk;Mk k=0 is Markov. (Here we are working under
f
g
the risk-neutral measure IP, although that does not matter).
Proof: Fix k. We have
Mk+1 = Mk Sk+1;
_
where
indicates the maximum of two quantities. Let Z 4
= Sk+1
_
Sk , so
IPZ = u = ~p; IPZ = d = ~q;
f
f
and Z is independent of k. Let hx; y be a function of two variables. We have
F
hSk+1;Mk+1 = hSk+1;Mk Sk+1
_
= hZSk;Mk ZSk:
_
Define
gx;y 4= IEhZx;y Zx
f
_
= ~phux;y ux + ~qhdx;y dx:
_
_
The Independence Lemma implies
IE hS
uS
f
k+1;Mk+1 k = gSk;Mk = ~phuSk;Mk
k + ~qhdSk;Mk;
jF
_
the second equality being a consequence of the fact that Mk
dSk = Mk. Since the RHS is a
^
function of Sk;Mk, we have proved the Markov property (form (b)) for this two-dimensional
process.
Continuing with the exotic option of the previous Lemma... Let Vk denote the value of the derivative
k
security at time k. Since 1 + r V
IP
,
k is a martingale under f, we have
Vk = 1
1 + rIE V
1:
f
k+1 k ;k = 0; 1;:: : ; n
jF
,
At the final time, we have
Vn = vnSn;Mn:
Stepping back one step, we can compute
Vn 1 = 1 IE vnSn;Mn n 1
jF
,
1 + rf
,
=
1
1 + r ~pvnuSn 1;uSn 1 Mn 1 + ~qvndSn 1;Mn 1 :
_
,
,
,
,
,
76
This leads us to define
vn 1x;y 4= 1
y + ~qvndx;y
,
1 + r ~pvnux;ux _
so that
Vn 1 = vn 1Sn 1;Mn 1:
,
,
,
,
The general algorithm is
v
kx; y = 1
1 + r ~pvk+1ux;ux y + ~qvk+1dx;y ;
_
and the value of the option at time k is vkSk;Mk. Since this is a simple European option, the
hedging portfolio is given by the usual formula, which in this case is
M
v
k = vk+1uSk; uSk
k
k+1dSk;Mk
_
,
u dSk
,
Chapter 5
Stopping Times and American Options
5.1
American Pricing
Let us first review the European pricing formula in a Markov model. Consider the Binomial
model with n periods. Let Vn = gSn be the payoff of a derivative security. Define by backward
recursion:
vnx = gx
vkx = 1
1 + r ~pvk+1ux + ~qvk+1dx :
Then vkSk is the value of the option at time k, and the hedging portfolio is given by
v
k = vk+1uSk
k+1dSk
,
u dS
; k = 0;1;2;::: ;n 1:
k
,
,
Now consider an American option. Again a function g is specified. In any period k, the holder
of the derivative security can “exercise” and receive payment gSk. Thus, the hedging portfolio
should create a wealth process which satisfies
Xk gSk; k; almost surely.
8
This is because the value of the derivative security at time k is at least gSk, and the wealth process
value at that time must equal the value of the derivative security.
American algorithm.
vnx = gx
vkx = max 1
1 + r~pvk+1ux + ~qvk+1dx; gx
Then vkSk is the value of the option at time k.
77
78
v (16) = 0
S (HH) = 16
2
2
S (H) = 8
1
S (HT) = 4
2
v (4) = 1
S = 4
2
0
S (TH) = 4
2
S (T) = 2
1
S (TT) = 1
2
v (1) = 4
2
Figure 5.1: Stock price and final value of an American put option with strike price 5.
Example 5.1 See Fig. 5.1. S = 4;u = 2;d = ;r = ; ~p = ~
q = ;n = 2
x = gx = 5 x
1
1
1
. Set v
+
.
,
0
2
2
4
2
Then
v 8 = max 4 :0 + :1;5 8+
1
1
,
1
5 2
2
= max 25;0
= 0:40
v 2 = max 4 :1 + :4;5 2+
1
1
,
1
5 2
2
= max 2;3
f
g
= 3:00
v 4 = max 4 :0:4 + :3:0;5 4+
1
1
,
0
5 2
2
= max 1:36;1
f
g
= 1:36
Let us now construct the hedging portfolio for this option. Begin with initial wealth X = 1:36. Compute
0
as follows:
0
0:40 = v S H
1
1
= S H + 1 + rX S
,
1
0
0
0
0
= 8 + 5
4
0
41:36 , 0
= 3 + 1:70 = = 0:43
,
0
0
3:00 = v S T
1
1
= S T + 1 + rX S
,
1
0
0
0
0
= 2 + 5
4
0
41:36 , 0
= 3 + 1:70 = = 0:43
,
,
0
0
CHAPTER 5. Stopping Times and American Options
79
Using = 0:43 results in
,
0
X H = v S H = 0:40; X T = v S T = 3:00
1
1
1
1
1
1
Now let us compute (Recall that S T = 2):
1
1
1 = v 4
2
= S TH T + 1 + rX T TS T
,
2
1
1
1
1
= 4 T + 5
2 T
1
43 , 1
= 1:5 T + 3:75 = T = 1:83
,
1
1
4 = v 1
2
= S TT T + 1 + rX T TS T
,
2
1
1
1
1
= T + 5
2 T
1
43 , 1
= 1:5 T + 3:75 = T = 0:16
,
,
1
1
We get different answers for T ! If we had X T = 2, the value of the European put, we would have
1
1
1 = 1:5 T + 2:5 = T = 1;
,
1
1
4 = 1:5 T + 2:5 = T = 1;
,
,
1
1
5.2
Value of Portfolio Hedging an American Option
Xk+1 = kSk+1 + 1 + rXk Ck kSk
,
,
= 1 + rXk + kSk+1 1 + rSk 1 + rCk
,
,
Here, Ck is the amount “consumed” at time k.
The discounted value of the portfolio is a supermartingale.
The value satisfies Xk
gSk;k = 0;1;::: ;n.
The value process is the smallest process with these properties.
When do you consume? If
IE1 + r k+1v
kv
,
f
k+1Sk+1 k
1 + r, kSk;
jF
or, equivalently,
IE 1
f
1 + rvk+1Sk+1 k vkSk
jF
80
and the holder of the American option does not exercise, then the seller of the option can consume
to close the gap. By doing this, he can ensure that Xk = vkSk for all k, where vk is the value
defined by the American algorithm in Section 5.1.
In the previous example, v1S1T = 3;v2S2TH = 1 and v2S2TT = 4. Therefore,
IE 1
h
1
f
1 + rv2S2 1 T = 4
jF
5 2:1 + 12:4i
= 4 5
5 2
= 2;
v1S1T = 3;
so there is a gap of size 1. If the owner of the option does not exercise it at time one in the state
!1 = T, then the seller can consume 1 at time 1. Thereafter, he uses the usual hedging portfolio
v
k = vk+1uSk
k+1dSk
,
u dSk
,
In the example, we have v1S1T = gS1T . It is optimal for the owner of the American option
to exercise whenever its value vkSk agrees with its intrinsic value gSk.
n
Definition 5.1 (Stopping Time) Let ; ; be a probability space and let
k k=0 be a filtra-
F
P
fF
g
tion. A stopping time is a random variable
:
0;1;2;::: ;n
with the property that:
!f
g
f1g
!
; ! = k
k; k = 0; 1;:: : ; n; :
f
2
g
2
F
8
1
Example 5.2 Consider the binomial model with n = 2;S = 4;u = 2;d = ;r =
1
1
, so ~
p = ~q = 1. Let
0
2
4
2
v ;v ;v be the value functions defined for the American put with strike price 5. Define
0
1
2
! = min k;v S = 5 S :
+
f
,
g
k
k
k
The stopping time
corresponds to “stopping the first time the value of the option agrees with its intrinsic
value”. It is an optimal exercise time. We note that
! = 1 if ! A
2
T
2 if ! A
2
H
We verify that
is indeed a stopping time:
!; ! = 0 =
f
g
2
F
0
!; ! = 1 = A
f
g
2
F
T
1
!; ! = 2 = A
f
g
2
F
H
2
Example 5.3 (A random time which is not a stopping time) In the same binomial model as in the previous
example, define
! = min k;S ! = m ! ;
f
g
k
2
CHAPTER 5. Stopping Times and American Options
81
where m 4
= min
S . In other words, stops when the stock price reaches its minimum value. This
2
0j
2
j
random variable is given by
8
0 if ! A ;
! =
2
H
1 if ! = TH;
:
2 if ! = TT
We verify that
is not a stopping time:
!; ! = 0 = A
f
g
62
F
H
0
!; ! = 1 = TH
f
g
f
g
6
2
F
1
!; ! = 2 = TT
f
g
f
g
2
F
2
5.3
Information up to a Stopping Time
Definition 5.2 Let
be a stopping time. We say that a set A
is determined by time
provided
that
A !; ! = k
k; k:
f
g
2
F
8
The collection of sets determined by
is a -algebra, which we denote by
.
F
Example 5.4 In the binomial model considered earlier, let
= min k;v S = 5 S ;
+
f
,
g
k
k
k
i.e.,
! = 1 if ! A
2
T
2 if ! A
2
H
The set HT is determined by time , but the set TH is not. Indeed,
f
g
f
g
HT
!; ! = 0 =
f
g
f
g
2
F
0
HT
!; ! = 1 =
f
g
f
g
2
F
1
HT
!; ! = 2 = HT
f
g
f
g
f
g
2
F
2
but
TH
!; ! = 1 = TH
:
f
g
f
g
f
g
6
2
F
1
The atoms of
are
F
HT ; HH ; A = TH;TT :
f
g
f
g
f
g
T
Notation 5.1 (Value of Stochastic Process at a Stopping Time) If ; ; is a probability space,
F
P
k nk=0
n
is a filtration under
, Xk k=0 is a stochastic process adapted to this filtration, and is
fF
g
F
f
g
a stopping time with respect to the same filtration, then X is an
-measurable random variable
F
whose value at ! is given by
X ! 4= X !!:
82
n
Theorem 3.17 (Optional Sampling) Suppose that Yk; k
Y
1
k=0 (or k; k k=0) is a submartin-
f
F
g
f
F
g
gale. Let
and
be bounded stopping times, i.e., there is a nonrandom number n such that
n;
n; almost surely.
If
almost surely, then
Y IEY
:
jF
Taking expectations, we obtain IEY
IEY , andin particular,Y0 = IEY0 IEY . If Yk; k 1k=0
f
F
g
is a supermartingale, then
implies Y
IEY
.
jF
If Yk; k
1
k=0 is a martingale, then
implies Y = IEY
.
f
F
g
jF
Example 5.5 In the example 5.4 considered earlier, we define ! = 2 for all !
. Under the risk-neutral
2
probability measure, the discounted stock price process S
5
,k
is a martingale. We compute
k
4
"
IE 4 2 S
:
e
5
F
2
The atoms of
are HH ; HT ; and A . Therefore,
F
f
g
f
g
T
"
IE 4 2 S
HH = 4 2 S HH;
e
5
F
2
5
2
"
IE 4 2 S
HT = 4 2 S HT;
e
5
F
2
5
2
and for !
A ,
2
T
"
IE 4 2 S
! =
4 2 S TH+ 4 2S TT
1
1
e
5
F
2
5
2
5
2
2
2
=
2:56 +
0:64
1
1
2
2
= 1:60
In every case we have gotten (see Fig. 5.2)
"
!
IE 4 2 S
! = 4 S ! !:
e
5
F
2
5
CHAPTER 5. Stopping Times and American Options
83
(16/25) S (HH) = 10.24
2
(4/5) S (H) = 6.40
1
(16/25) 2
S (HT) = 2.56
S = 4
0
(16/25) 2
S (TH) = 2.56
(4/5) S (T) = 1.60
1
(16/25) 2
S (TT) = 0.64
Figure 5.2: Illustrating the optional sampling theorem.
84
Chapter 6
Properties of American Derivative
Securities
6.1
The properties
Definition 6.1 An American derivative security is a sequence of non-negative random variables
Gk nk=0 such that each Gk is k-measurable. The owner of an American derivative security can
f
g
F
exercise at any time k, and if he does, he receives the payment Gk.
(a) The value Vk of the security at time k is
Vk = max1 + rkIE 1 + r G
,
f
k ;
jF
where the maximum is over all stopping times
satisfying
k almost surely.
k
n
(b) The discounted value process 1 + r V
,
k k=0 is the smallest supermartingale which satisfies
f
g
Vk Gk; k; almost surely.
8
(c) Any stopping time
which satisfies
V0 = IE 1 + r G
,
f
is an optimal exercise time. In particular
4
= min k;Vk = Gk
f
g
is an optimal exercise time.
(d) The hedging portfolio is given by
V
k!1;::: ;!k = Vk+1!1;::: ;!k;H
k+1!1;::: ;!k;T
,
S
1:
k+1!1;::: ;!k;H Sk+1!1;::: ;!k;T;k = 0; 1;::: ;n ,
,
85
86
(e) Suppose for some k and !, we have Vk! = Gk!. Then the owner of the derivative security
should exercise it. If he does not, then the seller of the security can immediately consume
V
1
k!
IE Vk+1 k !
,
1 + rf
jF
and still maintain the hedge.
6.2
Proofs of the Properties
n
Let Gk k=0 be a sequence of non-negative random variables such that each Gk is k-measurable.
f
g
F
Define Tk to be the set of all stopping times satisfying k
n almost surely. Define also
Vk 4= 1 + rk maxIE
G
T
1 + r,
f
k :
k
jF
2
Lemma 2.18 Vk
Gk for every k.
Proof: Take
Tk to be the constant k.
2
k
n
Lemma 2.19 The process 1 + r V
,
k k=0 is a supermartingale.
f
g
Proof: Let attain the maximum in the definition of Vk+1, i.e.,
1 + r k+1V
IE h1 + r
i
G
:
,
k+1 =
,
f
k+1
jF
Because is also in Tk, we have
IE 1 + r k+1V
IE hIE 1 + r
i
G
,
,
f
k+1 k = f f
k+1 k
jF
jF
jF
= IE 1 + r G
,
f
k
jF
maxIE 1 + r G
,
f
k
Tk
jF
2
= 1 + r kV
,
k:
n
Lemma 2.20 If Yk k=0 is another process satisfying
f
g
Yk Gk;k = 0;1;::: ;n; a.s.,
k
n
and 1 + r Y
,
k k=0 is a supermartingale, then
f
g
Yk Vk;k = 0;1;::: ;n; a.s.
CHAPTER 6. Properties of American Derivative Securities
87
k
n
Proof: The optional sampling theorem for the supermartingale 1 + r Y
,
k k=0 implies
f
g
IE 1 + r Y
1 + r kY
T
,
,
f
k
k;
k:
jF
8
2
Therefore,
Vk = 1 + rk maxIE
G
T
1 + r,
f
k
k
jF
2
1 + rk maxIE 1 + r Y
,
f
k
Tk
jF
2
1 + r k1 + rkY
,
k
= Yk:
Lemma 2.21 Define
C
1
k = Vk
IE Vk+1 k
,
1 + rf
jF
= 1 + rk n1 + r kV IE 1 + r k+1V
o
:
,
k
,
f
k+1 k
,
jF
k
n
Since 1 + r V
,
k k=0 is a supermartingale, Ck must be non-negative almost surely. Define
f
g
V
k!1;::: ;!k = Vk+1!1;::: ;!k;H
k+1!1;::: ;!k;T
,
Sk+1!1;::: ;!k;H Sk+1!1;::: ;!k;T:
,
Set X0 = V0 and define recursively
Xk+1 = kSk+1 + 1 + rXk Ck kSk:
,
,
Then
Xk = Vk k:
8
Proof:
We proceed by induction on k. The induction hypothesis is that Xk = Vk for some
k 0;1;::: ;n 1 , i.e., for each fixed !1;::: ;!k we have
2
f
,
g
Xk!1;::: ;!k = Vk!1;::: ;!k:
We need to show that
Xk+1!1;::: ;!k;H = Vk+1!1;::: ;!k;H;
Xk+1!1;::: ;!k;T = Vk+1!1;::: ;!k;T:
We prove the first equality; the proof of the second is similar. Note first that
Vk!1;::: ;!k Ck!1;::: ;!k
,
=
1
1 + rIE V
f
k+1 k !1;::: ;!k
jF
=
1
1 + r ~pVk+1!1;::: ;!k;H+ ~qVk+1!1;::: ;!k;T:
88
Since !1;::: ;!k will be fixed for the rest of the proof, we will suppress these symbols. For
example, the last equation can be written simply as
Vk Ck = 1
,
1 + r ~pVk+1H + ~qVk+1T:
We compute
Xk+1H = kSk+1H + 1 + rXk Ck kSk
,
,
= Vk+1H Vk+1T
,
S
1 + rS
k+1H Sk+1T Sk+1H
k
,
,
+1 + rVk Ck
,
= Vk+1H Vk+1T
,
u dS
uS 1 + rS
k
k
k
,
,
+~pVk+1H + ~qVk+1T
= Vk+1H Vk+1T~q+ ~pVk+1H + ~qVk+1T
,
= Vk+1H:
6.3
Compound European Derivative Securities
In order to derive the optimal stopping time for an American derivative security, it will be useful to
study compound European derivative securities, which are also interesting in their own right.
A compound European derivative security consists of n + 1 different simple European derivative
securities (with the same underlying stock) expiring at times 0; 1;:: : ; n; the security that expires
at time j has payoff Cj. Thus a compound European derivative security is specified by the process
Cj nj=0
n
, where each Cj is j-measurable, i.e., the process Cj j=0 is adapted to the filtration
f
g
F
f
g
k nk=0.
fF
g
Hedging a short position (one payment). Here is how we can hedge a short position in the j’th
European derivative security. The value of European derivative security j at time k is given by
V j
k = 1 + rkIE 1 + r jC
,
f
j k ; k = 0;::: ;j;
jF
and the hedging portfolio for that security is given by
j
V j
k+1!1;::: ;!k;T
k !1;::: ;!k = V j
k+1!1;::: ;!k;H ,
Sj
;k = 0;::: ;j 1:
,
k+1!1;::: ;!k;H Sj
k+1!1;::: ;!k;T
,
Thus, starting with wealth V j
0 , and using the portfolio j
0 ;::: ; j
j 1, we can ensure that at
,
time j we have wealth Cj.
Hedging a short position (all payments). Superpose the hedges for the individual payments. In
n
other words, start with wealth V0 = Pj=0 V j
0 . At each time k
0;1;::: ;n 1 , first make the
2
f
,
g
payment Ck and then use the portfolio
k = kk+1 + kk+2 + :::+ kn
CHAPTER 6. Properties of American Derivative Securities
89
corresponding to all future payments. At the final time n, after making the final payment Cn, we
will have exactly zero wealth.
n
Suppose you own a compound European derivative security Cj j=0. Compute
f
g
n
2
n
3
V0 = XV j
X
0 = IE
1 + r jC
,
f
4
j5
j=0
j=0
n 1
and the hedging portfolio is k ,
k=0. You can borrow V0 and consume it immediately. This leaves
f
g
you with wealth X0 = V0. In each period k, receive the payment Ck and then use the portfolio
,
k. At the final time n, after receiving the last payment Cn, your wealth will reach zero, i.e., you
,
will no longer have a debt.
6.4
Optimal Exercise of American Derivative Security
In this section we derive the optimal exercise time for the owner of an American derivative security.
n
Let Gk k=0 be an American derivative security. Let be the stopping time the owner plans to
f
g
use. (We assume that each Gk is non-negative, so we may assume without loss of generality that the
owner stops at expiration – time n– if not before). Using the stopping time , in period j the owner
will receive the payment
Cj = I =j Gj:
f
g
In other words, once he chooses a stopping time, the owner has effectively converted the American
derivative security into a compound European derivative security, whose value is
2
n
3
V
X
0
= IE
1 + r jC
,
f
4
j5
j=0
2
n
3
= IE X1 + r jI
G
,
f
4
=j j5
j=0
f
g
= IE 1 + r G :
,
f
The owner of the American derivative security can borrow this amount of money immediately, if
n
he chooses, and invest in the market so as to exaclty pay off his debt as the payments Cj j=0 are
f
g
received. Thus, his optimal behavior is to use a stopping time
which maximizes V
0 .
Lemma 4.22 V
0 is maximized by the stopping time
= min k;V
:
k = Gk
f
g
Proof: Recall the definition
V0 4= maxTIE 1+r G = maxV
,
f
T 0
2
2
0
0
90
i
k
n
Let
IE h1 + r 0G :
1 + r V
0
be a stopping time which maximizes V
0 , i.e., V0 =
,
,
f
Because
k
0
k=0
f
g
is a supermartingale, we have from the optional sampling theorem and the inequality Vk
Gk, the
following:
V
i
0
IE h1 + r 0V
,
f
0
0
jF
= IE h1 + r
i
0
V
,
f
0
IE h1 + r
i
0
G
,
f
0
= V0:
Therefore,
V
i
i
0 = IE h1 + r 0V = IE h1 + r 0G ;
,
,
f
f
0
0
and
V = G ; a.s.
0
0
We have just shown that if 0 attains the maximum in the formula
V0 = maxTIE 1+r G ;
,
f
(4.1)
2
0
then
V = G ; a.s.
0
0
But we have defined
= min k;V
;
k = Gk
f
g
and so we must have
n
0
almost surely. The optional sampling theorem implies
1 + r G = 1 + r V
,
,
IE h1 + r
i
0
V
,
f
0
jF
= IE h1 + r
i
0
G
:
,
f
0
jF
Taking expectations on both sides, we obtain
IE h1 + r
i
i
G
IE h1 + r 0G = V
,
,
f
f
0:
0
It follows that also attains the maximum in (4.1), and is therefore an optimal exercise time for
the American derivative security.
Chapter 7
Jensen’s Inequality
7.1
Jensen’s Inequality for Conditional Expectations
Lemma 1.23 If ' : IR IR is convex and IE 'X
, then
!
j
j
1
IE 'X
'IE X :
jG
jG
For instance, if
= ; ;'x = x2:
G
f
g
IEX2 IEX2:
Proof: Since ' is convex we can express it as follows (See Fig. 7.1):
'x = max hx:
h '
h is linear
Now let hx = ax + b lie below '. Then,
IE 'X
IE aX + b
jG
jG
= aIE X + b
jG
= hIE X
jG
This implies
IE 'X
max hIE X
jG
h '
jG
h is linear
= 'IE X :
jG
91
92
ϕ
Figure 7.1: Expressing a convex function as a max over linear functions.
n
n
Theorem 1.24 If Yk k=0 is a martingale and is convex then 'Yk k=0 is a submartingale.
f
g
f
g
Proof:
IE 'Yk+1 k
'IE Yk+1 k
jF
jF
= 'Yk:
7.2
Optimal Exercise of an American Call
This follows from Jensen’s inequality.
Corollary 2.25 Given a convex function g : 0; IR where g0 = 0. For instance, gx =
1
!
x K+ is the payoff function for an American call. Assume that r 0. Consider the American
,
derivative security with payoff gSk in period k. The value of this security is the same as the value
of the simple European derivative security with final payoff gSn, i.e.,
IE 1 + r ngS
IE 1 + r gS ;
,
,
f
n = max f
where the LHS is the European value and the RHS is the American value. In particular
= n is an
optimal exercise time.
Proof: Because g is convex, for all
0;1 we have (see Fig. 7.2):
2
gx = gx + 1 :0
,
gx + 1 :g0
,
= gx:
CHAPTER 7. Jensen’s Inequality
93
(x,g(x))
( λx, g(x))
λ
x
(λx, g( λx))
Figure 7.2: Proof of Cor. 2.25
Therefore,
g 1
1
1 + rSk+1 1 + rgSk+1
and
IE
h
1 + r k+1gS
i
= 1 + r kIE 1
,
,
f
k+1 k
f
k
jF
1 + rgSk+1jF
1 + r kIE g 1
,
f
k
1 + rSk+1 jF
1 + r kg IE 1
,
f
k
1 + rSk+1jF
= 1 + r kgS
,
k;
k
n
So 1 + r gS
n
,
k k=0 is a submartingale. Let be a stopping time satisfying 0
. The
f
g
optional sampling theorem implies
1 + r gS IE 1 + r ngS
:
,
,
f
n
jF
Taking expectations, we obtain
IE 1 + r gS
IE
IE 1 + r ngS
,
,
f
f
f
n
jF
= IE 1 + r ngS
,
f
n :
Therefore, the value of the American derivative security is
maxIE 1 + r gS IE 1 + r ngS
,
,
f
f
n ;
and this last expression is the value of the European derivative security. Of course, the LHS cannot
be strictly less than the RHS above, since stopping at time n is always allowed, and we conclude
that
maxIE 1 + r gS = IE 1 + r ngS
,
,
f
f
n :
94
S (HH) = 16
2
S (H) = 8
1
S (HT) = 4
2
S = 4
0
S (TH) = 4
2
S (T) = 2
1
S (TT) = 1
2
Figure 7.3: A three period binomial model.
7.3
Stopped Martingales
n
n
Let Yk k=0 be a stochastic process and let be a stopping time. We denote by Yk
k=0 the
f
g
f
g
^
stopped process
Yk !!; k = 0;1;::: ;n:
^
Example 7.1 (Stopped Process) Figure 7.3 shows our familiar 3-period binomial example.
Define
! = 1 if ! = T;
1
2 if ! = H:
1
Then
8
S HH = 16 if ! = HH;
2
S ! ! = S HT = 4
if ! = HT;
2
S T = 2
2^
if ! = TH;
1
:
S T = 2
if ! = TT:
1
Theorem 3.26 A stopped martingale (or submartingale, or supermartingale) is still a martingale
(or submartingale, or supermartingale respectively).
n
Proof: Let Yk k=0 be a martingale, and be a stopping time. Choose some k
0;1;::: ;n .
f
g
2
f
g
c
The set
k is in k, so the set
k + 1 =
k is also in k. We compute
f
g
F
f
g
f
g
F
IE hY
i
i
k+1
k = IE hI k Y + I k+1 Yk+1 k
jF
jF
^
f
g
f
g
= I k Y + I k+1 IE Yk+1 k
jF
f
g
f
g
= I k Y + I k+1 Yk
f
g
f
g
= Yk :
^
CHAPTER 7. Jensen’s Inequality
95
96
Chapter 8
Random Walks
8.1
First Passage Time
Toss a coin infinitely many times. Then the sample space
is the set of all infinite sequences
! = !1;!2;::: of H and T. Assumethe tosses are independent,and on each toss, the probability
1
of H is 2, as is the probability of T. Define
Y
1
j! =
if !j = H;
1 if !j = T;
,
M0 = 0;k
Mk = XYj; k = 1;2;:::
j=1
The process Mk 1
k=0 is a symmetric random walk (see Fig. 8.1) Its analogue in continuous time is
f
g
Brownian motion.
Define
= min k 0;Mk = 1 :
f
g
If Mk never gets to 1 (e.g., ! = TTTT : : : ), then =
. The random variable
is called the
1
first passage time to 1. It is the first time the number of heads exceeds by one the number of tails.
8.2
is almost surely finite
It is shown in a Homework Problem that Mk
N
1
k=0 and
k 1
k=0 where
f
g
f
g
!
N
e + e ,
k = exp Mk k log
,
2
k
= eMk 2
e + e ,
97
98
Mk
k
Figure 8.1: The random walk process Mk
θ −θ
e + e
2
2
θ −θ
e + e
1
1
θ
θ
Figure 8.2: Illustrating two functions of
are martingales. (Take Mk = Sk in part (i) of the Homework Problem and take =
in part
,
,
(v).) Since N0 = 1 and a stopped martingale is a martingale, we have
"
k
1 = IEN
2
^
k = IE eMk^
(2.1)
^
e + e,
for every fixed
IR (See Fig. 8.2 for an illustration of the various functions involved). We want
2
to let k
in (2.1), but we have to worry a bit that for some sequences !
, ! =
.
!1
2
1
We consider fixed
0, so
2
e + e
1:
,
As k
,
!1
k
2
2
^
if
;
e + e
e+e
1
,
!
,
0
if
= 1
Furthermore, Mk
1, because we stop this martingale when it reaches 1, so
^
0 eMk
e
^
CHAPTER 8. Random Walks
99
and
k
0 eMk 2
^ e:
^
e + e
,
In addition,
k
lim eMk 2
^
e
2
e+e
if
;
1
,
k
=
^
e + e ,
0
if
= :
! 1
1
Recall Equation (2.1):
"
k
IE eM
k
2
^
= 1
^
e + e ,
Letting k
, and using the Bounded Convergence Theorem, we obtain
!1
IE e 2
e + e I
= 1:
(2.2)
f
1g
,
For all
0;1 , we have
2
0 e 2
I
e;
e + e
f
1g
,
so we can let 0 in (2.2), using the Bounded Convergence Theorem again, to conclude
IE hI
i
= 1;
f
1g
i.e.,
IP
= 1:
f
1g
We know there are paths of the symmetric random walk Mk 1
k=0 which never reach level 1. We
f
g
have just shown that these paths collectively have no probability. (In our infinite sample space
,
each path individually has zero probability). We therefore do not need the indicator I
in
f
1g
(2.2), and we rewrite that equation as
IE 2
e + e
= e :
,
(2.3)
,
8.3
The moment generating function for
Let
0;1 be given. We want to find 0 so that
2
=
2
e + e :
,
Solution:
e + e 2 = 0
,
,
e 2 2e + = 0
,
,
,
100
p
e
1
2
,
:
,
= 1
We want
0, so we must have e
1
,
. Now 0
1, so
0 1
2 1
1
2;
,
,
,
1
p1
2;
,
,
1 p1
2
;
,
,
1 p1
2
,
,
1
We take the negative square root:
p
e
1
2
,
:
,
= 1 ,
Recall Equation (2.3):
IE 2
e + e
= e ;
,
0:
,
With
0;1 and 0 related by
2
p
e
1
2
,
;
,
= 1 ,
= 2
e + e ;
,
this becomes
p
IE = 1
1
2
,
,
; 0
1:
(3.1)
We have computed the moment generating function for the first passage time to 1.
8.4
Expectation of
Recall that
p
IE = 1
1
2
,
,
; 0
1;
so
d
1
d IE
= IE ,
p
= d 1
1
2 !
d
,
,
p
= 1
1
2
,
,
2p1
2 :
,
CHAPTER 8. Random Walks
101
Using the Monotone Convergence Theorem, we can let
1 in the equation
"
p
IE
1
1
2
,
,
= 1 ,2p1 2 ;
,
to obtain
IE = :
1
Thus in summary:
4
= min k;Mk = 1 ;
f
g
IP
= 1;
f
1g
IE = :
1
8.5
The Strong Markov Property
The random walk process Mk 1
k=0 is a Markov process, i.e.,
f
g
IE random variable depending only on Mk+1;Mk+2;::: k
j
F
= IE same random variable Mk :
j
In discrete time, this Markov property implies the Strong Markov property:
IE random variable depending only on M +1;M +2;:::j F
= IE same random variable M :
j
for any almost surely finite stopping time .
8.6
General First Passage Times
Define
m 4= min k 0; Mk = m ; m = 1; 2;:: :
f
g
Then 2
1 is the number of periods between the first arrival at level 1 and the first arrival at level
,
2. The distribution of 2
1 is the same as the distribution of 1 (see Fig. 8.3), i.e.,
,
p
IE
1
2
,
;
,
= 1 ,
0;1:
2
1
2
102
Mk
k
τ1
τ2
τ − τ
2
1
Figure 8.3: General first passage times.
For
0;1,
2
IE
= IE
,
2
1
2
1
jF
jF
1
1
=
IE ,
2
1
1
jF
1
(taking out what is known)
=
IE
M
,
2
1
1
j
1
(strong Markov property)
=
IE ,
2
1
1
M = 1; not random
1
p
=
1
1
2 !
,
,
:
1
Take expectations of both sides to get
p
IE
= IE : 1
1
2 !
,
,
2
1
2
p
= 1
1
2 !
,
,
In general,
m
p
IE
2 !
m = 1
1
,
,
;
0;1:
2
8.7
Example: Perpetual American Put
Consider the binomial model, with u = 2;d = 12;r = 14, and payoff function 5 Sk+. The risk
,
neutral probabilities are ~
p = 12, ~q= 12, and thus
Sk = S0uMk;
CHAPTER 8. Random Walks
103
where Mk is a symmetric random walk under the risk-neutral measure, denoted by IP
f .
Suppose
S0 = 4. Here are some possible exercise rules:
Rule 0: Stop immediately. 0 = 0;V = 1
0
.
Rule 1: Stop as soon as stock price falls to 2, i.e., at time
1 4
= min k;Mk = 1 :
f
,
g
,
Rule 2: Stop as soon as stock price falls to 1, i.e., at time
2 4
= min k;Mk = 2 :
f
,
g
,
Because the random walk is symmetric under IP
IP
f,
m has the same distribution under f as the
,
stopping time m in the previous section. This observation leads to the following computations of
value. Value of Rule 1:
V = IE 1 + r 5 S +
,
f
,1
,1
,
,1
= 5 2+IE h4 i
,
5 ,1
q
= 3:1
1 4
,
,
52
4
5
= 32:
Value of Rule 2:
V = 5 1+IE h4 i
f
,2
,
5 ,2
= 4:122
= 1:
This suggests that the optimal rule is Rule 1, i.e., stop (exercise the put) as soon as the stock price
3
falls to 2, and the value of the put is 2 if S0 = 4.
Suppose instead we start with S0 = 8, and stop the first time the price falls to 2. This requires 2
down steps, so the value of this rule with this initial stock price is
5 2+IE h4 i = 3:1
f
,
5 ,2
22 = 34:
In general, if S0 = 2j for some j
1, and we stop when the stock price falls to 2, then j 1 down
,
steps will be required and the value of the option is
5 2+IE h4 j i = 3:1 1:
f
,1
,
5 ,
2j,
We define
v2j
1
4
= 3:12j ; j = 1;2;3;:::
,
104
If S0 = 2j for some j
1, then the initial price is at or below 2. In this case, we exercise
immediately, and the value of the put is
v2j 4= 5 2j; j = 1;0; 1; 2;:::
,
,
,
Proposed exercise rule: Exercise the put whenever the stock price is at or below 2. The value of
this rule is given by v2j as we just defined it. Since the put is perpetual, the initial time is no
different from any other time. This leads us to make the following:
Conjecture 1 The value of the perpetual put at time k is vSk.
How do we recognize the value of an American derivative security when we see it?
There are three parts to the proof of the conjecture. We must show:
(a) vSk
5 Sk+ k;
,
8
n
(b)
45kvSko1k=0 is a supermartingale,
(c)
vSk 1k=0 is the smallest process with properties (a) and (b).
f
g
Note: To simplify matters, we shall only consider initial stock prices of the form S0 = 2j, so Sk is
always of the form 2j, with a possibly different j.
Proof: (a). Just check that
v2j
1
4
= 3:12j, 5 2j+ for j 1;
,
v2j 4= 5 2j 5 2j+ for j 1:
,
,
This is straightforward.
Proof: (b). We must show that
vS
i
k
IE h4
f
k
5vSk+1jF
= 45:12v2Sk + 45:12v12Sk:
By assumption, Sk = 2j for some j. We must show that
v2j 2
1:
5v2j+1 + 25v2j,
If j
2, then v2j = 3:1 1 and
2j,
2
5v2j+1 + 25v2j 1
,
= 2
2
5:3:12j + 25:3:12j,
= 3:2
2
5:14 + 25 12j,
= 3:1
2
2:12j,
= v2j:
CHAPTER 8. Random Walks
105
If j = 1, then v2j = v2 = 3 and
2
5v2j+1 + 25v2j 1
,
= 25v4+ 25v1
= 25:3:12 + 25:4
= 3=5 + 8=5
= 215 v2 = 3
4
There is a gap of size 5.
If j
0, then v2j = 5 2j and
,
2
1
5v2j+1 + 25v2j
,
= 2
1
55 2j+1 + 2
2j,
,
55 ,
= 4 2
1
,
54 + 12j,
= 4 2j v2j = 5 2j:
,
,
There is a gap of size 1. This concludes the proof of (b).
n
Proof: (c). Suppose Yk k=0 is some other process satisfying:
f
g
(a’) Yk
5 Sk+ k;
,
8
(b’)
4
1
k=0 is a supermartingale.
f
5kYkg
We must show that
Yk vSk k:
(7.1)
8
Actually, since the put is perpetual, every time k is like every other time, so it will suffice to show
Y0 vS0;
(7.2)
provided we let S0 in (7.2) be any number of the form 2j. With appropriate (but messy) conditioning
on
k, the proof we give of (7.2) can be modified to prove (7.1).
F
For j
1,
v2j = 5 2j = 5 2j+;
,
,
so if S0 = 2j for some j
1, then (a’) implies
Y0 5 2j+ = vS0:
,
Suppose now that S0 = 2j for some j
2, i.e., S0 4. Let
= min k;Sk = 2
f
g
= min k;Mk = j 1 :
f
,
g
106
Then
vS0 = v2j = 3:1 1
2j,
= IE h45 5 S +i:
,
Because 4
1
k=0 is a supermartingale
f
5kYkg
Y0 IE h4
IE h4
S +i = vS0:
5 Y i
5 5 ,
Comment on the proof of (c): If the candidate value process is the actual value of a particular
exercise rule, then (c) will be automatically satisfied. In this case, we constructed v so that vSk is
the value of the put at time k if the stock price at time k is Sk and if we exercise the put the first time
(k, or later) that the stock price is 2 or less. In such a situation, we need only verify properties (a)
and (b).
8.8
Difference Equation
If we imagine stock prices which can fall at any point in 0; , not just at points of the form 2j for
1
integers j, then we can imagine the function vx, defined for all x
0, which gives the value of
the perpetual American put when the stock price is x. This function should satisfy the conditions:
(a) vx
K x+; x,
,
8
1
(b) vx
x;
1+r ~pvux + ~qvdx ; 8
(c) At each x, either (a) or (b) holds with equality.
In the example we worked out, we have
1
For j
1 : v2j = 3:1
= 6
2j,
2j ;
For j
1 : v2j = 5 2j:
,
This suggests the formula
v
6
x = x;
x 3;
5 x; 0 x 3:
,
We then have (see Fig. 8.4):
(a) vx
5 x+; x;
,
8
4 h1
(b) vx 5 2v2x + 12vx2i for every x except for 2 x 4.
CHAPTER 8. Random Walks
107
v(x)
5
(3,2)
5
x
Figure 8.4: Graph of vx.
Check of condition (c):
If 0
x 3, then (a) holds with equality.
If x
6, then (b) holds with equality:
4 1
1 6
12
5 2v2x + 12vx2 = 45 2 2x + 12 x = 6x:
If 3
x 4 or 4 x 6, then both (a) and (b) are strict. This is an artifact of the
discreteness of the binomial model. This artifact will disappear in the continuous model, in
which an analogue of (a) or (b) holds with equality at every point.
8.9
Distribution of First Passage Times
Let Mk 1
k=0 be a symetric random walk under a probability measure IP, with M0 = 0. Defining
f
g
= min k 0;Mk = 1 ;
f
g
we recall that
p
IE = 1
1
2
,
,
; 0
1:
We will use this moment generating function to obtain the distribution of
. We first obtain the
Taylor series expasion of IE
as follows:
108
fx = 1 p1 x; f0 = 0
,
,
f
x 12; f
0
x = 121 ,
0
0 = 1
,
2
f x = 1
x 3
,
2; f 0 = 1
00
41
00
,
4
f
x 5; f
0
00
x = 381 , 0000 = 3
2
,
8
:::
fjx = 1 3 ::: 2j 3
j
2
,1
,
2j
1 x
;
,
2
,
fj0 = 1 3 ::: 2j 3
,
2j
= 1 3 ::: 2j 3
4 ::: 2j 2
,
,
2j
:2 2j 1j 1!
,
,
=
2j 1 2j 2!
1 ,
2
,
j 1!
,
The Taylor series expansion of fx is given by
fx = 1 p1 x
,
,
= 1 1
X
j=0 j!fj0xj
= 1
2j 1 2j 2!
X
1 ,
2
,
j=1
j!j 1!xj
,
!
= x
2j 1
1
2j 2
1 ,
,
2 + 1
X
2
xj:
j=2
j 1
j
,
So we have
p
IE
= 1
1
2
,
,
= 1f 2
2j 1
!
=
1
2j 2
,
,
2 + 1
X
:
j=2 2
j 1
j
,
But also,
IE = 1
X
2j 1IP
:
,
= 2j 1
f
,
g
j=1
CHAPTER 8. Random Walks
109
Figure 8.5: Reflection principle.
Figure 8.6: Example with j = 2.
Therefore,
IP = 1 = 1
f
g
2;
!
IP
2j 1
= 2j 1 = 1
1
2j 2
,
,
; j = 2;3;:::
f
,
g
2
j 1
j
,
8.10
The Reflection Principle
To count how many paths reach level 1 by time 2j
1, count all those for which M2j 1 = 1 and
,
,
double count all those for which M2j 1
3. (See Figures 8.5, 8.6.)
,
110
In other words,
IP
2j 1 = IP M2j 1 = 1 + 2IP M2j 1 3
f
,
g
f
g
f
g
,
,
= IP M2j 1 = 1 + IP M2j 1 3 + IP M2j 1
3
f
g
f
g
f
,
g
,
,
,
= 1 IP M2j 1 = 1 :
,
f
,
g
,
For j
2,
IP = 2j 1 = IP
2j 1 IP
2j 3
f
,
g
f
,
g
,
f
,
g
= 1 IP M2j 1 = 1
1 IP M2j 3 = 1
,
f
,
g
,
,
f
,
g
,
,
= IP M2j 3 = 1 IP M2j 1 = 1
f
,
g
,
f
,
g
,
,
=
2j 3
2j 3!
2j 1 2j 1!
1
,
1 ,
2
,
,
j 1!j 2! , 2
j!j 1!
,
,
,
=
2j 1 2j 3!
1 ,
2
,
j!j 1! 4jj 1 2j 12j 2
,
,
,
,
,
=
2j 1 2j 3!
1 ,
2
,
j!j 1! 2j2j 2 2j 12j 2
,
,
,
,
,
=
2j 1 2j 2!
1 ,
2
,
j!j 1!
,
!
=
2j 1
1
2j 2
1 ,
2
,
j 1
j :
,
Chapter 9
Pricing in terms of Market Probabilities:
The Radon-Nikodym Theorem.
9.1
Radon-Nikodym Theorem
Theorem 1.27 (Radon-Nikodym) Let IP and IP
f
be two probability measures on a space ;
.
F
Assume that for every A
satisfying IP A = 0, we also have IP A = 0
f
. Then we say that
2
F
IP
f
is absolutely continuous with respect to IP. Under this assumption, there is a nonegative random
variable Z such that
IPA = Z ZdIP; A
;
f
A
(1.1)
8
2
F
and Z is called the Radon-Nikodym derivative of IP
f
with respect to IP.
Remark 9.1 Equation (1.1) implies the apparently stronger condition
IEX = IE XZ
f
for every random variable X for which IE XZ
.
j
j
1
Remark 9.2 If IP
f
is absolutely continuous with respect to IP, and IP is absolutely continuous with
respect to IP
IP
IP
f , we say that I
P and f are equivalent. IP and f are equivalent if and only if
IPA = 0 exactly when IPA = 0; A
:
f
8
2
F
1
If IP and IP
IP
f
are equivalent and Z is the Radon-Nikodym derivative of f w.r.t. IP, then Z is the
Radon-Nikodym derivative of IP w.r.t. IP
f , i.e.,
IEX = IE XZ X;
f
(1.2)
8
IEY = IE Y: 1 Y:
f
Z
(1.3)
8
(Let X and Y be related by the equation Y = XZ to see that (1.2) and (1.3) are the same.)
111
112
Example 9.1 (Radon-Nikodym Theorem) Let
= HH;HT;TH;TT , the set of coin toss sequences
f
g
of length 2. Let P correspond to probability
IP
1
for H and 2 for T , and let e correspond to probability 1 for
3
3
2
H
IP !
and
1
for T . Then Z! = e ! , so
2
I
P
ZHH = 94; ZHT = 98; ZTH = 98; ZTT = 916:
9.2
Radon-Nikodym Martingales
Let
be the set of all sequences of n coin tosses. Let IP be the market probability measure and let
IP
f
be the risk-neutral probability measure. Assume
IP! 0; IP! 0; !
;
f
8
2
so that IP and IP
IP
f
are equivalent. The Radon-Nikodym derivative of f with respect to IP is
Z! = IP!
f
IP!:
Define the IP-martingale
Zk 4= IE Z k ; k = 0;1;::: ;n:
jF
We can check that Zk is indeed a martingale:
IE Zk+1 k = IE IE Z k+1 k
jF
jF
jF
= IE Z k
jF
= Zk:
Lemma 2.28 If X is k-measurable, then IEX = IE XZ
f
k .
F
Proof:
IEX = IE XZ
f
= IE IE XZ k
jF
= IE X:IE Z k
jF
= IE XZk :
Note that Lemma 2.28 implies that if X is k-measurable, then for any A
k,
F
2
F
IE I
f
AX = IE ZkIAX ;
or equivalently,
Z
XdIP = Z XZ
f
A
A
kdIP:
CHAPTER 9. Pricing in terms of Market Probabilities
113
2
Z (HH) = 9/4
1/3
Z (H) = 3/2
1
1/3
2/3
2
Z (HT) = 9/8
Z = 1
0
2
Z (TH) = 9/8
2/3
1/3
Z (T) = 3/4
1
2/3
2
Z (TT) = 9/16
Figure 9.1: Showing the Zk values in the 2-period binomial model example. The probabilities shown
are for IP, not IP
f.
Lemma 2.29 If X is k-measurable and 0
j k, then
F
IE X
IE XZ
f
j = 1
k j :
jF
Zj
jF
1
Proof: Note first that Zj IE XZk j is j-measurable. So for any A
j, we have
jF
F
2
F
Z
1 IE XZ
IP = Z IE XZ
A Zj
k j df
k j dIP (Lemma 2.28)
jF
A
jF
= Z XZ
A
kdIP (Partial averaging)
= Z XdIPf
A
(Lemma 2.28)
Example 9.2 (Radon-Nikodym Theorem, continued) We show in Fig. 9.1 the values of the martingale Z .
k
We always have Z = 1, since
0
Z
Z = IEZ = ZdIP = IP = 1:
e
0
9.3
The State Price Density Process
In order to express the value of a derivative security in terms of the market probabilities, it will be
useful to introduce the following state price density process:
k = 1 + r kZ
,
k; k = 0;::: ;n:
114
We then have the following pricing formulas: For a Simple European derivative security with
payoff Ck at time k,
V
i
0 = IE h1 + r kC
,
f
k
= IE h1 + r kZ i
,
kCk
(Lemma 2.28)
= IE
kCk :
More generally for 0
j k,
V
i
j = 1 + rjIE h1 + r kC
,
f
k j
jF
= 1 + rj
k
i
Z IE h1 + r Z
,
j
kCk j
(Lemma 2.29)
jF
= 1
IE
j
kCk j
jF
k
Remark 9.3
jVj j=0 is a martingale under IP, as we can check below:
f
g
IE
j+1Vj+1 j = IE IE
kCk j+1 j
jF
jF
jF
= IE
kCk j
jF
=
jVj:
n
Now for an American derivative security Gk k=0:
f
g
V0 = sup IE 1 + r G
,
f
T
2
0
= sup IE 1 + r Z G
,
T
2
0
= sup IE
G :
T
2
0
More generally for 0
j n,
Vj = 1 + rj sup IE 1 + r G
,
f
T
j
jF
j
2
= 1 + rj sup 1 IE 1 + r Z G
,
T
j
jF
j Zj
2
= 1
sup IE
G
j T
j :
jF
j
2
Remark 9.4 Note that
n
(a)
jVj j=0 is a supermartingale under IP,
f
g
(b)
jVj
jGj j;
8
CHAPTER 9. Pricing in terms of Market Probabilities
115
ζ (ΗΗ) = 1.44
2
S (HH) = 16
2
1/3
ζ (Η) = 1.20
1
S (H) = 8
1
ζ (ΗΤ) = 0.72
2
1/3
2/3
S (HT) = 4
2
S = 4
0
2/3
S (TH) = 4
1/3
2
ζ = 1.00
0
ζ (ΤΗ) = 0.72
S (T) = 2
2
1
ζ (Τ) = 0.6
1
2/3
ζ (ΤΤ) = 0.36
2
S (TT) = 1
2
Figure 9.2: Showing the state price values
k. The probabilities shown are for IP, not IP
f .
n
(c)
jVj j=0 is the smallest process having properties (a) and (b).
f
g
We interpret
k by observing that
k!IP ! is the value at time zero of a contract which pays $1
at time k if ! occurs.
Example 9.3 (Radon-NikodymTheorem, continued) We illustrate the use of the valuation formulas for
European and American derivative securities in terms of market probabilities. Recall that p = 1 , q = 2. The
3
3
state price values
are shown in Fig. 9.2.
k
For a European Call with strike price 5, expiration time 2, we have
V HH = 11;
HHV HH = 1:44 11 = 15:84:
2
2
2
V HT = V TH = V TT = 0:
2
2
2
V = 1 1 15:84 = 1:76:
0
3 3
HH
2
HHV HH = 1:44 11 = 1:20 11 = 13:20
2
1:20
1
V H = 1 13:20 = 4:40
1
3
Compare with the risk-neutral pricing formulas:
V H = V HH + V HT =
11 = 4:40;
2
2
2
1
1
1
5
5
5
V T = V TH + V TT = 0;
2
2
1
1
1
5
5
V = V H + V T =
4:40 = 1:76:
2
2
2
0
1
1
5
5
5
Now consider an American put with strike price 5 and expiration time 2. Fig. 9.3 shows the values of
5 S +. We compute the value of the put under various stopping times :
,
k
k
(0) Stop immediately: value is 1.
(1) If HH = HT = 2; TH = TT = 1, the value is
1
3
0:72 +
1:80 = 1:36:
2
2
3
3
116
+
(5 - S (HH)) = 0
2
+
ζ
(5 - S (HH)) = 0
2(HH)
2
+
1/3
(5 - S (H)) = 0
1
ζ
+
(H)
1 (5 - S (H)) = 0
1
+
(5 - S (HT)) = 1
2
+
ζ (HT) (5 - S (HT)) = 0.72
1/3
2/3
2
2
+
(5-S
+
0) =1
(5 - S (TH)) = 1
+
ζ
2
(5-S
0
+
0) =1
2/3
1/3
ζ (TH) (5 - S
2
(TH)) = 0.72
2
+
(5 - S (T)) = 3
1
ζ
+
1(T) (5 - S (T)) = 1.80
1
2/3
+
(5 - S (TT)) = 4
2
+
ζ (TT) (5 - S
2
(TT)) = 1.44
2
Figure 9.3: Showing the values
k5 Sk+ for an American put. The probabilities shown are for
,
I
P, not IP
f .
(2) If we stop at time 2, the value is
1
1
3
0:72 +
0:72 +
1:44 = 0:96
2
2
2
2
3
3
3
3
3
We see that (1) is optimal stopping rule.
9.4
Stochastic Volatility Binomial Model
Let
be the set of sequences of n tosses, and let 0
dk 1+rk uk, wherefor each k, dk;uk;rk
are
k-measurable. Also let
F
~p
d
1 + r
k = 1 + rk
k
k
,
u
; ~q
:
k dk
k = uk ,
uk dk
,
,
Let IP
f
be the risk-neutral probability measure:
IP !
= ~p
f
1 = H
0;
f
g
IP !
= ~q
f
1 = T
0;
f
g
and for 2
k n,
IP !
f
k+1 = H k = ~pk;
jF
IP !
f
k+1 = T k = ~qk:
jF
Let IP be the market probability measure, and assume IP !
0 !
. Then IP and IP
f
are
f
g
8
2
equivalent. Define
Z! = IP!
f
IP! !
;
8
2
CHAPTER 9. Pricing in terms of Market Probabilities
117
Zk = IE Z k ; k = 0;1;::: ;n:
jF
We define the money market price process as follows:
M0 = 1;
Mk = 1 + rk 1Mk 1; k = 1;::: ;n:
,
,
Note that Mk is Fk 1-measurable.
,
We then define the state price process to be
k = 1M Zkk; k = 0;::: ;n:
n 1
As before the portfolio process is k ,
k=0. The self-financing value process (wealth process)
f
g
consists of X0, the non-random initial wealth, and
Xk+1 = kSk+1 + 1 + rkXk kSk; k = 0;::: ;n 1:
,
,
Then the following processes are martingales under IP
f :
1
n
1
n
M S
X
;
k k
and
k
k
=0
Mk
k=0
and the following processes are martingales under IP:
kSk nk=0
n
and
kXk k=0:
f
g
f
g
We thus have the following pricing formulas:
Simple European derivative security with payoff Ck at time k:
V
j = MjIE Ck
f
Mk j
F
= 1
IE
j
kCk j
jF
n
American derivative security
Gk k=0:
f
g
V
j = Mj sup IE G
f
T
j
j
M F
2
= 1
sup IE
G
j T
j :
jF
j
2
The usual hedging portfolio formulas still work.
118
9.5
Another Applicaton of the Radon-Nikodym Theorem
Let ; ;Q be a probability space. Let
be a sub- -algebra of
, and let
R
X be a non-negative
F
G
F
random variable with
X dQ = 1. We construct the conditional expectation (under Q) of X
given . On , define two probability measures
G
G
IPA = QA A ;
8
2
G
IPA = Z XdQ A :
f
A
8
2
G
Whenever Y is a -measurable random variable, we have
G
Z
Y dIP = Z Y dQ;
if Y = 1A for some A
, this is just the definition of IP , and the rest follows from the “standard
2
G
machine”. If A
and IP A = 0, then QA = 0, so IP A = 0
IP
f
. In other words, the measure f
2
G
is absolutely continuous with respect to the measure IP
f . The Radon-Nikodym theorem implies that
there exists a
-measurable random variable Z such that
G
IPA 4= Z Z dIP A ;
f
A
8
2
G
i.e.,
Z
X dQ = Z Z dIP A :
A
A
8
2
G
This shows that Z has the “partial averaging” property, and since Z is -measurable, it is the con-
G
ditional expectation (under the probability measure Q) of X given . The existence of conditional
G
expectations is a consequence of the Radon-Nikodym theorem.
Chapter 10
Capital Asset Pricing
10.1
An Optimization Problem
Consider an agent who has initial wealth X0 and wants to invest in the stock and money markets so
as to maximize
IE logXn:
Remark 10.1 Regardless of the portfolio used by the agent,
kXk 1
k=0 is a martingale under IP, so
f
g
IE
nXn = X0
BC
Here, (BC) stands for “Budget Constraint”.
Remark 10.2 If is any random variable satisfying (BC), i.e.,
IE
n = X0;
then there is a portfolio which starts with initial wealth X0 and produces Xn = at time n. To see
this, just regard as a simple European derivative security paying off at time n. Then X0 is its value
at time 0, and starting from this value, there is a hedging portfolio which produces Xn = .
Remarks 10.1 and 10.2 show that the optimal Xn for the capital asset pricing problem can be
obtained by solving the following
Constrained Optimization Problem:
Find a random variable which solves:
Maximize IE log
Subject to IE
n = X0:
Equivalently, we wish to
X
Maximize
log!IP!
!2
119
120
X
Subject to
n!!IP! X0 = 0:
!
,
2
There are 2n sequences ! in . Call them !1;!2;::: ;!2n. Adopt the notation
x1 = !1; x2 = !2; ::: ; x2n = !2n:
We can thus restate the problem as:
2n
X
Maximize
logxkIP!k
k=1
2n
X
Subject to
n!kxkIP!k Xo = 0:
,
k=1
In order to solve this problem we use:
Theorem 1.30 (Lagrange Multiplier) If x1;::: ;xm solve the problem
Maxmize fx1;::: ;xm
Subject to gx1;::: ;xm = 0;
then there is a number such that
@
@x fx
g
x
k
1;::: ;xm = @
@xk 1;::: ;xm; k = 1;::: ;m;
(1.1)
and
gx1;::: ;xm = 0:
(1.2)
For our problem, (1.1) and (1.2) become
1x IP!k=
n!kIP!k; k=1;:::;2n;
1:10
k
2n
X
n!kxkIP!k = X0:
1:20
k=1
Equation (1.1’) implies
xk = 1
n!k:
Plugging this into (1.2’) we get
1 2n
1
X
IP!k = X0 =
k=1
= X0:
CHAPTER 10. Capital Asset Pricing
121
Therefore,
xk = X0
n!k; k = 1;::: ;2n:
Thus we have shown that if solves the problem
Maximize
IE log
Subject to
IE
n = X0;
(1.3)
then
:
= X0
n
(1.4)
Theorem 1.31 If is given by (1.4), then solves the problem (1.3).
Proof: Fix Z
0 and define
fx = log x xZ:
,
We maximize f over x
0:
f
Z
x
0
x = 1x = 0
= 1
,
Z;
f
1
x IR:
00 x = , x2 0; 8 2
The function f is maximized at x = 1
Z , i.e.,
logx xZ fx
x
Z
= log 1 1;
0;
0:
(1.5)
,
Z , 8
8
Let be any random variable satisfying
IE
n = X0
and let
:
= X0
n
From (1.5) we have
log
n
log X0
1:
,
X0
n ,
Taking expectations, we have
IE log 1 IE
n IE log 1;
,
X0
,
and so
IE log IE log :
122
In summary, capital asset pricing works as follows: Consider an agent who has initial wealth X0
and wants to invest in the stock and money market so as to maximize
IE logXn:
The optimal Xn is Xn = X0
n , i.e.,
nXn = X0:
n
Since
kXk k=0 is a martingale under IP, we have
f
g
kXk = IE
nXn k = X0; k = 0;::: ;n;
jF
so
Xk = X0
;
k
and the optimal portfolio is given by
X
X
0
0
k!1;::: ;!k =
k+1!1;::: ;!k;H , k+1!1;::: ;!k;T
Sk+1!1;::: ;!k;H Sk+1!1;::: ;!k;T:
,
Chapter 11
General Random Variables
11.1
Law of a Random Variable
Thus far we have considered only random variables whose domain and range are discrete. We now
consider a general random variable X :
IR defined on the probability space ; ; . Recall
!
F
P
that:
is a -algebra of subsets of
.
F
I
P is a probability measure on
, i.e., IP A is defined for every A
.
F
2
F
A function X :
IR is a random variable if and only if for every B
IR (the -algebra of
!
2
B
Borel subsets of IR), the set
X B
1
4
= X
!
B
;
,
B 4= ;X!
f
2
g
f
2
g
2
F
1
i.e., X :
IR is a random variable if and only if X
IR
,
is a function from
to
(See Fig.
!
B
F
11.1)
Thus any random variable X induces a measure X on the measurable space IR; IR defined
B
by
XB = IP
X 1B B
IR;
,
8
2
B
where the probabiliy on the right is defined since X 1B
,
. X is often called the Law of X –
2
F
in Williams’ book this is denoted by
X.
L
11.2
Density of a Random Variable
The density of X (if it exists) is a function fX : IR
0; such that
!
1
XB = Z f
B
IR:
B Xx dx 8 2 B
123
124
X
R
{X ε
B
B}
Ω
Figure 11.1: Illustrating a real-valued random variable X.
We then write
dXx = fXxdx;
where the integral is with respect to the Lebesgue measure on IR. fX is the Radon-Nikodym deriva-
tive of X with respect to the Lebesgue measure. Thus X has a density if and only if X is
absolutely continuous with respect to Lebesgue measure, which means that whenever B
IR
2
B
has Lebesgue measure zero, then
IP X B = 0:
f
2
g
11.3
Expectation
Theorem 3.32 (Expectation of a function of X) Let h : IR IR be given. Then
!
IEhX 4= Z hX! dIP!
= Z hx d
IR
Xx
= Z hxf
IR
Xx dx:
Proof: (Sketch). If hx = 1Bx for some B
IR, then these equations are
IE1BX 4= P X B
f
2
g
= XB
= Z f
B Xx dx;
which are true by definition. Now use the “standard machine” to get the equations for general h.
CHAPTER 11. General Random Variables
125
(X,Y)
C
y
{ (X,Y)ε C}
Ω
x
Figure 11.2: Two real-valued random variables X; Y .
11.4
Two random variables
Let X; Y be two random variables
IR defined on the space ; ; . Then X;Y induce a
!
F
P
measure on
IR2 (see Fig. 11.2) called the joint law of X;Y , defined by
B
X;Y C 4= IP X;Y C
C
IR2:
f
2
g
8
2
B
The joint density of X; Y is a function
fX;Y : IR2 0;
!
1
that satisfies
X;Y C = ZZ fX;Y x;y dxdy C
IR2:
8
2
B
C
fX;Y is theRadon-Nikodymderivativeof X;Y with respect to the Lebesguemeasure (area) on IR2.
We compute the expectation of a function of X; Y in a manner analogous to the univariate case:
IEkX;Y 4= Z kX!;Y ! dIP!
= ZZ kx;y dX;Y x;y
IR2
= ZZ kx;yfX;Y x;y dxdy
IR2
126
11.5
Marginal Density
Suppose X; Y has joint density fX;Y . Let B
IR be given. Then
Y B = IP Y B
f
2
g
= IP X;Y IR B
f
2
g
= X;Y IR B
= Z Z f
B IR X;Y x; y dxdy
= Z f
B Y y dy;
where
fY y 4= Z f
IR X;Y x; y dx:
Therefore, fY y is the (marginal) density for Y .
11.6
Conditional Expectation
Suppose X; Y has joint density fX;Y . Let h : IR IR be given. Recall that IE hX Y 4
=
!
j
IE hX Y depends on ! through Y , i.e., there is a function gy (g depending on h) such that
j
IE hX Y ! = gY !:
j
How do we determine g?
We can characterize g using partial averaging: Recall that A
Y A = Y B for some
2
f
2
g
B
IR. Then the following are equivalent characterizations of g:
2
B
Z
gY dIP = Z hX dIP A Y ;
A
A
(6.1)
8
2
Z
1BY gY dIP = Z 1BY hX dIP B IR;
(6.2)
8
2
B
Z
1
1
B
IR;
IR BygyY dy = ZZ Byhx dX;Y x; y
(6.3)
8
2
B
IR2
Z
gyf
Z
hxf
B
IR:
B
Y y dy = ZB IR
X;Y x; y dxdy
(6.4)
8
2
B
CHAPTER 11. General Random Variables
127
11.7
Conditional Density
A function fX Y x y : IR2
0; is called a conditionaldensity for X given Y provided that for
j
!
1
j
any function h : IR IR:
!
gy = Z hxf
y dx:
IR
X Y x
(7.1)
j
j
(Here g is the function satisfying
IE hX Y = gY ;
j
and g depends on h, but fX Y does not.)
j
Theorem 7.33 If X; Y has a joint density fX;Y , then
fX Y x y = fX;Y x;y
j
j
fY y :
(7.2)
Proof: Just verify that g defined by (7.1) satisfies (6.4): For B
IR;
2
B
Z
Z
hxf
y dxf
Z
hxf
B IR
X Y x
Y y dy = Z
X;Y x; y dxdy:
j
j
B IR
|
z
gy
Notation 11.1 Let g be the function satisfying
IE hX Y = gY :
j
The function g is often written as
gy = IE hX Y = y ;
j
and (7.1) becomes
IE hX Y = y = Z hxfX Y x y dx:
j
IR
j
j
In conclusion, to determine IE hX Y (a function of !), first compute
j
gy = Z hxf
y dx;
IR
X Y xj
j
and then replace the dummy variable y by the random variable Y :
IE hX Y ! = gY !:
j
Example 11.1 (Jointly normal random variables) Given parameters:
0;
0; 1
1. Let
,
1
2
X;Y have the joint density
f x;y =
1
exp
1
x
y
2
2 x + y2 :
p
,
,
X;Y
2
1
21
2
2
2
2
,
1
2
,
1
2
1
2
128
The exponent is
1
x
y
2
2 x + y2
,
21
,
2
2
2
,
1
2
1
2
"
=
1
x
y 2 + y21 2
,
21
,
,
2
2
,
1
2
2
=
1
1 x
2
y2
1
y
:
1
,
21
,
,
2
2
2
2
,
2
1
2
We can compute the Marginal density of Y as follows
2
Z
f y =
1
x
y2
y
1
1
e
1
1
,
,
2
2
dx:e, 2
21,
2
2
1
2
p
Y
2
1 2
,
,1
1
2
2
Z
=
1
y
2
1
1
u
2
e
2
,
du:e,2
2
2
2
,1
using the substitution u =
x
y
1
1
, du =
dx
p
p
,
2
2
2
1,
1,
1
1
2
=
1 e y1
,
2
:
2
2
p2 2
Thus Y is normal with mean 0 and variance 2.
2
Conditional density. From the expressions
,
2
f x;y =
1
e
2
y
1
1
1
1
,
x,
y
2
2
2
21,
e, ;
2
2
1
2
p
X;Y
2
1 2
,
1
2
2
f y = 1 e y
1
,
2
;
2
2
p
Y
2 2
we have
f x y = f x;y
X;Y
j
X
jY
f y
Y
=
1
1 e
x
y2
1
1
1
,
,
2
2
:
21,
2
1
p
2 p1 2
,
1
In the x-variable, f
x y is a normal density with mean y
1
and variance 1
2
2
. Therefore,
j
,
X
jY
1
2
Z
IE X Y = y = 1 xf x ydx = 1y;
j
j
X
jY
2
,1
"
IE X
2
1
y Y = y
,
2
Z
=
2
1
x
1
y f x y dx
,
j
X
jY
2
,1
= 1
:
2
2
,
1
CHAPTER 11. General Random Variables
129
From the above two formulas we have the formulas
IE X Y = 1Y;
(7.3)
j
2
"
IE X
2
1
Y
Y = 1
:
2
2
(7.4)
,
,
1
2
Taking expectations in (7.3) and (7.4) yields
IEX = 1IEY = 0;
(7.5)
2
"
IE X
2
1
Y
= 1
:
2
2
(7.6)
,
,
1
2
Based on Y , the best estimator of X is
Y
1
. This estimator is unbiased (has expected error zero) and the
2
expected square error is 1
2 2. No other estimator based on Y can have a smaller expected square error
,
1
(Homework problem 2.1).
11.8
Multivariate Normal Distribution
Please see Oksendal Appendix A.
Let X denote the column vector of random variables X1;X2;::: ;XnT , and x the corresponding
column vector of values x1;x2;::: ;xnT . X has a multivariate normal distribution if and only if
the random variables have the joint density
p
f x = detA
1
= exp n
:A:
T
X o:
,
2X ,
,
X
2n 2
Here,
4
= 1;::: ;nT = IEX 4= IEX1;::: ;IEXnT;
and A is an n
n nonsingular matrix. A 1
,
is the covariance matrix
A 1
,
= IE hX
:X
Ti;
,
,
i.e. the i; jth element of A 1
,
is IEXi
iXj
j. The random variables in X are independent
,
,
1
if and only if A, is diagonal, i.e.,
A 1 = diag 2
,
1; 22;::: ; 2n;
where 2j = IEXj j2 is the variance of Xj.
,
130
11.9
Bivariate normal distribution
Take n = 2 in the above definitions, and let
1X2 2
,
4
= IEX1 ,
:
1 2
Thus,
"
A 1
2
1
1 2 ;
,
=
1 2
2
2
2
1
3
A =
1
,
1
2
2
2
,
,
;
1
2
1
4
1
5
,
1
1
2
2
2
,
,
1
2
2
pdet A =
1
1 2p1
2;
,
and we have the formula from Example 11.1, adjusted to account for the possibly non-zero expec-
tations:
"
f
1
x
2 x
X ;X x1;x2 =
1
1
12
1
1x2
2
22
,
,
,
2 exp
+ x2 ,
:
,
2
2
2
1
2
2 1 2p1
21
1
,
1 2
2
,
,
11.10
MGF of jointly normal random variables
Let u = u1;u2;::: ;unT denote a column vector with components in IR, and let X have a
multivariate normal distribution with covariance matrix A 1
,
and mean vector
. Then the moment
generating function is given by
IEeuT:X = Z 1 :::Z 1 euT:XfX1;X2;::: ;Xnx1;x2;::: ;xn dx1:::dxn
,1
,1
= expn1
1
o
2uT A u
:
,
+ uT
If any n random variables X1;X2;::: ;Xn have this moment generating function, then they are
jointly normal, and we can read out the means and covariances. The random variables are jointly
normal and independent if and only if for any real column vector u = u1;::: ;unT
8
n
9
8
n
9
IEe T:
=
=
X
X
1 2
4
u
:
u
X
= IE exp
jXj = exp
2 ju2j + ujj
:j=1
;
:j=1
;
Chapter 12
Semi-Continuous Models
12.1
Discrete-time Brownian Motion
n
Let Yj j=1 be a collection of independent, standard normal random variables defined on ; ; ,
f
g
F
P
where IP is the market measure. As before we denote the column vector Y1;::: ;YnT by Y. We
therefore have for any real colum vector u = u1;::: ;unT ,
8
n
9
8
n
9
IEe T
=
=
X
u
X
1
:
u
Y
= IE exp
jYj = exp
2u2j
:j=1
;
:
j=1
;
Define the discrete-time Brownian motion (See Fig. 12.1):
B0 = 0;k
Bk = XYj; k = 1;::: ;n:
j=1
If we know Y1;Y2;::: ;Yk, then we know B1;B2;::: ;Bk. Conversely, if we know B1;B2;::: ;Bk,
then we know Y1 = B1;Y2 = B2
B1;::: ;Yk = Bk Bk 1. Define the filtration
,
,
,
0 =
; ;
F
f
g
k =
Y1;Y2;::: ;Yk = B1;B2;::: ;Bk; k = 1;::: ;n:
F
n
Theorem 1.34
Bk k=0 is a martingale(under IP).
f
g
Proof:
IE Bk+1 k = IE Yk+1 + Bk k
jF
jF
= IEYk+1 + Bk
= Bk:
131
132
Bk
Y2
Y
Y1
4
Y3
0
1
2
3
4
k
Figure 12.1: Discrete-time Brownian motion.
n
Theorem 1.35
Bk k=0 is a Markov process.
f
g
Proof: Note that
IE hBk+1 k = IE hYk+1 + Bk k :
jF
jF
Use the Independence Lemma. Define
gb = IEhY
Z
1
1
k+1 + b = 1
hy + be,2y2 dy:
p2 ,1
Then
IE hYk+1 + Bk k = gBk;
jF
which is a function of Bk alone.
12.2
The Stock Price Process
Given parameters:
IR, the mean rate of return.
2
0, the volatility.
S0 0, the initial stock price.
The stock price process is then given by
Sk = S0 expn Bk + 1 2ko; k = 0;::: ;n:
,
2
Note that
Sk+1 = Sk expn Yk+1 + 1 2o;
,
2
CHAPTER 12. Semi-Continuous Models
133
IE S
1
k+1 k = SkIE e Yk+1 k :e,2 2
jF
jF
= S
1
ke12 2e,2 2
= eSk:
Thus
= log IE Sk+1 k
jF
S
= logIE Sk+1
;
k
Sk k
F
and
var log Sk+1
1 2
S
= var
Y
= 2:
k
k+1 + , 2
12.3
Remainder of the Market
The other processes in the market are defined as follows.
Money market process:
Mk = erk; k = 0;1;::: ;n:
Portfolio process:
0;1;::: ;n 1;
,
Each k is k-measurable.
F
Wealth process:
X0 given, nonrandom.
Xk+1 = kSk+1 + erXk kSk
,
= kSk+1 erSk + erXk
,
Each Xk is k-measurable.
F
Discounted wealth process:
Xk+1
S
M
k+1
Sk
k =
+ Xk
+1
k Mk+1 , Mk
Mk:
12.4
Risk-Neutral Measure
Definition 12.1 Let
n
n
o
IP
f
be a probability measure on ;
, equivalent to the market measure IP. If
F
Sk
Mk
IP
IP
k=0 is a martingale under f, we say that f is a risk-neutral measure.
134
n
n
X o
Theorem 4.36 If IP
k
f
is a risk-neutral measure, then every discounted wealth process
Mk k=0 is
a martingale under IP
f, regardless of the portfolio process used to generate it.
Proof:
IE Xk+1
= IE Sk+1
Sk + Xk
f
Mk+1 k
f
k
k
F
Mk+1 , Mk
Mk F
= k IE Sk+1
Sk + Xk
f
Mk+1 k
F
,
Mk
Mk
= Xk
Mk :
12.5
Risk-Neutral Pricing
Let Vn be the payoff at time n, and say it is n-measurable. Note that Vn may be path-dependent.
F
Hedging a short position:
Sell the simple European derivative security Vn.
Receive X0 at time 0.
Construct a portfolio process 0;::: ; n 1 which starts with X0 and ends with Xn = Vn.
,
If there is a risk-neutral measure IP
f, then
X0 = IE Xn IE Vn
f
Mn = fMn:
Remark 12.1 Hedging in this “semi-continuous” model is usually not possible because there are
not enough trading dates. This difficulty will disappear when we go to the fully continuous model.
12.6
Arbitrage
Definition 12.2 An arbitrage is a portfolio which starts with X0 = 0 and ends with Xn satisfying
IPXn 0 = 1; IPXn 0 0:
(IP here is the market measure).
Theorem 6.37 (Fundamental Theorem of Asset Pricing: Easy part) If there is a risk-neutral mea-
sure, then there is no arbitrage.
CHAPTER 12. Semi-Continuous Models
135
Proof: Let IP
f
be a risk-neutral measure, let
n
n
o
X0 = 0, and let Xn be the final wealth corresponding
X
to any portfolio process. Since
k
M
IP
k k=0 is a martingale under f,
IE Xn IE X = 0:
f
Mn = f 0
M
(6.1)
0
Suppose IP Xn
0 = 1. We have
IPXn 0 = 1 = IPXn 0 = 0 = IPX
IPX 0 = 1:
f
n 0 = 0 = f n
(6.2)
(6.1) and (6.2) imply IP X
f
n = 0 = 1. We have
IPX
IPX
IPX
f
n = 0 = 1 = f n 0 = 0 =
n 0 = 0:
This is not an arbitrage.
12.7
Stalking the Risk-Neutral Measure
Recall that
Y1;Y2;::: ;Yn are independent,standard normal random variables on some probabilityspace
; ; .
F
P
Sk = S0 expn Bk + 1
,
2 2ko.
Sk+1 = S0 expn Bk + Yk+1 + 1 2k + 1o
,
2
= Sk expn Yk+1 + 1 2o:
,
2
Therefore,
Sk+1
M
1 2
k = Sk
r
o;
+1
Mk :expn Yk+1 + , , 2
IE Sk+1
M
:IE
Y
r 1 2
k
= Sk
exp
+1 k
k+1
k : exp
F
Mk
f
g
jF
f
,
,
2 g
= Sk
M
1 2
1 2
k : exp
:exp r
f
2 g
f
,
,
2 g
= e r: Sk :
,
Mk
If = r, the market measure is risk neutral. If = r, we must seek further.
6
136
Sk+1
M
:
r 1 2
k
= Sk expn Y
o
+1
Mk
k+1 + , , 2
= Sk
M :
r
1 2o
k exp n Yk+1 + , , 2
= Sk
M
1 2o
k : exp n ~Yk+1
;
,
2
where
~Yk+1 = Yk+1 + r,:
r
The quantity , is denoted and is called the market price of risk.
We want a probability measure IP
f
under which ~
Y1;::: ; ~Yn are independent, standard normal ran-
dom variables. Then we would have
IE Sk+1
= Sk IE hexp ~Y
i
:exp 1 2
f
Mk+1 k
k+1
k
F
Mk :f
f
gjF
f,
2 g
= Sk
M :
1 2 :
1 2
k exp
exp
f
2 g
f,
2 g
= Sk
Mk :
Cameron-Martin-Girsanov’s Idea: Define the random variable
2
n
3
Z = exp X Y 1
:
4
j
5
,
,
22
j=1
Properties of Z:
Z 0.
8
n
9
IEZ = IE exp
n
=
X Yj : exp
,
,
22
:j=1
;
= exp n
n
22 :exp ,22 = 1:
Define
IPA = Z Z dIP A
:
f
A
8
2
F
Then IP A
0
f
for all A
and
2
F
IP = IEZ = 1:
f
In other words, IP
f
is a probability measure.
CHAPTER 12. Semi-Continuous Models
137
We show that IP
f
is a risk-neutral measure. For this, it suffices to show that
~Y1 = Y1 + ; ::: ; ~Yn = Yn +
are independent, standard normal under IP
f .
Verification:
Y1;Y2;::: ;Yn: Independent, standard normal under IP, and
2
n
3
2
n
3
IE exp X u
= exp X 1
:
4
jYj5
4
2u2j5
j=1
j=1
~Y = Y1 + ; ::: ; ~Yn = Yn + :
Z 0 almost surely.
Z = exph n
1
Pj=1 Yj
,
,
22i ;
IPA = Z Z dIP A
;
f
A
8
2
F
IEX = IEXZ
f
for every random variable X.
Compute the moment generating function of ~
Y1;::: ; ~Yn under IPf:
2
n
3
2
n
n
3
IE exp Xu
= IE exp Xu
X Y
1
f
4
j ~Yj5
4
jYj + +
j
5
,
,
22
j=1
j=1
j=1
2
n
3
2
n
3
= IE exp Xu Y :exp Xu
1
4
j
j5
4
j
5
,
,
22
j=1
j=1
2
n
3
2
n
3
= exp X 1
2 :exp Xu
1
4
2uj
5
4
j
5
,
,
22
j=1
j=1
2
n
3
= exp
X
1
u
1
4
2u2j
j + 1
5
,
22 + uj , 22
j=1
2
n
3
= exp X 1
:
4
2u2j5
j=1
138
12.8
Pricing a European Call
Stock price at time n is
Sn = S0 expn Bn + 1 2no
,
2
8
n
9
= S
=
0 exp
X
Yj + 1 2n
,
2
:
j=1
;
8
n
9
= S
=
0 exp
XYj + r
1 2
,
rn +
n
,
,
,
2
:
j=1
;
8
n
9
= S
=
0 exp
X
~Yj + r 1 2n :
,
2
:
j=1
;
Payoff at time n is Sn
K+. Price at time zero is
,
+
8
9
2
0
n
1
3
IESn K+
rn
=
X
1 2
,
= IE e
S
~Y
K
,
n
f
M
f
n
4
@
0 exp
j + r
A
5
,
2
,
:
j=1
;
= Z
+
b2
1
e rn
S
1 2no K : 1
db
,
0 exp n b + r
n2
2
p
,
2
,
2ne,
,1
n
P
since
j=1 ~Yj is normal with mean 0, variance n, under IP
f .
This is the Black-Scholes price. It does not depend on .
Chapter 13
Brownian Motion
13.1
Symmetric Random Walk
Toss a fair coin infinitely many times. Define
X
1
if
!
j! =
j = H;
1
if
!j = T:
,
Set
M0 = 0k
Mk = XXj;
k 1:
j=1
13.2
The Law of Large Numbers
We will use the method of moment generating functions to derive the Law of Large Numbers:
Theorem 2.38 (Law of Large Numbers:)
1
kMk 0 almost surely, as k
:
!
!1
139
140
Proof:
'ku = IE exp ukMk
8
k
9
= IE exp
u =
X
kXj
(Def. of
Mk:)
:j=1
;
k
= Y IE exp u
j=1
kXj
(Independence of the Xj’s)
=
k
1
u
2euk + 12e
;
,
k
which implies,
log'
u
ku = k log
12euk + 12e,k
Let x = 1k. Then
log
1
ux
lim
2eux + 12e,
k
log 'ku = lim
x 0
x
!
! 1
u
u ux
= lim 2eux , 2e,
x 0 1
ux
(L’Hˆopital’s Rule)
!
2eux + 12e,
= 0:
Therefore,
lim
k
'ku = e0 = 1;
! 1
which is the m.g.f. for the constant 0.
13.3
Central Limit Theorem
We use the method of moment generating functions to prove the Central Limit Theorem.
Theorem 3.39 (Central Limit Theorem)
1
Standard normal, as
k
:
pkMk!
! 1
Proof:
'ku = IE exp u
pkMk
=
u k
1
2e upk + 12e p
,
k ;
CHAPTER 13. Brownian Motion
141
so that,
log'
u
ku = k log
12e upk + 12e p
,
k :
Let x = 1
pk . Then
log
1
ux
lim
2eux + 12e,
k
log'ku = lim
x 0
x2
!
!1
u
u ux
= lim 2eux , 2e,
x 0 2x
(L’Hˆopital’s Rule)
1
ux
!
2eux + 12e,
u
u ux
= lim
1
2eux , 2 e,
x 0 1
ux : lim
x 0
2x
!
2eux + 12e,
!
u
u ux
= lim 2eux , 2e,
x 0
2x
!
u
u
ux
2
2
= lim 2 eux , 2 e,
x 0
2
(L’Hˆopital’s Rule)
!
= 12u2:
Therefore,
lim '
k
ku = e12u2;
! 1
which is the m.g.f. for a standard normal random variable.
13.4
Brownian Motion as a Limit of Random Walks
k
Let n be a positive integer. If t
0 is of the form
n, then set
Bnt = 1
pn Mtn = 1
pn Mk:
k
If t
0 is not of the form
n, then define Bnt by linear interpolation (See Fig. 13.1).
Here are some properties of B100t:
142
k/n
(k+1)/n
Figure 13.1: Linear Interpolation to define Bnt.
Properties of
B1001 :
100
B1001 = 1 X
10 Xj
(Approximately normal)
j=1
100
IEB1001 = 1 X
10 IEXj = 0:
j=1
100
varB1001 = 1 X
100 varXj = 1
j=1
Properties of
B1002 :
200
B1002 = 1 X
10 Xj
(Approximately normal)
j=1
IEB1002 = 0:
varB1002 = 2:
Also note that:
B1001 and B1002 B1001 are independent.
,
B100t is a continuous function of t.
To get Brownian motion, let n
in Bnt; t
0.
! 1
13.5
Brownian Motion
(Please refer to Oksendal, Chapter 2.)
CHAPTER 13. Brownian Motion
143
B(t) = B(t,ω)
ω
t
(Ω, F,P)
Figure 13.2: Continuous-time Brownian Motion.
A random variable Bt (see Fig. 13.2) is called a Brownian Motion if it satisfies the following
properties:
1. B0 = 0,
2. Bt is a continuous function of t;
3. B has independent, normally distributed increments: If
0 = t0 t1 t2 ::: tn
and
Y1 = Bt1 Bt0; Y2 = Bt2 Bt1; ::: Yn = Btn Btn 1;
,
,
,
,
then
Y1;Y2;::: ;Yn are independent,
IEYj = 0
j;
8
varYj = tj tj 1
j:
,
8
,
13.6
Covariance of Brownian Motion
Let 0
s t be given. Then Bs and Bt Bs are independent, so Bs and Bt =
,
Bt Bs + Bs are jointly normal. Moreover,
,
IEBs = 0;
varBs = s;
IEBt = 0;
varBt = t;
IEBsBt = IEBs Bt Bs + Bs
,
= IEBsBt Bs+IEB2s
,
|
z
0
|
z
s
= s:
144
Thus for any s
0, t 0 (not necessarily s t), we have
IEBsBt = s t:
^
13.7
Finite-Dimensional Distributions of Brownian Motion
Let
0 t1 t2 ::: tn
be given. Then
Bt1;Bt2;::: ;Btn
is jointly normal with covariance matrix
2
IEB2t1
IEBt1Bt2 ::: IEBt1Btn3
C = 6IEBt2Bt1
IEB2t2 ::: IEBt2Btn7
6
7
6:: :: :: ::: :: :: ::: :: :: :: ::: :: :: :: ::: :: :: ::: :: :: :: :7
4IEBt
5
nBt1 IEBtnBt2 :::
IEB2tn
2t1 t1 ::: t13
= 6t1 t2 ::: t27
6
7
6:: :::: ::: ::: :::7
4t
5
1 t2 ::: tn
13.8
Filtration generated by a Brownian Motion
t t 0
fF
g
Required properties:
For each t, Bt is t-measurable,
F
For each t and for t
t1 t2
tn, the Brownian motion increments
Bt1 Bt; Bt2 Bt1; :::; Btn Btn 1
,
,
,
,
are independent of
t.
F
Here is one way to construct
t. First fix t. Let s 0;t and C
IR be given. Put the set
F
2
2
B
Bs C = ! : Bs;! C
f
2
g
f
2
g
in
t. Do this for all possible numbers s 0;t and C
IR. Then put in every other set
F
2
2
B
required by the -algebra properties.
This
t contains exactly the information learned by observing the Brownian motion upto time t.
F
t t 0 is called the filtration generated by the Brownian motion.
fF
g
CHAPTER 13. Brownian Motion
145
13.9
Martingale Property
Theorem 9.40 Brownian motion is a martingale.
Proof: Let 0
s t be given. Then
IE Bt s = IE Bt Bs + Bs s
jF
,
jF
= IE Bt Bs + Bs
,
= Bs:
Theorem 9.41 Let
IR be given. Then
2
Zt = expn Bt 1
,
,
22to
is a martingale.
Proof: Let 0
s t be given. Then
IE Z
t s = IE exp Bt Bs + Bs 1
s + s s
jF
f,
,
,
22t ,
g
F
= IE Zsexp Bt Bs 1
s s
f,
,
,
22t , g F
= ZsIE hexp Bt Bs 1
s i
f,
,
,
22t , g
= Zsexpn1
1
2 2 varBt Bs
so
,
,
,
22t ,
= Zs:
13.10
The Limit of a Binomial Model
Consider the n’th Binomial model with the following parameters:
un = 1 + pn: “Up” factor. ( 0).
dn = 1 p
,
n: “Down” factor.
r = 0.
~pn = 1 dn
,
un dn = =pn
2 =pn = 12.
,
~qn = 12.
146
Let kH denote the number of H in the first k tosses, and let kT denote the number of T in the
first k tosses. Then
kH+ kT = k;
kH
kT = Mk;
,
which implies,
kH = 12k + Mk
kT = 12k Mk:
,
In the n’th model, take n steps per unit time. Set Sn
0 = 1. Let t = kn for some k, and let
1
1
M
Sn
t = 1 +
2nt+Mnt 1
2nt nt
,
:
p
n
p
,
n
Under IP
f , the price process Sn is a martingale.
Theorem 10.42 As n
, the distribution of Snt converges to the distribution of
! 1
exp Bt 1 2t ;
f
,
2
g
1
where B is a Brownian motion. Note that the correction ,2 2t is necessary in order to have a
martingale.
Proof: Recall that from the Taylor series we have
log1 + x = x 1
,
2x2 + Ox3;
so
logSnt = 12nt + Mntlog1+
Mntlog1
pn + 12nt
p
,
,
n
= nt12 log1 + pn+ 12 log1 p
,
n
+ M
nt 1
1
2 log1 + pn
p
,
2 log1 , n
2
2
!
= nt 1
1
1
1
3=2
2
pn , 4 n , 2 pn , 4 n + On,
2
2
!
+ M
1
nt 1
3=2
2
pn , 4 n + 12 pn + 14 n + On,
= 1 2t + On 1
,
2
,
2
+
1
+ 1
On 1
,
2
pnMnt
nMnt
|
z
|
z
Bt
0
!
!
1 2
As n
, the distribution of log Snt approaches the distribution of Bt
t.
! 1
,
2
CHAPTER 13. Brownian Motion
147
B(t) = B(t,ω)
x
ω
t
(Ω, F, Px)
Figure 13.3: Continuous-time Brownian Motion, starting at x = 0.
6
13.11
Starting at Points Other Than 0
(The remaining sections in this chapter were taught Dec 7.)
For a Brownian motion Bt that starts at 0, we have:
IPB0 = 0 = 1:
For a Brownian motion Bt that starts at x, denote the corresponding probability measure by IP x
(See Fig. 13.3), and for such a Brownian motion we have:
IPxB0 = x = 1:
Note that:
If x = 0, then IP x puts all its probability on a completely different set from IP.
6
The distribution of Bt under IP x is the same as the distribution of x + Bt under IP.
13.12
Markov Property for Brownian Motion
We prove that
Theorem 12.43 Brownian motion has the Markov property.
Proof:
Let s
0; t 0 be given (See Fig. 13.4).
2
3
IE h
Bs + t s = IE 6hBs + t Bs+ Bs s7
6
7
F
,
F
4
5
|
z
|
z
Independent of
s
s-measurable
F
F
148
B(s)
s
s+t
restart
Figure 13.4: Markov Property of Brownian Motion.
Use the Independence Lemma. Define
gx = IE hBs + t Bs + x
,
2
3
= IE 6hx +
Bt
7
6
7
4
5
|
z
same distribution as Bs + t
Bs
,
= IExhBt:
Then
IE hBs + t s = gBs
F
= EBshBt:
In fact Brownian motion has the strong Markov property.
Example 13.1 (Strong Markov Property) See Fig. 13.5. Fix x
0 and define
= min t 0; Bt = x :
f
g
Then we have:
IE hB + t = gB = IE hBt:
x
F
CHAPTER 13. Brownian Motion
149
x
τ
τ + t
restart
Figure 13.5: Strong Markov Property of Brownian Motion.
13.13
Transition Density
Let pt; x; y be the probability that the Brownian motion changes value from x to y in time t, and
let
be defined as in the previous section.
pt;x;y = 1
y x 2
,
t2
p
2te,
1
gx = IExhBt = Z hypt;x;ydy:
,1
1
IE hBs + t s = gBs = Z hypt;Bs;ydy:
F
,1
1
IE hB + t = Z hypt;x;ydy:
F
,1
13.14
First Passage Time
Fix x
0. Define
= min t 0; Bt = x :
f
g
Fix
0. Then
expnBt 1
o
^
,
22t ^
is a martingale, and
IE expnBt 1
o = 1:
^
,
22t ^
150
We have
8
1
lim
1
e,22 if
;
1
t
expn
o =
,
22t ^
0
;
(14.1)
:
!1
if
= 1
0 exp Bt 1
ex:
f
^
,
22t ^ g
Let t
in (14.1), using the Bounded Convergence Theorem, to get
! 1
IE hexp x 1
1
i
= 1:
f
,
22 g f 1g
Let 0 to get IE1
= 1, so
f
1g
IP
= 1;
f
1g
IE exp 1
= e x:
,
(14.2)
f,
22 g
Let
= 122. We have the m.g.f.:
IEe = e xp2 ;
0:
,
,
(14.3)
Differentiation of (14.3) w.r.t.
yields
IE
x
e
xp2 :
,
= p
,
,
2 e,
Letting
0, we obtain
IE = :
(14.4)
1
Conclusion. Brownian motion reaches level x with probability 1. The expected time to reach level
x is infinite.
We use the Reflection Principle below (see Fig. 13.6).
IP
t; Bt x = IP Bt x
f
g
f
g
IP
t = IP
t;Bt x + IP
t;Bt x
f
g
f
g
f
g
= IP Bt x + IP Bt x
f
g
f
g
= 2IP Bt x
f
g
1
= 2 Z e y2
,
t dy
2
p2t x
CHAPTER 13. Brownian Motion
151
shadow path
x
τ
t
Brownian motion
Figure 13.6: Reflection Principle in Brownian Motion.
Using the substitution z = y
p t;
dz = dypt we get
1
IP
t = 2 Z e z2 dz:
,
2
p
f
g
2 xpt
Density:
f t = @
x2
@tIP
t = x
t ;
2
p
f
g
2t3e,
which follows from the fact that if
b
Ft = Z gz dz;
at
then
@F
@a
@t = , @t gat:
Laplace transform formula:
1
IEe = Z e tf tdt = e xp2 :
,
,
,
0
152
Chapter 14
The Itˆo Integral
The following chapters deal with Stochastic Differential Equations in Finance. References:
1. B. Oksendal, Stochastic Differential Equations, Springer-Verlag,1995
2. J. Hull, Options, Futures and other Derivative Securities, Prentice Hall, 1993.
14.1
Brownian Motion
(See Fig. 13.3.) ; ; is given, always in the background, even when not explicitly mentioned.
F
P
Brownian motion, Bt; ! : 0;
IR, has the following properties:
1
!
1. B0 = 0; Technically, IP !; B0;! = 0 = 1,
f
g
2. Bt is a continuous function of t,
3. If 0 = t0
t1 ::: tn, then the increments
Bt1 Bt0; ::: ; Btn Btn 1
,
,
,
are independent,normal, and
IE Btk+1 Btk = 0;
,
IE Btk+1 Btk 2 = tk+1 tk:
,
,
14.2
First Variation
Quadratic variation is a measure of volatility. First we will consider first variation, FV f, of a
function ft.
153
154
f(t)
t2
t
t
T
1
Figure 14.1: Example function ft.
For the function pictured in Fig. 14.1, the first variation over the interval 0;T is given by:
FV 0;T f = ft1 f0 ft2 ft1 + fT ft2
,
,
,
,
t
t
T
1
2
= Z f
Z
f
Z
f
0 t dt +
0t dt +
0 t dt:
,
0
t
t
1
2
T
= Z f
dt:
0 t
j
j
0
Thus, first variation measures the total amount of up and down motion of the path.
The general definition of first variation is as follows:
Definition 14.1 (First Variation) Let = t0;t1;::: ;tn be a partition of 0;T , i.e.,
f
g
0 = t0 t1 ::: tn = T:
The mesh of the partition is defined to be
= max tk+1 tk:
jj
jj
k=0;:::;n 1
,
,
We then define
n 1
FV
,
X
0;T f = lim
ft
ft :
0
k+1
k
j
,
j
jj
jj!
k=0
Suppose f is differentiable. Then the Mean Value Theorem implies that in each subinterval tk;tk+1 ,
there is a point tk such that
ftk+1 ftk = f t
t
0
ktk+1
k:
,
,
CHAPTER 14. The It ˆo Integral
155
Then
n 1
n 1
,
,
X
ftk+1 ftk = X f
t
0 tk tk+1
k;
j
,
j
j
j
,
k=0
k=0
and
n 1
FV
,
X
0;T f = lim
f t t
t
0
0
k k+1
k
j
j
,
jj
jj!
k=0
T
= Z f
dt:
0
t
j
j
0
14.3
Quadratic Variation
Definition 14.2 (Quadratic Variation) The quadratic variation of a function f on an interval 0;T
is
n 1
f T = lim ,
X
ftk+1 ftk 2:
h
i
0
j
,
j
jj
jj!
k=0
Remark 14.1 (Quadratic Variation of Differentiable Functions) If f is differentiable, then f T =
h
i
0, because
n 1
n 1
,
,
X
ftk+1 ftk 2 = X f
2
t
0 tk tk+1
k2
j
,
j
j
j
,
k=0
k=0 n 1
: ,
X
f
2
t
0
tk tk+1 k
jj
jj
j
j
,
k=0
and
n 1
f T
lim : lim ,
X
f
2
t
0 tk tk+1
k
h
i
0 jj jj 0
j
j
,
jj
jj!
jj
jj!
k=0
T
= lim Z f 2 dt
0 t
0 jj jj j
j
jj
jj!
0
= 0:
Theorem 3.44
B T = T;
h
i
or more precisely,
IP !
; B:;! T = T = 1:
f
2
h
i
g
In particular, the paths of Brownian motion are not differentiable.
156
Proof: (Outline) Let = t0;t1;::: ;tn be a partition of 0;T . To simplify notation, set Dk =
f
g
Btk+1 Btk. Define the sample quadratic variation
,
n 1
Q
,
= X D2k:
k=0
Then
n 1
Q
,
T = X D2k tk+1 tk :
,
,
,
k=0
We want to show that
lim Q
T = 0:
0 ,
jj
jj!
Consider an individual summand
D2k tk+1 tk = Btk+1 Btk 2 tk+1 tk:
,
,
,
,
,
This has expectation 0, so
n 1
IEQ
,
T = IE X D2k tk+1 tk = 0:
,
,
,
k=0
For j = k, the terms
6
D2j tj+1 tj and D2k tk+1 tk
,
,
,
,
are independent, so
n 1
varQ
,
T = X var D2k tk+1 tk
,
,
,
k=0
n 1
= ,
X
IE D4k 2tk+1 tkD2k + tk+1 tk2
,
,
,
k=0
n 1
= ,
X
3tk+1 tk2 2tk+1 tk2 + tk+1 tk2
,
,
,
,
k=0
2
(if X is normal with mean 0 and variance
, then IEX4 = 3 4)
n 1
= 2 ,
X
tk+1 tk2
,
k=0 n 1
2 ,
X
tk+1 tk
jj
jj
,
k=0
= 2 T:
jj
jj
Thus we have
IEQ T = 0;
,
varQ T 2 :T:
,
jj
jj
CHAPTER 14. The It ˆo Integral
157
As
0, varQ T 0, so
jj
jj!
,
!
lim Q
T = 0:
0 ,
jj
jj!
Remark 14.2 (Differential Representation) We know that
IE Btk+1 Btk2 tk+1 tk = 0:
,
,
,
We showed above that
var Btk+1 Btk2 tk+1 tk = 2tk+1 tk2:
,
,
,
,
When tk+1
tk is small, tk+1 tk2 is very small, and we have the approximate equation
,
,
Btk+1 Btk2 tk+1 tk;
,
'
,
which we can write informally as
dBt dBt = dt:
14.4
Quadratic Variation as Absolute Volatility
On any time interval T1;T2 , we can sample the Brownian motion at times
T1 = t0 t1 ::: tn = T2
and compute the squared sample absolute volatility
1 n 1
,
X
T
Bt
Bt
2 T1
k+1
k2:
,
,
k=0
This is approximately equal to
1
T1
T
B T
B T
= 1:
2 T1
2
1 = T2 ,
h
i
,
h
i
T2 T1
,
,
As we increase the number of sample points, this approximation becomes exact. In other words,
Brownian motion has absolute volatility 1.
Furthermore, consider the equation
T
B T = T = Z 1 dt;
T 0:
h
i
8
0
This says that quadratic variation for Brownian motion accumulates at rate 1 at all times along
almost every path.
158
14.5
Construction of the Itˆo Integral
The integrator is Brownian motion Bt;t
0, with associated filtration t;t 0, and the
F
following properties:
1. s
t= every set in s is also in t,
F
F
2. Bt is t-measurable, t,
F
8
3. For t
t1 ::: tn, the increments Bt1 Bt;Bt2 Bt1;::: ;Btn Btn 1
,
,
,
,
are independent of
t.
F
The integrand is t;t
0, where
1.
t is t-measurable t (i.e., is adapted)
F
8
2.
is square-integrable:
T
IE Z 2t dt
;
T:
1
8
0
We want to define the Itˆo Integral:
Zt
It =
u dBu;
t 0:
0
Remark 14.3 (Integral w.r.t. a differentiable function) If ft is a differentiable function, then
we can define
t
t
Z
u dfu = Z uf0u du:
0
0
This won’t work when the integrator is Brownian motion, because the paths of Brownian motion
are not differentiable.
14.6
Itˆo integral of an elementary integrand
Let = t0;t1;::: ;tn be a partition of 0;T , i.e.,
f
g
0 = t0 t1 ::: tn = T:
Assume that t is constant on each subinterval tk;tk+1 (see Fig. 14.2). We call such a an
elementary process.
The functions Bt and tk can be interpreted as follows:
Think of Bt as the price per unit share of an asset at time t.
CHAPTER 14. The It ˆo Integral
159
δ( t ) = δ( t )
1
δ( t )= δ( t )
δ(
3
t ) = δ( t )
0
t
t
0=t
t
t
1
2
= T
0
3
4
δ( t ) = δ( t )
2
Figure 14.2: An elementary function .
Think of t0;t1;::: ;tn as the trading dates for the asset.
Think of tk as the number of shares of the asset acquired at trading date tk and held until
trading date tk+1.
Then the It ˆo integral It can be interpreted as the gain from trading at time t; this gain is given by:
8
t0 Bt Bt0 ;
0 t t1
,
|
z
It =
=B0=0
t0 Bt1 Bt0 + t1 Bt Bt1 ;
t1 t t2
,
,
:
t0 Bt1 Bt0 + t1 Bt2 Bt1 + t2 Bt Bt2 ; t2 t t3:
,
,
,
In general, if tk
t tk+1,
k 1
It = ,
X
tj Btj+1 Btj + tk Bt Btk :
,
,
j=0
14.7
Properties of the Itˆo integral of an elementary process
Adaptedness For each t; It is t-measurable.
F
Linearity If
Z t
Z t
It = u dBu;
Jt =
u dBu
0
0
then
It Jt = t
Z
u u dBu
0
160
s
t
t
t
. . . . .
t
t
l
l+1
k
k+1
Figure 14.3: Showing s and t in different partitions.
and
cIt = t
Z
c udBu:
0
Martingale It is a martingale.
We prove the martingale property for the elementary process case.
Theorem 7.45 (Martingale Property)
k 1
It = ,
X
tj Btj+1 Btj + tk Bt Btk ;
tk t tk+1
,
,
j=0
is a martingale.
Proof: Let 0
s t be given. We treat the more difficult case that s and t are in different
subintervals, i.e., there are partition points t` and tk such that s
t`;t`+1 and t tk;tk+1 (See
2
2
Fig. 14.3).
Write
` 1
It = ,X tj Btj+1 Btj + t` Bt`+1 Bt`
,
,
j=0 k 1
+ ,
X
tj Btj+1 Btj + tk Bt Btk
,
,
j=`+1
We compute conditional expectations:
2 ` 1
3
` 1
IE ,
,
X
t
Bt
s = X t
Bt
4
jBtj+1
j
5
jBtj+1
j:
,
F
,
j=0
j=0
IE t`Bt`+1 Bt` s = t`IE Bt`+1 s Bt`
,
F
jF
,
= t` Bs Bt`
,
CHAPTER 14. The It ˆo Integral
161
These first two terms add up to Is. We show that the third and fourth terms are zero.
2
k 1
3
k 1
IE
,
,
X
t
Bt
s = X IE IE t
Bt
t
s
4
jBtj+1
j
5
jBtj+1
j
j
,
F
,
F
F
j=`+1
j=`+1
k 1
2
3
= ,
X
IE 6 tjIE Btj+1 tj Btj s7
jF
,
F
4
5
j=`+1
|
z
=0
2
3
IE tkBt Btk s = IE 6 tkIE Bt tk Btk s7
,
F
jF
,
F
4
5
|
z
=0
Theorem 7.46 (Itˆo Isometry)
IEI2
t
t = IE Z 2u du:
0
Proof: To simplify notation, assume t = tk, so
k
It = X tj Btj+1 Btj
,
j=0
|
z
Dj
Each Dj has expectation 0, and different Dj are independent.
2
0
k
1
I2t = X t
@
jDjA
j=0
k
= X 2tjD2j + 2X ti tjDiDj:
j=0
i j
Since the cross terms have expectation zero,
k
IEI2t = XIE 2tjD2j
j=0
k
= XIE 2tjIE Btj+1 Btj2 tj
,
F
j=0
k
= XIE 2tjtj+1 tj
,
j=0
k tj+1
= IE
Z
X
2u du
j=0 tj
= IE t
Z
2u du
0
162
path of δ
path of δ4
t
t
0=t
t
t
1
2
= T
0
3
4
Figure 14.4: Approximating a general process by an elementary process 4, over 0;T .
14.8
Itˆo integral of a general integrand
Fix T
0. Let be a process (not necessarily an elementary process) such that
t is t-measurable, t 0;T ,
F
8
2
IE T
R
0 2t dt
:
1
Theorem 8.47 There is a sequence of elementary processes
n 1
n=1 such that
f
g
T
lim
2
n
IE Z
t dt = 0:
0
nt
j
,
j
!1
Proof: Fig. 14.4 shows the main idea.
In the last section we have defined
I
T
nT = Z0 nt dBt
for every n. We now define
T
T
Z
t dBt = lim Z
0
n
0 nt dBt:
! 1
CHAPTER 14. The It ˆo Integral
163
The only difficulty with this approach is that we need to make sure the above limit exists. Suppose
n and m are large positive integers. Then
2
!
varI
T
Z
nT ImT = IE
mt dBt
,
0 nt ,
T
(It ˆo Isometry:) = IE Z0 nt mt 2 dt
,
= IE T
Z
t + t
2 dt
0
nt
mt
j
,
j
j
,
j
T
T
a + b2 2a2 + 2b2 : 2IE Z
nt
t 2 dt + 2IE Z
mt
t 2 dt;
0 j
,
j
0 j
,
j
which is small. This guarantees that the sequence InT 1
n=1 has a limit.
f
g
14.9
Properties of the (general) Itˆo integral
I
t
t = Z u dBu:
0
Here
is any adapted, square-integrable process.
Adaptedness. For each t, It is t-measurable.
F
Linearity. If
Z t
Z t
It = u dBu;
Jt =
u dBu
0
0
then
It Jt = t
Z
u u dBu
0
and
cIt = t
Z
c udBu:
0
Martingale. It is a martingale.
Continuity. It is a continuous function of the upper limit of integration t.
t
Itˆo Isometry. IEI2t = IE R0 2u du.
Example 14.1 () Consider the Itˆo integral
Z
T
Bu dBu:
0
We approximate the integrand as shown in Fig. 14.5
164
T/4
2T/4
3T/4
T
Figure 14.5: Approximating the integrand Bu with 4, over 0;T .
8B0 = 0
if
0 u T=n;
u = BT=n
if
T=n u 2T=n;
n
:::
B
n,1T
u T:
:
if
n,1T
T
n
By definition,
Z
n,1
T
Bu dBu = lim
kT
k + 1T
kT
X
B n B
n
B
:
,
n
n!
1
0
k =0
To simplify notation, we denote
B
kT
4
= B
;
k
n
so
Z
n,1
T
Bu dBu = lim X B B
B :
,
k
k +1
k
n!
1
0
k =0
We compute
n,1
n,1
n,1
n,1
X
B
B = X B
X
B B + X B
1
2
1
2
1
2
,
,
k +1
k
k
k +1
k +1
k
2
2
2
k =0
k =0
k =0
k =0
= B + n,1
n,1
n,1
X
B X B B + X B
2
2
2
1
1
1
,
k
k +1
n
j
k
2
2
2
j
=0
k =0
k =0
= B + n,1
n,1
X
B X B B
2
2
1
,
k
k +1
n
k
2
k =0
k =0
= B n,1
X
B B
B :
1
2
,
,
k
k +1
k
n
2
k =0
CHAPTER 14. The It ˆo Integral
165
Therefore,
n,1
n,1
X
B B
B = B
X B
B ;
1
2
1
2
,
,
,
k
k +1
k
k +1
k
n
2
2
k =0
k =0
or equivalently
n,1
kT
k + 1T
kT
n,1
k + 1T
k 2
X
B
X
n
B
n
B
= B T
B
:
2
1
1
,
n
,
n
T
2
2
k =0
k =0
Let n
and use the definition of quadratic variation to get
!
1
Z
T
Bu dBu = B T
T:
2
1
1
,
2
2
0
1
Remark 14.4 (Reason for the 2T term) If f is differentiable with f0 = 0, then
T
T
Z
fu dfu = Z fuf0u du
0
0
T
= 12f2u 0
= 12f2T:
In contrast, for Brownian motion, we have
T
Z
BudBu = 1
1
0
2B2T , 2T:
1
The extra term 2T comes from the nonzero quadratic variation of Brownian motion. It has to be
there, because
IE T
Z
Bu dBu = 0
(It ˆo integral is a martingale)
0
but
IE12B2T = 12T:
14.10
Quadratic variation of an Itˆo integral
Theorem 10.48 (Quadratic variation of Itˆo integral) Let
It = t
Z
u dBu:
0
Then
I t = t
Z
2u du:
h
i
0
166
This holds even if
is not an elementary process. The quadratic variation formula says that at each
time u, the instantaneous absolute volatility of I is 2u. This is the absolute volatility of the
Brownian motion scaled by the size of the position (i.e. t) in the Brownian motion. Informally,
we can write the quadratic variation formula in differential form as follows:
dIt dIt = 2t dt:
Compare this with
dBt dBt = dt:
Proof: (For an elementary process ). Let = t0;t1;::: ;tn be the partition for , i.e., t =
f
g
tk for tk t tk+1. To simplify notation, assume t = tn. We have
n 1
I t = ,
X
I tk+1 I tk :
h
i
h
i
,
h
i
k=0
Let us compute I tk+1
I tk. Let = s0;s1;::: ;sm be a partition
h
i
,
h
i
f
g
tk = s0 s1 ::: sm = tk+1:
Then
sj+1
Is
Z
j+1 Isj =
tk dBu
,
sj
= tk Bsj+1 Bsj ;
,
so
m 1
I t
,
k
X
+1
I tk =
Isj+1 Isj 2
h
i
,
h
i
,
j=0 m 1
= 2t ,
k X Bsj+1 Bsj 2
,
j=0
0 2tktk+1 tk:
jj
jj!
,
,
,
,
,
,
!
It follows that
n 1
I t = ,
X
2tktk+1 tk
h
i
,
k=0
n 1 tk+1
=
Z
,
X
2u du
k=0 tk
0
t
Z
2u du:
jj
jj!
0
,
,
,
,
,
,
!
Chapter 15
Itˆo’s Formula
15.1
Itˆo’s formula for one Brownian motion
We want a rule to “differentiate” expressions of the form fBt, where fx is a differentiable
function. If Bt were also differentiable, then the ordinary chain rule would give
ddtfBt=f0BtB0t;
which could be written in differential notation as
dfBt = f0BtB0t dt
= f0BtdBt
However, Bt is not differentiable, and in particular has nonzero quadratic variation, so the correct
formula has an extra term, namely,
dfBt = f
:
0
Bt dBt + 12f00Bt dt
|
z
dBt dBt
This is It ˆo’s formula in differential form. Integrating this, we obtain It ˆo’s formula in integral form:
fBt fB0 = t
t
Z
f
Z
f
0
Bu dBu + 1
00
Bu du:
,
0
2 0
|
z
f0
Remark 15.1 (Differential vs. Integral Forms) The mathematically meaningful form of It ˆo’s for-
mula is It ˆo’s formula in integral form:
fBt fB0 = t
t
Z
f
Z
f
0
Bu dBu + 1
00
Bu du:
,
0
2 0
167
168
This is because we have solid definitions for both integrals appearing on the right-hand side. The
first,
t
Z
f0Bu dBu
0
is an It ˆo integral, defined in the previous chapter. The second,
t
Z
f00Bu du;
0
is a Riemann integral, the type used in freshman calculus.
For paper and pencil computations, the more convenient form of It ˆo’s rule is It ˆo’s formula in differ-
ential form:
dfBt = f0Bt dBt + 12f00Bt dt:
There is an intuitive meaning but no solid definition for the terms dfBt;dBt and dt appearing
in this formula. This formula becomes mathematically respectable only after we integrate it.
15.2
Derivation of Itˆo’s formula
Consider fx = 12x2, so that
f0x = x; f00x = 1:
Let xk;xk+1 be numbers. Taylor’s formula implies
fxk+1 fxk = xk+1 xkf
x
0 xk + 1
k2f00xk:
,
,
2xk+1 ,
In this case, Taylor’s formula to second order is exact because f is a quadratic function.
In the general case, the above equation is only approximate, and the error is of the order of xk+1 ,
xk3. The total error will have limit zero in the last step of the following argument.
Fix T
0 and let = t0;t1;::: ;tn be a partition of 0;T . Using Taylor’s formula, we write:
f
g
fBT fB0
,
= 1
1
2B2T , 2B20
n 1
= ,
X
fBtk+1 fBtk
,
k=0
n 1
n 1
= ,
,
X
Btk+1 Btk f
X
B
B
0
Btk + 1
tk+1
tk 2 f00Btk
,
2
,
k=0
k=0
n 1
n 1
= ,
,
X
Btk Btk+1 Btk + 1 X Btk+1 Btk 2 :
,
2
,
k=0
k=0
CHAPTER 15. It ˆo’s Formula
169
We let
0 to obtain
jj
jj!
fBT fB0 = T
Z
Bu dBu + 1 B T
,
0
2 h i
|
z
T
T
T
= Z f
Z
f
du:
0
Bu dBu + 1
00 Bu
0
2 0 | z1
This is It ˆo’s formula in integral form for the special case
fx = 12x2:
15.3
Geometric Brownian motion
Definition 15.1 (Geometric Brownian Motion) Geometric Brownian motion is
St = S0expn Bt +
1 2to;
,
2
where and
0 are constant.
Define
ft;x = S0expn x +
1 2to;
,
2
so
St = ft;Bt:
Then
ft =
1 2f; fx = f; fxx = 2f:
,
2
According to It ˆo’s formula,
dSt = dft;Bt
= ftdt + fxdB + 12fxxdB
|
dB
z
dt
= 1 2f dt + f dB + 1 2f dt
,
2
2
= Stdt + St dBt
Thus, Geometric Brownian motion in differential form is
dSt = Stdt + St dBt;
and Geometric Brownian motion in integral form is
St = S0 + t
t
Z
Su du + Z Su dBu:
0
0
170
15.4
Quadratic variation of geometric Brownian motion
In the integral form of Geometric Brownian motion,
S
t
t
t = S0+ Z Su du + Z Su dBu;
0
0
the Riemann integral
F
t
t = Z Su du
0
is differentiable with F t = St
0
. This term has zero quadratic variation. The It ˆo integral
G
t
t = Z Su dBu
0
is not differentiable. It has quadratic variation
G t = t
Z
2S2u du:
h
i
0
Thus the quadratic variation of S is given by the quadratic variation of G. In differential notation,
we write
dSt dSt = Stdt + StdBt2 = 2S2t dt
15.5
Volatility of Geometric Brownian motion
Fix 0
T1 T2. Let = t0;::: ;tn be a partition of T1;T2 . The squared absolute sample
f
g
volatility of S on T1;T2 is
1 n 1
T2
Z
,
1
X
2
T
St
St
S2u du
2 T1
k+1
k 2
,
'
T2 T1
,
k=0
,
T1
2S2T1
'
As T2
T1, the above approximation becomes exact. In other words, the instantaneous relative
volatility of S is 2. This is usually called simply the volatility of S.
15.6
First derivation of the Black-Scholes formula
Wealth of an investor. An investor begins with nonrandom initial wealth X0 and at each time t,
holds t shares of stock. Stock is modelled by a geometric Brownian motion:
dSt = Stdt + StdBt:
CHAPTER 15. It ˆo’s Formula
171
t can be random, but must be adapted. The investor finances his investing by borrowing or
lending at interest rate r.
Let Xt denote the wealth of the investor at time t. Then
dXt = tdSt + r Xt tSt dt
,
= t Stdt + StdBt + r Xt tSt dt
,
= rXtdt + tSt r dt + tSt dBt:
,
|
z
Risk premium
Value of an option. Consider an European option which pays gST at time T . Let vt; x denote
the value of this option at time t if the stock price is St = x. In other words, the value of the
option at each time t
0;T is
2
vt;St:
The differential of this value is
dvt;St = vtdt + vxdS + 12vxxdS dS
= vtdt + vx S dt + S dB + 12vxx 2S2 dt
= hv
i
t + Svx + 1 2
2 S2vxx dt + SvxdB
A hedging portfolio starts with some initial wealth X0 and invests so that the wealth Xt at each
time tracks vt; St. We saw above that
dXt = rX + rS dt + SdB:
,
To ensure that Xt = vt; St for all t, we equate coefficients in their differentials. Equating the
dB coefficients, we obtain the -hedging rule:
t = vxt;St:
Equating the dt coefficients, we obtain:
vt + Svx + 1 2
2 S2vxx = rX + rS:
,
But we have set = vx, and we are seeking to cause X to agree with v. Making these substitutions,
we obtain
vt + Svx + 1 2
2 S2vxx = rv + vx rS;
,
(where v = vt; St and S = St) which simplifies to
vt + rSvx + 1 2
2 S2vxx = rv:
In conclusion, we should let v be the solution to the Black-Scholes partial differential equation
vtt;x + rxvxt;x + 1 2
2 x2vxxt; x = rvt; x
satisfying the terminal condition
vT;x = gx:
If an investor starts with X0 = v0;S0 and uses the hedge t = vxt; St, then he will have
Xt = vt;St for all t, and in particular, XT = gST.
172
15.7
Mean and variance of the Cox-Ingersoll-Ross process
The Cox-Ingersoll-Ross model for interest rates is
drt = ab crtdt + qrt dBt;
,
where a; b; c; and r0 are positive constants. In integral form, this equation is
rt = r0 + a t
t
Z
b cru du + Z qru dBu:
0 ,
0
We apply It ˆo’s formula to compute dr2t. This is dfrt, where fx = x2. We obtain
dr2t = dfrt
= f0rt drt + 12f00rt drt drt
2
= 2rt ab crt dt + qrt dBt + ab crt dt + qrt dBt
,
,
= 2abrt dt 2acr2t dt + 2 r3t dBt + 2rt dt
2
,
= 2ab + 2rt dt 2acr2t dt + 2 r3t dBt
2
,
The mean of rt. The integral form of the CIR equation is
rt = r0 + a t
t
Z
b cru du + Z qru dBu:
0 ,
0
Taking expectations and remembering that the expectation of an It ˆo integral is zero, we obtain
IErt = r0 + a t
Z
b cIEru du:
0 ,
Differentiation yields
ddtIErt=ab cIErt=ab acIErt;
,
,
which implies that
d
h
dt eactIErti = eact acIErt + ddtIErt = eactab:
Integration yields
eactIEr
t
t r0 = abZ eacu du = b
1:
,
0
ceact ,
We solve for IErt:
IEr
t = b
act
b
c + e
r
:
,
0 , c
If r0 = bc, then IErt = bc for every t. If r0 = b
6
c, then rt exhibits mean reversion:
lim
t
IErt = bc:
!1
CHAPTER 15. It ˆo’s Formula
173
Variance of rt. The integral form of the equation derived earlier for dr2t is
r2t = r20 + 2ab + 2 t
t
t
Z
ru du 2acZ r2u du + 2 Z r3u dBu:
0
2
,
0
0
Taking expectations, we obtain
IEr2
t
t
t = r20 + 2ab + 2Z IEru du 2acZ IEr2u du:
0
,
0
Differentiation yields
ddtIEr2t=2ab+ 2IErt 2acIEr2t;
,
which implies that
d
dte2actIEr2t = e2act 2acIEr2t + ddtIEr2t
= e2act2ab + 2IErt:
Using the formula already derived for IErt and integrating the last equation, after considerable
algebra we obtain
!
IEr2
2
t = b 2
b
act
2ac2 + b2c2 + r0
e,
,
c ac + 2bc
2 2
2
+ r0 b
2act + b r0
e 2act:
,
,
c ace,
ac 2c ,
varrt = IEr2t IErt2
,
2
2
= b 2
b
act
b
2act
2ac2 + r0
+
r0 e
:
,
,
c ace,
ac 2c ,
15.8
Multidimensional Brownian Motion
Definition 15.2 (d-dimensional Brownian Motion) A d-dimensional Brownian Motion is a pro-
cess
Bt = B1t;::: ;Bdt
with the following properties:
Each Bkt is a one-dimensional Brownian motion;
If i = j, then the processes Bit and Bjt are independent.
6
Associated with a d-dimensional Brownian motion, we have a filtration
t such that
fF
g
For each t, the random vector Bt is t-measurable;
F
For each t
t1 ::: tn, the vector increments
Bt1 Bt;::: ;Btn Btn 1
,
,
,
are independent of
t.
F
174
15.9
Cross-variations of Brownian motions
Because each component Bi is a one-dimensional Brownian motion, we have the informal equation
dBit dBit = dt:
However, we have:
Theorem 9.49 If i = j,
6
dBit dBjt = 0
Proof: Let = t0;::: ;tn be a partition of 0;T . For i = j, define the sample cross variation
f
g
6
of Bi and Bj on 0;T to be
n 1
C
,
= X Bitk+1 Bitk Bjtk+1 Bjtk :
,
,
k=0
The increments appearing on the right-hand side of the above equation are all independent of one
another and all have mean zero. Therefore,
IEC = 0:
We compute varC. First note that
n 1
C2
2
2
,
X
=
Bitk+1 Bitk Bjtk+1 Bjtk
,
,
k=0
n 1
+ 2 ,
X
Bit`+1 Bit` Bjt`+1 Bjt` : Bitk+1 Bitk Bjtk+1 Bjtk
,
,
,
,
` k
All the increments appearing in the sum of cross terms are independent of one another and have
mean zero. Therefore,
varC = IEC2
n 1
= IE ,
X
Bitk+1 Bitk 2 Bjtk+1 Bjtk 2 :
,
,
k=0
But Bitk+1 Bitk 2 and Bjtk+1 Bjtk 2 are independent of one another, and each has
,
,
expectation tk+1
tk. It follows that
,
n 1
n 1
varC
,
,
= Xtk+1 tk2
Xtk+1 tk = :T:
,
jj
jj
,
jj
jj
k=0
k=0
As
0, we have varC 0, so C converges to the constant IEC = 0.
jj
jj!
!
CHAPTER 15. It ˆo’s Formula
175
15.10
Multi-dimensional Itˆo formula
To keep the notation as simple as possible, we write the It ˆo formula for two processes driven by a
two-dimensional Brownian motion. The formula generalizes to any number of processes driven by
a Brownian motion of any number (not necessarily the same number) of dimensions.
Let X and Y be processes of the form
X
t
t
t
t = X0+ Z u du + Z
0
0 11u dB1u + Z0 12u dB2u;
Y t = Y 0 + t
t
t
Z
u du + Z
0
0 21u dB1u + Z0 22u dB2u:
Such processes, consisting of a nonrandom initial condition, plus a Riemann integral, plus one or
more It ˆo integrals, are called semimartingales. The integrands
u; u; and iju can be any
adapted processes. The adaptedness of the integrands guarantees that X and Y are also adapted. In
differential notation, we write
dX = dt + 11 dB1 + 12 dB2;
dY = dt + 21 dB1 + 22 dB2:
Given these two semimartingales X and Y , the quadratic and cross variations are:
dX dX = dt + 11 dB1 + 12 dB22;
= 211 dB
+2 11 12 dB
+ 212 dB
|
2 dB
z
2
|
1 dB
z
2
|
1 dB
z
1
dt
0
dt
= 211 + 2122 dt;
dY dY = dt + 21 dB1 + 22 dB22
= 221 + 2222 dt;
dX dY = dt + 11 dB1 + 12 dB2 dt + 21 dB1 + 22 dB2
= 11 21 + 12 22 dt
Let ft; x; y be a function of three variables, and let Xt and Y t be semimartingales. Then we
have the corresponding It ˆo formula:
dft;x;y = ft dt + fx dX + fy dY + 12 fxx dX dX + 2fxy dX dY + fyy dY dY :
In integral form, with X and Y as decribed earlier and with all the variables filled in, this equation
is
ft;Xt;Yt f0;X0;Y0
,
t
= Z f
0 t + fx + fy + 12 211 + 212fxx + 11 21 + 12 22fxy + 12 221 + 222fyy du
+ t
t
Z
0 11fx + 21fy dB1 + Z0 12fx + 22fy dB2;
where f = fu; Xu;Y u, for i; j
1;2 , ij = iju, and Bi = Biu.
2
f
g
176
Chapter 16
Markov processes and the Kolmogorov
equations
16.1
Stochastic Differential Equations
Consider the stochastic differential equation:
dXt = at;Xt dt + t;Xt dBt:
(SDE)
Here at; x and t; x are given functions, usually assumed to be continuous in t; x and Lips-
chitz continuous in x,i.e., there is a constant L such that
at;x at;y L x y ;
t;x t;y L x y
j
,
j
j
,
j
j
,
j
j
,
j
for all t; x; y.
Let t0;x be given. A solution to (SDE) with the initial condition t0;x is a process Xt t t
f
g
0
satisfying
Xt0 = x;
t
Zt
Xt = Xt
Z
0 + as; Xs ds +
s;Xs dBs;
t t0
t
t
0
0
The solution process Xt t t will be adapted to the filtration
t t 0 generated by the Brow-
f
g
fF
g
0
nian motion. If you know the path of the Brownian motion up to time t, then you can evaluate
Xt.
Example 16.1 (Drifted Brownian motion) Let a be a constant and = 1, so
dXt = a dt + dBt:
If t ;x is given and we start with the initial condition
0
Xt = x;
0
177
178
then
Xt = x + at t + Bt Bt ;
t t :
,
,
0
0
0
To compute the differential w.r.t. t, treat t and Bt as constants:
0
0
dXt = a dt + dBt:
Example 16.2 (Geometric Brownian motion) Let r and be constants. Consider
dXt = rXt dt + Xt dBt:
Given the initial condition
Xt = x;
0
the solution is
Xt = xexp Bt Bt + r
t t :
2
1
,
,
,
0
0
2
Again, to compute the differential w.r.t. t, treat t and Bt as constants:
0
0
dXt = r
Xt dt + Xt dBt +
Xt dt
2
2
1
1
,
2
2
= rXt dt + Xt dBt:
16.2
Markov Property
Let 0
t0 t1 be given and let hy be a function. Denote by
IEt ;xhXt
0
1
the expectation of hXt1, given that Xt0 = x. Now let
IR be given, and start with initial
2
condition
X0 = :
We have the Markov property
IE0; h
Xt1 t0 = IEt ;Xt hXt
0
0
1:
F
In other words, if you observe the path of the driving Brownian motion from time 0 to time t0, and
based on this information, you want to estimate hXt1, the only relevant information is the value
of Xt0. You imagine starting the SDE at time t0 at value Xt0, and compute the expected
value of hXt1.
CHAPTER 16. Markov processes and the Kolmogorov equations
179
16.3
Transition density
Denote by
pt0;t1; x;y
the density (in the y variable) of Xt1, conditioned on Xt0 = x. In other words,
IEt ;xhXt
hypt
0
1 = ZIR
0;t1; x; y dy:
The Markov property says that for 0
t0 t1 and for every ,
IE0; hXt1 t0 = Z hypt0;t1; Xt0;y dy:
F
IR
Example 16.3 (Drifted Brownian motion) Consider the SDE
dXt = a dt + dBt:
Conditioned on Xt = x, the random variable Xt is normal with mean x + at
t and variance
,
0
1
1
0
t t , i.e.,
,
1
0
pt ;t ; x;y =
1
y x + at t 2
,
,
1
0
p
0
1
2t t exp ,
2t t
:
,
,
1
0
1
0
Note that p depends on t and t only through their difference t
t . This is always the case when at;x
,
0
1
1
0
and t; x don’t depend on t.
Example 16.4 (Geometric Brownian motion) Recall that the solution to the SDE
dXt = rXt dt + Xt dBt;
with initial condition Xt = x, is Geometric Brownian motion:
0
Xt = xexp Bt Bt + r
t t :
2
1
,
,
,
1
1
0
1
0
2
The random variable Bt
Bt has density
,
1
0
IP Bt Bt db =
1
b2
p
f
,
2
g
1
0
2t t exp ,2t t db;
,
,
1
0
1
0
and we are making the change of variable
y = xexp b + r
t t
2
1
,
,
1
0
2
or equivalently,
b = 1 hlog yx r
t t i:
2
1
,
,
,
1
0
2
The derivative is
dy
db = y;
or equivalently,
db = dyy:
180
Therefore,
pt ;t ;x;y dy = IP Xt dy
f
2
g
0
1
1
=
1
1
h
y
log y r
t t i2 dy:
2
1
p2t
t exp ,2t t
x , ,
,
1
0
2
2
,
,
1
0
1
0
Using the transition density and a fair amount of calculus, one can compute the expected payoff from a
European call:
Z
IE XT K = 1y K pt;T;x;y dy
t;x
+
+
,
,
0
= e
1
h
r T
,t
xN
log x
t + 2T ti
1
p
T t
K + rT ,
,
2
,
KN
1
h
log x
t
T ti
1
2
p
,
T t
K + rT , ,
,
2
,
where
Z
Z
N = 1
e
1
1
1
2
2
,
x
dx = 1
e, x dx:
2
2
p2
p
2
,1
,
Therefore,
IE e
XT K t = e
IE
XT K
0;
,r T
,t
+
,r T
,t
t;X
t
+
,
F
,
= XtN
1
log Xt
t +
T t
2
1
p
T t
K + rT ,
,
2
,
e
K N
1
log Xt
t
T t
,r T
,t
2
1
p
,
T t
K + rT , ,
,
2
,
16.4
The Kolmogorov Backward Equation
Consider
dXt = at;Xt dt + t;Xt dBt;
and let pt0;t1; x; y be the transition density. Then the Kolmogorov Backward Equation is:
@ pt0;t1; x;y = at0;x @
2t0;x @2
,
@t0
@xpt0;t1; x;y + 12
@x2pt0;t1; x;y: (KBE)
The variables t0 and x in KBE are called the backward variables.
In the case that a and are functions of x alone, pt0;t1; x; y depends on t0 and t1 only through
their difference
= t1 t0. We then write p ; x;y rather than pt0;t1; x;y, and KBE
,
becomes
@
2
@ p ; x;y = ax @@xp ; x;y + 12 x @2
@x2p ; x;y:
(KBE’)
CHAPTER 16. Markov processes and the Kolmogorov equations
181
Example 16.5 (Drifted Brownian motion)
dXt = a dt + dBt
p ; x;y = 1
y x + a 2
,
p
2 exp ,
2
:
@
@ 1
y x a 2
,
,
@ p = p = @
exp
p
2
,
2
@ y x a
1
y x a
2
2
,
,
,
,
p
,
@
2
2 exp ,
2
=
1
x a
x a
,
+ y , ,
p:
,
2 + ay ,
2 2
@
x a
,
@xp = p = y ,
p:
x
@
@ y x a
x a
2
,
,
,
@x p = p =
p + y ,
p
xx
@x
x
2
= 1p + y x a 2
,
,
p:
,
2
Therefore,
ap + p = ay x a 1
x a 2
,
,
,
p
1
,
x
xx
2 + y , 2
2
2
= p :
This is the Kolmogorov backward equation.
Example 16.6 (Geometric Brownian motion)
dXt = rXt dt + Xt dBt:
p ; x;y =
1
1 h
y
log y r
2
i2 :
1
p2 exp ,2
x , , 2
2
It is true but very tedious to verify that p satisfies the KBE
p = rxp +
x p :
1
2
2
x
xx
2
16.5
Connection between stochastic calculus and KBE
Consider
dXt = aXt dt + Xt dBt:
(5.1)
Let hy be a function, and define
vt;x = IEt;xhXT;
182
where 0
t T. Then
vt;x = Z hy pT t; x;y dy;
,
v
Z
tt; x =
hy p T t; x;y dy;
,
,
vxt;x = Z hy pxT t; x;y dy;
,
vxxt;x = Z hy pxxT t; x;y dy:
,
Therefore, the Kolmogorov backward equation implies
vtt;x + axvxt;x + 1 2
2 xvxxt; x =
Z
hyh p T t;x;y + axpxT t;x;y + 1 2xpxxT t;x;yi dy = 0
,
,
,
2
,
Let 0; be an initial condition for the SDE (5.1). We simplify notation by writing IE rather than
IE0;.
Theorem 5.50 Starting at X0 = , the process vt; Xt satisfies the martingale property:
IE v
t;Xt s = vs;Xs;
0 s t T:
F
Proof: According to the Markov property,
IE h
XT t = IEt;XthXT = vt;Xt;
F
so
IE vt;Xt s = IE IE hXT t s
jF
F
F
= IE hXT s
F
= IEs;XshXT
(Markov property)
= vs;Xs:
It ˆo’s formula implies
dvt;Xt = vtdt + vxdX + 12vxxdX dX
= vtdt + avxdt + vxdB + 1 2
2 vxxdt:
CHAPTER 16. Markov processes and the Kolmogorov equations
183
In integral form, we have
vt;Xt = v0;X0
t
+ Z hv
2Xuv
0
tu; Xu + aXuvxu; Xu + 12
xxu; Xui du
t
+ Z Xuv
0
xu; Xu dBu:
t h
i
R
We know that vt; Xt is a martingale, so the integral 0 vt + avx + 12 2vxx du must be zero
for all t. This implies that the integrand is zero; hence
vt + avx + 1 2
2 vxx = 0:
Thus by two different arguments, one based on the Kolmogorov backward equation, and the other
based on It ˆo’s formula, we have come to the same conclusion.
Theorem 5.51 (Feynman-Kac) Define
vt;x = IEt;xhXT;
0 t T;
where
dXt = aXt dt + Xt dBt:
Then
vtt;x + axvxt;x + 1 2
2 xvxxt; x = 0
(FK)
and
vT;x = hx:
The Black-Scholes equation is a special case of this theorem, as we show in the next section.
Remark 16.1 (Derivation of KBE) We plunked down the Kolmogorov backward equation with-
out any justification. In fact, one can use It ˆo’s formula to prove the Feynman-Kac Theorem, and use
the Feynman-Kac Theorem to derive the Kolmogorov backward equation.
16.6
Black-Scholes
Consider the SDE
dSt = rSt dt + St dBt:
With initial condition
St = x;
the solution is
Su = xexpn Bu Bt + r 1 2u to;
u t:
,
,
2
,
184
Define
vt;x = IEt;xhST
= IEh
xexpn BT Bt + r 1 2T to;
,
,
2
,
where h is a function to be specified later.
Recall the Independence Lemma: If
is a -field, X is -measurable, and Y is independent of ,
G
G
G
then
IE h
X;Y = X;
G
where
x = IEhx;Y :
With geometric Brownian motion, for 0
t T, we have
St = S0expn Bt + r 1 2to;
,
2
ST = S0expn BT + r 1 2To
,
2
=
St expn BT Bt + r 1 2T to
,
,
2
,
|
z
t
|
z
-measurable
t
F
independent of F
We thus have
ST = XY;
where
X = St
Y = expn BT Bt + r 1 2T to:
,
,
2
,
Now
IEhxY = vt;x:
The independence lemma implies
IE hST t = IE hXY t
F
jF
= vt;X
= vt;St:
CHAPTER 16. Markov processes and the Kolmogorov equations
185
We have shown that
vt;St = IE hST t;
0 t T:
F
Note that the random variable hST whose conditional expectation is being computed does not
depend on t. Because of this, the tower property implies that vt; St; 0
t T, is a martingale:
For 0
s t T,
IE v
t;St s = IE IE hST t
s
F
F
F
= IE hST s
F
= vs;Ss:
This is a special case of Theorem 5.51.
Because vt; St is a martingale, the sum of the dt terms in dvt; St must be 0. By Itˆo’s
formula,
dvt;St = hvtt;St dt + rStvxt;St + 1 2
2 S2tvxxt; Sti dt
+ Stvxt;St dBt:
This leads us to the equation
vtt;x + rxvxt;x + 1 2
2 x2vxxt; x = 0;
0 t T; x 0:
This is a special case of Theorem 5.51 (Feynman-Kac).
Along with the above partial differential equation, we have the terminal condition
vT;x = hx;
x 0:
Furthermore, if St = 0 for some t
0;T , then also ST = 0. This gives us the boundary
2
condition
vt;0 = h0;
0 t T:
Finally, we shall eventually see that the value at time t of a contingent claim paying hST is
ut;x = e rT tIEt;xh
,
,
ST
= e rT tv
,
,
t;x
at time t if St = x. Therefore,
vt;x = erT tut;x;
,
vtt;x = rerT tu
tu
,
t;x + erT, tt;x;
,
vxt;x = erT tu
,
xt; x;
vxxt;x = erT tu
,
xxt; x:
186
t
Plugging these formulas into the partial differential equation for v and cancelling the erT, ap-
pearing in every term, we obtain the Black-Scholes partial differential equation:
rut;x + utt;x + rxuxt;x + 1 2x2uxxt;x = 0;
0 t T; x 0:
,
2
(BS)
Compare this with the earlier derivation of the Black-Scholes PDE in Section 15.6.
In terms of the transition density
p
2
t;T; x;y =
1
1
1 2
y
log y r
T t
p
2T t exp ,2T t 2
x , , 2
,
,
,
for geometric Brownian motion (See Example 16.4), we have the “stochastic representation”
ut;x = e rT tIEt;xh
,
,
ST
(SR)
= e rT t Z 1 h
,
,
ypt;T; x;y dy:
0
In the case of a call,
hy = y K+
,
and
u
t;x = x N 1
log x
t + 1 2T t
pT
t
K + rT ,
2
,
,
e rT tK N 1
t 1 2
t
,
,
log x
T
p
,
T t
K + rT , , 2
,
,
Even if hy is some other function (e.g., hy = K
y+, a put), ut;x is still given by and
,
satisfies the Black-Scholes PDE (BS) derived above.
16.7
Black-Scholes with price-dependent volatility
dSt = rSt dt + St dBt;
vt;x = e rT tIEt;x
K
,
,
ST
+:
,
The Feynman-Kac Theorem now implies that
rvt;x+ vtt;x + rxvxt;x + 1 2xvxxt;x = 0;
0 t T; x 0:
,
2
v also satisfies the terminal condition
vT;x = x K+;
x 0;
,
CHAPTER 16. Markov processes and the Kolmogorov equations
187
and the boundary condition
vt;0 = 0;
0 t T:
An example of such a process is the following from J.C. Cox, Notes on options pricing I: Constant
elasticity of variance diffusions, Working Paper, Stanford University, 1975:
dSt = rSt dt + S t dBt;
where 0
1. The “volatility” S 1t
,
decreases with increasing stock price. The corre-
sponding Black-Scholes equation is
rv + vt + rxvx + 1 2x2 vxx = 0;
0 t T x 0;
,
2
vt;0 = 0;
0 t T
vT;x = x K+;
x 0:
,
188
Chapter 17
Girsanov’s theorem and the risk-neutral
measure
(Please see Oksendal, 4th ed., pp 145–151.)
Theorem 0.52 (Girsanov, One-dimensional) Let Bt; 0
t T, be a Brownian motion on
a probability space ; ; . Let
t;0
t T, be the accompanying filtration, and let
F
P
F
t;0 t T, be a process adapted to this filtration. For 0 t T, define
Bt = t
Z
u du + Bt;
e
0
Zt = exp
t
t
Z
u dBu 1 Z 2u du ;
,
0
,
2 0
and define a new probability measure by
IPA = Z ZT dIP;
A
:
f
A
8
2
F
Under IP
Bt;0 t T
f, the process e
, is a Brownian motion.
Caveat: This theorem requires a technical condition on the size of . If
IE
T
exp 1 Z
2
2u du
;
0
1
everything is OK.
We make the following remarks:
Zt is a matingale. In fact,
dZt = tZt dBt + 1
1
,
22tZt dBt dBt , 22tZt dt
= tZt dBt:
,
189
190
IP
0
f
is a probability measure. Since Z0 = 1, we have IEZt = 1 for every t
. In particular
IP = Z ZT dIP = IEZT = 1;
f
so IP
f
is a probability measure.
IE
IE
IP
f
in terms of IE. Let f denote expectation under f. If X is a random variable, then
IEZ = IE ZTX :
f
To see this, consider first the case X = 1A, where A
. We have
2
F
IEX = IPA = Z ZT dIP = Z ZT1
f
f
A
A dIP = IE ZTX :
Now use Williams’ “standard machine”.
IP
f
and IP . The intuition behind the formula
IPA = Z ZT dIP
A
f
A
8
2
F
is that we want to have
IP! = ZT;!IP!;
f
but since IP ! = 0 and IP ! = 0
IP
f
, this doesn’t really tell us anything useful about f. Thus,
we consider subsets of
, rather than individual elements of
.
Distribution of BT
e
. If is constant, then
ZT = expn BT 1
,
,
22To
BT = T + BT:
e
Under IP , BT is normal with mean 0 and variance T , so BT
e
is normal with mean T and
variance T :
IPBT d~b = 1
~b T2
,
d~b:
e
p
2
2T exp , 2T
Removal of Drift from BT
IP
BT
e
. The change of measure from IP to f removes the drift from e
.
To see this, we compute
IEBT = IE ZTT + BT
f
e
= IE hexpn BT 1
,
,
22To T + BTi
= 1 Z
b2
1
T + bexp b 1
exp
db
p2T
f,
,
22Tg
,
2T
,1
= 1 Z
b + T2
1
T + bexp
db
p2T
,
2T
,1
y = T + b = 1 Z
y2
1
y exp
dy (Substitute y = T + b)
p2T
,
2
,1
= 0:
CHAPTER 17. Girsanov’s theorem and the risk-neutral measure
191
We can also see that IEBT = 0
f
e
by arguing directly from the density formula
IP
~b T2
nBt d~bo = 1
,
d~b:
e
p
2
2T exp , 2T
Because
ZT = exp BT 1
f,
,
22Tg
= exp BT T 1
e
f,
,
,
22Tg
= exp BT + 1
;
e
f,
22Tg
we have
IP nBT d~bo = IP nBT d~bo expn ~b + 1
f
e
e
2
2
,
22To
= 1
~b T2
,
~b+ 1
d~b:
p2T exp ,
2T
,
22T
~
= 1
b2 d~b:
p2T exp ,2T
Under IP BT
BT
f,
e
is normal with mean zero and variance T . Under IP , e
is normal with
mean T and variance T .
Means change, variances don’t. When we use the Girsanov Theorem to change the probability
measure, means change but variances do not. Martingales may be destroyed or created.
Volatilities, quadratic variations and cross variations are unaffected. Check:
dB dB = t dt + dBt2 = dB:dB = dt:
e
e
17.1
Conditional expectations under f
I
P
Lemma 1.53 Let 0
t T. If X is t-measurable, then
F
IEX = IE X:Zt :
f
Proof:
IEX = IE X:ZT = IE IE X:ZT t
f
jF
= IE X IE ZT t
jF
= IE X:Zt
because Zt; 0
t T, is a martingale under IP.
192
Lemma 1.54 (Baye’s Rule) If X is t-measurable and 0
s t T, then
F
IE X s = 1
s :
f
(1.1)
jF
ZsIE XZtjF
1
Proof: It is clear that ZsIE XZt s is s-measurable. We check the partial averaging
jF
F
property. For A
s, we have
2
F
Z
1
s dIP = IE 1 1
s
f
f
A ZsIE XZt
A
jF
ZsIE XZtjF
= IE 1AIE XZt s
(Lemma 1.53)
jF
= IE IE 1AXZt s
(Taking in what is known)
jF
= IE 1AXZt
= IE 1
f
AX
(Lemma 1.53 again)
= Z X dIP:
f
A
Although we have proved Lemmas 1.53 and 1.54, we have not proved Girsanov’s Theorem. We
will not prove it completely, but here is the beginning of the proof.
Lemma 1.55 Using the notation of Girsanov’s Theorem, we have the martingale property
IE Bt s = Bs;
0 s t T:
f
e
e
jF
Proof: We first check that BtZt
e
is a martingale under IP . Recall
dBt = t dt + dBt;
e
dZt = tZt dBt:
,
Therefore,
dBZ = B dZ + Z dB + dB dZ
e
e
e
e
= BZ dB + Z dt + Z dB Z dt
e
,
,
= BZ + Z dB:
e
,
Next we use Bayes’ Rule. For 0
s t T,
IE Bt s = 1
BtZt s
f
e
jF
ZsIE e
jF
= 1
ZsBsZs
e
= Bs:
e
CHAPTER 17. Girsanov’s theorem and the risk-neutral measure
193
Definition 17.1 (Equivalent measures) Two measures on the same probability space which have
the same measure-zero sets are said to be equivalent.
The probability measures IP and IP
IP
f
of the Girsanov Theorem are equivalent. Recall that f is
defined by
IPA = Z ZT dIP;
A
:
f
2
F
R
If IP A = 0, then A ZT dIP = 0: Because ZT 0 for every !, we can invert the definition
of IP
f
to obtain
IPA = Z 1 IP;
A
:
A ZT df
2
F
R
If IP A = 0
f
, then A 1
ZT dIP = 0:
17.2
Risk-neutral measure
As usual we are given the Brownian motion: Bt; 0
t T, with filtration t;0 t T,
F
defined on a probability space ; ; . We can then define the following.
F
P
Stock price:
dSt = tSt dt + tSt dBt:
The processes t and t are adapted to the filtration. The stock price model is completely
general, subject only to the condition that the paths of the process are continuous.
Interest rate: rt; 0
t T. The process rt is adapted.
Wealth of an agent, starting with X0 = x. We can write the wealth process differential in
several ways:
dXt = t dSt +rt Xt tSt dt
,
|
z
|
z
Capital gains from Stock
Interest earnings
= rtXt dt + t dSt rSt dt
,
= rtXt dt + tt rtSt dt + t tSt dBt
,
|
z
Risk premium
2
3
6
7
= rtXt dt + t tSt
t rt
6
7
,
dt + dBt
6
t
7
6
7
4
5
|
z
Market price of risk=t
194
Discounted processes:
de t
t
R
ru du
R
ru du
,
St = e,
rtSt dt + dSt
0
0
,
de t
t
R
ru du
R
ru du
,
Xt = e,
rtXt dt + dXt
0
0
,
= tde tRru du
,
St
:
0
Notation:
t = e t
t
R
ru du;
1
R
ru du;
0
t = e, 0
d t = rt t dt;
d 1
rt
t = , t dt:
The discounted formulas are
dSt
t = 1t rtSt dt + dSt
,
= 1t t rtStdt+ tSt dBt
,
= 1t tSt t dt+dBt ;
dXt
t = t dSt
t
= t
t tSt t dt + dBt :
Changing the measure. Define
Bt = t
Z
u du + Bt:
e
0
Then
dSt
t = 1t tSt dBt;
e
dXt
t = t
t tSt dBt:
e
St
Xt
Under IP
f ,
t and t are martingales.
Definition 17.2 (Risk-neutral measure) A risk-neutral measure (sometimes called a martingale
measure) is any probability measure, equivalent to the market measure IP , which makes all dis-
counted asset prices martingales.
CHAPTER 17. Girsanov’s theorem and the risk-neutral measure
195
For the market model considered here,
IPA = Z ZT dIP;
A
;
f
A
2
F
where
Z
t
t
t = exp Z u dBu 1 Z 2u du ;
,
0
,
2 0 rt
is the unique risk-neutral measure. Note that because t = t,t ; we must assume that t =
6
0.
Risk-neutral valuation. Consider a contingent claim paying an
T-measurable random variable
F
V at time T.
Example 17.1
V = ST K ;
+
European call
,
V = K ST ;
+
European put
,
!
+
Z
V = 1 T
T
Su du K ;
Asian call
,
0
V = max St;
Look back
0tT
If there is a hedging portfolio, i.e., a process t; 0
t T, whose corresponding wealth process
satisfies XT = V , then
X0 = IE V :
f
T
Xt
This is because
t is a martingale under IP
f , so
X
0 = X0
0 = IE XT = IE V :
f
T
f
T
196
Chapter 18
Martingale Representation Theorem
18.1
Martingale Representation Theorem
See Oksendal, 4th ed., Theorem 4.11, p.50.
Theorem 1.56 Let Bt; 0
t T; be a Brownian motion on ; ; . Let t;0 t T, be
F
P
F
the filtration generated by this Brownian motion. Let Xt; 0
t T, be a martingale (under IP)
relative to this filtration. Then there is an adapted process t; 0
t T, such that
Xt = X0 + t
Z
u dBu;
0 t T:
0
In particular, the paths of X are continuous.
Remark 18.1 We already know that if Xt is a process satisfying
dXt = t dBt;
then Xt is a martingale. Now we see that if Xt is a martingale adapted to the filtration generated
by the Brownian motion Bt, i.e, the Brownian motion is the only source of randomness in X, then
dXt = t dBt
for some t.
18.2
A hedging application
Homework Problem 4.5. In the context of Girsanov’s Theorem, suppse that
t;0 t T; is
F
the filtration generated by the Brownian motion B (under IP ). Suppose that Y is a IP
f-martingale.
Then there is an adapted process t; 0
t T, such that
Y t = Y 0 + t
Z
u dBu;
0 t T:
e
0
197
198
dSt = tSt dt + tSt dBt;
t
t = exp Z ru du ;
0
t = t rt
,
t ;
Bt = t
Z
u du + Bt;
e
0
Z
t
t
t = exp Z u dBu 1 Z 2u du ;
,
0
,
2 0
IPA = Z ZT dIP;
A
:
f
A
8
2
F
Then
dSt
t = St
t t dBt:
e
Let t; 0
t T; be a portfolio process. The corresponding wealth process Xt satisfies
dXt
t = t tSt
t dBt;
e
i.e.,
Xt
t
t = X0 + Z u uSu Bu;
0 t T:
0
u d e
Let V be an T -measurable random variable, representing the payoff of a contingent claim at
F
time T . We want to choose X0 and t; 0
t T, so that
XT = V:
Define the IP
f-martingale
Y t = IE V
t;
0 t T:
f
T F
According to Homework Problem 4.5, there is an adapted process t; 0
t T, such that
Y t = Y 0 + t
Z
u dBu;
0 t T:
e
0
i
Set X0 = Y 0 = IE h V
f
T and choose u so that
u uSu
u = u:
CHAPTER 18. Martingale Representation Theorem
199
With this choice of u; 0
u T, we have
Xt
t = Y t = IE V
t ;
0 t T:
f
T F
In particular,
XT
T = IE V
T = V
f
T F
T;
so
XT = V:
The Martingale Representation Theorem guarantees the existence of a hedging portfolio, although
it does not tell us how to compute it. It also justifies the risk-neutral pricing formula
X
t = tIE V
t
f
T F
= t
ZtIE ZT
TV t
F
= 1
tIE
TV t;
0 t T;
F
where
t = Zt
t
t
t
= exp Z u dBu Z ru + 1
,
0
,
0
22u du
18.3
-dimensional Girsanov Theorem
d
Theorem 3.57 (d-dimensional Girsanov)
Bt = B1t;::: ;Bdt;0 t T, a d-
dimensional Brownian motion on ; ; ;
F
P
t;0 t T; the accompanying filtration, perhaps larger than the one generated by B;
F
t = 1t;::: ;dt;0 t T, d-dimensionaladapted process.
For 0
t T; define
B
t
e j t = Z0 ju du + Bjt;
j = 1;::: ;d;
Zt = exp
t
t
Z
u: dBu 1 Z u 2 du ;
,
0
,
2 0 jj jj
IPA = Z ZT dIP:
f
A
200
Then, under IP
f , the process
Bt = B
B
t T;
e
e 1t;::: ; edt;
0
is a d-dimensional Brownian motion.
18.4
-dimensional Martingale Representation Theorem
d
Theorem 4.58
Bt = B1t;::: ;Bdt;0 t T; a d-dimensional Brownian motion
on ; ; ;
F
P
t;0 t T; the filtration generated by the Brownian motion B.
F
If Xt; 0
t T, is a martingale (under IP) relative to t;0 t T, then there is a
F
d-dimensionaladpated process t = 1t;::: ; dt, such that
Xt = X0+ t
Z
u: dBu;
0 t T:
0
Corollary 4.59 If we have a d-dimensional adapted process t = 1t;::: ;dt; then we can
define B; Z
IP
t T
IP
e
and f as in Girsanov’s Theorem. If Y t; 0
, is a martingale under f relative
to
t;0 t T, then there is a d-dimensional adpated process t = 1t;::: ; dt such
F
that
Y
t
t = Y 0 + Z u: dBu;
0 t T:
e
0
18.5
Multi-dimensional market model
Let Bt = B1t;::: ;Bdt; 0
t T, be a d-dimensional Brownian motion on some
; ; , and let t; 0 t T, be the filtration generated by B. Then we can define the
F
P
F
following:
Stocks
d
dSit = itSit dt + SitX ijt dBjt;
i = 1;::: ;m
j=1
Accumulation factor
t
t = exp Z ru du :
0
Here, it; ijt and rt are adpated processes.
CHAPTER 18. Martingale Representation Theorem
201
Discounted stock prices
d
dSit
Sit
X
t = it rt
ijt dBjt
,
t dt + Sit
t
|
z
j=1
Risk Premium
?
d
= Sit X
t
ijt jt + dBjt
(5.1)
j=1
|
z
dBjt
e
For 5.1 to be satisfied, we need to choose 1t;::: ;dt, so that
d
X
ijtjt = it rt;
i = 1;::: ;m:
(MPR)
,
j=1
Market price of risk. The market price of risk is an adapted process t = 1t;::: ;dt
satisfying the system of equations (MPR) above. There are three cases to consider:
Case I: (Unique Solution). For Lebesgue-almost every t and IP -almost every !, (MPR) has a
unique solution t. Using t in the d-dimensional Girsanov Theorem, we define a unique
risk-neutral probability measure IP
IP
f . Under f, every discounted stock price is a martingale.
Consequently, the discounted wealth process corresponding to any portfolio process is a IP
f -
martingale, and this implies that the market admits no arbitrage. Finally, the Martingale
Representation Theorem can be used to show that every contingent claim can be hedged; the
market is said to be complete.
Case II: (No solution.) If (MPR) has no solution, then there is no risk-neutral probability measure
and the market admits arbitrage.
Case III: (Multiple solutions). If (MPR) has multiple solutions, then there are multiple risk-neutral
probability measures. The market admits no arbitrage, but there are contingent claims which
cannot be hedged; the market is said to be incomplete.
Theorem 5.60 (Fundamental Theorem of Asset Pricing) Part I. (Harrison and Pliska, Martin-
gales and Stochastic integrals in the theory of continuous trading, Stochastic Proc. and Applications
11 (1981), pp 215-260.):
If a market has a risk-neutral probability measure, then it admits no arbitrage.
Part II. (Harrison and Pliska, A stochastic calculus model of continuous trading: complete markets,
Stochastic Proc. and Applications 15 (1983), pp 313-316):
The risk-neutral measure is unique if and only if every contingent claim can be hedged.
202
Chapter 19
A two-dimensional market model
Let Bt = B1t;B2t; 0
t T; be a two-dimensional Brownian motion on ; ; . Let
F
P
t;0 t T; be the filtration generated by B.
F
In what follows, all processes can depend on t and !, but are adapted to t; 0
t T. To
F
simplify notation, we omit the arguments whenever there is no ambiguity.
Stocks:
dS1 = S1 1 dt + 1 dB1 ;
dS
2 = S2 2 dt + 2 dB1 + q1
2 2 dB2 :
,
We assume 1
0; 2 0; 1
1: Note that
,
dS1 dS2 = S21 21 dB1 dB1 = 21S21 dt;
dS2 dS2 = S22 2 22 dB1 dB1 + S221 2 22 dB2 dB2
,
= 22S22 dt;
dS1 dS2 = S1 1S2 2 dB1 dB1 = 1 2S1S2 dt:
In other words,
dS1
S has instantaneous variance 21,
1
dS
2
2
S has instantaneous variance 2,
2
dS
dS
1
S and 2
S have instantaneous covariance 1 2.
1
2
Accumulation factor:
t = exp t
Z
r du :
0
The market price of risk equations are
11 = 1 r
,
21 + q1
2 22 = 2 r
(MPR)
,
,
203
204
The solution to these equations is
r
1 = 1 , ;
1
r
r
2 = 12
21
,
,
,
1 2p1
2
;
,
provided
1
1.
,
Suppose
1
1. Then (MPR) has a unique solution 1;2; we define
,
Zt = exp
t
t
t
Z
Z
Z
1 dB1
2 dB2 1 21 + 22 du ;
,
0
,
0
,
2 0
IPA = Z ZT dIP;
A
:
f
A
8
2
F
IP
f
is the unique risk-neutral measure. Define
B
t
e 1t = Z0 1 du + B1t;
B
t
e 2t = Z0 2 du + B2t:
Then
dS
h
i
1 = S1 r dt + 1 dB ;
e 1
dS
2 = S2 r dt + 2 dB
2
B :
e 1 + q1
2d e2
,
We have changed the mean rates of return of the stock prices, but not the variances and covariances.
19.1
Hedging when ,1
1
dX = 1 dS1 + 2 dS2 + rX 1S1 2S2 dt
,
,
dX
= 1dX rX dt
,
= 11dS1 rS1 dt + 12dS2 rS2 dt
,
,
= 11S1 1 dB
B
2
B :
e 1 + 1 2S2
2 d e1 + q1
2 d e2
,
Let V be T -measurable. Define the IP
f -martingale
F
Y
t = IE V
t ;
0 t T:
f
T F
CHAPTER 19. A two-dimensional market model
205
The Martingale Representation Corollary implies
Y
t
t
t = Y 0 + Z
B
B
0 1 d e1 + Z0 2 d e2:
We have
dX
= 11S1 1 + 12S2 2 dBe1
+ 1 q
2S2 1
2 2 dBe2;
,
dY = 1 dB
B
e1 + 2 d e2:
We solve the equations
11S1 1 + 12S2 2 = 1
1 q
2S2 1
2 2 = 2
,
for the hedging portfolio 1; 2. With this choice of 1; 2 and setting
X0 = Y 0 = IE V
f
T;
we have Xt = Y t; 0
t T; and in particular,
XT = V:
Every
T-measurable random variable can be hedged; the market is complete.
F
19.2
Hedging when
=
1
The case
= 1 is analogous. Assume that = 1. Then
,
dS1 = S1 1 dt + 1 dB1
dS2 = S2 2 dt + 2 dB1
The stocks are perfectly correlated.
The market price of risk equations are
11 = 1 r
,
21 = 2 r
(MPR)
,
The process 2 is free. There are two cases:
206
r
r
Case I:
,
= , :
1
2
There is no solution to (MPR), and consequently, there is no risk-neutral
6
1
2
measure. This market admits arbitrage. Indeed
dX
= 11dS1 rS1 dt + 12dS2 rS2 dt
,
,
= 11S1 1 r dt + 1 dB1 + 12S2 2 r dt + 2 dB1
,
,
r
r
Suppose
,
,
:
1
2
Set
1
2
1
1 = 1 ;
:
1S1
2 = , 2S2
Then
dX
= 1 1 r
1 2 r
,
dt + dB
,
dt + dB
1
1
1
,
2
= 1 1 r 2 r
,
,
dt
|
1 ,z
2
Positive
r
r
Case II:
,
= , :
1
2
The market price of risk equations
1
2
11 = 1 r
,
21 = 2 r
,
have the solution
r
r
1 = 1 , = 2 , ;
1
2
2 is free; there are infinitely many risk-neutral measures. Let IPf be one of them.
Hedging:
dX
= 11S1 1 r dt + 1 dB1 + 12S2 2 r dt + 2 dB1
,
,
= 11S1 1 1 dt + dB1 + 12S2 2 1 dt + dB1
= 11S1 1 + 12S2 2 dBe1:
Notice that Be2 does not appear.
Let V be an T -measurable random variable. If V depends on B2, then it can probably not
F
be hedged. For example, if
V = hS1T;S2T;
and 1 or 2 depend on B2, then there is trouble.
CHAPTER 19. A two-dimensional market model
207
More precisely, we define the IP
f -martingale
Y t = IE V
t;
0 t T:
f
T F
We can write
Y
t
t
t = Y 0 + Z
B
B
0 1 d e1 + Z0 2 d e2;
so
dY = 1 dB
B
e 1 + 2 d e2:
To get d
X to match dY , we must have
2 = 0:
208
Chapter 20
Pricing Exotic Options
20.1
Reflection principle for Brownian motion
Without drift.
Define
MT = max Bt:
0 t T
Then we have:
IP MT m;BT b
f
g
= IP BT 2m b
f
,
g
= 1 Z
x2
1
exp
dx; m 0; b m
p2T 2m b
,
2T
,
So the joint density is
!
IP MT dm;BT db =
@2
1 Z
x2
1
exp
dx dm db
p
f
2
2
g
,
@m @b
2T 2m b
,
2T
,
!
= @
1
2m b2
,
dm db;
p
,
@m
2T exp , 2T
= 22m b
2m b2
,
,
T
dm db; m 0;b m:
p2T exp ,
2T
With drift. Let
Bt = t + Bt;
e
209
210
shadow path
2m-b
m
b
Brownian motion
Figure 20.1: Reflection Principle for Brownian motion without drift
m
m=b
(B(T), M(T)) lies in here
b
Figure 20.2: Possible values of BT ;MT .
CHAPTER 20. Pricing Exotic Options
211
where Bt; 0
t T, is a Brownian motion (without drift) on ; ; . Define
F
P
ZT = exp BT 1
f,
,
22Tg
= exp BT + T + 1
f,
22Tg
= exp Bt + 1
;
e
f,
22Tg
IPA = Z ZT dIP; A
:
f
A
8
2
F
SetMT = max
BT:
f
0 t T e
Under IP; B
f
e
is a Brownian motion (without drift), so
IP M
~
~
T d~m; BT d~b = 22~m b
2~m b2
,
,
d~m d~b; ~m 0; ~b ~m:
f
f
e
f
2
2
g
Tp2T exp , 2T
Let h~
m;~b be a function of two variables. Then
IEhMT;BT = IEhMT;BT
f
e
f
e
f
ZT
= IE hhMT;BTexp BT 1
i
f
f
e
e
f
,
22Tg
~m= ~b=
Z
~m
1
= Z
h~m;~b exp ~b 1
IP MT d ~m; BT d~b :
f
f
e
f
,
22Tg f
2
2
g
~m=0 ~b=,1
But also,
~m= ~b=
Z
~m
1
IEhMT;BT = Z
h~m;~b IP MT d~m; BT d~b :
f
e
f
e
f
2
2
g
~m=0 ~b=,1
Since h is arbitrary, we conclude that
(MPR)
IP MT d~m; BT d~b
f
e
f
2
2
g
= exp ~b 1
IP MT d~m; BT d~b
f
f
e
f
,
22Tg f
2
2
g
~
~
= 22~m b
2~m b2
1
,
,
T
:exp ~b
d~m d~b; ~m 0; ~b ~m:
p2T exp ,
2T
f
,
22Tg
212
20.2
Up and out European call.
Let 0
K L be given. The payoff at time T is
ST K+1 S T L ;
,
f
g
where
ST = max
0 t T St:
To simplify notation, assume that IP is already the risk-neutral measure, so the value at time zero of
the option is
v0;S0 = e rTIE h
K
i
:
,
ST
+1 S T L
,
f
g
Because IP is the risk-neutral measure,
dSt = rSt dt + St dBt
St = S0 exp Bt + r 1 2t
f
,
2
g
8
9
2
3
= S
6
7=
0 exp
6Bt + r
t7
,
6
2 7
4
5
|
z
:
;
= S0 exp Bt ;
e
f
g
where
= r
;
,
2
Bt = t + Bt:
e
Consequently,
S
M ;
t = S0 exp
t
f
f
g
where,
Mt = max Bu:
f
0 u t e
We compute,
v0;S0 = e rTIE hST K+1
i
,
S T L
,
f
g
= e rTIE
+
S
B
K 1
,
0exp
T
e
f
g
,
S0exp MT L
e
f
f
g
g
"
= e rTIE
S
B
K1
,
0exp
T
e
1
1
f
g
,
BT
log K ; MT log L
e
S0 e
S0
|
z
|
z
b
m
~
~
CHAPTER 20. Pricing Exotic Options
213
~
M(T)
y
x=y
~
m
(B(T), M(T)) lies in here
x~
~
B(T)
b
Figure 20.3: Possible values of BT ; MT
e
f
.
We consider only the case
S0 K L; so 0 ~b ~m:
The other case, K
S0 L leads to ~b 0 ~m and the analysis is similar.
~m ~m
R
R
We compute ~b x : : :dy dx:
v0;S0 = e rT ~m ~m
Z
Z
x K
x
2y x2
1
,
dy dx
,
~
S0exp
22y ,
b x
f
g
,
Tp2T exp , 2T + x , 22T
y= ~m
= e rT ~m
Z
S0exp x K 1
2y x2
1
,
dx
,
p
,
~
b
f
g
,
2T exp , 2T
+ x , 22T y=x
"
= e rT ~m
Z
S0exp x K 1 exp
x2
1
,
~
b
p
f
g
,
2T
,
2T + x , 22T
exp 2 ~m x2
1
,
dx
,
,
2T
+ x , 22T
= 1
rT S0 ~m
Z
exp x x2
1
dx
p2T e,
~
b
,
2T + x , 22T
1
rT K ~m
Z
exp
x2
1
dx
p
,
2T e,
~
b
,
2T + x , 22T
1
rT S0 ~m
Z
exp x 2~m x2
1
,
dx
p
,
2T e,
~b
,
2T
+ x , 22T
+ 1
rT K ~m
Z
exp 2~m x2
1
,
dx:
p
2T e,
~
b
,
2T
+ x , 22T
The standard method for all these integrals is to complete the square in the exponent and then
recognize a cumulative normal distribution. We carry out the details for the first integral and just
214
give the result for the other three. The exponent in the first integrand is
x x2
1
,
2T + x , 22T
= 1
T T2 + 1 2T + T
,
2T x , ,
2
2
= 1 x rT
T +rT:
,
2T
,
,
2
In the first integral we make the change of variable
y = x rT=
T=2=pT; dy = dx=pT;
,
,
to obtain
e rTS0 ~m
Z
x2
,
exp x
1
dx
p2T
~
b
,
2T + x , 22T
2
= 1
~m exp 1 x rT T
dx
p2T S0 Z~b
,
2T
,
,
2
m rpT
pT
~
pT ,
,
2
= 1
Z
exp y2 dy
p2T S0:
f,
2 g
b rpT
pT
~
p
T ,
,
2
"
!
!
p
~
p
= S0 N ~m rpT
T
N b
rpT
T :
pT
p
,
,
2
,
T ,
,
2
Putting all four integrals together, we have
"
!
!
p
p
v
~
0;S0 = S0 N ~m rpT
T
N b
rpT
T
pT
p
,
,
2
,
T ,
,
2
"
!
!
p
p
e rTK N ~m rpT
T
N ~b
rpT
T
,
+
+
p
p
,
T ,
2
,
T ,
2
"
!
!
p
p
S
~
0 N ~m
T
N 2~m b
T
,
p
p
,
T + rpT + 2 ,
T + rpT + 2
!
p
+ exp rT + 2~mr
N ~m
T
p
,
,
2
T + rpT , 2 ,
!
p
N 2~m ~b
T
,
;
pT
+ rpT , 2
where
~b = 1 log K
S0; ~m = 1 log L
S0:
CHAPTER 20. Pricing Exotic Options
215
v(t,L) = 0
L
+
v(T,x) = (x - K)
v(t,0) = 0
T
Figure 20.4: Initial and boundary conditions.
If we let L
we obtain the classical Black-Scholes formula
!1
"
!
p
v
~
0;S0 = S0 1 N b
rpT
T
p
,
T ,
,
2
"
!
p
e rTK
N ~b
rpT
T
,
1
+
p
,
,
T ,
2
!
p
= S0N
1
T
pT log S0
K + rpT + 2
!
p
e rTKN
1
T :
,
p
,
T log S0
K + rpT , 2
If we replace T by T
t and replace S0 by x in the formula for v0;S0,we obtain a formula
,
for vt; x, the value of the option at the time t if St = x. We have actually derived the formula
under the assumption x
K L, but a similar albeit longer formula can also be derived for
K x L. We consider the function
vt;x = IEt;xhe rT t
K
i
;
t T;
x L:
,
,
ST
+1 S T L
0
0
,
f
g
This function satisfies the terminal condition
vT;x = x K+; 0 x L
,
and the boundary conditions
vt;0 = 0; 0 t T;
vt;L = 0; 0 t T:
We show that v satisfies the Black-Scholes equation
rv + vt + rxvx + 1 2x2vxx; 0 t T; 0 x L:
,
2
216
Let S0
0 be given and define the stopping time
= min t 0; St = L :
f
g
Theorem 2.61 The process
e rt vt ;St ; 0 t T;
,
^
^
^
is a martingale.
Proof: First note that
S
T:
T L
Let !
be given, and choose t
0;T . If ! t, then
2
2
IE e rTST K+1
t! = 0:
,
S T L
,
F
f
g
But when !
t, we have
vt !;St !;! = vt !;L = 0;
^
^
^
so we may write
IE e rT
K
rt !v
,
ST
+1 S T L t! = e, ^
t !; St !;!:
,
F
^
^
f
g
On the other hand, if !
t, then the Markov property implies
IE e rT
K
,
ST
+1 S T L t !
,
F
f
g
= IEt;St;! he rTST K+1
i
,
S T L
,
f
g
= e rtv
,
t;St;!
= e rt !v
; S
,
^
t
t !;!:
^
^
In both cases, we have
e rt v
;S
rT
K
:
,
^
t
t = IE e, ST
+1 S T L t
^
^
,
F
f
g
Suppose 0
u t T. Then
IE e rt v
;S
,
^
t
t u
^
^
F
= IE IE e rT
K
,
ST
+1 S T L t u
,
F
F
f
g
= IE e rT
K
,
ST
+1 S T L u
,
F
f
g
= e ru v
; S
,
^
u
u :
^
^
CHAPTER 20. Pricing Exotic Options
217
For 0
t T, we compute the differential
d
e rtv
rt rv
2S2v
rt Sv
,
t;St = e,
+ vt + rSvx + 1
xx dt + e,
x dB:
,
2
Integrate from 0 to t
:
^
e rt v
;S
,
^
t
t = v0;S0
^
^
+ t
Z
^
e ru rv + v
2S2v
,
0
t + rSvx + 1
xx du
,
2
t
+
Z
^
e ru Sv
,
0
x dB:
|
z
A stopped martingale is still a martingale
rt
Because e
v t ;St
,
^
is also a martingale, the Riemann integral
^
^
t
Z
^
e ru rv
2S2v
,
+ v
0
t + rSvx + 1
xx du
,
2
is a martingale. Therefore,
rvu;Su+ vtu;Su+ rSuvxu;Su+ 1 2S2uvxxu;Su = 0; 0 u t :
,
2
^
The PDE
rv + vt + rxvx + 1 2x2vxx = 0; 0 t T; 0 x L;
,
2
then follows.
The Hedge
d
e rtvt;St = e rt Stv
t :
,
,
xt; St dBt; 0
Let Xt be the wealth process corresponding to some portfolio t. Then
de rtX
rt
,
t = e, t St dBt:
We should take
X0 = v0;S0
and
t = vxt;St; 0 t T :
^
Then
XT = vT ;ST
^
^
^
= vT;ST = ST K+
if
T
,
v ;L = 0
if
T.
218
v(T, x)
0
K
L
x
v(t, x)
0
K
L
x
Figure 20.5: Practial issue.
20.3
A practical issue
For t
T but t near T, vt;x has the form shown in the bottom part of Fig. 20.5.
In particular, the hedging portfolio
t = vxt;St
can become very negative near the knockout boundary. The hedger is in an unstable situation. He
should take a large short position in the stock. If the stock does not cross the barrier L, he covers
this short position with funds from the money market, pays off the option, and is left with zero. If
the stock moves across the barrier, he is now in a region of t = vxt; St near zero. He should
cover his short position with the money market. This is more expensive than before, because the
stock price has risen, and consequently he is left with no money. However, the option has “knocked
out”, so no money is needed to pay it off.
Because a large short position is being taken, a small error in hedging can create a significant effect.
Here is a possible resolution.
Rather than using the boundary condition
vt;L = 0; 0 t T;
solve the PDE with the boundary condition
vt;L + Lvxt;L = 0; 0 t T;
where
is a “tolerance parameter”, say 1%. At the boundary, Lvxt; L is the dollar size of the
short position. The new boundary condition guarantees:
1. Lvxt; L remains bounded;
2. The value of the portfolio is always sufficient to cover a hedging error of
times the dollar
size of the short position.
Chapter 21
Asian Options
Stock:
dSt = rSt dt + St dBt:
Payoff:
!
V = h T
Z
St dt
0
Value of the payoff at time zero:
"
!
X0 = IE e rTh T
Z
St dt :
,
0
Introduce an auxiliary process Y t by specifying
dY t = St dt:
With the initial conditions
St = x; Y t = y;
we have the solutions
ST = xexpn BT Bt + r 1 2T to;
,
,
2
,
Y
T
T = y + Z Su du:
t
Define the undiscounted expected payoff
ut;x;y = IEt;x;yhY T; 0 t T; x 0; y IR:
2
219
220
21.1
Feynman-Kac Theorem
The function u satisfies the PDE
ut + rxux + 1 2
2 x2uxx + xuy = 0; 0 t T; x 0; y IR;
2
the terminal condition
uT;x;y = hy; x 0; y IR;
2
and the boundary condition
ut;0;y = hy; 0 t T; y IR:
2
One can solve this equation. Then
v t;S
t
t;Z Su du
0
is the option value at time t, wherevt;x;y=e rT tut;x;y:
,
,
The PDE for v is
rv + vt + rxvx + 1 2x2vxx + xvy = 0;
(1.1)
,
2
vT;x;y = hy;
vt;0;y = e rT th
,
,
y:
One can solve this equation rather than the equation for u.
21.2
Constructing the hedge
Start with the stock price S0. The differential of the value Xt of a portfolio t is
dX = dS + rX S dt
,
= Sr dt + dB + rX dt rS dt
,
= S dB + rX dt:
We want to have
Xt = v t;St; t
Z
Su du
;
0
so that
!
XT = v T;S0; T
Z
Su du ;
0
!
= h T
Z
Su du :
0
CHAPTER 21. Asian Options
221
The differential of the value of the option is
dvt;S
t
t;Z Su du
= v
0
tdt + vxdS + vyS dt + 12vxx dS dS
= vt + rSvx + Svy + 1 2
2 S2vxx dt + Svx dB
= rvt;St dt + vxt;St St dBt: (From Eq. 1.1)
Compare this with
dXt = rXt dt + t St dBt:
Take t = vxt; St: If X0 = v0;S0; 0, then
Xt = v t;St; t
Z
Su du
; 0 t T;
0
because both these processes satisfy the same stochastic differential equation, starting from the same
initial condition.
21.3
Partial average payoff Asian option
Now suppose the payoff is
!
V
T
= h Z St dt ;
where 0
T. We computev ;x;y=IE ;x;ye rT hYT
,
,
just as before. For 0
t , we compute next the value of a derivative security which pays off
v ;S ;0
at time . This value is
wt;x = IEt;xe r tv
,
,
;S ;0:
The function w satisfies the Black-Scholes PDE
rw + wt + rxwx + 1 2x2wxx = 0; 0 t ; x 0;
,
2
with terminal condition
w ;x = v ;x;0; x 0;
and boundary condition
wt;0 = e rT th
t T:
,
,
0; 0
The hedge is given by
8
t = wxt;
St;
0 t
;
v t;St; tR Su du;
t T:
:
x
222
Remark 21.1 While no closed-form for the Asian option price is known, the Laplace transform (in
the variable
2
4 T
t) has been computed. See H. Geman and M. Yor, Bessel processes, Asian
,
options, and perpetuities, Math. Finance 3 (1993), 349–375.
Chapter 22
Summary of Arbitrage Pricing Theory
A simple European derivative security makes a random payment at a time fixed in advance. The
value at time t of such a security is the amount of wealth needed at time t in order to replicate the
security by trading in the market. The hedging portfolio is a specification of how to do this trading.
22.1
Binomial model, Hedging Portfolio
Let
be the set of all possible sequences of n coin-tosses. We have no probabilities at this point.
Let r
0; u r + 1; d = 1=u be given. (See Fig. 2.1)
Evolution of the value of a portfolio:
Xk+1 = kSk+1 + 1 + rXk kSk:
,
Given a simple European derivative security V !1;!2, we want to start with a nonrandom X0 and
use a portfolio processes
0; 1H; 1T
so that
X2!1;!2 = V !1;!2 !1;!2:
(four equations)
8
There are four unknowns: X0; 0; 1H; 1T . Solving the equations, we obtain:
223
224
2
3
X
1 + r d
1 + r
1!1 = 1 6
7
,
1 + r
+u ,
;
6
u d X2!1;H
u d X2!1;T7
4
5
|
z
|
z
,
V ! ;H
,
V ! ;T
1
1
X
1 + r d
1 + r
0 = 1
,
1 + r u d X1H + u ,u d X1T;
,
,
X
1!1 = X2!1;H
2!1;T
,
S2!1;H S2!1;T ;
,
X
0 = X1H
1T
,
S1H S1T :
,
The probabilities of the stock price paths are irrelevant, because we have a hedge which works on
every path. From a practical point of view, what matters is that the paths in the model include all
the possibilities. We want to find a description of the paths in the model. They all have the property
2
logSk+1 logSk2 = log Sk+1
,
Sk
= logu2
= logu2:
Let
= logu 0. Then
n 1
,
X
logSk+1 logSk2 = 2n:
,
k=0
The paths of log Sk accumulate quadratic variation at rate 2 per unit time.
If we change u, then we change , and the pricing and hedging formulas on the previous page will
give different results.
We reiterate that the probabilities are only introduced as an aid to understanding and computation.
Recall:
Xk+1 = kSk+1 + 1 + rXk kSk:
,
Define
k = 1 + rk:
Then
Xk+1 = Sk+1 + Xk Sk;
k+1
k k+1
k
k
,
k
i.e.,
Xk+1 Xk
= Sk+1 Sk :
k+1
k
,
k
k+1 , k
In continuous time, we will have the analogous equation
dXt
t = t dSt
t :
CHAPTER 22. Summary of Arbitrage Pricing Theory
225
S
X
If we introduce a probability measure IP
k
k
f
under which
k is a martingale, then k will also be a
martingale, regardless of the portfolio used. Indeed,
IE Xk+1 = IE Xk + Sk+1 Sk
f
f
k+1 k
k
k
F
k
k+1 , k F
= Xk + IE Sk+1
Sk :
f
k
k
k+1 k
F
,
k
|
z
=0
Suppose we want to have X2 = V , where V is some 2-measurable random variable. Then we
F
must have
1
1 + rX1 = X1 = IE X2
= IE V
;
f
f
1
2 1
1
F
2 F
X
0 = X0 = IE X1 = IE V :
f
f
0
1
2
S
To find the risk-neutral probability measure IP
k
f
under which
k is a martingale, we denote ~p =
IP !
IP !
f
k = H , ~q = f k = T , and compute
f
g
f
g
IE Sk+1 = ~pu Sk + ~qd Sk
f
k+1 k
F
k+1
k+1
= 1
1 + r ~pu + ~qd Sk:k
We need to choose ~
p and ~qso that
~pu + ~qd = 1 + r;
~p+ ~q= 1:
The solution of these equations is
~p= 1 + r d
1 + r
,
u d ; ~q= u ,u d :
,
,
22.2
Setting up the continuous model
Now the stock price St; 0
t T, is a continuous function of t. We would like to hedge
along every possible path of St, but that is impossible. Using the binomial model as a guide, we
choose
0 and try to hedge along every path St for which the quadratic variation of log St
accumulates at rate 2 per unit time. These are the paths with volatility 2.
To generate these paths, we use Brownian motion, rather than coin-tossing. To introduce Brownian
motion, we need a probability measure. However, the only thing about this probability measure
which ultimately matters is the set of paths to which it assigns probability zero.
226
Let Bt; 0
t T, be a Brownian motion defined on a probability space ; ; . For any
F
P
IR, the paths of
2
t + Bt
2
accumulate quadratic variation at rate
per unit time. We want to define
St = S0exp t + Bt ;
f
g
so that the paths of
log St = logS0 + t + Bt
accumulate quadratic variation at rate 2 per unit time. Surprisingly, the choice of
in this definition
is irrelevant. Roughly, the reason for this is the following: Choose !1
. Then, for 1
IR,
2
2
1t + Bt; !1; 0 t T;
is a continuous function of t. If we replace 1 by 2, then 2t + Bt; !1 is a different function.
However, there is an !2
such that
2
1t + Bt; !1 = 2t + Bt; !2; 0 t T:
In other words, regardless of whether we use 1 or 2 in the definition of St, we will see the same
paths. The mathematically precise statement is the following:
If a set of stock price paths has a positive probability when St is defined by
St = S0exp 1t + Bt ;
f
g
then this set of paths has positive probability when St is defined by
St = S0exp 2t + Bt :
f
g
Since we are interested in hedging along every path, except possibly for a set of paths
which has probability zero, the choice of
is irrelevant.
The most convenient choice of
is
= r 1 2;
,
2
so
St = S0exp rt + Bt 1 2t ;
f
,
2
g
and
e rtS
B
1 2t
,
t = S0exp
t
f
,
2
g
is a martingale under IP . With this choice of ,
dSt = rSt dt + St dBt
CHAPTER 22. Summary of Arbitrage Pricing Theory
227
and IP is the risk-neutral measure. If a different choice of is made, we have
St = S0exp t + Bt ;
f
g
dSt = + 1 2
2 St dt + St dBt:
|
z
= rSt dt + h r
,
dt + dBti:
|
z
dBt
e
B
IP
B
e
has the same paths as B. We can change to the risk-neutral measure f, under which e is a
1
Brownian motion, and then proceed as if
had been chosen to be equal to r , 2 2.
22.3
Risk-neutral pricing and hedging
Let IP
f
denote the risk-neutral measure. Then
dSt = rSt dt + St dBt;
e
where B
IP
e
is a Brownian motion under f. Set
t = ert:
Then
dSt
t = St
tdBt;
e
St
so t is a martingale under IP
f .
Evolution of the value of a portfolio:
dXt = tdSt + rXt tSt dt;
(3.1)
,
which is equivalent to
dXt
t = tdSt
t
(3.2)
= t St
tdBt:
e
Xt
Regardless of the portfolio used,
t is a martingale under IP
f .
Now suppose V is a given T -measurable random variable, the payoff of a simple European
F
derivative security. We want to find the portfolio process T ; 0
t T, and initial portfolio
Xt
value X0 so that XT = V . Because t must be a martingale, we must have
Xt
t = IE V
t ; 0 t T:
f
T
(3.3)
F
This is the risk-neutral pricing formula. We have the following sequence:
228
1. V is given,
2. Define Xt; 0
t T, by (3.3) (not by (3.1) or (3.2), because we do not yet have t).
3. Construct t so that (3.2) (or equivalently, (3.1)) is satisfied by the Xt; 0
t T,
defined in step 2.
Xt
To carry out step 3, we first use the tower property to show that
t defined by (3.3) is a martingale
under IP
f . We next use the corollary to the Martingale Representation Theorem (Homework Problem
4.5) to show that
dXt
t = t dBt
e
(3.4)
for some proecss
. Comparing (3.4), which we know, and (3.2), which we want, we decide to
define
t = t t
St :
(3.5)
Then (3.4) implies (3.2), which implies (3.1), which implies that Xt; 0
t T, is the value of
the portfolio process t; 0
t T.
From (3.3), the definition of X, we see that the hedging portfolio must begin with value
X0 = IE V ;
f
T
and it will end with value
XT = TIE V
T = T V
f
T F
T = V:
Remark 22.1 Although we have taken r and
to be constant, the risk-neutral pricing formula is
still “valid” when r and are processes adapted to the filtration generated by B. If they depend on
either Be or on S, they are adapted to the filtration generated by B. The “validity” of the risk-neutral
pricing formula means:
1. If you start with
X0 = IE V ;
f
T
then there is a hedging portfolio t; 0
t T, such that XT = V ;
2. At each time t, the value Xt of the hedging portfolio in 1 satisfies
Xt
t = IE V
t :
f
T F
Remark 22.2 In general, when there are multiple assets and/or multiple Brownian motions, the
risk-neutral pricing formula is valid provided there is a unique risk-neutral measure. A probability
measure is said to be risk-neutral provided
CHAPTER 22. Summary of Arbitrage Pricing Theory
229
it has the same probability-zero sets as the original measure;
it makes all the discounted asset prices be martingales.
To see if the risk-neutral measure is unique, compute the differential of all discounted asset prices
and check if there is more than one way to define B
B
e
so that all these differentials have only d e
terms.
22.4
Implementation of risk-neutral pricing and hedging
To get a computable result from the general risk-neutral pricing formula
Xt
t = IE V
t ;
f
T F
one uses the Markov property. We need to identify some state variables, the stock price and possibly
other variables, so that
Xt = tIE V
t
f
T F
is a function of these variables.
Example 22.1 Assume r and are constant, and V = hST . We can take the stock price to be the state
variable. Define
vt;x = IE h
t;x
e
hSTi :
,r T
,t
e
Then
Xt = e IE e hST t
r t
,r T
e
F
= vt;St;
and Xt = e
vt;St
IP
,r t
is a martingale under e .
t
Example 22.2 Assume r and are constant.
!
V = h Z T Su du :
0
Take St and Y t = R t Su du to be the state variables. Define
0
vt;x;y = IE
h
t;x;y
e
hY Ti;
,r T
,t
e
where
Y T = y + Z T Su du:
t
230
Then
Xt = e IE e hST t
r t
,r T
e
F
= vt;St;Y t
and
Xt
t = e vt;St;Y t
,r t
is a martingale under IP
e
.
Example 22.3 (Homework problem 4.2)
dSt = rt;Y t Stdt + t;Y tSt dBt;
e
dY t = t;Y t dt + t;Y t dBt;
e
V = hST:
Take St and Y t to be the state variables. Define
2
3
6
7
6
7
Z
vt;x;y = IE
T
6
7
t;x;y
exp
ru;Y u du hST :
e
6
7
,
6
7
t
6
7
4
5
|
z
t
T
Then
Xt = tIE hST t
e
T F
"
Z
= IE exp
T
ru;Y u du hST t
e
,
F
t
= vt;St;Y t;
and
Xt
Z
t
t = exp
ru;Y u du vt;St;Y t
,
0
is a martingale under IP
e
.
In every case, we get an expression involving v to be a martingale. We take the differential and
set the dt term to zero. This gives us a partial differential equation for v, and this equation must
hold wherever the state processes can be. The dBe term in the differential of the equation is the
differential of a martingale, and since the martingale is
Xt
t
t = X0 + Z u Su Bu
0
ud e
we can solve for t. This is the argument which uses (3.4) to obtain (3.5).
CHAPTER 22. Summary of Arbitrage Pricing Theory
231
Example 22.4 (Continuation of Example 22.3)
Xt
Z
t
t = exp
ru;Y u du vt;St;Y t
,
0
|
z
1=
t
is a martingale under IP
e
. We have
d Xt
t = 1t rt;Y tvt;St;Y t dt
,
+ v dt + v dS + v dY
t
x
y
+ v dS dS + v dS dY + v dY dY
1
1
xx
xy
y y
2
2
= 1t rv +v +rSv + v + 2S2v + Sv + 2v dt
1
1
,
t
x
y
xx
xy
y y
2
2
+ Sv + v dBe
x
y
The partial differential equation satisfied by v is
rv + v + rxv + v +
x v + xv +
v = 0
2
2
2
1
1
,
t
x
y
xx
xy
y y
2
2
where it should be noted that v = vt; x; y, and all other variables are functions of t; y. We have
d Xt
t = 1t Sv + v dBt;
e
x
y
where
= t;Y t, = t;Y t, v = vt;St;Y t, and S = St. We want to choose t so that
(see (3.2))
d Xt
t = t t;Y tSt
t dBt:
e
Therefore, we should take t to be
t = v t;St;Y t +
t;Y t
t;St;Y t:
x
t;Y t Stvy
232
Chapter 23
Recognizing a Brownian Motion
Theorem 0.62 (Levy) Let Bt; 0
t T; be a process on ; ; , adapted to a filtration
F
P
t;0 t T, such that:
F
1. the paths of Bt are continuous,
2. B is a martingale,
3. B t = t; 0
t T, (i.e., informally dBt dBt = dt).
h
i
Then B is a Brownian motion.
Proof: (Idea) Let 0
s t T be given. We need to show that Bt Bs is normal, with
,
mean zero and variance t
s, and Bt Bs is independent of s. We shall show that the
,
,
F
conditional moment generating function of Bt
Bs is
,
IE euBt Bs
t s
,
:
,
s = e12u2
F
Since the moment generating function characterizes the distribution, this shows that Bt
Bs
,
is normal with mean 0 and variance t
s, and conditioning on s does not affect this, i.e.,
,
F
Bt Bs is independent of s.
,
F
We compute (this uses the continuity condition (1) of the theorem)
deuBt = ueuBtdBt + 12u2euBtdBt dBt;
so
euBt = euBs + t
t
Z
ueuBv dBv + 1
euBv dv
|
:
s
2u2 Zs
z
uses cond. 3
233
234
t
R
Now 0 ueuBvdBv is a martingale (by condition 2), and so
IE t
Z
ueuBvdBv s
s
F
=
s
t
Z
ueuBvdBv + IE Z ueuBvdBv s
,
0
0
F
= 0:
It follows that
IE euBt s = euBs + 1
t IEeuBv s dv:
F
2u2 Zs
F
We define
'v = IE euBv s;
F
so that
's = euBs
and
't = euBs + 1
t
2u2 Z 'v dv;
s
'0t = 12u2't;
't = ke12u t2:
Plugging in s, we get
euBs = ke1
1
2u s
s
2
= k = euBs,2u2 :
Therefore,
IE euBt
s = 't = euBs+12u t s
2
,
;
F
IE euBt Bs
t s
,
:
,
s = e12u2
F
CHAPTER 23. Recognizing a Brownian Motion
235
23.1
Identifying volatility and correlation
Let B1 and B2 be independent Brownian motions and
dS1
S = r dt +
1
11 dB1 + 12 dB2;
dS2
S = r dt +
2
21 dB1 + 22 dB2;
Define
1 = q 211 + 212;
2 = q 221 + 222;
= 11 21 + 12 22:
1 2
Define processes W1 and W2 by
dW1 = 11 dB1 + 12 dB2
1
dW2 = 21 dB1 + 22 dB2:
2
Then W1 and W2 have continuous paths, are martingales, and
dW1 dW1 = 12 11dB1 + 12dB22
1
= 12 211dB1 dB1 + 212dB2 dB2
1
= dt;
and similarly
dW2 dW2 = dt:
Therefore, W1 and W2 are Brownian motions. The stock prices have the representation
dS1
S = r dt +
1
1 dW1;
dS2
S = r dt +
2
2 dW2:
The Brownian motions W1 and W2 are correlated. Indeed,
dW1 dW2 = 1
1 2 11dB1 + 12dB2 21dB1 + 22dB2
= 1
1 2 11 21 + 12 22 dt
= dt:
236
23.2
Reversing the process
Suppose we are given that
dS1
S = r dt +
1
1dW1;
dS2
S = r dt +
2
2dW2;
where W1 and W2 are Brownian motions with correlation coefficient . We want to find
"
= 11 12
21
22
so that
"
"
= 11 12
11
21
0
21
22
12
22
"
=
2
11 + 212
11 21 + 12 22
11 21 + 12 22
2
21 + 222
"
=
2
1
1 2
1 2
2
2
A simple (but not unique) solution is (see Chapter 19)
11 = 1;
12 = 0;
21 = 2;
22 = q1
2 2:
,
This corresponds to
1 dW1 = 1dB1= dB1 = dW1;
2 dW2 = 2 dB1 + q1
2 2 dB2
,
= dB
dW
2 = dW2
1
,
= 1
p
1 2 ; 6
,
If
= 1, then there is no B2 and dW2 = dB1 = dW1:
Continuing in the case
= 1, we have
6
dB1 dB1 = dW1 dW1 = dt;
dB
2 dB2 =
1
1 2 dW2 dW2 2 dW1 dW2 + 2dW2 dW2
,
,
= 1
1 2 dt 2 2 dt + 2 dt
,
,
= dt;
CHAPTER 23. Recognizing a Brownian Motion
237
so both B1 and B2 are Brownian motions. Furthermore,
dB1 dB2 =
1
dW1 dW1
p
1 2 dW1 dW2 ,
,
=
1
dt = 0:
p
1 2 dt ,
,
We can now apply an Extension of Levy’s Theorem that says that Brownian motions with zero
cross-variation are independent, to conclude that B1;B2 are independent Brownians.
238
Chapter 24
An outside barrier option
Barrier process:
dY t
Y t = dt + 1 dB1t:
Stock process:
dSt
St = dt + 2 dB1t + q1 2 2 dB2t;
,
where 1
0; 2 0; 1
1, and B1 and B2 are independent Brownian motions on some
,
; ; . The option pays off:
F
P
ST K+1 Y T L
,
f
g
at time T , where
0 S0 K; 0 Y 0 L;
Y
Y
T = max t:
0 t T
Remark 24.1 The option payoff depends on both the Y and S processes. In order to hedge it, we
will need the money market and two other assets, which we take to be Y and S. The risk-neutral
measure must make the discounted value of every traded asset be a martingale, which in this case
means the discounted Y and S processes.
We want to find 1 and 2 and define
dB
B
e1 = 1 dt + dB1;
d e2 = 2 dt + dB2;
239
240
so that
dYY =rdt+ 1dBe1
= r dt + 11 dt + 1 dB1;
dSS =rdt+ 2dB
2
B
e1 + q1
2d e2
,
= r dt + 2 1 dt + q1 2 22 dt
,
+ 2 dB1 + q1 2 2 dB2:
,
We must have
= r + 11;
(0.1)
= r + 21 + q1 2 22:
(0.2)
,
We solve to get
r
1 = , ;
1
r
2 =
21
,
,
:
p
1 2 2
,
We shall see that the formulas for 1 and 2 do not matter. What matters is that (0.1) and (0.2)
uniquely determine 1 and 2. This implies the existence and uniqueness of the risk-neutral measure.
We define
ZT = expn 1B1T 2B2T 1
,
,
,
221 + 22To ;
IPA = Z ZT dIP; A
:
f
A
8
2
F
Under IP B
B
IP
f ,
e 1 and e2 are independent Brownian motions (Girsanov’s Theorem). f is the unique
risk-neutral measure.
Remark 24.2 Under both IP and IP
f , Y has volatility
1, S has volatility 2 and
dY dS
Y S = 1 2 dt;
dY
dS
i.e., the correlation between Y and S is .
The value of the option at time zero is
v0;S0;Y0 = IE he rTST K+1
i
:
,
f
Y T L
,
f
g
We need to work out a density which permits us to compute the right-hand side.
CHAPTER 24. An outside barrier option
241
Recall that the barrier process is
dYY =rdt+ 1dBe1;
so
Y t = Y 0expnrt + 1B
1 2
e1t
1to :
,
2
Set
= r=
b
1
1=2;
,
Bt = t + B
b
b
e 1t;
MT = max Bt:
c
0 t T b
Then
Y t = Y 0exp 1Bt ;
b
f
g
Y T = Y 0exp MT :
1 c
f
g
The joint density of BT
MT
b
and c
, appearing in Chapter 20, is
IP BT d^b;MT d^m
f
b
c
f
2
2
g
^
^
= 22^m b
2^m b2
1
,
,
T
^b 2T d^b d^m;
p
2T exp , 2T
+ b , 2b
^m 0;^b ^m:
The stock process.
dSS =rdt+ 2dB
2
B
e1 + q1
2d e2;
,
so
ST = S0exp rT + 2B
1 2 2
2 B
1
2 2
e 1T
2T + q1
2 e2T
2T
f
,
2
,
,
21 ,
g
= S0exp rT 1 22T + 2B
2 B
e1T + q1
2 e2T
f
,
2
,
g
From the above paragraph we have
B
T + BT;
b
e1T =
b
,
so
ST = S0exp rT + 2BT 1 2
T + q1 2 B
b
2T
2b
2 e2T
f
,
2
,
,
g
242
24.1
Computing the option value
v0;S0;Y0 = IE he rT
K
i
,
ST
+1
f
Y T L
,
f
g
+
= e rTIES0exp r 1 2
T + BT + q1 2 B
K
,
f
2
2b
2 b
2 e2T
,
2 ,
,
,
:1
Y 0exp MT L
b
f
g
1
We know the joint density of BT ; M T
B
b
c
. The density of e2T is
IP B
d~b
~
= 1
b2 d~b; ~b IR:
f
e 2T
p
f
2
g
2T exp ,2T
2
Furthermore, the pair of random variables BT ; MT
B
B
b
c
is independent of e2T because e1 and
B
IP
B
BT;MT
e2 are independent under f. Therefore, the joint density of the random vector e2T ; b
c
is
IP B
d~b;BT d^b;MT d ^m; = IP B
d~b :IP BT d^b;MT d^m
f
e 2T
b
c
f
e2T
f
b
c
f
2
2
2
g
f
2
g
f
2
2
g
The option value at time zero is
v0;S0;Y0
log L
1
^m 1
+
Z
Y 0
1
= e rT
Z
Z
S
1 2
2
K
,
0exp r
2
2 T +
b
2^b + q1
2~b
,
2 ,
,
,
0
,1
,1
: 1
~b2
p
2T exp ,2T
:22^m ^b
2 ^m ^b2
1
,
,
T
^b 2T
p2T exp ,
2T
+ b , 2b
:d~b d^b d^m:
The answer depends on T; S0 and Y 0. It also depends on 1; 2; ;r;K and L. It does not
depend on ; ; 1; nor 2. The parameter
= r
b
appearing in the answer is b
1
,
2 :
1
Remark 24.3 If we had not regarded Y as a traded asset, then we would not have tried to set its
mean return equal to r. We would have had only one equation (see Eqs (0.1),(0.2))
= r + 21 + q1 2 22
(1.1)
,
to determine 1 and 2. The nonuniqueness of the solution alerts us that some options cannot be
hedged. Indeed, any option whose payoff depends on Y cannot be hedged when we are allowed to
trade only in the stock.
CHAPTER 24. An outside barrier option
243
If we have an option whose payoff depends only on S, then Y is superfluous. Returning to the
original equation for S,
dSS =dt+ 2dB1+q1 2 2dB2;
,
we should set
dW = dB1 + q1 2dB2;
,
so W is a Brownian motion under IP (Levy’s theorem), and
dSS =dt+ 2dW:
Now we have only Brownian motion, there will be only one , namely,
= r
,
;
2
so with dW = dt + dW;
f
we have
dSS =rdt+ 2 dW;f
and we are on our way.
24.2
The PDE for the outside barrier option
Returning to the case of the option with payoff
ST K+1 Y T L ;
,
f
g
we obtain a formula for
vt;x;y = e rT tIEt;x;y h
K
;i
,
,
ST
+1
f
max
,
t u T Y u L
f
g
by replacing T , S0 and Y 0 by T
t, x and y respectively in the formula for v0;S0;Y0.
,
Now start at time 0 at S0 and Y 0. Using the Markov property, we can show that the stochastic
process
e rtv
,
t;St;Yt
is a martingale under IP
f . We compute
dhe rtvt;St;Yti
,
= e rt
rv
2
2
dt
,
+ vt + rSvx + rY vy + 1 2S2vxx + 1 2SY vxy + 1 1Y 2vyy
,
2
2
+
2Svx dB
2
B
B
e 1 + q1
2Svx d e2 + 1Y vyd e1
,
244
y
v(t, x, L) = 0, x >= 0
L
x
v(t, 0, 0) = 0
Figure 24.1: Boundary conditions for barrier option. Note that t
0;T is fixed.
2
Setting the dt term equal to 0, we obtain the PDE
rv + vt + rxvx + ryvy + 1 22x2vxx
,
2+ 1 2xyvxy + 1 2
2 1y2vyy = 0;
0 t T; x 0; 0 y L:
The terminal condition is
vT;x;y = x K+; x 0; 0 y L;
,
and the boundary conditions are
vt;0;0 = 0; 0 t T;
vt;x;L = 0; 0 t T; x 0:
CHAPTER 24. An outside barrier option
245
x = 0
y = 0
rv + vt + ryvy + 1 21y2vyy = 0
rv + vt + rxvx + 1 22x2vxx = 0
,
2
,
2
This is the usual Black-Scholes formula
This is the usual Black-Scholes formula
in y.
in x.
The boundary conditions are
The boundary condition is
vt;0;L = 0; vt;0;0 = 0;
vt;0;0 = e rT t0 K+ = 0;
,
,
,
the terminal condition is
the terminal condition is
vT;0;y = 0 K+ = 0; y 0:
vT;x;0 = x K+; x 0:
,
,
On the x = 0 boundary, the option value On the y = 0 boundary, the barrier is ir-
is vt; 0;y = 0; 0
y L:
relevant, and the option value is given by
the usual Black-Scholes formula for a Eu-
ropean call.
24.3
The hedge
After setting the dt term to 0, we have the equation
dhe rtv
,
t;St;Yti
= e rt
B
2
B
B ;
,
2Svx d e1 + q1
2Svx d e2 + 1Y vyd e1
,
where vx = vxt; St;Y t, vy = vyt; St;Y t, and B B
e 1; e2;S;Y are functions of t. Note
that
dhe rtS
rt rS
,
ti = e,
t dt + dSt
,
= e rt
B
2
B
:
,
2St d e1t + q1
2St d e2t
,
dhe rtY
rt rY
,
ti = e,
t dt + dY t
,
= e rt
B
,
1Y t d e1t:
Therefore,
dhe rtvt;St;Yti = v
rtS + v
rtY :
,
xd e,
yd e,
Let 2t denote the number of shares of stock held at time t, and let 1t denote the number of
“shares” of the barrier process Y . The value Xt of the portfolio has the differential
dX = 2dS + 1dY + r X 2S 1Y dt:
,
,
246
This is equivalent to
d e rtX
rtS
rtY
,
t = 2td e, t + 1td e, t :
To get Xt = vt; St;Y t for all t, we must have
X0 = v0;S0;Y0
and
2t = vxt;St;Y t;
1t = vyt;St;Yt:
Chapter 25
American Options
This and the following chapters form part of the course Stochastic Differential Equations for Fi-
nance II.
25.1
Preview of perpetual American put
dS = rS dt + S dB
Intrinsic value at time t : K
St+:
,
Let L
0;K be given. Suppose we exercise the first time the stock price is L or lower. We define
2
L = min t 0; St L ;
f
g
vLx = IEe r L
S
,
K
L+
,
= K x
if x
L,
,
K LIEe r L
,
if x
L:
,
The plan is to comute vLx and then maximize over L to find the optimal exercise price. We need
to know the distribution of L.
25.2
First passage times for Brownian motion: first method
(Based on the reflection principle)
Let B be a Brownian motion under IP , let x
0 be given, and define
= min t 0;Bt = x :
f
g
is called the first passage time to x. We compute the distribution of .
247
248
Intrinsic value
K
K
Stock price
x
Figure 25.1: Intrinsic value of perpetual American put
Define
Mt = max
0 u t Bu:
From the first section of Chapter 20 we have
IP Mt dm;Bt db = 22m b
2m b2
,
,
dm db; m 0;b m:
f
2
2
g
tp2t exp , 2t
Therefore,
IP Mt x =
m
Z
Z
22m b
2m b2
1
,
,
db dm
f
g
x
tp2t exp , 2t
,1
b=m
= Z
2
2m b2
1
,
dm
x p2t exp ,
2t
b=,1
= Z
2
m2
1
dm:
x p2t exp , 2t
We make the change of variable z = m
pt in the integral to get
= Z
2
z2
1
dz:
x= p
,
pt
2 exp
2
Now
t Mt x;
CHAPTER 25. American Options
249
so
IP
dt = @
t dt
f
2
g
@tIPf g
= @@tIP Mt x dt
f
g
"
= @ Z
2
z2
1
@t
dz dt
x= p
,
p
t 2 exp
2
=
2
x2 : @ x
dt
p
p
,
2 exp ,2t @t
t
= x
x2
t
dt:
p2t exp , 2t
We also have the Laplace transform formula
IEe = Z 1 e tIP
dt
,
,
0
f
2
g
= e xp2 ;
,
0: (See Homework)
Reference: Karatzas and Shreve, Brownian Motion and Stochastic Calculus, pp 95-96.
25.3
Drift adjustment
Reference: Karatzas/Shreve, Brownian motion and Stochastic Calculus, pp 196–197.
For 0
t
, define
1
Bt = t + Bt;
e
Zt = exp Bt 1 ;
f,
,
22tg
= exp Bt + 1
;
e
f,
22tg
Define
~ = min t 0;Bt = x :
e
f
g
We fix a finite time T and change the probability measure “only up to T ”. More specifically, with
T fixed, define
IPA = Z ZT dP; A
T:
f
A
2
F
Under IP
Bt;0 t T
f , the process e
, is a (nondrifted) Brownian motion, so
IP ~ dt = IP
dt
f
f
2
g
f
2
g
= x
x2
t
dt; 0 t T:
p 2t exp , 2t
250
For 0
t T we have
IP ~ t = IE h1
i
~ t
f
g
f
g
= IE 1
1
f
~ t
f
g
ZT
= IE h1
exp BT 1
i
f
~ t
e
f
,
22Tg
f
g
= IE 1
IE exp BT 1
~ t
f
~ t f
e
f
,
22Tg F ^
f
g
= IE h1
exp B~ t 1
t i
f
~ t
e
f
^
,
22~ ^ g
f
g
= IE h1
exp x 1
i
f
~ t
f
,
22~g
f
g
t
= Z exp x 1
IP ~ ds
f
0
f
,
22sg f 2 g
t
= Z
x
x 1
x2 ds
0 sp2s exp
,
22s , 2s
t
= Z
x
x s2
,
ds:
0 sp2s exp ,
2s
Therefore,
IP ~ dt = x
x t2
,
dt; 0 t T:
f
2
g
tp2t exp , 2t
Since T is arbitrary, this must in fact be the correct formula for all t
0.
25.4
Drift-adjusted Laplace transform
Recall the Laplace transform formula for
= min t 0;Bt = x
f
g
for nondrifted Brownian motion:
IEe
x
t x2 dt
xp2 ;
,
= Z 1
= e,
0;x 0:
0 tp2t exp , , 2t
For
~ = min t 0;t + Bt = x ;
f
g
CHAPTER 25. American Options
251
the Laplace transform is
IEe ~
x
t x t2
,
dt
,
= Z 1
0 tp2t exp , ,
2t
= Z
x
x2
1
t
1
dt
0 tp2t exp , , 2t + x , 22t
= ex Z
x
x2
1
+ 1
dt
0 tp2t exp ,
22t , 2t
= ex xp2 +2;
,
0;x 0;
where in the last step we have used the formula for IEe,
with
replaced by
+ 122.
If ~!
, then
1
lime ~!
,
= 1;
0
if ~! =
, then e ~! = 0
,
for every
0, so
1
lime ~! = 0:
,
0
Therefore,
lime ~!
:
,
= 1
0
~ 1
Letting
0 and using the Monotone Convergence Theorem in the Laplace transform formula
IEe ~
xp2 +2;
,
= ex,
we obtain
IP ~
= ex xp
x
2
:
,
= ex, j j
f
1g
If
0, then
IP ~
= 1:
f
1g
If
0, then
IP ~
= e2x 1:
f
1g
(Recall that x
0).
25.5
First passage times: Second method
(Based on martingales)
Let
0 be given. Then
Y t = exp Bt 1 2t
f
,
2
g
252
is a martingale, so Y t
is also a martingale. We have
^
1 = Y 0
^
= IEY t
^
= IE exp Bt 1 2t :
f
^
,
2
^
g
= lim
1 2
t
IE exp Bt
t :
f
^
,
2
^
g
!1
We want to take the limit inside the expectation. Since
0
exp Bt 1 2t
ex;
f
^
,
2
^
g
this is justified by the Bounded Convergence Theorem. Therefore,
1 = IE lim
1 2
t
exp Bt
t :
f
^
,
2
^
g
!1
There are two possibilities. For those ! for which !
,
1
lim
1 2
1
t
exp Bt
t = e x,2 2 :
f
^
,
2
^
g
!1
For those ! for which ! =
,
1
lim
1 2
1 2
t
exp Bt
t
lim exp x
t = 0:
f
^
,
2
^
g
t
f
,
2
g
!1
!1
Therefore,
1 = IE lim
1 2
t
exp Bt
t
f
^
,
2
^
g
!1
= IE e x 1
,
2 2 1 1
= IEe x 1
,
2 2 ;
1
where we understand e x,2 2 to be zero if =
.
1
Let
= 12 2, so = p2 . We have again derived the Laplace transform formula
e xp2
;
,
= IEe,
0;x 0;
for the first passage time for nondrifted Brownian motion.
25.6
Perpetual American put
dS = rS dt + S dB
S0 = x
St = xexp r 1 2t + Bt
f
,
2
g
8
9
2
3
= xexp
r
6
7=
6
t + Bt7 :
,
6
2
7
4
5
|
z
:
;
CHAPTER 25. American Options
253
Intrinsic value of the put at time t: K
St+.
,
Let L
0;K be given. Define for x L,
2
L = min t 0; St = L
f
g
= min t 0; t + Bt = 1 log L
f
xg
= min t 0; t Bt = 1 log x
f
,
,
Lg
Define
vL = K LIEe r L
,
,
= K Lexp log x 1 log xp2r + 2
,
,
L ,
L
p2r+
1
= K Lx
2
,
,
:
,
L
We compute the exponent
1
r
1s
2
r
p2r + 2 =
2r +
=2
,
,
,
2 + 12 ,
,
= r
1s2r+ r2 r+ 2=4
,
2 + 12 ,
2 ,
= r
1sr2
,
2 + 12 ,
2 + r + 2=4
2
= r
1sr + =2
,
2 + 12 ,
= r
1 r + =2
,
2 + 12 ,
= 2r
,
2 :
Therefore,
8
v
K x;
0 x L;
Lx =
,
K L
2r=
,
x
2
,
;
x L:
:
,
L
2r=
2
2r=
The curves K
L,x
2
,
; are all of the form Cx,
.
,
L
We want to choose the largest possible constant. The constant is
C = K LL2r= 2;
,
254
value
K - x
K
σ2
-2r/
(K - L) (x/L)
K
Stock price
x
Figure 25.2: Value of perpetual American put
value
σ 2
-2r/
C x
3
σ 2
-2r/
C x
2
σ 2
-2r/
C1 x
Stock price
x
Figure 25.3: Curves.
CHAPTER 25. American Options
255
and
@C
r
r 1
@L = L 2 + 2r
LL 2 ,
2
2
,
2 K ,
= L r
2
1 + 2r
L 1
2
,
2 K , L
= L r
K
2
1 + 2r + 2r
:
2
,
2
2 L
We solve
1 + 2r
+ 2r K
,
2
2 L = 0
to get
L = 2rK
2 + 2r:
Since 0
2r
2 + 2r; we have
0 L K:
Solution to the perpetual American put pricing problem (see Fig. 25.4):
8
vx = K x;
0 x L ;
,
K L
2r=
,
x
2
,
;
x L ;
:
,
L
where
L = 2rK
2 + 2r:
Note that
v
1;
0 x L ;
,
0 x =
2r K L L 2r=
2r=
1
2
x
2
;
x L :
,
,
,
,
2
We have
lim v x = 2 r
L 1
0
x L
,
2 K ,
L
2
= 2 r K
2rK
+ 2r
,
2
,
2 + 2r 2rK
2
!
2
= 2 r
+ 2r 2r
+ 2r
,
,
2
2 + 2r
2r
= 1
,
= lim v0x:
x L
"
256
value
K - x
K
σ2
-2r/
(K - L )
* (x/L )
*
*
L
K
Stock price
x
Figure 25.4: Solution to perpetual American put.
25.7
Value of the perpetual American put
Set
= 2r2; L = 2rK
2 + 2r = + 1K:
If 0
x L
x
x
, then vx = K
. If L
, then
,
1
vx = K L
x
L ,
(7.1)
,
|
z
C
= IEx he r K L +1
i
;
,
(7.2)
,
f
1g
where
S0 = x
(7.3)
= min t 0; St = L :
(7.4)
f
g
If 0
x L, then
rvx + rxv
2x2v
r
x
rK:
0 x + 1
00 x =
K
+ rx 1 =
,
2
,
,
,
,
If L
x
, then
1
rvx + rxv x + 1 2x2v x
0
0
0
,
2
= C rx
rx x 1 1 2x2
2
,
,
,
1x, ,
,
,
,
2
,
,
= Cx
r r
1 2
,
1
,
,
,
2
,
,
= C
1x r 1 2 2r
,
,
,
,
2
2
= 0:
In other words, v solves the linear complementarity problem: (See Fig. 25.5).
CHAPTER 25. American Options
257
6
K
v
@
@
@
@
@
@
@
@
@
x
-
L
K
Figure 25.5: Linear complementarity
For all x
IR, x = L,
2
6
rv rxv 1 2x2v
0
0
0
0;
(a)
,
,
2
v K x+;
(b)
,
One of the inequalities (a) or (b) is an equality.
(c)
The half-line 0; is divided into two regions:
1
= x; vx K x+ ;
C
f
,
g
= x; rv rxv 1 2x2v
;
0
0
0
0
S
f
,
,
2
g
and L is the boundary between them. If the stock price is in , the owner of the put should not
C
exercise (should “continue”). If the stock price is in
or at L, the owner of the put should exercise
S
(should “stop”).
25.8
Hedging the put
Let S0 be given. Sell the put at time zero for vS0. Invest the money, holding t shares of
stock and consuming at rate Ct at time t. The value Xt of this portfolio is governed by
dXt = t dSt + rXt tSt dt Ct dt;
,
,
or equivalently,
de rtXt = e rtCt dt + e rtt St dBt:
,
,
,
,
258
The discounted value of the put satisfies
d
e rtv
rt h rv
2S2
,
St = e,
St + rStv0St+ 1
tv00Sti dt
,
2
+ e rt S
,
tv0St dBt
= rKe rt1
dt
rt S
,
St L
+ e,
tv0St dBt:
,
f
g
We should set
Ct = rK1 St L ;
f
g
t = v0St:
Remark 25.1 If St
L, then
vSt = K St; t = v0St = 1:
,
,
To hedge the put when St
L, short one share of stock and hold K in the money market. As
long as the owner does not exercise, you can consume the interest from the money market position,
i.e.,
Ct = rK1 St L :
f
g
rt
Properties of e
vSt
,
:
rt
1. e
vSt
,
is a supermartingale (see its differential above).
rt
rt
2. e
vSt e K St+
t
,
,
,
0
;
,
1
rt
3. e
vSt
,
is the smallest process with properties 1 and 2.
Explanation of property 3. Let Y be a supermartingale satisfying
Y t e rtK St+; 0 t
:
,
(8.1)
,
1
Then property 3 says that
Y t e rtv
t
:
,
St; 0
(8.2)
1
We use (8.1) to prove (8.2) for t = 0, i.e.,
Y 0 vS0:
(8.3)
If t is not zero, we can take t to be the initial time and St to be the initial stock price, and then
adapt the argument below to prove property (8.2).
Proof of (8.3), assuming Y is a supermartingale satisfying (8.1):
Case I: S0
L : We have
Y 0 K S0+ = vS0:
,
|
z
8:1
CHAPTER 25. American Options
259
Case II: S0
L: For T 0, we have
Y 0 IEY T (Stopped supermartingale is a supermartingale)
^
IE hY T1
i
: (Since Y 0)
^
f
1g
Now let T
to get
! 1
Y 0 lim IE hY T1
i
T
^
f
1g
!1
IE hY 1
i
(Fatou’s Lemma)
f
1g
2
3
IE r
6e
S
7
,
K
+1
(by 8.1)
,
4
5
f
1g
|
z
L
= vS0: (See eq. 7.2)
25.9
Perpetual American contingent claim
Intinsic value: hSt.
Value of the American contingent claim:
vx = supIEx e r h
,
S ;
where the supremum is over all stopping times.
Optimal exercise rule: Any stopping time
which attains the supremum.
Characterization of v:
rt
1. e
vSt
,
is a supermartingale;
rt
rt
2. e
vSt e hSt; 0 t
,
,
;
1
rt
3. e
vSt
,
is the smallest process with properties 1 and 2.
25.10
Perpetual American call
vx = supIEx e r
K
,
S
+
,
Theorem 10.63
vx = x x 0:
8
260
Proof: For every t,
vx IEx he rtSt K+i
,
,
IEx he rt
K
,
St
i
,
= IEx he rtS
e rtK
,
ti ,
,
= x e rtK:
,
,
Let t
to get vx
x.
! 1
Now start with S0 = x and define
Y t = e rtS
,
t:
Then:
1. Y is a supermartingale (in fact, Y is a martingale);
rt
2. Y t
e St K+; 0 t
,
.
,
1
Therefore, Y 0
vS0,i.e.,
x vx:
Remark 25.2 No matter what
we choose,
IEx e r
K
r S
x
,
S
+
IEx e,
= vx:
,
There is no optimal exercise time.
25.11
Put with expiration
Expiration time: T
0.
Intrinsic value: K
St+.
,
Value of the put:
vt;x = (value of the put at time t if St = x)
=
sup IExe r tK S +:
,
,
t
T
,
|
z
:stoppingtime
See Fig. 25.6. It can be shown that v; vt;vx are continuous across the boundary, while vxx has a
jump.
Let S0 be given. Then
CHAPTER 25. American Options
261
x
6
v K x
vT;x = 0; x K
,
rv + vt + rxvx + 1
,
2 2x2vxx = 0
K
L
v = K x
vT;x = K x; 0 x K
,
,
vt = 0; vx = 1; vxx = 0
,
rv + vt + rxvx + 1
rK
,
2 2x2vxx = ,
t
-
T
Figure 25.6: Value of put with expiration
rt
1. e
vt;St; 0 t T;
,
is a supermartingale;
rt
rt
2. e
vt;St e K St+; 0 t T
,
,
;
,
rt
3. e
vt;St
,
is the smallest process with properties 1 and 2.
25.12
American contingent claim with expiration
Expiration time: T
0.
Intrinsic value: hSt.
Value of the contingent claim:
vt;x = sup IExe r th
,
,
S :
t
T
Then
rv vt rxvx 1 2x2vxx 0;
(a)
,
,
,
2
v hx;
(b)
At every point t; x
0;T
0; , either (a) or (b) is an equality.
(c)
2
1
Characterization of v: Let S0 be given. Then
262
rt
1. e
vt;St; 0 t T;
,
is a supermartingale;
rt
rt
2. e
vt;St e hSt
,
,
;
rt
3. e
vt;St
,
is the smallest process with properties 1 and 2.
The optimal exercise time is
= min t 0; vt;St = hSt
f
g
If ! =
, then there is no optimal exercise time along the particular path !.
1
Chapter 26
Options on dividend-paying stocks
26.1
American option with convex payoff function
Theorem 1.64 Consider the stock price process
dSt = rtSt dt + tSt dBt;
where r and
are processes and rt
0; 0 t T; a.s. This stock pays no dividends.
Let hx be a convex function of x
0, and assume h0 = 0. (E.g., hx = x K+). An
,
American contingent claim paying hSt if exercised at time t does not need to be exercised
before expiration, i.e., waiting until expiration to decide whether to exercise entails no loss of value.
Proof: For 0
1 and x 0, we have
h x = h1 0 + x
,
1
h0 + hx
,
= hx:
Let T be the time of expiration of the contingent claim. For 0
t T,
0
t
T
Z
ru du
1
T = exp , t
and ST
0, so
h t
t
TST ThST:
(*)
Consider a European contingent claim paying hST at time T . The value of this claim at time
t 0;T is
2
Xt = t IE 1ThST t:
F
263
264
6
r
.
.
...
x;hx
.
.
.
.
.
.
.
.
.
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .
hx
r
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
h x
h
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...................................................................................
r
.
.
.
.
.
.
-
x
Figure 26.1: Convex payoff function
Therefore,
Xt
t = 1tIE t
ThST t
F
1
t (by (*))
tIE h t
TST F
1
t
(Jensen’s inequality)
th t IE ST
T F
= 1
S
th tSt
t
(
is a martingale)
= 1thSt:
This shows that the value Xt of the European contingent claim dominates the intrinsic value
hSt of the American claim. In fact, except in degenerate cases, the inequality
Xt hSt; 0 t T;
is strict, i.e., the American claim should not be exercised prior to expiration.
26.2
Dividend paying stock
Let r and be constant, let be a “dividend coefficient” satisfying
0
1:
CHAPTER 26. Options on dividend paying stocks
265
Let T
0 be an expiration time, and let t1 0;T be the time of dividend payment. The stock
2
price is given by
1
St = S0exp r
;
0 t t1;
f
,
2 2t + Btg
1 St1exp r 1
t1 + Bt Bt1 ;
t1 t T:
,
f
,
2 2t ,
,
g
Consider an American call on this stock. At times t
t1;T, it is not optimal to exercise, so the
2
value of the call is given by the usual Black-Scholes formula
vt;x = xNd+T t;x Ke rT tN
t;x
T;
,
,
d T
; t1 t
,
,
,
,
where
d
T t;x =
1
log x
tr 2=2 :
p
,
T t
K + T ,
,
At time t1, immediately after payment of the dividend, the value of the call is
vt1;1 St1:
,
At time t1, immediately before payment of the dividend, the value of the call is
wt1;St1;
where
wt1;x = maxx K+; vt1;1 x :
,
,
Theorem 2.65 For 0
t t1, the value of the American call is wt;St, where
wt;x = IEt;xhe rt tw
,
,
t
1
1;St1i :
This function satisfies the usual Black-Scholes equation
rw + wt + rxwx + 1 2x2wxx = 0; 0 t t1; x 0;
,
2
(where w = wt; x) with terminal condition
wt1;x = maxx K+; vt1;1 x ; x 0;
,
,
and boundary condition
wt;0 = 0; 0 t T:
The hedging portfolio is
t = wxt;St;
0 t t1;
vxt;St;
t1 t T:
Proof: We only need to show that an American contingent claim with payoff wt1;St1 at time
t1 need not be exercised before time t1. According to Theorem 1.64, it suffices to prove
1. wt1; 0 = 0,
266
2. wt1;x is convex in x.
Since vt1; 0 = 0, we have immediately that
wt1;0 = max0 K+; vt1;1 0 = 0:
,
,
To prove that wt1;x is convex in x, we need to show that vt1; 1
x is convex is x. Obviously,
,
x K+ is convex in x, and the maximum of two convex functions is convex. The proof of the
,
convexity of vt1; 1
x in x is left as a homework problem.
,
26.3
Hedging at time 1
t
Let x = St1.
Case I: vt1; 1
x x K+.
,
,
The option need not be exercised at time t1 (should not be exercised if the inequality is strict). We
have
wt1;x = vt1;1 x;
,
t1 = wxt1;x = 1 vxt1;1 x = 1 t1+;
,
,
,
where
t1+ = lim t
t t
1
is the number of shares of stock held by the hedge immediately after payment of the dividend. The
post-dividend position can be achieved by reinvesting in stock the dividends received on the stock
held in the hedge. Indeed,
t1+ = 1
1
t1 = t1 + 1 t1
,
,
= t1 + t1St1
1 St1
,
= # of shares held when dividend is paid +
dividends received
price per share when dividend is reinvested
Case II: vt1; 1
x x K+.
,
,
The owner of the option should exercise before the dividend payment at time t1 and receive x K.
,
The hedge has been constructed so the seller of the option has x
K before the dividend payment
,
at time t1. If the option is not exercised, its value drops from x K to vt1; 1
x, and the seller
,
,
of the option can pocket the difference and continue the hedge.
Chapter 27
Bonds, forward contracts and futures
Let W t; t; 0
t T be a Brownian motion (Wiener process) on some ; ; . Con-
f
F
g
F
P
sider an asset, which we call a stock, whose price satisfies
dSt = rtSt dt + tSt dWt:
Here, r and
are adapted processes, and we have already switched to the risk-neutral measure,
which we call IP . Assume that every martingale under IP can be represented as an integral with
respect to W .
Define the accumulation factor
t
t = exp Z ru du :
0
A zero-coupon bond, maturing at time T , pays 1 at time T and nothing before time T . According
to the risk-neutral pricing formula, its value at time t
0;T is
2
B
t;T = t IE 1T t
F
= IE t
T t
F
"
T
= IE exp
Z
ru du
t :
,
t
F
Given Bt; T dollars at time t, one can construct a portfolio of investment in the stock and money
267
268
market so that the portfolio value at time T is 1 almost surely. Indeed, for some process ,
Bt;T = tIE 1T t
F
|
z
martingale
= t IE 1
t
T + Z u dWu
0
= t B0;T + t
Z
u dWu ;
0
dB
t
t;T = rt t B0;T + Z u dWu dt + t t dWt
0
= rtBt;T dt + t t dWt:
The value of a portfolio satisfies
dXt = t dSt + rt Xt tSt dt
,
= rtXt dt + t tSt dWt:
(*)
We set
t = t t
tSt:
If, at any time t, Xt = Bt; T and we use the portfolio u; t
u T, then we will have
XT = BT;T = 1:
If rt is nonrandom for all t, then
B
T
t;T = exp
Z
ru du ;
,
t
dBt;T = rtBt;T dt;
i.e.,
= 0. Then given above is zero. If, at time t, you are given Bt;T dollars and you always
invest only in the money market, then at time T you will have
B
T
t;Texp Z ru du = 1:
t
If rt is random for all t, then
is not zero. One generally has three different instruments: the
stock, the money market, and the zero coupon bond. Any two of them are sufficient for hedging,
and the two which are most convenient can depend on the instrument being hedged.
CHAPTER 27. Bonds, forward contracts and futures
269
27.1
Forward contracts
We continue with the set-up for zero-coupon bonds. The T -forward price of the stock at time
t 0;T is the t-measurable price, agreed upon at time t, for purchase of a share of stock at
2
F
time T , chosen so the forward contract has value zero at time t. In other words,
IE 1
T ST Ft t = 0; 0 t T:
,
F
We solve for F t:
0 = IE 1TST Ft t
,
F
= IE ST
Ft
T t
t
F
,
t IE t
T F
= St Ft
t , t Bt;T:
This implies that
Ft = St
Bt;T:
Remark 27.1 (Value vs. Forward price) The T -forward price F t is not the value at time t of
the forward contract. The value of the contract at time t is zero. F t is the price agreed upon at
time t which will be paid for the stock at time T .
27.2
Hedging a forward contract
Enter a forward contract at time 0, i.e., agree to pay F 0 = S0
B0;T for a share of stock at time T.
At time zero, this contract has value 0. At later times, however, it does not. In fact, its value at time
t 0;T is
2
V
t = t IE 1TST F0 t
,
F
= t IE ST
T t F0 IE t t
F
,
T F
= tSt
t F0Bt;T
,
= St F0Bt;T:
,
This suggests the following hedge of a short position in the forward contract. At time 0, short F 0
T-maturity zero-coupon bonds. This generates income
F0B0;T = S0
B0;TB0;T = S0:
270
Buy one share of stock. This portfolio requires no initial investment. Maintain this position until
time T , when the portfolio is worth
ST F0BT;T = ST F0:
,
,
Deliver the share of stock and receive payment F 0.
A short position in the forward could also be hedged using the stock and money market, but the
implementation of this hedge would require a term-structure model.
27.3
Future contracts
Future contracts are designed to remove the risk of default inherent in forward contracts. Through
the device of marking to market, the value of the future contract is maintained at zero at all times.
Thus, either party can close out his/her position at any time.
Let us first consider the situation with discrete trading dates
0 = t0 t1 ::: tn = T:
On each tj;tj+1, r is constant, so
t
t
k
k+1 = exp Z +1 ru du
0
8
k
9
= exp
=
X
rtjtj+1 tj
,
:j=0
;
is
tk-measurable.
F
Enter a future contract at time tk, taking the long position, when the future price is tk. At time
tk+1, when the future price is tk+1, you receive a payment tk+1 tk. (If the price has
,
fallen, you make the payment
tk+1 tk. ) The mechanism for receiving and making
,
,
these payments is the margin account held by the broker.
By time T = tn, you have received the sequence of payments
tk+1 tk; tk+2 tk+1; ::: ; tn tn 1
,
,
,
,
at times tk+1;tk+2;::: ;tn. The value at time t = t0 of this sequence is
2n 1
3
t IE ,
1
X
t
t :
4
j
5
F
j=k tj+1 tj+1 ,
Because it costs nothing to enter the future contract at time t, this expression must be zero almost
surely.
CHAPTER 27. Bonds, forward contracts and futures
271
The continuous-time version of this condition is
"
t IE T
Z
1
t = 0; 0 t T:
t
u du F
Note that tj+1 appearing in the discrete-time version is tj-measurable, as it should be when
F
approximating a stochastic integral.
Definition 27.1 The T -future price of the stock is any t-adapted stochastic process
F
t; 0 t T ;
f
g
satisfying
T = ST a.s., and
(a)
"
IE T
Z
1
t = 0; 0 t T:
t
u du
(b)
F
Theorem 3.66 The unique process satisfying (a) and (b) is
t = IE ST t ;0 t T:
F
Proof: We first show that (b) holds if and only if is a martingale. If is a martingale, then
t
R
0 1u du is also a martingale, so
"
IE T
t
t
Z
1
t = IE Z 1
t Z 1
t
u du F
0 u du F
,
0 u du
= 0:
On the other hand, if (b) holds, then the martingale
"
M
T
t = IE Z
1
t
0
u du F
satisfies
"
Mt = t
T
Z
1
Z
1
t
0 u du + IE t
u du F
= t
Z
1
t T:
0 u du; 0
this implies
dMt = 1t dt;
dt = t dMt;
272
and so is a martingale (its differential has no dt term).
Now define
t = IE ST t ; 0 t T:
F
Clearly (a) is satisfied. By the tower property, is a martingale, so (b) is also satisfied. Indeed, this
is the only martingale satisfying (a).
27.4
Cash flow from a future contract
With a forward contract, entered at time 0, the buyer agrees to pay F 0 for an asset valued at ST .
The only payment is at time T .
With a future contract, entered at time 0, the buyer receives a cash flow (which may at times be
negative) between times 0 and T . If he still holds the contract at time T , then he pays ST at time
T for an asset valued at ST. The cash flow received between times 0 and T sums to
T
Z
du = T 0 = ST 0:
0
,
,
Thus, if the future contract holder takes delivery at time T , he has paid a total of
0 ST+ ST = 0
,
for an asset valued at ST .
27.5
Forward-future spread
Future price: t = IE ST t.
F
Forward price:
Ft = St
Bt;T =
St
:
tIE 1T t
F
Forward-future spread:
0 F0 = IE ST
S0
i
,
,
IE h 1T
=
1
IE
IE 1 IE ST IE ST :
1
,
T
T
T
1
If T and ST are uncorrelated,
0 = F0:
CHAPTER 27. Bonds, forward contracts and futures
273
1
If T and ST are positively correlated, then
0 F0:
This is the case that a rise in stock price tends to occur with a fall in the interest rate. The owner
of the future tends to receive income when the stock price rises, but invests it at a declining interest
rate. If the stock price falls, the owner usually must make payments on the future contract. He
withdraws from the money market to do this just as the interest rate rises. In short, the long position
1
in the future is hurt by positive correlation between
T and ST. The buyer of the future is
compensated by a reduction of the future price below the forward price.
27.6
Backwardation and contango
Suppose
dSt = St dt + St dWt:
r
Define = , ; W t = t + W t
f
,
ZT = exp WT 1
f,
,
22Tg
IPA = Z ZT dIP; A
T:
f
A
8
2
F
Then W
IP
f
is a Brownian motion under f, and
dSt = rSt dt + St dWt:
f
We have
t = ert
St = S0exp 1 2t + Wt
f
,
2
g
= S0exp r 1 2t + Wt
f
f
,
2
g
1
rT
1
Because T = e
IP
,
is nonrandom, ST and T are uncorrelated under f. Therefore,
t = IE ST t
f
F
= Ft
= St
t
Bt;T = erT S
,
t:
The expected future spot price of the stock under IP is
IEST = S0eTIE hexpn 1 2T + WToi
,
2
= eTS0:
274
The future price at time 0 is
0 = erTS0:
If
r, then 0 IEST: This situation is called normal backwardation (see Hull). If r,
then 0
IEST. This is called contango.
Chapter 28
Term-structure models
Throughout this discussion, W t; 0
t T is a Brownian motion on some probability space
f
g
; ; , and Ft; 0 t T is the filtration generated by W.
F
P
f
g
Suppose we are given an adapted interest rate process rt; 0
t T . We define the accumu-
f
g
lation factor
t = exp t
Z
ru du ; 0 t T :
0
In a term-structure model, we take the zero-coupon bonds (“zeroes”) of various maturities to be the
primitive assets. We assume these bonds are default-free and pay $1 at maturity. For 0
t T
T, let
Bt;T = price at time t of the zero-coupon bond paying $1 at time T.
Theorem 0.67 (Fundamental Theorem of Asset Pricing) A term structure model is free of arbi-
trage if and only if there is a probability measure IP
f
on
(a risk-neutral measure) with the same
probability-zero sets as IP (i.e., equivalent to IP ), such that for each T
0;T , the process
2
Bt;T
t ; 0 t T;
is a martingale under IP
f .
Remark 28.1 We shall always have
dBt;T = t;TBt;T dt + t;TBt;T dWt; 0 t T;
for some functions t; T and t; T . Therefore
dBt;T
t = Bt;T d 1t + 1t dBt;T
= t;T rt Bt;T
,
t dt + t;TBt;T
t dWt;
275
276
so IP is a risk-neutral measure if and only if t; T , the mean rate of return of Bt; T under IP , is
the interest rate rt. If the mean rate of return of Bt; T under IP is not rt at each time t and for
each maturity T , we should change to a measure IP
f
under which the mean rate of return is rt. If
such a measure does not exist, then the model admits an arbitrage by trading in zero-coupon bonds.
28.1
Computing arbitrage-free bond prices: first method
Begin with a stochastic differential equation (SDE)
dXt = at;Xt dt + bt;Xt dWt:
The solution Xt is the factor. If we want to have n-factors, we let W be an n-dimensional
Brownian motion and let X be an n-dimensional process. We let the interest rate rt be a function
of Xt. In the usual one-factor models, we take rt to be Xt (e.g., Cox-Ingersoll-Ross, Hull-
White).
Now that we have an interest rate process rt; 0
t T , we define the zero-coupon bond
f
g
prices to be
B
t;T = t IE 1T t
F
"
= IE exp
T
Z
ru du
t ; 0 t T T :
,
t
F
We showed in Chapter 27 that
dBt;T = rtBt;T dt + t t dWt
for some process . Since Bt; T has mean rate of return rt under IP , IP is a risk-neutral measure
and there is no arbitrage.
28.2
Some interest-rate dependent assets
Coupon-paying bond: Payments P1;P2;::: ;Pn at times T1;T2;::: ;Tn. Price at time t is
X
PkBt;Tk:
k:t Tk
f
g
Call option on a zero-coupon bond: Bond matures at time T . Option expires at time T1
T.
Price at time t is
t IE 1T
K+ t ; 0 t T
1BT1;T
1:
,
F
CHAPTER 28. Term-structure models
277
28.3
Terminology
Definition 28.1 (Term-structure model) Any mathematical model which determines, at least the-
oretically, the stochastic processes
Bt;T; 0 t T;
for all T
0;T .
2
Definition 28.2 (Yield to maturity) For 0
t T T, the yield to maturity Y t;T is the
t-measurable random-variable satisfying
F
Bt;Texp T tY t;T = 1;
f
,
g
or equivalently,
Y t;T =
1
,
T t logBt;T:
,
Determining
Bt;T; 0 t T T ;
is equivalent to determining
Y t;T; 0 t T T :
28.4
Forward rate agreement
Let 0
t T T +
T be given. Suppose you want to borrow $1 at time T with repayment
(plus interest) at time T + , at an interest rate agreed upon at time t. To synthesize a forward-rate
Bt;T
agreement to do this, at time t buy a T -maturity zero and short Bt;T+ T + -maturity zeroes.
The value of this portfolio at time t is
Bt;T
Bt;T
,
Bt;T + Bt;T + = 0:
Bt;T
At time T , you receive $1 from the T -maturity zero. At time T + , you pay $ Bt;T+ . The
effective interest rate on the dollar you receive at time T is Rt; T; T + given by
Bt;T
Bt;T + = exp Rt;T;T + ;
f
g
or equivalently,
Rt;T;T + = logBt;T + logBt;T
,
:
,
The forward rate is
ft;T = lim Rt;T;T + = @
0
,
@T logBt;T:
(4.1)
278
This is the instantaneous interest rate, agreed upon at time t, for money borrowed at time T .
Integrating the above equation, we obtain
T
T
Z
ft;u du = Z @
t
,
t @u log Bt; u du
u=T
= logBt;u
,
u=t
= logBt;T;
,
so
Bt;T = exp
T
Z
ft;u du :
,
t
You can agree at time t to receive interest rate ft; u at each time u
t;T . If you invest $ Bt;T
2
at time t and receive interest rate ft; u at each time u between t and T , this will grow to
Bt;Texp T
Z
ft;u du = 1
t
at time T .
28.5
Recovering the interest
from the forward rate
r
t
"
Bt;T = IE exp
T
Z
ru du
t ;
,
t
F
@
"
T
Z
@T Bt;T = IE rTexp
ru du
t ;
,
,
t
F
@@TBt;T =IE rt t= rt:
T=t
,
F
,
On the other hand,
Bt;T = exp
T
Z
ft;u du ;
,
t
@
T
Z
@T Bt;T = ft;Texp
ft;u du ;
,
,
t
@@TBt;T = ft;t:
T=t ,
Conclusion: rt = ft; t.
CHAPTER 28. Term-structure models
279
28.6
Computing arbitrage-free bond prices: Heath-Jarrow-Morton
method
For each T
0;T , let the forward rate be given by
2
ft;T = f0;T+ t
t
Z
u;T du + Z u;T dWu; 0 t T:
0
0
Here
u;T; 0 u T and u;T; 0 u T are adapted processes.
f
g
f
g
In other words,
dft;T = t;T dt + t;T dWt:
Recall that
Bt;T = exp
T
Z
ft;u du :
,
t
Now
d
T
T
Z
ft;u du = ft;t dt Z dft;u du
,
t
,
t
= rt dt
T
Z
t;u dt + t;u dWt du
,
t
"
"
= rt dt
T
T
Z
t;u du dt Z t;u du dWt
,
t
,
t
|
z
t;T
|
z
t;T
= rt dt
t;T dt
t;T dWt:
,
,
Let
gx = ex; g0x = ex; g00x = ex:
Then
!
B
T
t;T = g
Z
ft;u du ;
,
t
and
!
dB
T
t;T = dg
Z
ft;u du
,
t
!
= g
T
Z
f
dt
dW
0
t;u du r dt
,
t
,
,
!
+ 1
T
Z
2g
f
0
0
t;u du 2 dt
,
t
= Bt;Thrt
t;T + 1
,
2 t; T2i dt
t;TBt;T dWt:
,
280
28.7
Checking for absence of arbitrage
IP is a risk-neutral measure if and only if
t T T ;
t;T = 12 t;T2; 0
i.e.,
T
2
T
!
Z
t;u du = 1 Z t;u du ; 0 t T T :
t
2
t
(7.1)
Differentiating this w.r.t. T , we obtain
t;T = t;T T
Z
t;u du; 0 t T T :
t
(7.2)
Not only does (7.1) imply (7.2), (7.2) also implies (7.1). This will be a homework problem.
Suppose (7.1) does not hold. Then IP is not a risk-neutral measure, but there might still be a risk-
neutral measure. Let t; 0
t T be an adapted process, and define
f
g
Wt = t
Z
u du + Wt;
f
0
Zt = exp
t
t
Z
u dWu 1 Z 2u du ;
,
0
,
2 0
IPA = Z ZT
A
dIP
T:
f
A
8
2
F
Then
dBt;T = Bt;Thrt t;T + 1
,
2 t; T2i dt
t;TBt;T dWt
,
= Bt;Thrt
t;T + 1
,
2 t; T2 + t; Tti dt
W
t T:
t;TBt;T d t; 0
f
,
In order for Bt; T to have mean rate of return rt under IP
f, we must have
t T T :
t;T = 12 t;T2 + t;Tt; 0
(7.3)
Differentiation w.r.t. T yields the equivalent condition
t;T = t;T
t T T :
t; T + t; T t;
0
(7.4)
Theorem 7.68 (Heath-Jarrow-Morton) For each T
0;T
u T;
, let
u;T; 0
and
2
u;T;0
u
T, be adapted processes, and assume u;T 0 for all u and T. Let
f0;T; 0 t T, be a deterministic function, and define
ft;T = f0;T+ t
t
Z
u;T du + Z u;T dWu:
0
0
CHAPTER 28. Term-structure models
281
Then ft; T ; 0
t T T is a family of forward rate processes for a term-structure model
without arbitrage if and only if there is an adapted process t; 0
t T, satisfying (7.3), or
equivalently, satisfying (7.4).
Remark 28.2 Under IP , the zero-coupon bond with maturity T has mean rate of return
rt t;T + 1
,
2 t; T2
and volatility
t;T
. The excess mean rate of return, above the interest rate, is
t;T + 1 t;T2;
,
2
and when normalized by the volatility, this becomes the market price of risk
t;T + 1 t;T2
,
2
t;T
:
The no-arbitrage condition is that this market price of risk at time t does not depend on the maturity
T of the bond. We can then set
"
t =
t;T + 1 t;T2
,
2
;
,
t;T
and (7.3) is satisfied.
(The remainder of this chapter was taught Mar 21)
Suppose the market price of risk does not depend on the maturity T , so we can solve (7.3) for .
Plugging this into the stochastic differential equation for Bt; T , we obtain for every maturity T :
dBt;T = rtBt;T dt
W
t;TBt;T d t:
f
,
Because (7.4) is equivalent to (7.3), we may plug (7.4) into the stochastic differential equation for
ft;T to obtain, for every maturity T:
dft;T = t;T t;T + t;Tt dt + t;T dWt
= t;T
W
t;T dt + t;T d t:
f
28.8
Implementation of the Heath-Jarrow-Morton model
Choose
t T T ;
t;T; 0
t; 0 t T :
282
These may be stochastic processes, but are usually taken to be deterministic functions. Define
t;T = t;T t;T + t;Tt;
W
t
t = Z u du + Wt;
f
0
Zt = exp
t
t
Z
u dWu 1 Z 2u du ;
,
0
,
2 0
IPA = Z ZT
A
dIP
T:
f
A
8
2
F
Let f0;T ; 0
T T ; be determined by the market; recall from equation (4.1):
f0;T = @
T T :
,
@T logB0;T; 0
Then ft; T for 0
t T is determined by the equation
dft;T = t;T t;T dt + t;T dWt;
f
(8.1)
this determines the interest rate process
rt = ft;t; 0 t T ;
(8.2)
and then the zero-coupon bond prices are determined by the initial conditions B0;T ; 0
T
T, gotten from the market, combined with the stochastic differential equation
dBt;T = rtBt;T dt
W
t;TBt;T d t:
f
(8.3)
,
Because all pricing of interest rate dependent assets will be done under the risk-neutral measure IP
f ,
under which W
W
f
is a Brownian motion, we have written (8.1) and (8.3) in terms of f rather than
W. Written this way, it is apparent that neither t nor t;T will enter subsequentcomputations.
The only process which matters is t; T ; 0
t T T, and the process
T
t T T ;
t;T = Z t;u du; 0
t
(8.4)
obtained from t; T .
From (8.3) we see that
t;T
is the volatility at time t of the zero coupon bond maturing at time
T. Equation (8.4) implies
T;T = 0; 0 T T :
(8.5)
This is because BT; T = 1 and so as t approaches T (from below), the volatility in Bt; T must
vanish.
In conclusion, to implement the HJM model, it suffices to have the initial market data B0;T ; 0
T T ; and the volatilities
t;T; 0 t T T :
CHAPTER 28. Term-structure models
283
We require that
t;T
be differentiable in T and satisfy (8.5). We can then define
t;T = @@T t;T;
and (8.4) will be satisfied because
t;T = t;T
t;t = T
Z
@ t;u du:
,
t @u
We then let W
IP
t
f
be a Brownian motion under a probability measure f, and we let Bt; T ; 0
T T, be given by (8.3), where rt is given by (8.2) and ft;T by (8.1). In (8.1) we use the
initial conditions
f0;T = @
T T :
,
@T logB0;T; 0
Remark 28.3 It is customary in the literature to write W rather than W
IP
f
and IP rather than f,
so that IP is the symbol used for the risk-neutral measure and no reference is ever made to the
market measure. The only parameter which must be estimated from the market is the bond volatility
t;T
, and volatility is unaffected by the change of measure.
284
Chapter 29
Gaussian processes
Definition 29.1 (Gaussian Process) A Gaussian process Xt, t
0, is a stochastic process with
the property that for every set of times 0
t1 t2 ::: tn, the set of random variables
Xt1;Xt2;::: ;Xtn
is jointly normally distributed.
Remark 29.1 If X is a Gaussian process, then its distribution is determined by its mean function
mt = IEXt
and its covariance function
s;t = IE Xs ms Xt mt :
,
,
Indeed, the joint density of Xt1;::: ;Xtn is
IP Xt1 dx1;::: ;Xtn dxn
f
2
2
g
=
1
1
o
2n=2
mt x mt dx
,1
T
1 ::: dxn;
pdet exp n,2x ,
,
where is the covariance matrix
2
t1;t1 t1;t2 ::: t1;tn3
= 6 t2;t1 t2;t2 ::: t2;tn7
6
7
6
:::
:::
:::
::: 7
4
t
5
n;t1 tn;t2 :::
tn;tn
xistherowvector x1;x2;::: ;xn , tistherowvector t1;t2;::: ;tn ,andmt = mt1;mt2;::: ;mtn .
The moment generating function is
n
IE exp
o
X
ukXtk = expnu mt
u ;
T
+ 1 T
2u
k=1
where u = u1;u2;::: ;un .
285
286
29.1
An example: Brownian Motion
Brownian motion W is a Gaussian process with mt = 0 and s; t = s t. Indeed, if 0
s t,
^
then
s;t = IE WsWt = IE hWsWt Ws + W2si
,
= IEWs:IE Wt Ws + IEW2s
,
= IEW2s
= s t:
^
To prove that a process is Gaussian, one must show that Xt1;::: ;Xtn has either a density or a
moment generating function of the appropriate form. We shall use the m.g.f., and shall cheat a bit
by considering only two times, which we usually call s and t. We will want to show that
"
"
IE exp u
u
1Xs + u2Xt = exp u1m1 + u2m2 + 1
11
12
1
:
f
g
2 u1 u2
21
22 u2
Theorem 1.69 (Integral w.r.t. a Brownian) Let W t be a Brownian motion and t a nonran-
dom function. Then
Xt = t
Z
u dWu
0
is a Gaussian process with mt = 0 and
s t
s;t = Z ^ 2u du:
0
Proof: (Sketch.) We have
dX = dW:
Therefore,
deuXs = ueuXs s dWs + 12u2euXs 2s ds;
euXs
s
s
= euX0 + uZ euXv v dWv+12u2 Z euXv 2v dv;
0
|
0
z
Martingale
IEeuXs
s
= 1 + 1
2
2u2 Z
vIEeuXv dv;
0
ddsIEeuXs= 12u2 2sIEeuXs;
IEeuXs = euX0 exp 1
s 2
2u2 Z
v dv
(1.1)
0
s
= exp 1
2
2u2 Z
v dv :
0
s
R
This shows that Xs is normal with mean 0 and variance 0 2v dv.
CHAPTER 29. Gaussian processes
287
Now let 0
s t be given. Just as before,
deuXt = ueuXt t dWt + 12u2euXt 2t dt:
Integrate from s to t to get
euXt
t
t
= euXs + uZ veuXv dWv + 1
2veuXv dv:
s
2u2 Zs
Take IE :::
s conditional expectations and use the martingale property
jF
IE t
t
s
Z
veuXv dWv s = IE Z veuXv dWv s Z veuXv dWv
s
F
0
F
,
0
= 0
to get
IE euXt
t
s = euXs + 1
2vIE euXv s dv
F
2u2 Zs
F
ddtIEeuXt s= 1
s; t s:
F
2u2 2tIE euXt F
The solution to this ordinary differential equation with initial time s is
IE euXt
t
s = euXs exp 1
2v dv ; t s:
(1.2)
F
2u2 Zs
We now compute the m.g.f. for Xs;Xt, where 0
s t:
IE eu Xs+u Xt s = eu XsIE eu Xt s
1
2
1
2
F
F
t
Z
1.2
= eu +u Xsexp 1
2v dv ;
1
2
2u22 s
IE heu Xs+u Xti = IE IE eu Xs+u Xt s
1
2
1
2
F
= IE
t
Z
neu +u Xso: exp 1
2v dv
1
2
2u22 s
s
t
Z
(1.1)
= exp 1
2
2
2u1 + u22 Z
v dv + 1
v dv
0
2u22 s
s
t
= exp 1
2
Z
2
2u21 + 2u1u2 Z
v dv + 1
v dv
0
2u22 0
"
s
s "
R
= exp 1
u1
2 u1 u2
0 2 R0 2
s
t
:
R
0 2 R0 2 u2
This shows that Xs;Xt is jointly normal with IEXs = IEXt = 0,
IEX2s = s
t
Z
2v dv;
IEX2t = Z 2v dv;
0
0
IE XsXt = s
Z
2v dv:
0
288
Remark 29.2 The hard part of the above argument, and the reason we use moment generating
functions, is to prove the normality. The computation of means and variances does not require the
use of moment generating functions. Indeed,
X
t
t = Z u dWu
0
is a martingale and X0 = 0, so mt=IEXt=0 t 0:
8
For fixed s
0,
IEX2s = s
Z
2v dv
0
by the It ˆo isometry. For 0
s t,
IE X
sXt Xs = IE IE XsXt Xs s
,
,
F
2
3
= IE 6
7
6Xs IE Xt
s Xs
7
F
,
6
7
4
5
|
z
0
= 0:
Therefore,
IE XsXt = IE XsXt Xs + X2s
,
= IEX2s = s
Z
2v dv:
0
If
were a stochastic proess, the It ˆo isometry says
IEX2s = s
Z
IE 2v dv
0
and the same argument used above shows that for 0
s t,
IE XsXt = IEX2s = s
Z
IE 2v dv:
0
However, when
is stochastic, X is not necessarily a Gaussian process, so its distribution is not
determined from its mean and covariance functions.
Remark 29.3 When
is nonrandom,
X
t
t = Z u dWu
0
is also Markov. We proved this before, but note again that the Markov property follows immediately
from (1.2). The equation (1.2) says that conditioned on
s, the distributionof
t
R
Xt depends only
F
on Xs; in fact, Xt is normal with mean Xs and variance s 2v dv.
CHAPTER 29. Gaussian processes
289
z
z
y = z
v = z
s
s
y
v
t
(b)
(a)
z
y = z
s
v
y
t
(c)
Figure 29.1: Range of values of y; z; v for the integrals in the proof of Theorem 1.70.
Theorem 1.70 Let W t be a Brownian motion, and let t and ht be nonrandom functions.
Define
Xt = t
t
Z
u dWu; Y t = Z huXu du:
0
0
Then Y is a Gaussian process with mean function mY t = 0 and covariance function
s t
s
t
Z
^
Y s; t = Z
2v
hy dy
Z hy dy
dv:
0
v
v
(1.3)
Proof: (Partial) Computation of Y s; t: Let 0
s t be given. It is shown in a homework
problem that Y s;Y t is a jointly normal pair of random variables. Here we observe that
m
t
Y t = IEY t = Z hu IEXu du = 0;
0
and we verify that (1.3) holds.
290
We have
Y s; t = IE Y sY t
= IE s
t
Z
hyXy dy:Z hzXz dz
0
0
= IE s t
Z
Z
hyhzXyXz dy dz
0 0
= s t
Z
Z
hyhzIE XyXz dy dz
0 0
= s t
y z
Z
Z
hyhzZ ^ 2v dv dy dz
0 0
0
s t
z
= Z Z hyhz Z 2v dv
dy dz
0 z
0
s s
y
+ Z Z hyhz Z 2v dv
dz dy (See Fig. 29.1(a))
0 y
0
= s
t
z
Z
hzZ hy dy
Z 2v dv
dz
0
z
0
s
s
y
+ Z hy Z hz dz
Z 2v dv
dy
0
y
0
= s z
t
Z
Z
hz 2vZ hy dy
dv dz
0 0
z
s y
s
+ Z Z hy 2v Z hz dz
dv dy
0 0
y
= s s
t
Z
Z
hz 2vZ hy dy
dz dv
0 v
z
+ s s
s
Z
Z
hy 2vZ hz dz
dy dv (See Fig. 29.1(b))
0 v
y
= s
s t
Z
2vZ Z hyhz dy dz
dv
0
v z
+ s
s s
Z
2vZ Z hyhz dz dy
dv
0
v y
= s
s t
Z
2vZ Z hyhz dy dz
dv (See Fig. 29.1(c))
0
v v
= s
s
t
Z
2v Z hy dy
Z hz dz
dv
0
v
v
s
s
t
= Z 2v Z hy dy
Z hy dy
dv
0
v
v
t
t
Remark 29.4 Unlike the process Xt = R0 u dWu, the process Y t = R0 Xu du is
CHAPTER 29. Gaussian processes
291
neither Markov nor a martingale. For 0
s t,
IE Y t s = s
t
Z
huXu du + IE Z huXu du s
jF
0
s
F
= Y s + t
Z
huIE Xu s du
s
F
= Y s + t
Z
huXs du
s
= Y s + Xs t
Z
hu du;
s
where we have used the fact that X is a martingale. The conditional expectation IE Y t s is
jF
not equal to Y s, nor is it a function of Y s alone.
292
Chapter 30
Hull and White model
Consider
drt = t trt dt + t dWt;
,
where t, t and t are nonrandom functions of t.
We can solve the stochastic differential equation. Set
Kt = t
Z
u du:
0
Then
d
eKtrt = eKt trt dt + drt
= eKt t dt + t dWt:
Integrating, we get
eKtr
t
t
t = r0+ Z eKu u du + Z eKu u dWu;
0
0
so
r
t = e Kt r
t eKu
t eKu
:
,
0 + Z
u du + Z
u dWu
0
0
From Theorem 1.69 in Chapter 29, we see that rt is a Gaussian process with mean function
m
t
rt = e Kt r
eKu
,
0 + Z
u du
(0.1)
0
and covariance function
s t
^
rs; t = e Ks Kt Z
e2Ku 2
,
,
u du:
(0.2)
0
The process rt is also Markov.
293
294
T
R
We want to study 0 rt dt. To do this, we define
Xt = t
T
Z
eKu u dWu; Y T = Z e KtXt dt:
,
0
0
Then
r
t
t = e Kt r
eKu
KtX
,
0 + Z
u du + e,
t;
0
T
T
t
Z
rt dt = Z e Kt r
eKu
,
0 + Z
u du dt + Y T:
0
0
0
T
R
According to Theorem 1.70 in Chapter 29, 0 rt dt is normal. Its mean is
IE T
T
t
Z
rt dt = Z e Kt r0 + Z eKu u du dt;
,
(0.3)
0
0
0
and its variance is
!
var T
Z
rt dt = IEY 2T
0
T
2
T
!
= Z e2Kv 2v Z e Ky dy dv:
,
0
v
The price at time 0 of a zero-coupon bond paying $1 at time T is
B0;T = IE exp
T
Z
rt dt
,
0
T
T
!
= exp 1IE Z rt dt + 1 12 var Z rt dt
,
0
2,
0
= exp r0 T
T t
Z
e Kt dt Z Z e Kt+Ku
,
,
u du dt
,
0
,
0 0
2
!
+ 1 T
T
Z
Z
Ky
2
e2Kv 2v
e
dy dv
,
0
v
= exp r0C0;T A0;T ;
f,
,
g
where
C0;T = T
Z
e Kt dt;
,
0
2
!
A0;T = T t
T
T
Z
Z
e Kt+Ku u du dt 1 Z e2Kv 2v Z e Ky dy dv:
,
,
0 0
,
2 0
v
CHAPTER 30. Hull and White model
295
t
u = t
T
u
Figure 30.1: Range of values of u; t for the integral.
30.1
Fiddling with the formulas
Note that (see Fig 30.1)
T t
Z
Z
e Kt+Ku
,
u du dt
0 0
= T T
Z
Z
e Kt+Ku u dt du
,
0 u
!
y = t; v = u = T
T
Z
eKv v Z e Ky dy dv:
,
0
v
Therefore,
2
23
!
!
A
T
T
T
0;T = Z eKv v Z e Ky dy
1
Z
e Ky dy
dv;
,
,
0 4
v
5
,
2e2Kv 2v v
C
T
0;T = Z e Ky dy;
,
0
B0;T = exp r0C0;T A0;T :
f,
,
g
Consider the price at time t
0;T of the zero-coupon bond:
2
"
Bt;T = IE exp
T
Z
ru du
t :
,
t
F
Because r is a Markov process, this should be random only through a dependence on rt. In fact,
Bt;T = exp rtCt;T At;T ;
f,
,
g
296
where
2
23
!
!
A
T
T
T
t;T = Z eKv v Z e Ky dy
1
Z
e Ky dy
dv;
,
,
t 4
v
5
,
2e2Kv 2v v
C
T
t;T = eKt Z e Ky dy:
,
t
The reason for these changes is the following. We are now taking the initial time to be
T
T
R
R
t rather than
zero, so it is plausible that 0 ::: dv should be replaced by t ::: dv: Recall that
Kv = v
Z
u du;
0
and this should be replaced by
K
v
v Kt = Z u du:
,
t
Similarly, Ky should be replaced by Ky
Kt. Making these replacements in A0;T, we
,
see that the Kt terms cancel. In C0;T , however, the Kt term does not cancel.
30.2
Dynamics of the bond price
Let Ctt; T and Att; T denote the partial derivatives with respect to t. From the formula
Bt;T = exp rtCt;T At;T ;
f
,
,
g
we have
dBt;T = Bt;Th Ct;T drt 1
rtCtt;T dt Att;T dti
,
,
2C2t; T drt drt ,
,
= Bt;T Ct;T t
trtdt
,
,
Ct;T t dWt 1
,
,
2C2t; T 2t dt
rtCtt;T dt Att;T dt:
,
,
Because we have used the risk-neutral pricing formula
"
B
T
t;T = IE exp
Z
ru du
t
,
t
F
to obtain the bond price, its differential must be of the form
dBt;T = rtBt;T dt + ::: dWt:
CHAPTER 30. Hull and White model
297
Therefore, we must have
Ct;T t trt 1
rtCtt;T Att;T = rt:
,
,
,
2C2t; T 2t ,
,
We leave the verification of this equation to the homework. After this verification, we have the
formula
dBt;T = rtBt;T dt tCt;TBt;T dWt:
,
In particular, the volatility of the bond price is tCt; T .
30.3
Calibration of the Hull & White model
Recall:
drt = t trt dt + t dBt;
,
Kt = t
Z
u du;
0 2
23
!
!
A
T
T
T
t;T = Z eKv v Z e Ky dy
1
Z
e Ky dy
dv;
,
,
t 4
v
5
,
2e2Kv 2v v
C
T
t;T = eKt Z e Ky dy;
,
t
Bt;T = exp rtCt;T At;T :
f
,
,
g
Suppose we obtain B0;T for all T
0;T from market data (with some interpolation). Can we
2
determine the functions t, t, and t for all t
0;T ? Not quite. Here is what we can do.
2
We take the following input data for the calibration:
1. B0;T ; 0
T T;
2. r0;
3.
0;
4.
t; 0 t T (usually assumed to be constant);
5.
0C0;T; 0 T T, i.e., the volatility at time zero of bonds of all maturities.
Step 1. From 4 and 5 we solve for
C0;T = T
Z
e Ky dy:
,
0
298
We can then compute
@
KT
@T C0;T = e,
= KT = log @
,
@T C0;T;
@
T
Z
@T KT = @@T
u du = T:
0
We now have T for all T
0;T .
2
Step 2. From the formula
B0;T = exp r0C0;T A0;T ;
f,
,
g
we can solve for A0;T for all T
0;T . Recall that
2
2
23
!
!
A0;T = T
T
T
Z
eKv v Z e Ky dy
1
Z
e Ky dy
dv:
,
,
0 4
v
5
,
2e2Kv 2v v
We can use this formula to determine T ; 0
T T as follows:
@
T "
!
KT
KT
T
Z
Ky
@T A0;T = Z eKv ve
e2Kv 2
e
dy dv;
,
ve,
,
0
,
v
"
!
eKT @
T
T
Z
Ky
@T A0;T = Z eKv v e2Kv 2v
e
dy dv;
,
0
,
v
@
T
Z
KT
@T eKT @@T A0;T = eKT T
e2Kv 2v e
dv;
,
,
0
eKT @
T
Z
@T eKT @@T A0;T = e2KT T
e2Kv 2v dv;
,
0
@
@T eKT @@T eKT @@T A0;T =
e2KT 2
T T :
0 T e2KT + 2 T T e2KT
T; 0
,
This gives us an ordinary differential equation for
, i.e.,
e2Kt 2
0
te2Kt + 2 t te2Kt
t = known function of t:
,
From assumption 4 and step 1, we know all the coefficients in this equation. From assumption 3,
we have the initial condition 0. We can solve the equation numerically to determine the function
t; 0 t T.
Remark 30.1 The derivation of the ordinary differential equation for
t requires three differ-
entiations. Differentiation is an unstable procedure, i.e., functions which are close can have very
different derivatives. Consider, for example,
fx = 0 x IR;
8
2
gx = sin1000x
100
x IR:
8
2
CHAPTER 30. Hull and White model
299
Then
fx gx
1
x IR;
j
,
j
100 8 2
but because
g0x = 10cos1000x;
we have
f
g
0 x
0 x = 10
j
,
j
for many values of x.
Assumption 5 for the calibration was that we know the volatility at time zero of bonds of all maturi-
ties. These volatilities can be implied by the prices of options on bonds. We consider now how the
model prices options.
30.4
Option on a bond
Consider a European call option on a zero-coupon bond with strike price K and expiration time T1.
The bond matures at time T2
T1. The price of the option at time 0 is
IE e T
R
ru du
1
,
BT1;T2 K+
0
,
= IEe T
R
ru du
1
,
exp rT1CT1;T2 AT1;T2 K+:
0
f,
,
g
,
+
= Z Z
1
1
e x
yC
A
K
f
,
exp
T1;T2
T1;T2
x;y dx dy;
f,
,
g
,
,1
,1
T
R
where fx; y is the joint density of
1
0 ru du; rT1.
T
R
We observed at the beginning of this Chapter (equation (0.3)) that
1
0 ru du is normal with
"
T
T
Z
1
1
1
4
= IE
ru du = Z IEru du
0
0
= T
v
Z
1
r0e Kv + e Kv Z eKu u du dv;
,
,
0
0
2
"
!
2
T
T
T
Z
Z
1
1
1
1
Ky
4
= var
ru du = Z e2Kv 2v
e
dy dv:
,
0
0
v
We also observed (equation (0.1)) that rT1 is normal with
T
2
KT
KT Z 1
4
= IErT1 = r0e
eKu
,
+ e,
u du;
1
1
0
2
T1
2
2KT Z
4
= varrT1 = e
e2Ku 2
,
u du:
1
0
300
T
R
In fact,
1
0 ru du; rT1 is jointly normal, and the covariance is
"
T
Z
1 2 = IE
1
ru IEru du: rT
IErT
0
1
1
,
,
= T
Z
1
IE ru IEru rT
IErT
0
1
1 du
,
,
T
= Z 1
0
ru; T1 du;
where ru; T1 is defined in Equation 0.2.
The option on the bond has price at time zero of
+
Z
Z
1
1
e x
yC
A
K
,
exp
T1;T2
T1;T2
f
,
,
g
,
,1
,1
1
1
"
x2
+ 2 xy + y2
dx dy: (4.1)
2 1 2p1 2 exp ,21 2 21
1 2
2
2
,
,
The price of the option at time t
0;T1 is
2
IE e T
R
1
,
t ru du BT1;T2 K+ t
,
F
= IE e T
R
1
,
t ru duexp rT1CT1;T2 AT1;T2
K+ t (4.2)
f
,
,
g
,
F
Because of the Markov property, this is random only through a dependence on
rt. To compute
T
R
this option price, we need the joint distribution of
1
t ru du; rT1 conditioned on rt. This
CHAPTER 30. Hull and White model
301
pair of random variables has a jointly normal conditional distribution, and
"
T
Z
1t = IE
1
ru du t
t
F
T
v
= Z 1 rte Kv+Kt
Kv Z eKu
,
+ e,
u du dv;
t
t
2
2
3
!
2
T
Z
1
1t = IE
ru du
t
4
t
1t
5
,
F
T
2
T
!
= Z
Z
1
e2Kv 2v
1
e Ky dy dv;
,
t
v
2t = IE rT1 rt
T
= rte KT +Kt
KT Z 1 eKu
,
+ e,
u du;
1
1
t
2
2t = IE rT1 2t2 t
,
F
= e 2KT T
Z
1
e2Ku 2u du;
,
1
t
"
!
t
T
Z
1t 2t = IE
1
ru du
rT
t
t
1t
1
2t
,
,
F
= T
u
Z
1
e Ku KT Z e2Kv 2
,
,
v dv du:
1
t
t
The variances and covariances are not random. The means are random through a dependence on
rt.
Advantages of the Hull & White model:
1. Leads to closed-form pricing formulas.
2. Allows calibration to fit initial yield curve exactly.
Short-comings of the Hull & White model:
1. One-factor, so only allows parallel shifts of the yield curve, i.e.,
Bt;T = exp rtCt;T At;T ;
f
,
,
g
so bond prices of all maturities are perfectly correlated.
2. Interest rate is normally distributed, and hence can take negative values. Consequently, the
bond price
"
Bt;T = IE exp
T
Z
ru du
t
,
t
F
can exceed 1.
302
Chapter 31
Cox-Ingersoll-Ross model
In the Hull & White model, rt is a Gaussian process. Since, for each t, rt is normally distributed,
there is a positive probability that rt
0. TheCox-Ingersoll-Rossmodel is thesimplest onewhich
avoids negative interest rates.
We begin with a d-dimensional Brownian motion W1;W2;::: ;Wd. Let
0 and
0 be
constants. For j = 1;::: ;d, let Xj0 IR be given so that
2
X210+ X220+ :::+ X2d0 0;
and let Xj be the solution to the stochastic differential equation
dXjt = 1
,
2 Xjt dt + 12 dWjt:
Xj is called the Orstein-Uhlenbeck process. It always has a drift toward the origin. The solution to
this stochastic differential equation is
X
1
t
Z
jt = e,2 t Xj0 + 12
e12 u dW
0
ju :
This solution is a Gaussian process with mean function
m
1
jt = e,2 tXj0
and covariance function
s t
s;t = 1 2 1
^
4 e,2 s+t Z e u du:
0
Define
rt 4= X21t + X22t + :::+ X2dt:
If d = 1, we have rt = X21t and for each t, IP rt 0 = 1, but (see Fig. 31.1)
f
g
IP There are infinitely many values of t 0 for which rt = 0 = 1
303
304
2
r(t) = X (t)
1
t
x2
( X (t), X (t) )
1
2
x1
Figure 31.1: rt can be zero.
If d
2, (see Fig. 31.1)
IP There is at least one value of t 0 for which rt = 0 = 0:
f
g
Let fx1;x2;::: ;xd = x21 + x22 + ::: + x2d. Then
f
2
x
if i = j;
i = 2xi; fxixj = 0
if i = j:
6
It ˆo’s formula implies
d
d
drt = Xfx
X
i dXi + 12
fxixi dXi dXi
i=1
i=1
d
d
=
1
X
2Xi 1
X
2 dWi dWi
,
2 Xi dt + 12 dWit +
i=1
i=1 4
d
= rt dt + XXi dWi + d 2
,
i=1
4 dt
!
d
= d 2
X
X
it
4
rt dt + qrt p
,
i=1 rt dWit:
Define
d
Wt =
t
Z
X
X
iu
p
i=1 0
ru dWiu:
CHAPTER 31. Cox-Ingersoll-Ross model
305
Then W is a martingale,
d
dW =
X
X
i
p
i=1 r dWi;
d
dW dW = X2
X
i
i=1 r dt = dt;
so W is a Brownian motion. We have
!
drt = d 2
4
rt dt + qrt dWt:
,
The Cox-Ingersoll-Ross (CIR) process is given by
drt =
rt dt + qrt dWt;
,
We define
d = 42 0:
If d happens to be an integer, then we have the representation
d
rt = XX2it;
i=1
1
but we do not require d to be an integer. If d
2 (i.e.,
2 2), then
IP There are infinitely many values of t 0 for which rt = 0 = 1:
f
g
This is not a good parameter choice.
1
If d
2 (i.e.,
2 2), then
IP There is at least one value of t 0 for which rt = 0 = 0:
f
g
With the CIR process, one can derive formulas under the assumption that d = 4 is a positive
2
integer, and they are still correct even when d is not an integer.
For example, here is the distribution of rt for fixed t
0. Let r0 0 be given. Take
X10 = 0; X20 = 0; ::: ; Xd 10 = 0; Xd0 = qr0:
,
For i = 1; 2;:: : ; d
1, Xit is normal with mean zero and variance
,
2
t;t =
t
4 1 e, :
,
306
Xdt is normal with mean
m
1
dt = e,2 tqr0
and variance t; t. Then
d 1
2
!
rt =
t;t ,
X
X
it
+
X2dt
(0.1)
p
i=1
t;t
|
z
Normal squared and independent of the other
|
z
2
Chi-square with d
1 = 4 , degreesof term
,
2
freedom
Thus rt has a non-central chi-square distribution.
31.1
Equilibrium distribution of rt
As t
, mdt 0. We have
! 1
!
d
2
!
rt = t;t
X
X
it
:
p
i=1
t;t
As t
, we have t; t = 2
!1
4 , and so the limiting distribution of rt is 2
4 times a chi-square
4
with d = 4 degrees of freedom. The chi-square density with
degrees of freedom is
2
2
fy =
1
y
2
2
,
e y=2:
,
22 =
2
2
,
22
We make the change of variable r = 2
4 y. The limiting density for rt is
2
2
,
p
r = 4
4
2
r
2
2 :
1
e,
22 =
2
2 r
2
,
22
2
= 2
1
2
2
2
r
2
r2 , e, :
,
2
2
2
2
We computed the mean and variance of rt in Section 15.7.
31.2
Kolmogorov forward equation
Consider a Markov process governed by the stochastic differential equation
dXt = bXt dt + Xt dWt:
CHAPTER 31. Cox-Ingersoll-Ross model
307
h
-
0
y
Figure 31.2: The function hy
Because we are going to apply the following analysis to the case Xt = rt, we assume that
Xt 0 for all t.
We start at X0 = x
0 at time 0. Then Xt is random with density p0;t;x;y (in the y
variable). Since 0 and x will not change during the following, we omit them and write pt; y rather
than p0;t;x;y. We have
IEhXt = Z 1 hypt;y dy
0
for any function h.
The Kolmogorov forward equation (KFE) is a partial differential equation in the “forward” variables
t and y. We derive it below.
Let hy be a smooth function of y
0 which vanishes near y = 0 and for all large values of y (see
Fig. 31.2). It ˆo’s formula implies
dhXt = hh0XtbXt+ 12h00Xt 2Xti dt + h0Xt Xt dWt;
so
hXt = hX0+ t
Z
h
h0XsbXs+ 1
0
2h00Xs 2Xsi ds +
t
Z
h Xs Xs dWs;
0
0
IEhXt = hX0+ IE t
Z
h
h XsbXs dt + 1 Xs 2Xsi ds;
0
0
2h00
308
or equivalently,
t
Z
Z
1
hypt;y dy = hX0+ Z 1 h ybyps;y dy ds +
0
0
0 0
1 t
Z
Z
1
2
h00y 2yps;y dy ds:
0 0
Differentiate with respect to t to get
Z
Z
1
hyp
h ybypt;y dy + 1 1 h y 2ypt;y dy:
0
00
0
tt; y dy = Z 1
0
2 0
Integration by parts yields
y=
Z
Z
1
1
h
1
h
0
ybypt;y dy = hybypt;y
y @
0
y=0 , 0
@y bypt;y dy;
|
z
=0
y=
Z
Z
1
1
h y 2ypt;y dy = h y 2ypt;y
1
h y @ 2ypt;y dy
00
0
0
0
y=0 , 0
@y
|
z
=0
y=
= hy @
1
2ypt;y
+Z
1
hy @2 2ypt;y dy:
,
@y
y=0
0
@y2
|
z
=0
Therefore,
Z
Z
Z
1
hyp
1
hy @
1
hy @2 2ypt;y dy;
0
tt; y dy = , 0
@y bypt;y dy + 12 0
@y2
or equivalently,
"
Z
@2
1
hy p
1
2ypt;y dy = 0:
0
tt; y + @
@y bypt;y, 2 @y2
This last equation holds for every function h of the form in Figure 31.2. It implies that
p
@2
tt; y + @
1
2
@y bypt;y
ypt;y = 0:
(KFE)
,
2 @y2
If there were a place where (KFE) did not hold, then we could take hy
0 at that and nearby
points, but take h to be zero elsewhere, and we would obtain
"
Z
@2
1
h p
1
dy = 0:
0
t + @
@ybp , 2 @y2 2p
6
CHAPTER 31. Cox-Ingersoll-Ross model
309
If the process Xt has an equilibrium density, it will be
py = lim
t
pt;y:
!1
In order for this limit to exist, we must have
0 = lim
t
ptt;y:
!1
Letting t
in (KFE), we obtain the equilibrium Kolmogorov forward equation
! 1
@
1 @2
2
@y bypy
ypy = 0:
,
2 @y2
When an equilibrium density exists, it is the unique solution to this equation satisfying
py 0 y 0;
8
Z
1
py dy = 1:
0
31.3
Cox-Ingersoll-Ross equilibrium density
We computed this to be
pr = Cr
2
2
r
2
,
e, ;
2
2
where
C
2
= 2
1
2
2
:
,
2
2
We compute
p
2
2
0 r = 2 ,2
:pr
r , 2 pr
= 2
1 2
2r
rpr;
,
2 ,
p
2
1 2
rp
1 2
rp
00 r =
r + 2
pr + 2
0
r
,
2r2 , 2 ,
2
r,
2r , 2 ,
= 2
1
1 2
1 2
2r
r
+ 2
r2
pr
,
r , 2 ,
,
2r , 2 ,
We want to verify the equilibrium Kolmogorov forward equation for the CIR process:
@
1 @2
@r
rpr
,
,
2 @r2 2rpr = 0:
(EKFE)
310
Now
@@r rpr= pr+ rpr;
0
,
,
,
@2
@r2 2rpr = @@r 2pr + 2rp0r
= 2 2p0r + 2rp00r:
The LHS of (EKFE) becomes
pr +
rp r 2p r 1 2rp r
0
0
0
0
,
,
,
,
2
= pr
+
r 2 2
1 2
r
,
,
,
2r , 2 ,
+ 1
1 2
2
1 2
r
r +
r2
,
2 ,
,
2r , 2 ,
= pr
1 2
r 2
1 2
r
,
2 ,
2r , 2 ,
1 2 2
1 2
r
,
2
2r , 2 ,
+ 1
1 2
2
1 2
r
r
r2
,
2 ,
,
2r , 2 ,
= 0;
as expected.
31.4
Bond prices in the CIR model
The interest rate process rt is given by
drt =
rt dt + qrt dWt;
,
where r0 is given. The bond price process is
"
Bt;T = IE exp
T
Z
ru du
t :
,
t
F
Because
"
exp
t
T
Z
ru du Bt;T = IE exp Z ru du
t ;
,
0
,
0
F
the tower property implies that this is a martingale. The Markov property implies that Bt; T is
random only through a dependence on rt. Thus, there is a function Br;t;T of the three dummy
variables r;t;T such that the process Bt; T is the function Br;t;T evaluated at rt;t;T , i.e.,
Bt;T = Brt;t;T:
CHAPTER 31. Cox-Ingersoll-Ross model
311
t
R
Because exp n
0 ru duo Brt;t;T is a martingale, its differential has no dt term. We com-
,
pute
d
t
exp Z ru du Brt;t;T
,
0
= exp
t
Z
ru du rtBrt;t;T dt + Brrt;t;T drt +
,
0
,
1
2Brrrt;t;T drt drt + Btrt;t;T dt:
The expression in ::: equals
= rB dt + Br
r dt + Br pr dW
,
,
+ 12Brr 2r dt + Bt dt:
Setting the dt term to zero, we obtain the partial differential equation
rBr;t;T+ Btr;t;T+
rBrr;t;T+ 1 2rBrrr;t;T = 0;
,
,
2
0 t T; r 0: (4.1)
The terminal condition is
Br;T;T = 1; r 0:
Surprisingly, this equation has a closed form solution. Using the Hull & White model as a guide,
we look for a solution of the form
Br;t;T = e rCt;T At;T;
,
,
where CT; T = 0; AT; T = 0. Then we have
Bt = rCt AtB;
,
,
Br = CB; Brr = C2B;
,
and the partial differential equation becomes
0 = rB + rCt AtB
rCB + 1 2rC2B
,
,
,
,
,
2
= rB 1 Ct + C + 1 2C2 BAt + C
,
,
2
,
We first solve the ordinary differential equation
1 Ctt;T + Ct;T + 1 2C2t;T = 0; CT;T = 0;
,
,
2
and then set
At;T =
T
Z
Cu;T du;
t
312
so AT; T = 0 and
Att;T = Ct;T:
,
It is tedious but straightforward to check that the solutions are given by
Ct;T =
sinh T t
,
cosh T t + 1
t;
,
2 sinh T ,
2
t
3
At;T = 2
e12 T,
;
4
5
,
2 log
cosh T t + 1
t
,
2 sinh T ,
where
u
u
= 1q
e,
2
2 + 2 2; sinh u = eu ,2 ; coshu = eu +e,
2 :
Thus in the CIR model, we have
"
IE exp
T
Z
ru du
t = Brt;t;T;
,
t
F
where
Br;t;T = exp rCt;T At;T ; 0 t T; r 0;
f,
,
g
and Ct; T and At; T are given by the formulas above. Because the coefficients in
drt =
rt dt + qrt dWt
,
do not depend on t, the function Br;t;T depends on t and T only through their difference
=
T t. Similarly, Ct;T and At;T are functions of = T t. We write Br; instead of
,
,
Br;t;T, and we have
Br; = exp rC A ;
0; r 0;
f,
,
g
where
C =
sinh
cosh + 12 sinh ;
2
3
A = 2
e12
;
4
5
,
2 log
cosh + 12 sinh
= 1q
2
2 + 2 2:
We have
Br0;T = IE exp
T
Z
ru du :
,
0
Now ru
0 for each u, almost surely, so Br0;T is strictly decreasing in T. Moreover,
Br0;0 = 1;
CHAPTER 31. Cox-Ingersoll-Ross model
313
lim B
Z
1
r
T
r0;T = IE exp
u du = 0:
,
0
!1
But also,
Br0;T = exp r0CT AT ;
f,
,
g
so
r0C0+ A0 = 0;
lim
T
r0CT+ AT = ;
1
!
1
and
r0CT+ AT
is strictly inreasing in T .
31.5
Option on a bond
The value at time t of an option on a bond in the CIR model is
"
v
T
t;rt = IE exp
Z
1
ru du BT1;T2 K+ t ;
,
t
,
F
where T1 is the expiration time of the option, T2 is the maturity time of the bond, and 0
t T1
T2
t
R
. As usual, exp n
0 ru duo vt; rt is a martingale, and this leads to the partial differential
,
equation
rv + vt +
rvr + 1 2rvrr = 0; 0 t T1; r 0:
,
,
2
(where v = vt; r.) The terminal condition is
vT1;r = Br;T1;T2 K+ ; r 0:
,
Other European derivative securities on the bond are priced using the same partial differential equa-
tion with the terminal condition appropriate for the particular security.
31.6
Deterministic time change of CIR model
Process time scale: In this time scale, the interest rate rt is given by the constant coefficient CIR
equation
drt =
rt dt + qrt dWt:
,
Real time scale: In this time scale, the interest rate ^r^t is given by a time-dependent CIR equation
d^r^t = ^^t ^^t^r^t d^t+ ^^tq^r^t d ^W^t:
,
t : Process time
314
6
.
.
.
.
.
.
t = '^t
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
^
-
t
-
: Real time
A
pe-
riod of high inter-
est rate volatility
Figure 31.3: Time change function.
There is a strictly increasing time change function t = '^t which relates the two time scales (See
Fig. 31.3).
Let ^
B^r;^t; ^T denote the price at real time ^tof a bond with maturity ^T when the interest rate at time
^
t is ^r. We want to set things up so
^B^r;^t; ^T = Br;t;T = e rCt;T At;T;
,
,
where t = '^t; T = ' ^
T, and Ct;T and At;T are as defined previously.
We need to determine the relationship between ^r and r. We have
B
T
r0;0;T = IE exp
Z
rt dt ;
,
0
^
B^r0;0; ^T = IE exp
T
Z
^r^t d^t :
,
0
With T = ' ^
T, make the change of variable t = '^t, dt = ' ^t d^t
0
in the first integral to get
^
B
T
r0;0;T = IE exp
Z
r'^t'
;
0^
t d^t
,
0
and this will be B^r0; 0; ^
T if we set
^r^t = r'^t '0^t:
CHAPTER 31. Cox-Ingersoll-Ross model
315
31.7
Calibration
!
^B^r^t;^t; ^T = B ^r^t
' ^t;'^t;'^T
0
= exp ^r^tC'^t;'^T A'^t;'^T
,
' ^t
,
0
= expn ^r^t ^C^t; ^T ^A^t; ^To;
,
,
where
^C^t; ^T = C'^t;'^T
' ^t
0
^A^t; ^T = A'^t;'^T
do not depend on ^
t and ^T only through ^T ^t, since, in the real time scale, the model coefficients
,
are time dependent.
Suppose we know ^r0 and ^
B^r0;0; ^T for all ^T 0; ^T . We calibrate by writing the equation
2
^B^r0;0; ^T = expn ^r0 ^C0; ^T ^A0; ^To;
,
,
or equivalently,
log ^B^r0;0; ^T = ^r0
,
' 0C'0;'^T + A'0;'^T:
0
Take
; and so the equilibrium distribution of rt seems reasonable. These values determine
the functions C; A. Take ' 0 = 1
0
(we justify this in the next section). For each ^
T, solve the
equation for ' ^
T:
log ^B^r0;0; ^T = ^r0C0;'^T + A0;'^T:
(*)
,
The right-hand side of this equation is increasing in the ' ^
T variable, starting at 0 at time 0 and
having limit
at
, i.e.,
1
1
^r0C0;0+ A0;0 = 0;
lim
:
T
^r0C0;T+ A0;T = 1
!1
Since 0
log ^B^r0;0; ^T
; (*) has a unique solution for each ^T. For ^T = 0, this solution
,
1
is '0 = 0. If ^
T1 ^T2, then
log ^Br0;0; ^T1
log ^Br0;0; ^T2;
,
,
so ' ^
T1 '^T2. Thus ' is a strictly increasing time-change-function with the right properties.
316
31.8
Tracking down 0
in the time change of the CIR model
'
0
Result for general term structure models:
@
= r0:
,
@T log B0;T T=0
Justification:
B
T
0;T = IE exp
Z
ru du :
,
0
T
logB0;T = logIE exp
Z
ru du
,
,
,
0
T
R
@
IE rTe
ru du
,
0
T
,
@T logB0;T =
IEe R ru du
,
0
@
= r0:
,
@T logB0;T T=0
In the real time scale associated with the calibration of CIR by time change, we write the bond price
as
^B^r0;0; ^T;
thereby indicating explicitly the initial interest rate. The above says that
@
= ^r0:
,
@ ^T log ^B^r0;0; ^T ^T=0
The calibration of CIR by time change requires that we find a strictly increasing function ' with
'0 = 0 such that
log ^B^r0;0; ^T = 1
0;
(cal)
,
' 0^r0C'^T + A'^T; ^T
0
where ^
B^r0;0; ^T, determined by market data, is strictly increasing in ^T, starts at 1 when ^T = 0,
and goes to zero as ^
T
. Therefore,
log ^B^r0;0; ^T is as shown in Fig. 31.4.
! 1
,
Consider the function
^r0CT+ AT;
Here CT and AT are given by
CT =
sinh T
cosh T+ 12 sinh T;
2
3
AT = 2
e12 T
;
4
5
,
2 log
cosh T+ 12 sinh T
= 1q
2
2 + 2 2:
CHAPTER 31. Cox-Ingersoll-Ross model
317
log ^B^r0;0; ^T
,
6
Goes to 1
Strictly increasing
^
-
T
Figure 31.4: Bond price in CIR model
6
^r0CT+ AT
... ... ... ... ... ... ... ... ... ... ...
log ^B^r0;0; ^T
,
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
T
-
'^T
Figure 31.5: Calibration
The function ^r0CT + AT is zero at T = 0, is strictly increasing in T , and goes to
as
1
T
. This is because the interest rate is positive in the CIR model (see last paragraph of Section
!1
31.4).
To solve (cal), let us first consider the related equation
log ^B^r0;0; ^T = ^r0C'^T + A'^T:
(cal’)
,
Fix ^
T and define '^T to be the unique T for which (see Fig. 31.5)
log ^B^r0;0; ^T = ^r0CT+ AT
,
If ^
T = 0, then '^T = 0. If ^T1 ^T2, then ' ^T1 '^T2. As ^T
, ' ^
T
. We have thus
! 1
!1
defined a time-change function ' which has all the right properties, except it satisfies (cal’) rather
than (cal).
318
We conclude by showing that ' 0 = 1
0
so ' also satisfies (cal). From (cal’) we compute
^r0 = @
,
@ ^T log ^B^r0;0; ^T ^T=0
= ^r0C0'0'00 + A0'0'00
= ^r0C00'00 + A00'00:
We show in a moment that C 0 = 1
0 = 0
0
, A0
, so we have
^r0 = ^r0'00:
Note that ^r0 is the initial interest rate, observed in the market, and is striclty positive. Dividing by
^r0, we obtain
' 0 = 1:
0
Computation of C 0
0
:
C
0
=
1
2 cosh
cosh + 12 sinh
cosh + 12 sinh
sinh
2 sinh + 1
,
2
cosh
C
h
0 0 = 12
+ 0 00 + 1
,
2 i = 1:
Computation of A 0
0
:
"
A
2
cosh + 12 sinh
0
= , 2
e =21
2
cosh + 1
2 e =2
cosh + 12 sinh
2 sinh
e =2
2 sinh + 1
;
,
2
cosh
A
2 + 0 1
0
0 =
0 + 1
,
2
+ 02 2 + 0 ,
2
"
2
= 2
1
1 2
,
2 2
2 , 2
= 0:
Chapter 32
A two-factor model (Duffie & Kan)
Let us define:
X1t = Interest rate at time t
X2t = Yield at time t on a bond maturing at time t + 0
Let X10
0, X20 0 be given, and let X1t and X2t be given by the coupled stochastic
differential equations
dX
q
1t = a11X1t + a12X2t + b1 dt + 1
1X1t + 2X2t +
dW1t;
(SDE1)
dX
q
2t = a21X1t + a22X2t + b2 dt + 2
1X1t + 2X2t + dW1t + q1
2 dW2t;
,
(SDE2)
where W1 and W2 are independent Brownian motions. To simplify notation, we define
Y t 4= 1X1t + 2X2t + ;
W3t 4= W1t + q1 2W2t:
,
Then W3 is a Brownian motion with
dW1t dW3t = dt;
and
dX1 dX1 = 21Y dt; dX2 dX2 = 22Y dt; dX1 dX2 = 1 2Y dt:
319
320
32.1
Non-negativity of Y
dY = 1 dX1 + 2 dX2
= 1a11X1 + 1a12X2 + 1b1 dt + 2a21X1 + 2a22X2 + 2b2 dt
+
q
pY 1 1 dW1 + 2 2 dW1 + 2 1
2 2 dW2
,
= 1a11 + 2a21X1 + 1a12 + 2a22X2 dt + 1b1 + 2b2 dt
+ 21 21 + 2 1 2 1 2 + 22 2212qY t dW4t
where
W
2
4t = 1 1 + 2 2W1t + 2p1
2W2t
,
q
21 21 + 2 1 2 1 2 + 22 22
is a Brownian motion. We shall choose the parameters so that:
Assumption 1: For some ,
1a11 + 2a21 =
1;
1a12 + 2a22 =
2:
Then
dY = 1X1 + 2X2 +
dt + 1b1 + 2b2
dt
,
+ 21 21 + 2 1 2 1 2 + 22 2212pY dW4
= Y dt + 1b1 + 2b2
dt + 21 21 + 2 1 2 1 2 + 22 2212pY dW4:
,
From our discussion of the CIR process, we recall that Y will stay strictly positive provided that:
Assumption 2:
Y 0 = 1X10 + 2X20 +
0;
and
1
Assumption 3:
1b1 + 2b2 ,
2 21 21 + 2 1 2 1 2 + 22 22:
Under Assumptions 1,2, and 3,
Y t 0; 0 t
; almost surely,
1
and (SDE1) and (SDE2) make sense. These can be rewritten as
dX
q
1t = a11X1t + a12X2t + b1 dt + 1 Y t dW1t;
(SDE1’)
dX
q
2t = a21X1t + a22X2t + b2 dt + 2 Y t dW3t:
(SDE2’)
CHAPTER 32. A two-factor model (Duffie & Kan)
321
32.2
Zero-coupon bond prices
The value at time t
T of a zero-coupon bond paying $1 at time T is
"
Bt;T = IE exp
T
Z
X1u du
t :
,
t
F
Since the pair X1;X2 of processes is Markov, this is random only through a dependence on
X1t;X2t. Since the coefficients in (SDE1) and (SDE2) do not depend on time, the bond price
depends on t and T only through their difference = T
t. Thus, there is a function Bx1;x2;
,
of the dummy variables x1;x2 and , so that
"
B
T
X
Z
1t;X2t;T t = IE exp
X1u du
t :
,
,
t
F
The usual tower property argument shows that
exp
t
Z
X1u du BX1t;X2t;T t
,
0
,
is a martingale. We compute its stochastic differential and set the dt term equal to zero.
dexp
t
Z
X1u du BX1t;X2t;T t
,
0
,
= exp
t
Z
X1u du X1B dt + Bx dX1 + Bx dX2 B dt
,
0
,
,
1
2
+ 1
2Bx x dX1 dX1 + Bx x dX1 dX2 + 12Bx x dX2 dX2
1
1
1
2
2
2
= exp
t
Z
X1u du X1B + a11X1 + a12X2 + b1Bx + a21X1 + a22X2 + b2Bx B
,
0
,
,
1
2
+ 1 2
2
2 1Y Bx x + 1 2Y Bx x + 12 2Y Bx x dt
1
1
1
2
2
2
+ p
p
1 Y Bx dW1 + 2 Y Bx dW3
1
2
The partial differential equation for Bx1;x2; is
x1B B +a11x1+a12x2+b1Bx +a21x1+a22x2+b2Bx +1 21 1x1+ 2x2+ Bx x
,
,
2
1
2
1
1
+ 1 2 1x1 + 2x2 + Bx x + 1 2
2 2 1x1 + 2x2 + Bx x = 0: (PDE)
1
2
2
2
We seek a solution of the form
Bx1;x2; = exp x1C1 x2C2 A ;
f
,
,
,
g
valid for all
0 and all x1;x2 satisfying
1x1 + 2x2 +
0:
(*)
322
We must have
Bx1;x2;0 = 1; x1;x2 satisfying (*);
8
because
= 0 corresponds to t = T. This implies the initial conditions
C10 = C20 = A0 = 0:
(IC)
We want to find C1 ;C2 ;A for
0. We have
B x1;x2; = x1C
x
A
0
1
2C02
0
Bx1;x2; ;
,
,
,
Bx x1;x2; = C1 Bx1;x2; ;
,
1
Bx x1;x2; = C2 Bx1;x2; ;
,
2
Bx x x1;x2; = C21 Bx1;x2; ;
1
1
Bx x x1;x2; = C1 C2 Bx1;x2; ;
1
2
Bx x x1;x2; = C22 Bx1;x2; :
2
2
(PDE) becomes
0 = Bx1;x2; x1 + x1C01 + x2C02 + A0 a11x1 + a12x2 + b1C1
,
,
a21x1 + a22x2 + b2C2
,
+ 1 2
2 1 1x1 + 2x2 + C21 + 1 2 1x1 + 2x2 + C1 C2
+ 1 2
2 2 1x1 + 2x2 + C22
= x1Bx1;x2; 1 + C
a
a
0
1
11C1
21C2
,
,
,
+ 1 2
2
2 1 1C21 + 1 2 1C1 C2 + 12 2 1C22
+ x2Bx1;x2; C
a
a
0
2
12C1
22C2
,
,
+ 1 2
2
2 1 2C21 + 1 2 2C1 C2 + 12 2 2C22
+ Bx1;x2; A
b
b
0
1C1
2C2
,
,
+ 1 2
2
2 1 C21 + 1 2 C1 C2 + 12 2 C22
We get three equations:
C
1 2
1 2
0
1 = 1 + a11C1 + a21C2
1 1C21
1 2 1C1 C2
2 1C22 ;
,
2
,
,
2
(1)
C10 = 0;
C
1 2
1 2
0
2 = a12C1 + a22C2
1 2C21
1 2 2C1 C2
2 2C22 ;
(2)
,
2
,
,
2
C20 = 0;
A
1 2
1 2
0
= b1C1 + b2C2
1 C21
1 2 C1 C2
2 C22 ;
(3)
,
2
,
,
2
A0 = 0;
CHAPTER 32. A two-factor model (Duffie & Kan)
323
We first solve (1) and (2) simultaneously numerically, and then integrate (3) to obtain the function
A .
32.3
Calibration
Let 0
0 be given. The value at time t of a bond maturing at time t + 0 is
BX1t;X2t; 0 = exp X1tC1 0 X2tC2 0 A 0
f,
,
,
g
and the yield is
1 logBX1t;X2t; 0 = 1 X1tC1 0+X2tC2 0+A 0 :
,
0
0
But we have set up the model so that X2t is the yield at time t of a bond maturing at time t + 0.
Thus
X2t = 1 X
0
1tC1 0 + X2tC2 0 + A 0 :
This equation must hold for every value of X1t and X2t, which implies that
C1 0 = 0; C2 0 = 0; A = 0:
We must choose the parameters
a11;a12;b1; a21;a22;b2; 1; 2; ; 1; ; 2;
so that these three equations are satisfied.
324
Chapter 33
Change of num´eraire
Consider a Brownian motion driven market model with time horizon T . For now, we will have
one asset, which we call a “stock” even though in applications it will usually be an interest rate
dependent claim. The price of the stock is modeled by
dSt = rt St dt + tSt dWt;
(0.1)
where the interest rate process rt and the volatility process t are adapted to some filtration
t; 0 t T
t; 0 t T
. W is a Brownian motion relative to this filtration, but
fF
g
fF
g
may be larger than the filtration generated by W .
This is not a geometric Brownian motion model. We are particularly interested in the case that the
interest rate is stochastic, given by a term structure model we have not yet specified.
We shall work only under the risk-neutral measure, which is reflected by the fact that the mean rate
of return for the stock is rt.
We define the accumulation factor
t
t = exp Z ru du ;
0
St
so that the discounted stock price t is a martingale. Indeed,
dSt
t = St
t t dWt:
The zero-coupon bond prices are given by
"
Bt;T = IE exp
T
Z
ru du
t
,
t
F
= IE t
T t ;
F
325
326
so
Bt;T
t = IE 1T t
F
is also a martingale (tower property).
The T -forward price F t; T of the stock is the price set at time t for delivery of one share of stock
at time T with payment at time T . The value of the forward contract at time t is zero, so
0 = IE t
T ST Ft;T t
,
F
= tIE ST
T Ft Ft;TIE t t
,
T F
= tSt
t Ft;TBt;T
,
= St Ft;TBt;T
,
Therefore,
Ft;T = St
Bt;T:
Definition 33.1 (Num´eraire) Any asset in the model whose price is always strictly positive can be
taken as the num´eraire. We then denominate all other assets in units of this num´eraire.
Example 33.1 (Money market as num´eraire) The money market could be the num´eraire. At time t, the
stock is worth St units of money market and the T -maturity bond is worth Bt;T units of money market.
t
t
Example 33.2 (Bond as num´eraire) The T -maturity bond could be the num´eraire. At time t
T, thestock
is worth F t; T units of T -maturity bond and the T -maturity bond is worth 1 unit.
We will say that a probability measure IPN is risk-neutral for the num´eraire N if every asset price,
divided by N, is a martingale under IPN . The original probability measure IP is risk-neutral for the
num´eraire
(Example 33.1).
Theorem 0.71 Let N be a num´eraire, i.e., the price process for some asset whose price is always
strictly positive. Then IPN defined by
IP
Z
NT
NA = 1
N0
A
T ;
A T dIP; 8 2 F
is risk-neutral for N.
CHAPTER 33. Change of num´eraire
327
Note: IP and IPN are equivalent, i.e., have the same probability zero sets, and
IPA = N0Z T
A
T:
A NT dIPN; 8 2 F
Proof: Because N is the price process for some asset, N= is a martingale under IP . Therefore,
IP
Z
NT
N = 1
N0
T dIP
= 1
N0:IE NT
T
= 1 N0
N0 0
= 1;
and we see that IPN is a probability measure.
Let Y be an asset price. Under IP , Y= is a martingale. We must show that under IPN , Y=N is
a martingale. For this, we need to recall how to combine conditional expectations with change of
measure (Lemma 1.54). If 0
t T T
T
and X is
-measurable, then
F
IE
N X t = N0 t
t
F
Nt IE NT
N0 TX F
= t
NtIE NT
T X t:
F
Therefore,
IE Y T
Y T
N NT t = t
t
F
NtIE NT
T NT F
= t Y t
Nt t
= Y t
Nt;
which is the martingale property for Y=N under IPN .
33.1
Bond price as num´eraire
Fix T
0;T and let Bt;T be the num´eraire. The risk-neutral measure for this num´eraire is
2
IP
Z
BT;T
T A =
1
B0;T A T dIP
= 1 Z 1
B0;T
A
T:
A T dIP 8 2 F
328
Because this bond is not defined after time T , we change the measure only “up to time T ”, i.e.,
1 BT;T
using B0;T T and only for A
T.
2
F
IPT is called the T-forward measure. Denominated in units of T-maturity bond, the value of the
stock is
Ft;T = St
Bt;T; 0 t T:
This is a martingale under IPT , and so has a differential of the form
dFt;T = Ft;TFt;T dWTt; 0 t T;
(1.1)
i.e., a differential without a dt term. The process WT ; 0
t T is a Brownian motion under
f
g
IPT. We may assume without loss of generality that Ft;T 0.
We write F t rather than F t; T from now on.
33.2
Stock price as num´eraire
Let St be the num´eraire. In terms of this num´eraire, the stock price is identically 1. The risk-
neutral measure under this num´eraire is
IP
Z
ST
SA = 1
S0
A
T:
A T dIP; 8 2 F
Denominated in shares of stock, the value of the T -maturity bond is
Bt;T
St = 1
Ft:
This is a martingale under IPS, and so has a differential of the form
d 1
1
Ft = t;T Ft dWSt;
(2.1)
where WSt; 0
t T is a Brownian motion under IPS. We may assume without loss of
f
g
generality that t; T
0.
Theorem 2.72 The volatility t; T in (2.1) is equal to the volatility F t; T in (1.1). In other
words, (2.1) can be rewritten as
d 1
1
Ft = Ft;T Ft dWSt;
(2.1’)
CHAPTER 33. Change of num´eraire
329
Proof: Let gx = 1=x, so g x = 1=x2; g x = 2=x3
0
0
0
. Then
,
d 1
Ft = dgFt
= g0Ft dFt + 12g00Ft dFt dFt
=
1
2
F t; TF2t; T dt
,
F2t Ft;TFt;T dWTt + 1
F3t
= 1 h
Ft
F t; T dWTt + 2F t; T dti
,
=
F t; T 1
Ft
dWTt + Ft;T dt :
,
1
Under IPT ; WT is a Brownian motion. Under this measure,
,
Ft has volatility F t; T and mean
2
1
rate of return
F t; T. The change of measure from IPT to IPS makes Ft a martingale, i.e., it
changes the mean return to zero, but the change of measure does not affect the volatility. Therefore,
t;T in (2.1) must be Ft;T and WS must be
W
t
St = WTt + Z
,
0 F u; T du:
33.3
Merton option pricing formula
The price at time zero of a European call is
V 0 = IE 1TST K+
,
= IE ST
T1 ST K
KIE 1 ST K
,
f
g
T1f
g
= S0Z
ST
KB0;TZ
1
ST K S0 T dIP ,
ST K B0;T T dIP
f
g
f
g
= S0IPS ST K KB0;TIPT ST K
f
g
,
f
g
= S0IPS FT K KB0;TIPT FT K
f
g
,
f
g
= S0IP
1
1
S FT K KB0;TIPT FT K :
,
f
g
330
This is a completely general formula which permits computation as soon as we specify F t; T . If
we assume that F t; T is a constant F , we have the following:
1
1 2
FT = B0;T
S0 expn FWST
F To ;
,
2
IP 1
1
S
1 2
FT K = IPS FWST
F T log S0
,
2
KB0;T
= IP W
1
S
ST
pT
F pT log S0
KB0;T + 12 FpT
= N 1;
where
1 =
1
log S0
2
F
F T :
p
T
KB0;T + 12
Similarly,
FT = S0
1 2
B0;T expn FWTT
F To ;
,
2
IPT FT K = IPT FWTT 1 2FT log KB0;T
f
g
,
2
S0
= IP W
1
T
T T
log KB0;T
2
F T
pT
F pT
S0 + 12
= IP
W
1
T
T T
1 2
,
log S0
F T
pT
F pT
KB0;T , 2
= N 2;
where
2 =
1
log S0
1 2
F
F T :
p
T
KB0;T , 2
rT
If r is constant, then B0;T = e, ,
1 =
1
log S0
2
F
F T ;
pT
K + r + 12
2 =
1
log S0
1 2
F
F T ;
pT
K + r , 2
and we have the usual Black-Scholes formula. When r is not constant, we still have the explicit
formula
V 0 = S0N 1 KB0;TN 2:
,
CHAPTER 33. Change of num´eraire
331
As this formula suggests, if F is constant, then for 0
t T, the valueof a Europeancall expiring
at time T is
V t = StN 1t KBt;TN 2t;
,
where
1t =
1
log Ft
2
t ;
F
F T
pT
t
K + 12
,
,
2t =
1
log Ft 1 2
t :
F
F T
pT
t
K , 2
,
,
This formula also suggests a hedge: at each time t, hold N 1t shares of stock and short
KN 2t bonds.
We want to verify that this hedge is self-financing. Suppose we begin with $ V 0 and at each time
t hold N 1t shares of stock. We short bonds as necessary to finance this. Will the position in
the bond always be
KN 2t? If so, the value of the portfolio will always be
,
StN 1t KBt;TN 2t = V t;
,
and we will have a hedge.
Mathematically, this question takes the following form. Let
t = N 1t:
At time t, hold t shares of stock. If Xt is the value of the portfolio at time t, then Xt ,
tSt
Xt t
will be invested in the bond, so the number of bonds owned is
,
Bt;T St and the
portfolio value evolves according to
dXt = t dSt + Xt t
,
Bt;T St dBt;T:
(3.1)
The value of the option evolves according to
dV t = N 1t dSt + St dN 1t + dSt dN 1t
KN 2t dBt;T K dBt;T dN 2t KBt;T dN 2t: (3.2)
,
,
,
If X0 = V 0, will Xt = V t for 0
t T?
Formulas (3.1) and (3.2) are difficult to compare, so we simplify them by a change of num´eraire.
This change is justified by the following theorem.
Theorem 3.73 Changes of num´eraire affect portfolio values in the way you would expect.
Proof: Suppose we have a model with k assets with prices S1;S2;::: ;Sk. At each time t, hold
it shares of asset i, i = 1;2;::: ;k 1, and invest the remaining wealth in asset k. Begin with
,
a nonrandom initial wealth X0, and let Xt be the value of the portfolio at time t. The number
of shares of asset k held at time t is
Xt
k 1
P
,
i=1 itSit
,
kt =
Skt
;
332
and X evolves according to the equation
k 1
k 1
!
dX = ,
,
dS
X
i dSi + X X iSi
k
,
i=1
i=1
Sk
k
= Xi dSi:
i=1
Note that
k
Xkt = XitSit;
i=1
and we only get to specify 1;::: ; k 1, not k, in advance.
,
Let N be a num´eraire, and define
Xt = Xt S
b
Nt; cit = Sit
Nt; i = 1;2;::: ;k:
Then
dX = 1
+ dX d 1
b
N dX + X d 1N
N
k
k
!
k
= 1 X
X
X
N i dSi +
iSi d 1 + i dSi d 1
i=1
i=1
N
i=1
N
k
=
1
X
i
+ dSi d 1
i=1
N dSi + Sid 1N
N
k
= Xi dSci:
i=1
Now
X
k 1
P
,
i=1 iSi
,
k =
Sk
X=N
k 1
P
,
=
i=1 iSi=N
,
Sk=N
k 1
P
= X
S
,
b
i=1 ici
,
S
:
c
k
Therefore,
k
k 1
!
dX =
,
dSc
X
S
X X S
k
b
i dci + b
ici
,
i=1
i=1
Sck
CHAPTER 33. Change of num´eraire
333
This is the formula for the evolution of a portfolio which holds i shares of asset i, i = 1; 2;:: : ; k,
1, and all assets and the portfolio are denominated in units of N.
We return to the European call hedging problem (comparison of (3.1) and (3.2)), but we now use
the zero-coupon bond as num´eraire. We still hold t = N 1t shares of stock at each time t.
In terms of the new num´eraire, the asset values are
St
Stock:
Bt;T = Ft;
Bt;T
Bond:
Bt;T = 1:
The portfolio value evolves according to
dXt = t dFt + Xt td1
b
b
,
1 = t dFt:
(3.1’)
In the new num´eraire, the option value formula
V t = N 1tSt KBt;TN 2t
,
becomes
V t = V t
KN
b
Bt;T = N 1tFt
2t;
,
and
dV = N
K dN
b
1t dFt + Ft dN 1t + dN 1t dFt
2t:
,
(3.2’)
To show that the hedge works, we must show that
Ft dN 1t + dN 1t dFt K dN 2t = 0:
,
This is a homework problem.
334
Chapter 34
Brace-Gatarek-Musiela model
34.1
Review of HJM under risk-neutral IP
ft;T = Forward rate at time t for borrowing at time T:
dft;T = t;T t;T dt + t;T dWt;
where
t;T = T
Z
t;u du
t
The interest rate is rt = ft; t. The bond prices
"
Bt;T = IE exp
T
Z
ru du
t
,
t
F
= exp
T
Z
ft;u du
,
t
satisfy
dBt;T = rt Bt;T dt
t;T
Bt;T dWt:
,
|
z
volatility of T -maturity bond.
To implement HJM, you specify a function
t;T; 0 t T:
A simple choice we would like to use is
t;T = ft;T
where
0 is the constant “volatility of the forward rate”. This is not possible because it leads to
T f
t;T = Z t;u du;
t
!
dft;T = 2ft;T T
Z
ft;u du dt + ft;T dWt;
t
335
336
and Heath, Jarrow and Morton show that solutions to this equation explode before T .
The problem with the above equation is that the dt term grows like the square of the forward rate.
To see what problem this causes, consider the similar deterministic ordinary differential equation
f0t = f2t;
where f0 = c
0. We have
f t
0
f2t = 1;
d 1
,
dt ft = 1;
1
t 1 du = t
,
ft + 1
f0 = Z0
1
1
1=c = ct 1
,
,
ft = t , f0 = t ,
c ;
ft = c
1 ct:
,
This solution explodes at t = 1=c.
34.2
Brace-Gatarek-Musiela model
New variables:
Current time t
Time to maturity
= T t:
,
Forward rates:
rt; = ft;t + ; rt;0 = ft;t = rt;
(2.1)
@@ rt; = @@Tft;t+
(2.2)
Bond prices:
Dt; = Bt;t +
(2.3)
= exp
t+
Z
ft;v dv
,
t
u = v t; du = dv : = exp Z ft;t + u du
,
,
0
= exp
Z
rt;u du
,
0
@@ Dt; = @@TBt;t+ = rt; Dt; :
(2.4)
,
CHAPTER 34. Brace-Gatarek-Musiela model
337
We will now write t; = t; T
t rather than t;T. In this notation, the HJM model is
,
dft;T = t; t; dt + t; dWt;
(2.5)
dBt;T = rtBt;T dt t; Bt;T dWt;
(2.6)
,
where
t; = Z
t;u du;
0
(2.7)
@@ t; = t; :
(2.8)
We now derive the differentials of rt; and Dt; , analogous to (2.5) and (2.6) We have
drt; =
dft;t +
+ @@Tft;t+ dt
|
z
differential applies only to first argument
(2.5),(2.2)
=
t; t; dt + t; dWt + @@ rt; dt
(2.8)
= @ h
@ rt; + 12 t; 2i dt + t; dWt:
(2.9)
Also,
dDt; =
dBt;t +
+ @@TBt;t+ dt
|
z
differential applies only to first argument
(2.6),(2.4)
= rt Bt;t + dt
t; Bt;t + dWt rt; Dt; dt
,
,
(2.1)
= rt;0 rt; Dt; dt t; Dt; dWt:
(2.10)
,
,
34.3
LIBOR
Fix
0 (say, = 14 year). $ Dt; invested at time t in a t + -maturity bond grows to $ 1 at
time t + . Lt; 0 is defined to be the corresponding rate of simple interest:
Dt; 1 + Lt;0 = 1;
@
1 + Lt;0 = 1
Z
Dt; = exp
rt;u du ;
0
@
L
expnR
1
t;0 =
0 rt; u duo , :
338
34.4
Forward LIBOR
0
Dt; +
is still fixed. At time t, agree to invest $ Dt; at time t + , with payback of $1 at time
t + +
Dt; +
. Can do this at time t by shorting Dt; bonds maturing at time t + and going long
one bond maturing at time t + + . The value of this portfolio at time t is
Dt; +
,
Dt; Dt; + Dt; + = 0:
The forward LIBOR Lt; is defined to be the simple (forward) interest rate for this investment:
Dt; +
Dt; 1 + Lt; = 1;
R
1 + Lt; = Dt;
0 rt; u du
D
g
t; + = expf,
expn
+
R
0 rt; u duo
,
= exp
+
Z
rt;u du ;
+
L
expnR rt;u duo 1
t; =
,
:
(4.1)
Connection with forward rates:
@
+
+
Z
Z
@ exp
rt;u du
= rt; + exp
rt;u du
=0
=0
= rt; ;
so
+
f
expnR rt;u duo 1
t;t + = rt; = lim
,
0
+
L
expnR rt;u duo 1
t; =
,
;
0 fixed:
(4.2)
rt; is the continuouslycompounded rate. Lt; is the simple rate over a period of duration .
We cannot have a log-normal model for rt; because solutions explode as we saw in Section 34.1.
For fixed positive , we can have a log-normal model for Lt; .
34.5
The dynamics of Lt;
We want to choose t; ; t
0;
0, appearing in (2.5) so that
dLt; = ::: dt + Lt; t; dWt
CHAPTER 34. Brace-Gatarek-Musiela model
339
for some
t; ; t
0;
0. This is the BGM model, and is a subclass of HJM models,
corresponding to particular choices of t; .
Recall (2.9):
drt; = @ h
@u rt;u + 12 t;u2i dt + t;u dWt:
Therefore,
!
d
+
+
Z
rt;u du = Z
drt;u du
(5.1)
= +
+
Z
@ h
@u rt;u + 12 t;u2i du dt + Z
t;u du dWt
= hrt; + rt; + 1
1
,
2 t; + 2 , 2 t; 2i dt
+ t; + t; dWt
,
and
2
+ r
dL
expnR
t;u duo 13
t; 4:1
= d
,
4
5
= 1 exp
+
+
Z
rt;u du dZ
rt;u du
2
!
+ 1
+
+
Z
2 exp
rt;u du
dZ
rt;u du
(4.1), (5.1)
= 1 1 + Lt;
(5.2)
rt; + rt; + 1
1
,
2 t; + 2 , 2 t; 2 dt
+ t; + t; dWt
,
+ 12 t; +
t; 2 dt
,
= 1 1 + Lt; rt; + rt; dt
,
+ t; + t; + t; dt
,
=
+ t; + t; dWt :
,
340
But
@
2
expn +
R
rt;u duo 13
,
@ Lt; = @@ 4
5
= exp
+
Z
rt;u du : rt; + rt;
,
= 1 1 + Lt; rt; + rt; :
,
Therefore,
dLt; = @@ Lt; dt+ 1 1+ Lt; t; + t; : t; + dt+dWt :
,
Take t; to be given by
t; Lt; = 1 1 + Lt; t; + t; :
(5.3)
,
Then
dLt; = @@ Lt; + t; Lt; t; + dt+ t; Lt; dWt: (5.4)
Note that (5.3) is equivalent to
t; + = t; + Lt; t;
1 + Lt; :
(5.3’)
Plugging this into (5.4) yields
"
dLt; = @@ Lt; + t; Lt;
dt
t; + L2t; 2t;
1 + Lt; + t; Lt; dWt: (5.4’)
34.6
Implementation of BGM
Obtain the initial forward LIBOR curve
L0; ;
0;
from market data. Choose a forward LIBOR volatility function (usually nonrandom)
t; ; t 0;
0:
CHAPTER 34. Brace-Gatarek-Musiela model
341
Because LIBOR gives no rate information on time periods smaller than , we must also choose a
partial bond volatility function
t; ; t 0; 0
for maturities less than
from the current time variable t.
With these functions, we can for each
0; solve (5.4’) to obtain
2
Lt; ; t 0; 0
:
Plugging the solution into (5.3’), we obtain
t;
2
for
. We then solve (5.4’) to obtain
Lt; ; t 0;
2 ;
and we continue recursively.
Remark 34.1 BGM is a special case of HJM with HJM’s
t;
generated recursively by (5.3’).
In BGM, t; is usually taken to be nonrandom; the resulting
t;
is random.
Remark 34.2 (5.4) (equivalently, (5.4’)) is a stochastic partial differential equation because of the
@@ Lt; term. This is not as terrible as it first appears. Returning to the HJM variables t and T,
set
Kt;T = Lt;T t:
,
Then
dKt;T = dLt;T t @
t dt
,
,
@ Lt;T ,
and (5.4) and (5.4’) become
dKt;T = t;T tKt;T
t
t;T
+ dt + dWt
,
,
= t;T tKt;T t;T t dt + Kt;T t;T t
,
,
,
1 + Kt;T dt + dWt: (6.1)
Remark 34.3 From (5.3) we have
t; Lt; = 1 + Lt; t; +
t;
,
:
If we let
0, then
t; Lt; @
= t; ;
!
@ t; + =0
and so
t;T tKt;T t;T t:
,
!
,
We saw before (eq. 4.2) that as
0,
Lt; rt; = ft;t + ;
!
342
so
Kt;T ft;T:
!
Therefore, the limit as
0 of (6.1) is given by equation (2.5):
dft;T = t;T t
t
t;T dt + dWt :
,
,
t;T tK t;T
2
2
Remark 34.4 Although the dt term in (6.1) has the term
,
1+Kt;T
involving K2, solutions
to this equation do not explode because
2t;T tK2t;T
2t;T tK2t;T
,
,
1 + Kt;T
Kt;T
2t;T tKt;T:
,
34.7
Bond prices
t
Let t = exp nR0 ru duo : From (2.6) we have
dBt;T
t = 1t rtBt;T dt + dBt;T
,
= Bt;T t;T t dWt:
,
t
,
Bt;T
The solution
t to this stochastic differential equation is given by
Bt;T
t
t
Z
1 Z
tB0;T = exp
u
u
u; T
dWu
u;T
2 du :
,
0
,
,
2 0
,
This is a martingale, and we can use it to switch to the forward measure
IP
Z
1
T A =
1
B0;T A T dIP
= Z
BT;T
A
T:
A TB0;T dIP 8 2 F
Girsanov’s Theorem implies that
W
t
Tt = Wt + Z
u
t T;
u;T
du; 0
0
,
is a Brownian motion under IPT .
CHAPTER 34. Brace-Gatarek-Musiela model
343
34.8
Forward LIBOR under more forward measure
From (6.1) we have
dKt;T = t;T tKt;T
t
t;T
+ dt + dWt
,
,
= t;T tKt;T dWT+ t;
,
so
Kt;T = K0;Texp t
t
Z
u;T u dW
1 Z
2u;T u du
0
T+ u
,
,
2 0
,
and
K
T
T
T;T = K0;Texp Z u;T u dW
1 Z
2u;T u du
0
T+ u
,
,
2 0
,
(8.1)
= Kt;Texp T
T
Z
u;T u dW
1 Z
2u;T u du :
t
T+ u
,
,
2 t
,
We assume that
is nonrandom. Then
Xt = T
T
Z
u;T u dW
1 Z
2u;T u du
t
T+ u
(8.2)
,
,
2 t
,
is normal with variance
2t = T
Z
2u;T u du
t
,
1
and mean , 2 2t.
34.9
Pricing an interest rate caplet
Consider a floating rate interest payment settled in arrears. At time T + , the floating rate interest
payment due is LT; 0 = KT; T ; the LIBOR at time T . A caplet protects its owner by
requiring him to pay only the cap c if KT; T
c. Thus, the value of the caplet at time T +
is KT; T c+. We determine its value at times 0
t T + .
,
Case I: T
t T + .
CT+ t = IE t
T + KT;T c+ t
(9.1)
,
F
= KT;T c+IE t
t
,
T + F
= KT;T c+Bt;T + :
,
344
Case II: 0
t T.
Recall that
IPT+ A = Z ZT + dIP; A
T + ;
A
8
2
F
where
Zt = Bt;T +
tB0;T + :
We have
CT+ t = IE t
T + KT;T c+ t
,
F
2
3
6
7
= Bt;T + tB0;T +
BT + ;T +
6
7
Bt;T + IE
KT;T c+ t
6
T + B0;T +
7
,
F
6
7
4
5
|
z
|
z
ZT+
1
Z t
= Bt;T + IE
T+ KT; T c+ t
,
F
From (8.1) and (8.2) we have
KT;T = Kt;Texp Xt ;
f
g
T
1
where Xt is normal under IPT+ with variance 2t = Rt 2u; T u du and mean
,
,
2 2t.
Furthermore, Xt is independent of t.
F
C
T+ t = Bt; T + IET+ Kt; T exp Xt
c+ t:
f
g
,
F
Set
gy = IE h
T+ y exp Xt
c+i
f
g
,
= y N 1
1
t log yc + 12 t
c N 1
,
t log yc , 2 t
:
Then
CT+ t = Bt;T + gKt;T; 0 t T :
(9.2)
,
In the case of constant , we have
t = pT t;
,
and (9.2) is called the Black caplet formula.
CHAPTER 34. Brace-Gatarek-Musiela model
345
34.10
Pricing an interest rate cap
Let
T0 = 0; T1 = ; T2 = 2 ; ::: ; Tn = n :
A cap is a series of payments
KTk;Tk c+ at time Tk+1; k = 0;1;::: ;n 1:
,
,
The value at time t of the cap is the value of all remaining caplets, i.e.,
Ct = X CTkt:
k:t Tk
34.11
Calibration of BGM
The interest rate caplet c on L0;T at time T + has time-zero value
CT+ 0 = B0;T + gK0;T;
where g (defined in the last section) depends on
T
Z
2u;T u du:
0
,
Let us suppose
is a deterministic function of its second argument, i.e.,
t; = :
Then g depends on
T
T
Z
2T u du = Z 2v dv:
0
,
0
T
R
If we know the caplet price CT+ 0, we can “back out” the squared volatility 0 2v dv. If we
know caplet prices
CT + 0;CT + 0;::: ;CTn+ 0;
0
1
where T0
T1 ::: Tn, we can “back out”
T
T
T
T
Z
Z
0
2v dv; Z 1 2v dv = Z 1 2v dv
0
2v dv;
0
T
0
,
0
0
::: ; T
Z
n
2v dv:
T
(11.1)
n,1
In this case, we may assume that
is constant on each of the intervals
0;T0; T0;T1; ::: ; Tn 1;Tn;
,
346
and choose these constants to make the above integrals have the values implied by the caplet prices.
T
R
If we know caplet prices CT+ 0 for all T
0, we can “back out” 0 2v dv and then differen-
tiate to discover 2 and = p 2 for all
0.
To implement BGM, we need both ;
0, and
:
t; ; t 0; 0
Now
t;
is the volatility at time t of a zero coupon bond maturing at time t + (see (2.6)).
1
Since
is small (say 4 year), and 0
, it is reasonable to set
:
t; = 0; t 0; 0
We can now solve (or simulate) to get
Lt; ; t 0;
0;
or equivalently,
Kt;T; t 0;T 0;
using the recursive procedure outlined at the start of Section 34.6.
34.12
Long rates
The long rate is determined by long maturity bond prices. Let n be a large fixed positive integer, so
that n is 20 or 30 years. Then
1
n
Z
Dt;n = exp
rt;u du
0
n
k
=
Z
Y
exp
rt;u du
k=1
k 1
,
n
= Y 1 + Lt;k 1 ;
,
k=1
where the last equality follows from (4.1). The long rate is
1
n
X
n log 1
Dt;n = 1n
log 1 + Lt;k 1 :
,
k=1
34.13
Pricing a swap
Let T0
0 be given, and set
T1 = T0 + ; T2 = T0 + 2 ; ::: ; Tn = T0 + n :
CHAPTER 34. Brace-Gatarek-Musiela model
347
The swap is the series of payments
LTk;0 c at time Tk+1;k = 0;1;::: ;n 1:
,
,
For 0
t T0, the value of the swap is
n 1
,
X
IE t
c t :
F
k=0
Tk+1 LTk;0 ,
Now
1 + LTk;0 =
1
BTk;Tk+1;
so
L
Tk;0 = 1
1
BT
1 :
k;Tk+1 ,
We compute
IE t
T
c t
k+1 LTk; 0 , F
= IE t
1
T
1 c
t
k+1 BTk;Tk+1 , ,
F
2
3
6
7
= IE
t
6
T
t7 1 + cBt;T
6
T
k
7
k+1
F
,
6
kBTk;Tk+1 IE Tk
Tk+1 F
7
4
5
|
z
BTk;Tk
+1
= IE t
T
t 1 + cBt;T
k+1
k+1
F
,
= Bt;Tk 1 + cBt;Tk+1:
,
The value of the swap at time t is
n 1
,
X
IE t
c t
F
k=0
Tk+1 LTk;0,
n 1
= ,
X
Bt;Tk 1 + cBt;Tk+1
,
k=0
= Bt;T0 1 + cBt;T1 + Bt;T1 1 + cBt;T2 + :::+ Bt;Tn 1 1 + cBt;Tn
,
,
,
,
= Bt;T0