Imagine a particle moving randomly on an infinite square grid. The particle begins at some initial point $P$, and at each time step it moves to one of its four neighboring points each with equal probability $\frac{1}{4}$. This is an example of a random walk. Consider the following question: assuming this random walk is allowed to proceed forever, what is the probability that the particle, at some point, revisits its starting point $P$?
We can simplify the model above and consider instead a one-dimensional grid, i.e. a line. In this case, the particle can only move left or right, each with probability $\frac{1}{2}$.
This model can be more easily related to some real scenarios. For example, you could imagine repeatedly flipping a fair coin in a gambling scenario. A heads means that you gain \$1, and a tails means that you lose \$1. A famous theorem says that if you continue this game indefinitely (starting with a finite amount of money), you will eventually go bankrupt. (See gambler's ruin.) This is similar to the statement that a random walk on a one-dimensional grid always eventually returns to its starting point, which turns out to be true.
It turns out that a random walk on a two-dimensional grid always eventually returns to its starting point, meaning that the answer to the initial question is 100%. However, on a three-dimensional grid, the probability of returning to the starting point drops to roughly 34%. At four, five, and six dimensions, the probability drops to 19.3%, 13.5%, and 10.4%, respectively.
Why? Intuitively, the drop in probability makes sense. In higher dimensions, there are more degrees of freedom to move around, so the probability of returning to the starting point decreases. But why is there a sudden phase transition between dimensions 2 and 3? And how does one calculate the percentage values above? Lastly, how long does it take for a random walk to return to its starting point?
An additional surprising result is the following. Although a random walk on a one-dimensional or two-dimensional grid always returns to its starting point, the expected number of steps to do so is infinite. Why?
The core goals of this page are to develop an intuitive understanding of the phase transition between dimensions 2 and 3, and for the infinite expected return time. We'll introduce some mathematical tools which enable precise calculation. We also will explore Brownian motion, which is a continuous-analogue of random walks which makes some appearances in nature.
TODO
TODO Explain how to define Brownian motion
Brownian motion in the plane, set against a pair of axis.
Brownian motion in 3D space.
Here, simulation runs are shown for many random walks in $\mathbb{Z}^D$ beginning at the origin, for dimensions $D=2$ and $D=3$. A random walk stops if it ever reaches the origin again. A histogram is given, showing the distribution of the current $L^1$-distance from the origin (distance from the origin shown along the x-axis, percentage of walks with that distance shown along the y-axis).
Below is a sketch of the proof of the following fact. Consider a random walk in $\mathbb{Z}^D$ beginning at the origin, for some dimension $D \ge 1$. At each timestep, the walk moves to one of the $2D$ neighboring points with equal probability. Then, the probability that the walk never returns to the origin is (a) zero if $D=1$ or $D=2$, and (b) positive if $D > 2$.
Let $a_n$ denote the number of paths of length $n$ in $\mathbb{Z}^D$ which begin and end at the origin. Let $b_n$ denote the number of such paths which never touch the origin between the start and end points. Let $$A(t) = 1 + a_1t + a_2t^2 + a_3t^3 + \cdots$$ $$B(t) = b_1t + b_2t^2 + b_2t^3 +\cdots$$ be the generating functions for $a_n$ and $b_n$, respectively. By convention, let $a_0=1$ and $b_0=0$.
First, the probability that a random walk returns to the origin after exactly $n$ steps is equal to $b_n(1/2D)^n$. So the probability that a random walk returns to the origin after a finite number of steps is equal to $B(1/2D)$.
Second, we have the identity $$A(t) = 1 + B(t) + B(t)^2 + \ldots = \frac{1}{1-B(t)}$$ This follows from a combinatorial argument. From this identity, it follows that $B(1/2D) = 1 - \frac{1}{A(1/2D)}$.
Third, consider the $(D+1)$-variable generating function $$F(x_1, x_2, \ldots, x_D, t) =1 + (x_1 + x_1^{-1} + \ldots + x_D + x_D^{-1})t + (x_1 + x_1^{-1} + \ldots + x_D + x_D^{-1})^2t^2 + \ldots$$ $$=\frac{1}{1-(x_1 + x_1^{-1} + \ldots + x_D + x_D^{-1})t}$$ The coefficient of $x_1^{c_1}x_2^{c_2}\ldots x_d^{c_D}t^n$ in this generating function is equal to the number of paths of length $n$ which end at the point $(c_1, c_2, \ldots, c_D)$. It then follows that $a_n$ is the coefficient of $x_1^0x_2^0\ldots x_d^0t^n$ in $F(x_1, x_2, \ldots, x_D, t)$.
The coefficient of $z^0$ in a generating function $f(z)$ can be computed using the complex contour integral $$[z^0]f(z) = \frac{1}{2\pi i}\oint_{|z|=1}\frac{f(z)}{z}dz = \frac{1}{2\pi}\int\limits_{0}^{2\pi i}f(e^{i\theta})d\theta$$ The second step above comes from the substitution $z=e^{i\theta}$. We apply this for all of the $x_k$'s, letting $x_k=e^{i\theta_k}$, and noting that $x_k + x_k^{-1} = 2\cos(\theta_k)$. We retrieve the entire function $A(t)$ as an integral $$A(t) =\left(\frac{1}{2\pi }\right)^D\int\limits_{[0,2\pi]^D}F(e^{i\theta_1}, \ldots, e^{i\theta_D}, t)d\theta_1\ldots d\theta_D = \left(\frac{1}{2\pi }\right)^D\int\limits_{[0,2\pi]^D}\frac{1}{1-2(\cos(\theta_1) + \ldots + \cos(\theta_D))t}d\theta_1\ldots d\theta_D$$ Therefore, $$A(1/2D) = \left(\frac{1}{2\pi }\right)^D\int\limits_{[0,2\pi]^D}\frac{1}{1-(\cos(\theta_1) + \ldots + \cos(\theta_D))/D}d\theta_1\ldots d\theta_D$$
We claim that the above integral diverges when $D \le 2$ (implying that $1-\frac{1}{A(1/2D)} = 1$, and therefore any path returns to the origin in finite time with probability 1) and converges when $D > 2$ (implying that $1-\frac{1}{A(1/2D)} < 1$, and therefore there is a nonzero probability that a path never returns to the origin). Why is this claim true? (TODO Visualization in 2D, and explanation in terms of Taylor expansion of cosine.)