highdimprob.knit

High-Dimensional Probability

In recent years the study of problems in probability theory in high dimensions has become more important. Such problems arise in a number of fields, from Biology to Computer Science. The main take-away is that our “low-dimensional” intuition can fail badly in high-dimensions. This is generally known as the curse of dimensionality, a term coined by Richard E. Bellman. We begin by considering some questions from geometry:

Example (4.3.1)

What is the area of the hyper-sphere relative to the hyper-cube?

Here is the answer in two dimensions:

\[\frac{\text{area of circle}}{\text{area of square}} = \frac{\pi}{4}=0.785\]

So our intuition tells us that the circle takes up a large part of the square. However, in d dimensions we find

\[ \begin{aligned} &\text{area of square} =2^d \\ &\text{area of circle} = \frac{2\pi^{d/2}}{d\Gamma(2d/2)} \\ &\frac{\text{area of circle}}{\text{area of square}} = \frac{\pi^{d/2}}{d2^{d-1}\Gamma(d/2)}\\ \end{aligned} \]

d=2:20
plot(d, pi^(d/2)/d/2^(d-1)/gamma(d/2), ylab="")

and so we see that this goes to 0! In high-dimensional space almost none of the area is in the circle, almost all of it is in the corners.

Can one understand this somehow? Observe that when we go from d to d+1 dimensions we always double the number of corners, so the total number of corners increases. On the other hand there is always only one inner circle.

Consider the distance from the center (at (0,0)) to a corner. By Pythagoras theorem in two dimensions this is \(\sqrt 2\). It can be shown that in d dimensions the distance is \(\sqrt d\), and this goes to infinity! So although we never leave the unit cube, even straight-line distances can go to infinity.

Next some examples from probability:

Example (4.3.2)

Say we have a d-dimensional standard normal distribution. In one dimensions we have of course the famous P(|X|>2)=0.05, that is the probability that a number chosen at random from the standard normal distribution is far from 0 (more than 2 units) is low (5%). How does this look in d dimensions?

So let \(\mathbf{X} = \{(X_1,..,X_d)\}\) be a point in \(R^d\), with \(X_i\sim N(0,1)\) and all independent. Then the Euclidean distance of \(\mathbf{X}\) from the origin is \(\sqrt{\sum X_i^2}\). Note that \(\sum X_i^2\sim \Gamma(d/2,\frac12)\). This is also called the chi-square distribution with d degrees of freedom. So now

\[P(\sqrt{\sum X_i^2} >2) = P(\sum X_i^2 >4)\]

d=1:20
plot(d, 1-pgamma(4,d/2, 0.5), ylab="")

and so in high-dimensional space all the points are far from the origin!

Again this is not so surprising once you think about it a bit: for \(\sum X_i^2\) to be small, all the \(X_i's\) have to be small. While each of them is small with a high probability, sooner or later there will be a larger one, making the sum large as well.

One of the most amazing results is this one:

Example (4.3.3)

Consider the hypersphere in \(\mathbb{R}^d\). Choose a point at random and consider the set of points on \(\mathbb{R}^d\) that are a distance \(0.5\pm \epsilon\) from y. Here is an illustration:

Now we choose another point at random on the hypersphere. What is the probability that is in the blue region? Our intuition tells us that the probability is small (for small \(\epsilon\)), as it would be in the two-dimensional case shown above. However, one can show that

\[P\ge 1-\exp(-d\epsilon^2/8)\]

which goes to 1 as \(d\) goes to infinity! So in high-dimensional space almost all the points will be in the blue region.

So, how can one do probability in such high-dimensional spaces? On of the most important tool are the concentration inequalities we discussed before. They turn out to still work even there and can often give us a good idea just where the high-probability areas are.

For more on this fairly recent topic see the book High-Dimensional Probability by Roman Vershynin.