We have previously seen that if we have a finite sample space S and all the outcomes are equally likely, then P(A)=n(A)/n(S), so finding probabilities means counting the number of outcomes in A and in S. The mathematical theory that deals with counting is called combinatorics. Here we will consider some special cases and their formulas.
Fundamental Theorem of Counting
If a job consists of k separate tasks, the ith of which can be carried out in ni ways, the entire job can be done in \(n_1 \times n_2 \times..\times n_k\) ways
proof
say k=2, say task 1 has outcomes a1 ,..,an and task 2 has outcomes b1 ,..,bm , then clearly there are \(n \times m\) combinations. The general case follows from mathematical induction.
You possess 5 pairs of shoes, 10 pairs of socks, 4 pairs of trousers and 9 shirts. How many combinations of outfits are there?
\[5\times 10 \times 4 \times 9\]
Many of the problems in combinatorics are variations of the following: say we have a box with n balls, numbered 1 to n, and we select k of them. In how many ways can this be done? In order to answer this question we need to be more specific on how the draws are done:
Case 1: with order and with repetition
Balls are drawn as follows: pick one ball, write down the number, replace the ball in the box, draw a second ball etc. In this case we have the order in which the balls are drawn, and we have repetition , that is the same ball can be chosen more than once.
Say n=10 and k=3, then some possible outcomes are: (7,2,3), (7,6,7), (6,7,7)
According to the Fundamental Theorem of counting this each task (drawing a ball) can be done in n ways, and there are k tasks, so the total number of ways is
\[n \times n \times n \times...\times n=n^k\]
Case 2: with order but without repetition
Balls are drawn as follows: pick one ball, write down the number, put the ball aside, not back in the box, draw a second ball etc. In this case we have the order in which the balls are drawn but each ball can be drawn only once, so there is no repetition
Say n=10 and k=3, then some possible outcomes are: (7,2,3), (7,6,4) but not (6,7,7)
According to the Fundamental Theorem of counting the first task (drawing the first ball) can be done in n ways, the second task can be done in n-1 ways and so on until the k^th task which can be done in n-k+1 ways, so the total number of ways is
\[n \times (n-1) \times(n-2) \times...\times (n-k+1)\]
This is often call the number of permutations of n objects, k at a time, and we use the notation \(P^n_k\).
An important special case is k=n, which is just called the permutations of n objects.
n!=n(n-1)(n-2)..1 is called “n factorial”
Note by definition 0!=1
with this definition we have \(P^n_k =n!/(n-k)!\).
Case 3: without order and without repetition
Balls are drawn as follows: pick all the balls simultaneously. In this case we have no order and no repetition
Say n=10 and k=3, then some possible outcomes are: (7,2,3), (7,6,4) but (6,4,7) is the same as (7,6,4)
This is called the number of combinations of n objects, k at a a time, and is denoted by \(C^n_k\).
To do this think in terms of a two tasks: first select without order and without repetition (which can be done in \(C^n_k\) ways) and then order the k selected objects (in k! ways) The result is a selection with order but without repetition, but this is \(P^n_k\)!. So we find:
\[P^n_k =C^n_k \times k!\] \[C^n_k =\frac{n}{(n-k)!k!}\]
\[C^n_k = \frac{n!}{(n-k)!k!} = {{n}\choose{k}}\]
We say “n choose k”
Case 4: without order but repetition
as is case 1, but the order is now irrelevant. This is somewhat more complicated, but the answer is
\[{{n+k-1}\choose{k}}\]
How many different license plates can there be in PR? A license plate has three letters and three numbers, order is important and there is repetition, there are
\[26\times 26\times 26\times 10\times 10\times 10 = 17,576,000\]
possible plates
Poker is played in a large number of different ways. Here we will keep it simple: we have a deck of 52 cards. Each card has a suit (Hearts, Diamonds, Clubs and Spades) and a denomination (2-10, Jack, Queen, King and Ace). A “hand” is any selection of 5 cards. The order is not important, and each combination of suit-denomination appears only once, so selection is done without order and without repetition.
How many 5-card hands are there?
\[C^{52}_5 = 52!/47!/5! = 2598960\]
What is the probability of a “four of a kind”, that is four cards of the same denomination?
First choose the denomination (13 ways), net select all those cards (1 way), finally choose a card from the rest of the deck (48 ways) so
\[P(\text{four of a kind}) =13\times 1\times 48/2598960 = 0.00024\].
What is the probability of a “full house”, that is three cards of one denomination and two cards of a second denomination?
First choose the denomination for the three cards (13 ways) next pick the three cards (\(C^4_3 =4\) ways) then pick the denominations for the two cards (12 ways) and finally pick those two cards (\(C^4_2 =6\) ways), so
\[P(\text{full house}) =13\times 4\times 12\times 6/2598960 = 0.00144\]
Notice that the probability of a “four of a kind” is smaller than the one for “full house”, and there fore a “four of a kind” beats a “full house”
The Pigeon Hole Principle states that if there are n pigeons, m pigeon holes and n>m, there must be at least one hole with more than one pigeon.
This completely obvious principle has many, often surprising, consequences. One of the most famous is this one:
There must be at least two people in Puerto Rico with the exact same number of hairs on their heads!
We can demonstrate this as follows. Since a typical human head has an average of around 150,000 hairs, it is reasonable to assume (as an upper bound) that no one has more than 1,000,000 hairs on their head (m = 1 million holes). There are more than 1,000,000 people in Puerto Rico (n is bigger than 1 million items). Assigning a pigeon hole to each number of hairs on a person’s head, and assigning people to pigeon holes according to the number of hairs on their head, there must be at least two people assigned to the same pigeon hole by the 1,000,001st assignment.
Perhaps the first written reference to the pigeon hole principle appears in 1622 in a short sentence of the Latin work Selectae Propositiones, by the French Jesuit Jean Leurechon, where he wrote “It is necessary that two men have the same number of hairs, or other things, as each other”.
Let \(|A|\) denote the number of elements in the set \(A\). Now it is easy to see that
\[|A\cup B|=|A|+|B|-|A\cap B|\] and so
\[ \begin{aligned} &S = (A\cup B)\cup (A^c\cap B^c)\\ &|S| = |A\cup B|+|A^c\cap B^c| = \\ &|A|+|B|-|A\cap B|+|A^c\cap B^c| \end{aligned} \] or
\[|A^c\cap B^c| = |S|-|A|-|B|+|A\cap B|\] This is a special case of
Inclusion-Exclusion Formula
If \(A_i \subseteq S\) for \(1\le i \le n\) then
\[|\bigcap A_i^c|=|S|-\sum_i |A_i|+\sum_{i,j}|A_i\cap A_j|\pm ... \pm\bigcap_i |A_i|\]
proof
We will count the number of times an element x is counted on both sides of the equation. Say \(x\in \bigcap A_i^c\), then it is counted once on the left side. It is also counted once in \(|S|\) but never in any of the other terms because x is not in any of the \(A_i\).
Now assume \(x\not\in \bigcap A_i^c\), so it is not counted on the left. On the right it is counted once in \(|S|\). For some values \(i_1,...,i_k\), \(x\in A_{i_m}\), and x is not in any of the other \(A_i\). So x is counted 0 times in any of the terms involving an \(A_i\) with \(i\not\in \{i_1,...,i_k\}\), and is counted once by each term involving only \(A_{i_1},...,A_{i_k}\). There are k terms of the form \(-|A_{i_m}|\), which count x -k times. There are \(k\choose 2\) terms of the form \(A_{i_l}\cap A_{i_m}\), counting x a total of \(k\choose 2\) times. Continuing in this way we the the total count on the right side of the equation is
\[ \begin{aligned} &1-k+{k\choose 2}-{k\choose 2}\pm...+(-1)^k{k\choose k} = \\ &\sum_{i=0}^k (-1)^k{k\choose i} = \\ &\sum_{i=0}^k {k\choose i} (-1)^k1^{k-i} = (-1+1)^k=0\\ \end{aligned} \]
\[|\bigcup A_i|=\sum_{k=1}^n (-1)^{k+1} \sum |\bigcap_{j=1}^k A_{i_j}|\] where the inner sum goes over all subsets \(\{i_1,...,i_k\}\) of \(1,...,n\).
Suppose we shuffle a deck of cards, what is the probability that no card remains in its original position? More generally, how many permutations of \([n]=\{1,2,...,n\}\) have none of the integers in the correct location? Such a permutation is called a derangement.
Let \(S\) be the set of all permutations of \([n]\) and \(A_i\) be the permutation in which \(i\) is in the correct position. The we want to find \(|\bigcap_{i=1}^n A_i^c|\).
For any \(i\) \(|A_i|=(n-1)!\) because once \(i\) is fixed the other elements can be in any location. Similarly \(|A_i\cap A_j|=(n-2)!\) and so on.
So by the inclusion-exclusion formula we find
\[ \begin{aligned} &|\bigcap_{i=1}^n A_i^c| = |S|+\sum_{k=1}^n (-1)^k {n\choose k}(n-k)!=\\ &n!+\sum_{k=1}^n (-1)^k \frac{n!}{(n-k)!k!}(n-k)!= \\ &n!+n!\sum_{k=1}^n (-1)^k \frac{1}{k!}= \\ &n!\left(1+\sum_{k=1}^n (-1)^k \frac{1}{k!}\right)= \\ &n!\sum_{k=0}^n (-1)^k \frac{1}{k!} \end{aligned} \] So the probability of getting a derangement by random chance is
\[\frac{1}{n!} n!\sum_{k=0}^n (-1)^k \frac{1}{k!} = \sum_{k=0}^n (-1)^k\frac{1}{k!} \approx e^{-1} \approx 0.368\]
\(f(x)\) is called a generating function for the sequence \(a_0, a_1, ...\) if
\[f(x) = \sum_{i=1}^\infty a_i x^i\]
\[(x+1)^n = \sum_{i=0}^n {n\choose k}x^k\] so \(f(x) = (x+1)^n\) is a generating function for the sequence of binomial coefficients \({n\choose k};k=0,...,n\).
This is easy: 1+2=2+1=3, so there are 2 solutions. However, we can also consider the polynomials
\[(1+x+x^2)(1+x+x^2)\] We can multiply this out and collect the terms. Then the coefficient of \(x^k\) will be the number of ways to choose one term from each factor so that the exponents add up to \(k\). But this is exactly the number of solutions of \(x+y=k\) where \(0\le x, y\le 2\). So, our problem is to find the coefficient of \(x^{3}\).
Now
\[ \begin{aligned} &(1+x+x^2)(1+x+x^2) = \\ &1+2x+3x^2+2x^3+x^4 \end{aligned} \] and so we see it is indeed 2!
This also works in more complicate cases:
Let’s consider the function
\[(1+x+x^2+x^3+x^4)(1+x+x^2+x^3+x^4)(x^2+x^3+x^4+x^5+x^6)\] Now our problem is to find the coefficient of \(x^{10}\).
We can use some online symbolic computer algebra program to do the work. We find
\[(1+x+x^2+x^3+x^4)(1+x+x^2+x^3+x^4)(x^5+x^6+x^2+x^3+x^4)=\] \[x^{14}+3x^{13}+6x^{12}+10x^{11}+15x^{10}+18x^9+19x^8+18x^7+15x^6+10x^5+6x^4+3x^3+x^2\] and so the answer is 15.
So this is the same as the problem before, except \(x\) is not bounded. So we should consider the generating function
\[(1+x+x^2+x^3+x^4+...)(1+x+x^2+x^3+x^4)(x^2+x^3+x^4+x^5+x^6)=\] \[\frac{(1+x+x^2+x^3+x^4)(x^2+x^3+x^4+x^5+x^6)}{1-x}\]
using the geometric series. We find
\[f(x)=x^2+3x^3+6x^4+10x^5+15x^6+19x^7+22x^8+24x^9+25x^{10}+...\] and so the answer is 25.
So now there are no restrictions, and we find
\[ \begin{aligned} &f(x) = (1+x+x^2+x^3+x^4+...)^3 = \frac{1}{(1+x)^3} = \\ &1+3x+6x^2+10x^3+15x^4+21x^5+28x^6+36x^7+45x^8+55x^9+66x^{10}+... \end{aligned} \]
Combinatorics is a huge area of mathematics, with applications in almost all fields of science.