Definitions and Notation
The Matching Experiment
The matching experiment is a random experiment that can the formulated in a number of colorful ways. Let .
- Suppose that male-female couples are at a party and that the males and females are randomly paired for a dance. A match occurs if a couple happens to be paired together.
- An absent-minded secretary prepares letters and envelopes to send to different people, but then randomly stuffs the letters into the envelopes. A match occurs if a letter is inserted in the proper envelope.
- people with hats have had a bit too much to drink at a party. As they leave the party, each person randomly grabs a hat. A match occurs if a person gets his or her own hat.
The experiments in [1] are equivalent from a mathematical point of view, and correspond to selecting a random permutation of the population .
- Number the couples from 1 to . Then is the number of the woman paired with the th man.
- Number the letters and corresponding envelopes from 1 to . Then is the number of the envelope containing the th letter.
- Number the people and their corresponding hats from 1 to . Then is the number of the hat chosen by the th person.
Our modeling assumption, of course, is that is uniformly distributed on the set of permutations of . The number of objects is the basic parameter of the experiment. We will also consider the case of sampling with replacement from the population , because the analysis is much easier but still provides insight. In this case, is a sequence of independent random variables, each uniformly distributed over .
Matches
A match occurs at position if . So the number of matches is the random variable defined mathematically by
where is the indicator variable for the event of match at position .
Our problem is to compute the probability distribution of the number of matches. This is an old and famous problem in probability that was first considered by Pierre-Remond Montmort; it sometimes referred to as Montmort's matching problem in his honor.
Sampling With Replacement
First let's solve the matching problem in the easy case, when the sampling is with replacement. Of course, this is not the way that the matching game is usually played, but the analysis will give us some insight.
is a sequence of Bernoulli Trials, with success probability .
Details:
The variables are independent since the sampling is with replacement. Since is uniformly distributed, .
The number of matches has the binomial distribution with trial parameter and success parameter .
Details:
This follows immediately from [4].
The mean and variance of the number of matches are
Details:
These results follow from [5]. Recall that the binomial distribution with parameters and has mean and variance .
Sampling Without Replacement
Now let's consider the case of real interest, when the sampling is without replacement, so that is a random permutation of the elements of .
Counting Permutations with Matches
To find the probability density function of , we need to count the number of permutations of with a specified number of matches. This will turn out to be easy once we have counted the number of permutations with no matches; these are called derangements of . We will denote the number of permutations of with exactly matches by for . In particular, is the number of derrangements of .
The number of derrangements is
Details:
By the complement rule for counting measure . From the inclusion-exclusion formula,
But if with then . Finally, the number of subsets of with is . Substituting into the displayed equation and simplifying gives the result.
The number of permutations with exactly matches is
Details:
The following is two-step procedure that generates all permutations with exactly matches: First select the integers that will match. The number of ways of performing this step is . Second, select a permutation of the remaining integers with no matches. The number of ways of performing this step is . By the multiplication principle of combinatorics it follows that . Using [8] and simplifying gives the results.
The Probability Density Function
The probability density function of the number of matches is
Details:
This follows directly from [9], since .
In the matching experiment, vary the parameter and note the shape and location of the probability density function. For selected values of , run the simulation 1000 times and compare the empirical density function to the true probability density function.
.
Details:
A simple probabilistic proof is to note that the event is impossible—if there are matches, then there must be matches. An algebraic proof can also be constructed from the probability density function in exericse [10].
The distribution of the number of matches converges to the Poisson distribution with parameter 1 as :
Details:
From the power series for the exponential function,
So the result follows from the probability density function in [10].
The convergence is remarkably rapid.
In the matching experiment, increase and note how the probability density function stabilizes rapidly. For selected values of , run the simulation 1000 times and compare the relative frequency function to the probability density function.
Moments
The mean and variance of the number of matches could be computed directly from the distribution. However, it is much better to use the representation in terms of indicator variables. The exchangeable property is an important tool in this section.
for .
Details:
is uniformly distributed on for each so .
So the expected number of matches is 1, regardless of , just as in when the sampling is with replacement .
for .
Details:
This follows from .
A match in one position would seem to make it more likely that there would be a match in another position. Thus, we might guess that the indicator variables are positively correlated.
For distinct ,
Details:
Note that is the indicator variable of the event of a match in position and a match in position . Hence by the exchangeable property . As before, . The results now follow from standard computational formulas for covariance and correlation.
Note that when , the event that there is a match in position 1 is perfectly correlated with the event that there is a match in position 2. This makes sense, since there will either be 0 matches or 2 matches.
for every .
Details:
This follows from [17] and [18], and basic properties of covariance. Recall that .
In the matching experiment, vary the parameter and note the shape and location of the mean standard deviation bar. For selected values of the parameter, run the simulation 1000 times and compare the sample mean and standard deviation to the distribution mean and standard deviation.
So the event that a match occurs in position is nearly independent of the event that a match occurs in position if is large. For large , the indicator variables behave nearly like Bernoulli trials with success probability , which of course, is what happens when the sampling is with replacement.
A Recursion Relation
In this subsection, we will give an alternate derivation of the distribution of the number of matches, in a sense by embedding the experiment with parameter into the experiment with parameter .
The probability density function of the number of matches satisfies the following recursion relation and initial condition:
- for .
- .
Details:
First, consider the random permutation of . Note that is a random permutation of if and only if if and only if . It follows that
From the defnition of conditional probability argument we have
But and . Substituting into the last displayed equation gives the recurrence relation. The initial condition is obvious, since if we must have one match.
This result can be used to obtain the probability density function of recursively for any .
The Probability Generating Function
Next recall that the probability generating function of is given by
The family of probability generating functions satisfies the following differential equations and ancillary conditions:
- for and
- for
Note also that for . Thus, the system of differential equations can be used to compute for any .
For with ,
Details:
This follows from differential equation in [23].
For ,
Details:
This follows from [25] and basic properties of generating functions.
Examples and Applications
A secretary randomly stuffs 5 letters into 5 envelopes. Find each of the following:
- The number of outcomes with exactly matches, for each .
- The probability density function of the number of matches.
- The covariance and correlation of a match in one envelope and a match in another envelope.
Details:
-
|
0 |
1 |
2 |
3 |
4 |
5 |
|
44 |
45 |
20 |
10 |
0 |
1 |
-
|
0 |
1 |
2 |
3 |
4 |
5 |
|
0.3667 |
0.3750 |
0.1667 |
0.0833 |
0 |
0.0083 |
- Covariance: , correlation
Ten married couples are randomly paired for a dance. Find each of the following:
- The probability density function of the number of matches.
- The mean and variance of the number of matches.
- The probability of at least 3 matches.
Details:
-
- ,
In the matching experiment, set . Run the experiment 1000 times and compare the following for the number of matches:
- The true probabilities
- The relative frequencies from the simulation
- The limiting Poisson probabilities
Details:
- See part (a) of [28].
-
|
|
0 |
0.3678794 |
1 |
0.3678794 |
2 |
0.1839397 |
3 |
0.06131324 |
4 |
0.01532831 |
5 |
0.003065662 |
6 |
0.0005109437 |
7 |
0.00007299195 |
8 |
|
9 |
|
10 |
|