Two weeks in a row with two solvers! You flatter me! Thanks to
and for their solutions—both of which were correct. Both of them solved the problem more or less the same as below.But first, let’s remind ourselves of the problem from last week.
[…] as it happens with any great bar crawl, the group has begun to splinter. Your group of N people has rather miraculously separated such that each person is at a different one of the N bars in the neighborhood. You want to reassemble so that you can properly toast for St. Patrick’s Day, but in your stubborn drunken state, none of you could agree on a bar to meet up at. Instead, you all decide to proceed as you were—going to a different bar each hour—but if you end up at a bar with one of the N people in the group, you stick together and travel as a group. The question this week is how long will it take for your whole group to reassemble?
Let’s first start with some simple examples and work our way up. Let’s first calculate the probability that no groups merge after a swap. This would imply that each of your N friends selected a different bar. Hopefully it’s clear that the probability of this happing is
How about if we start with N groups and end up with N-1 groups? That would imply 2 groups selected the same destination and everyone else selected unique destinations. In this case, there are (N choose 2) different ways of selecting the 2 groups that will merge. Of course, we also need to count the number of different ways of assigning the groups to the N-1 destinations. To do this, we need to select N-1 destinations (N choose N-1) and then assign each of them a group (N-1)!. Overall that leaves us with
Ok, and how about N’=N-2? Here again we need to select N-2 destination (N choose N-2) but for assigning groups to the destinations, things are a little trickier. For N-1 we knew one group had to go into one of the existing groups, but for N-2, you could have two different overlap destinations one overlap destination that 3 groups all selected. We could attempt to account for these different possibilities but the keen reader will realize that this complexity will only get more and more severe as we consider N-3 and so on. So we need a better way.
The key is recognizing that we can use Stirling numbers of the second kind to count the number of ways N groups can select N-d destinations.1 Keep in mind though that Stirling number of the second kind count the number of ways we can partition N elements into M unlabeled groups. The way we’re counting, we do want to consider the destination of the groups so we need to add a (N-d)! term that counts the number of different ways we can assign the N-d groups to the N-d destinations.
With this in mind, we can generalize the formula from above into
where S(n,m) is a Stirling number of the second kind.
From here, the rest is pretty simple. We just set up a recurrence relationship for the expected number of hours. Note that we set m=N-d.
To actually compute this, we need to rearrange a little to ensure we’re not trying to compute T(N) using an expression that includes T(N) so with some algebra we get
Of course T(1)=0 and using recursion, we can compute the answer!
Interestingly, T(N)~2N as N get’s large. I tried doing a little bit of asymptotic analysis to verify it analytically but I didn’t have much luck. Feel free to comment if you had any better luck.
And that’s it! I hope you enjoyed this one.
As always, please comment with any thoughts or feedback you had on this problem or if you approached it differently. Thanks for reading!
The formula for the Stirling number of the second kind can be derived using the inclusion-exclusion principal but we’ve used that method a few times in the last few months so I will omit a proof.
I think I figured out your slope, but it's a bit hand-wavy
For large N and N=m: P(N, m -> m) ~ N!/N^N << 1
What this means is that we are essentially averaging T(N) assuming it is greater than O(N).
Next regime then is what happens when N >> m: P(N, m -> m) ~ 1 - (m)(m+1)/(2N) so 1/(1-P(N,m -> m)) = 2N/(m (m+1))
In this regime P(N, m -> m-i) ~ N^(m-i)/N^m so we would expect the m-> m-1 transition to dominate. It would then be sufficient to sum up the contribution to the expected value. Thus
T(N) ~ sum(2 N / (m (m+1)) with m from 1 to infinity)
The sum is telescopic so we get T(N) ~ 2 N