Tazos


(courtesy of Richard Weber)


Dear Don,

A couple of my friends here have spent a while trying to unravel a problem which just gets very messy very quickly, yet seems like it should have a simple solution. Could you possibly help us out? If its simple and we should be able to solve it using techiques you taught us once, sorry, it was a long time (!) ago.

Walkers crisps have just finished a promotion in which some bags contain a Tazo. A tazo is a small piece of plastic with a number on the back. These numbers lie between 1 and 50, and the idea is you collect the lot. The numbers seem to pretty uniform in distribution, and if this is the case the problem is how many Tazos would one require to have an evens probability of having the complete set.

Computer simulation seems to give a answer of 213-4. But this is not entirely satisfactory, as we prefer analytic approches. Is there a simple way of getting this number?

Signed, A puzzled third year.


Dear Puzzled third year,

Indeed, you should have been able to do this if you had learnt Probability IA properly. You must have had a dud supervisor!

Let A_i be the event that Tazo #i is not included in the set. You want

1-P(union A_i)=P(you have the complete set).

Remember the principle of inclusion-exclusion?

P(union A_i) = sum P(A_i) - sum P(A_i A_j) + sum P(A_i A_j A_k) - etc

= b(50,1)(1-1/50)^n - b(50,2)(1-2/50)^n + b(50,3)(1-3/50)^n - ... ... + b(50,49)(1-49/50)^n

where b(50,j) is the binomial coefficient.

Now you have to write a little program to find the smallest n that makes this less than 0.5. Truncating the summation at an odd or even number of terms gives upper and lower bounds on the total sum (why?) Try n=214. (I do not think the sum simplifies.) You'll find that n=306 gives you a >90% chance of having all Tazos. Note how this differs from the expected time to get the complete set, which is about 50ln(50) = 196.

Signed, your IA supervisor