随机算法 (Fall 2011)/Coupon Collector

From EtoneWiki
Jump to: navigation, search

Suppose that a chocolate company releases different types of coupons. Each box of chocolates contains one coupon with a uniformly random type. Once you have collected all types of coupons, you will get a prize. So how many boxes of chocolates you are expected to buy to win the prize?

The coupon collector problem can be described in the balls-into-bins model as follows. We keep throwing balls one-by-one into bins (coupons), such that each ball is thrown into a bin uniformly and independently at random. Each ball corresponds to a box of chocolate, and each bin corresponds to a type of coupon. Thus, the number of boxes bought to collect coupons is just the number of balls thrown until none of the bins is empty.

Theorem
Let be the number of balls thrown uniformly and independently to bins until no bin is empty. Then , where is the th harmonic number.
Proof.
Let be the number of balls thrown while there are exactly nonempty bins, then clearly .

When there are exactly nonempty bins, throwing a ball, the probability that the number of nonempty bins increases (i.e. the ball is thrown to an empty bin) is

is the number of balls thrown to make the number of nonempty bins increases from to , i.e. the number of balls thrown until a ball is thrown to a current empty bin. Thus, follows the geometric distribution, such that

For a geometric random variable, .

Applying the linearity of expectations,

where is the th Harmonic number, and . Thus, for the coupon collectors problem, the expected number of coupons required to obtain all types of coupons is .


Only knowing the expectation is not good enough. We would like to know how fast the probability decrease as a random variable deviates from its mean value.

Theorem
Let be the number of balls thrown uniformly and independently to bins until no bin is empty. Then for any .
Proof.
For any particular bin , the probability that bin is empty after throwing balls is

By the union bound, the probability that there exists an empty bin after throwing balls is