随机算法 (Fall 2011)/Balls-into-balls Occupancy Problem

From TCS Wiki
Revision as of 00:56, 19 September 2011 by imported>Etone
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Now we ask about the loads of bins. Assuming that m balls are uniformly and independently assigned to n bins, for 1in, let Xi be the load of the ith bin, i.e. the number of balls in the ith bin.

An easy analysis shows that for every bin i, the expected load E[Xi] is equal to the average load m/n.

Because there are totally m balls, it is always true that i=1nXi=m.

Therefore, due to the linearity of expectations,

i=1nE[Xi]=E[i=1nXi]=E[m]=m.

Because for each ball, the bin to which the ball is assigned is uniformly and independently chosen, the distributions of the loads of bins are identical. Thus E[Xi] is the same for each i. Combining with the above equation, it holds that for every 1im, E[Xi]=mn. So the average is indeed the average!


Next we analyze the distribution of the maximum load. We show that when m=n, i.e. n balls are uniformly and independently thrown into n bins, the maximum load is O(lognloglogn) with high probability.

Theorem
Suppose that n balls are thrown independently and uniformly at random into n bins. For 1in, let Xi be the random variable denoting the number of balls in the ith bin. Then
Pr[max1inXi3lnnlnlnn]<1n.
Proof.
Let M be an integer. Take bin 1. For any particular M balls, these M balls are all thrown to bin 1 with probability (1/n)M, and there are totally (nM) distinct sets of M balls. Therefore, applying the union bound,
Pr[X1M](nM)(1n)M=n!M!(nM)!nM=1M!n(n1)(n2)(nM+1)nM=1M!i=0M1(1in)1M!.

According to Stirling's approximation, M!2πM(Me)M, thus

1M!(eM)M.
Figure 1

Due to the symmetry. All Xi have the same distribution. Apply the union bound again,

Pr[max1inXiM]=Pr[(X1M)(X2M)(XnM)]nPr[X1M]n(eM)M.

When M=3lnn/lnlnn,

(eM)M=(elnlnn3lnn)3lnn/lnlnn<(lnlnnlnn)3lnn/lnlnn=e3(lnlnlnnlnlnn)lnn/lnlnn=e3lnn+3lnlnlnnlnn/lnlnne2lnn=1n2.

Therefore,

Pr[max1inXi3lnnlnlnn]n(eM)M<1n.

When m>n, Figure 1 illustrates the results of several random experiments, which show that the distribution of the loads of bins becomes more even as the number of balls grows larger than the number of bins.

Formally, it can be proved that for m=Ω(nlogn), with high probability, the maximum load is within O(mn), which is asymptotically equal to the average load.