随机算法 (Fall 2011)/Problem set 1: Difference between revisions

Revision as of 08:10, 18 September 2011

Problem 0

你的姓名、学号。

Problem 1

(Interviewing problem of Google Inc.)

Give an streaming algorithm maintaining a uniform sample from a data stream. The meaning of this sentence is explained as follows:

Suppose that the input is a sequence of items [math]\displaystyle{ A[1], A[2], A[3], \ldots, A[n] }[/math], which is passed to your algorithm by one item at a time in the sequential order. (Equivalently, you can imagine that your algorithm scans over a large array [math]\displaystyle{ A }[/math] in one direction from left to right.)

You algorithm should return an [math]\displaystyle{ A[r] }[/math], where [math]\displaystyle{ r }[/math] is uniformly distributed over [math]\displaystyle{ \{1,2,\ldots, n\} }[/math].

Usually the input "data stream" is from a massive data set (e.g. search engine data), so you cannot afford storing the entire input. Make your algorithm use as small space as possible. We hope for a small space storing only constant number of items.

Develop an algorithm for the above problem. Give rigorous analysis for the algorithm to justify its correctness and efficiency.
Develop an algorithm which works even if [math]\displaystyle{ n }[/math] is not known in advance, and also give your analysis for the algorithm. (If your algorithm already satisfies this requirement, it's OK to use one algorithm answering both questions.)

@@ Line 9: / Line 9: @@
 Suppose that the input is a sequence of items <math>A[1], A[2], A[3], \ldots, A[n]</math>, which is passed to your algorithm by one item at a time in the sequential order. (Equivalently, you can imagine that your algorithm ''scans'' over a large array <math>A</math> in one direction from left to right.)
-You algorithm should return an <math>A[r]</math>, where <math>r</math> is uniformly distributed over <math>{1,2,\ldots, n}</math>.
+You algorithm should return an <math>A[r]</math>, where <math>r</math> is uniformly distributed over <math>\{1,2,\ldots, n\}</math>.
 Usually the input "data stream" is from a massive data set (e.g. search engine data), so you cannot afford storing the entire input. Make your algorithm use as small space as possible. We hope for a small space storing only constant number of items.

随机算法 (Fall 2011)/Problem set 1: Difference between revisions

Revision as of 08:10, 18 September 2011

Contents

Problem 0

Problem 1

Problem 2

Problem 3

Navigation menu

随机算法 (Fall 2011)/Problem set 1: Difference between revisions

Revision as of 08:10, 18 September 2011

Problem 0

Problem 1

Problem 2

Problem 3

Navigation menu

Search