imported>Etone |
imported>Etone |
Line 1: |
Line 1: |
| =The Chernoff Bound= | | {{Infobox |
| | |name = Infobox |
| | |bodystyle = |
| | |title = <font size=3>高级算法 |
| | <br>Advanced Algorithms</font> |
| | |titlestyle = |
|
| |
|
| Suppose that we have a fair coin. If we toss it once, then the outcome is completely unpredictable. But if we toss it, say for 1000 times, then the number of HEADs is very likely to be around 500. This striking phenomenon, illustrated in the right figure, is called the '''concentration'''. The Chernoff bound captures the concentration of independent trials.
| | |image = |
| | |imagestyle = |
| | |caption = |
| | |captionstyle = |
| | |headerstyle = background:#ccf; |
| | |labelstyle = background:#ddf; |
| | |datastyle = |
|
| |
|
| [[File:Coinflip.png|border|450px|right]]
| | |header1 =Instructor |
| | | |label1 = |
| The Chernoff bound is also a tail bound for the sum of independent random variables which may give us ''exponentially'' sharp bounds.
| | |data1 = |
| | | |header2 = |
| Before proving the Chernoff bound, we should talk about the moment generating functions.
| | |label2 = |
| | | |data2 = 尹一通 |
| == Moment generating functions ==
| | |header3 = |
| The more we know about the moments of a random variable <math>X</math>, the more information we would have about <math>X</math>. There is a so-called '''moment generating function''', which "packs" all the information about the moments of <math>X</math> into one function.
| | |label3 = Email |
| | | |data3 = yitong.yin@gmail.com yinyt@nju.edu.cn |
| {{Theorem
| | |header4 = |
| |Definition|
| | |label4= office |
| :The moment generating function of a random variable <math>X</math> is defined as <math>\mathbf{E}\left[\mathrm{e}^{\lambda X}\right]</math> where <math>\lambda</math> is the parameter of the function.
| | |data4= 计算机系 804 |
| }}
| | |header5 = Class |
| | | |label5 = |
| By Taylor's expansion and the linearity of expectations,
| | |data5 = |
| :<math>\begin{align}
| | |header6 = |
| \mathbf{E}\left[\mathrm{e}^{\lambda X}\right]
| | |label6 = Class meetings |
| &=
| | |data6 = Friday, 4pm-6pm <br> 仙II-504 |
| \mathbf{E}\left[\sum_{k=0}^\infty\frac{\lambda^k}{k!}X^k\right]\\
| | |header7 = |
| &=\sum_{k=0}^\infty\frac{\lambda^k}{k!}\mathbf{E}\left[X^k\right]
| | |label7 = Place |
| \end{align}</math>
| | |data7 = |
| | | |header8 = |
| The moment generating function <math>\mathbf{E}\left[\mathrm{e}^{\lambda X}\right]</math> is a function of <math>\lambda</math>.
| | |label8 = Office hours |
| | | |data8 = Tuesday, 2-4pm <br>计算机系 804 |
| == The Chernoff bound ==
| | |header9 = Textbooks |
| The Chernoff bounds are exponentially sharp tail inequalities for the sum of independent trials.
| | |label9 = |
| The bounds are obtained by applying Markov's inequality to the moment generating function of the sum of independent trials, with some appropriate choice of the parameter <math>\lambda</math>.
| | |data9 = |
| {{Theorem
| | |header10 = |
| |Chernoff bound (the upper tail)| | | |label10 = |
| :Let <math>X=\sum_{i=1}^n X_i</math>, where <math>X_1, X_2, \ldots, X_n</math> are independent Poisson trials. Let <math>\mu=\mathbf{E}[X]</math>.
| | |data10 = [[File:MR-randomized-algorithms.png|border|100px]] |
| :Then for any <math>\delta>0</math>,
| | |header11 = |
| ::<math>\Pr[X\ge (1+\delta)\mu]\le\left(\frac{e^{\delta}}{(1+\delta)^{(1+\delta)}}\right)^{\mu}.</math>
| | |label11 = |
| }}
| | |data11 = Motwani and Raghavan. <br>''Randomized Algorithms''.<br> Cambridge Univ Press, 1995. |
| {{Proof| For any <math>\lambda>0</math>, <math>X\ge (1+\delta)\mu</math> is equivalent to that <math>e^{\lambda X}\ge e^{\lambda (1+\delta)\mu}</math>, thus
| | |header12 = |
| :<math>\begin{align}
| | |label12 = |
| \Pr[X\ge (1+\delta)\mu]
| | |data12 = [[File:Approximation_Algorithms.jpg|border|100px]] |
| &=
| | |header13 = |
| \Pr\left[e^{\lambda X}\ge e^{\lambda (1+\delta)\mu}\right]\\
| | |label13 = |
| &\le
| | |data13 = Vazirani. <br>''Approximation Algorithms''. <br> Springer-Verlag, 2001. |
| \frac{\mathbf{E}\left[e^{\lambda X}\right]}{e^{\lambda (1+\delta)\mu}},
| | |belowstyle = background:#ddf; |
| \end{align}</math>
| | |below = |
| where the last step follows by Markov's inequality.
| |
| | |
| Computing the moment generating function <math>\mathbf{E}[e^{\lambda X}]</math>:
| |
| :<math>\begin{align}
| |
| \mathbf{E}\left[e^{\lambda X}\right]
| |
| &=
| |
| \mathbf{E}\left[e^{\lambda \sum_{i=1}^n X_i}\right]\\
| |
| &=
| |
| \mathbf{E}\left[\prod_{i=1}^n e^{\lambda X_i}\right]\\
| |
| &=
| |
| \prod_{i=1}^n \mathbf{E}\left[e^{\lambda X_i}\right].
| |
| & (\mbox{for independent random variables})
| |
| \end{align}</math>
| |
| | |
| Let <math>p_i=\Pr[X_i=1]</math> for <math>i=1,2,\ldots,n</math>. Then,
| |
| :<math>\mu=\mathbf{E}[X]=\mathbf{E}\left[\sum_{i=1}^n X_i\right]=\sum_{i=1}^n\mathbf{E}[X_i]=\sum_{i=1}^n p_i</math>.
| |
| | |
| We bound the moment generating function for each individual <math>X_i</math> as follows.
| |
| :<math>\begin{align}
| |
| \mathbf{E}\left[e^{\lambda X_i}\right]
| |
| &=
| |
| p_i\cdot e^{\lambda\cdot 1}+(1-p_i)\cdot e^{\lambda\cdot 0}\\
| |
| &=
| |
| 1+p_i(e^\lambda -1)\\
| |
| &\le
| |
| e^{p_i(e^\lambda-1)},
| |
| \end{align}</math>
| |
| where in the last step we apply the Taylor's expansion so that <math>e^y\ge 1+y</math> where <math>y=p_i(e^\lambda-1)\ge 0</math>. (By doing this, we can transform the product to the sum of <math>p_i</math>, which is <math>\mu</math>.)
| |
| | |
| Therefore,
| |
| :<math>\begin{align}
| |
| \mathbf{E}\left[e^{\lambda X}\right]
| |
| &=
| |
| \prod_{i=1}^n \mathbf{E}\left[e^{\lambda X_i}\right]\\
| |
| &\le
| |
| \prod_{i=1}^n e^{p_i(e^\lambda-1)}\\
| |
| &=
| |
| \exp\left(\sum_{i=1}^n p_i(e^{\lambda}-1)\right)\\
| |
| &=
| |
| e^{(e^\lambda-1)\mu}.
| |
| \end{align}</math>
| |
| Thus, we have shown that for any <math>\lambda>0</math>,
| |
| :<math>\begin{align}
| |
| \Pr[X\ge (1+\delta)\mu]
| |
| &\le
| |
| \frac{\mathbf{E}\left[e^{\lambda X}\right]}{e^{\lambda (1+\delta)\mu}}\\
| |
| &\le
| |
| \frac{e^{(e^\lambda-1)\mu}}{e^{\lambda (1+\delta)\mu}}\\
| |
| &=
| |
| \left(\frac{e^{(e^\lambda-1)}}{e^{\lambda (1+\delta)}}\right)^\mu
| |
| \end{align}</math>.
| |
| For any <math>\delta>0</math>, we can let <math>\lambda=\ln(1+\delta)>0</math> to get
| |
| :<math>\Pr[X\ge (1+\delta)\mu]\le\left(\frac{e^{\delta}}{(1+\delta)^{(1+\delta)}}\right)^{\mu}.</math>
| |
| }}
| |
| | |
| The idea of the proof is actually quite clear: we apply Markov's inequality to <math>e^{\lambda X}</math> and for the rest, we just estimate the moment generating function <math>\mathbf{E}[e^{\lambda X}]</math>. To make the bound as tight as possible, we minimized the <math>\frac{e^{(e^\lambda-1)}}{e^{\lambda (1+\delta)}}</math> by setting <math>\lambda=\ln(1+\delta)</math>, which can be justified by taking derivatives of <math>\frac{e^{(e^\lambda-1)}}{e^{\lambda (1+\delta)}}</math>.
| |
| | |
| ----
| |
| | |
| We then proceed to the lower tail, the probability that the random variable deviates below the mean value:
| |
| | |
| {{Theorem
| |
| |Chernoff bound (the lower tail)|
| |
| :Let <math>X=\sum_{i=1}^n X_i</math>, where <math>X_1, X_2, \ldots, X_n</math> are independent Poisson trials. Let <math>\mu=\mathbf{E}[X]</math>.
| |
| :Then for any <math>0<\delta<1</math>, | |
| ::<math>\Pr[X\le (1-\delta)\mu]\le\left(\frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}}\right)^{\mu}.</math>
| |
| }}
| |
| {{Proof| For any <math>\lambda<0</math>, by the same analysis as in the upper tail version,
| |
| :<math>\begin{align}
| |
| \Pr[X\le (1-\delta)\mu]
| |
| &=
| |
| \Pr\left[e^{\lambda X}\ge e^{\lambda (1-\delta)\mu}\right]\\
| |
| &\le
| |
| \frac{\mathbf{E}\left[e^{\lambda X}\right]}{e^{\lambda (1-\delta)\mu}}\\
| |
| &\le
| |
| \left(\frac{e^{(e^\lambda-1)}}{e^{\lambda (1-\delta)}}\right)^\mu.
| |
| \end{align}</math>
| |
| For any <math>0<\delta<1</math>, we can let <math>\lambda=\ln(1-\delta)<0</math> to get
| |
| :<math>\Pr[X\ge (1-\delta)\mu]\le\left(\frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}}\right)^{\mu}.</math>
| |
| }}
| |
| | |
| ----
| |
| | |
| Some useful special forms of the bounds can be derived directly from the above general forms of the bounds. We now know better why we say that the bounds are exponentially sharp.
| |
| | |
| {{Theorem
| |
| |Useful forms of the Chernoff bound| | |
| :Let <math>X=\sum_{i=1}^n X_i</math>, where <math>X_1, X_2, \ldots, X_n</math> are independent Poisson trials. Let <math>\mu=\mathbf{E}[X]</math>. Then
| |
| :1. for <math>0<\delta\le 1</math>,
| |
| ::<math>\Pr[X\ge (1+\delta)\mu]<\exp\left(-\frac{\mu\delta^2}{3}\right);</math>
| |
| ::<math>\Pr[X\le (1-\delta)\mu]<\exp\left(-\frac{\mu\delta^2}{2}\right);</math>
| |
| :2. for <math>t\ge 2e\mu</math>,
| |
| ::<math>\Pr[X\ge t]\le 2^{-t}.</math>
| |
| }}
| |
| {{Proof| To obtain the bounds in (1), we need to show that for <math>0<\delta< 1</math>, <math>\frac{e^{\delta}}{(1+\delta)^{(1+\delta)}}\le e^{-\delta^2/3}</math> and <math>\frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}}\le e^{-\delta^2/2}</math>. We can verify both inequalities by standard analysis techniques.
| |
| | |
| To obtain the bound in (2), let <math>t=(1+\delta)\mu</math>. Then <math>\delta=t/\mu-1\ge 2e-1</math>. Hence,
| |
| :<math>\begin{align} | |
| \Pr[X\ge(1+\delta)\mu]
| |
| &\le
| |
| \left(\frac{e^\delta}{(1+\delta)^{(1+\delta)}}\right)^\mu\\
| |
| &\le
| |
| \left(\frac{e}{1+\delta}\right)^{(1+\delta)\mu}\\
| |
| &\le
| |
| \left(\frac{e}{2e}\right)^t\\
| |
| &\le
| |
| 2^{-t}
| |
| \end{align}</math>
| |
| }} | | }} |
|
| |
|
| == Balls into bins, revisited ==
| | This is the page for the class ''Randomized Algorithms \ Advanced Algorithms'' for the Fall 2016 semester. Students who take this class should check this page periodically for content updates and new announcements. |
| Throwing <math>m</math> balls uniformly and independently to <math>n</math> bins, what is the maximum load of all bins with high probability? In the last class, we gave an analysis of this problem by using a counting argument.
| |
| | |
| Now we give a more "advanced" analysis by using Chernoff bounds.
| |
| | |
| | |
| For any <math>i\in[n]</math> and <math>j\in[m]</math>, let <math>X_{ij}</math> be the indicator variable for the event that ball <math>j</math> is thrown to bin <math>i</math>. Obviously
| |
| :<math>\mathbf{E}[X_{ij}]=\Pr[\mbox{ball }j\mbox{ is thrown to bin }i]=\frac{1}{n}</math>
| |
| Let <math>Y_i=\sum_{j\in[m]}X_{ij}</math> be the load of bin <math>i</math>.
| |
| | |
| | |
| Then the expected load of bin <math>i</math> is
| |
| | |
| <math>(*)\qquad \mu=\mathbf{E}[Y_i]=\mathbf{E}\left[\sum_{j\in[m]}X_{ij}\right]=\sum_{j\in[m]}\mathbf{E}[X_{ij}]=m/n. </math>
| |
| | |
| For the case <math>m=n</math>, it holds that <math>\mu=1</math>
| |
| | |
| Note that <math>Y_i</math> is a sum of <math>m</math> mutually independent indicator variable. Applying Chernoff bound, for any particular bin <math>i\in[n]</math>,
| |
| :<math>
| |
| \Pr[Y_i>(1+\delta)\mu] \le \left(\frac{e^{\delta}}{(1+\delta)^{1+\delta}}\right)^\mu.
| |
| </math>
| |
| | |
| === When <math>m=n</math> ===
| |
| | |
| When <math>m=n</math>, <math>\mu=1</math>. Write <math>c=1+\delta</math>. The above bound can be written as
| |
| :<math>
| |
| \Pr[Y_i>c] \le \frac{e^{c-1}}{c^c}.
| |
| </math>
| |
| | |
| Let <math>c=\frac{e\ln n}{\ln\ln n}</math>, we evaluate <math>\frac{e^{c-1}}{c^c}</math> by taking logarithm to its reciprocal.
| |
| :<math>
| |
| \begin{align}
| |
| \ln\left(\frac{c^c}{e^{c-1}}\right)
| |
| &=
| |
| c\ln c-c+1\\
| |
| &=
| |
| c(\ln c-1)+1\\
| |
| &=
| |
| \frac{e\ln n}{\ln\ln n}\left(\ln\ln n-\ln\ln\ln n\right)+1\\
| |
| &\ge
| |
| \frac{e\ln n}{\ln\ln n}\cdot\frac{2}{e}\ln\ln n+1\\
| |
| &\ge
| |
| 2\ln n.
| |
| \end{align}
| |
| </math>
| |
| Thus,
| |
| :<math>
| |
| \Pr\left[Y_i>\frac{e\ln n}{\ln\ln n}\right] \le \frac{1}{n^2}.
| |
| </math>
| |
| | |
| Applying the union bound, the probability that there exists a bin with load <math>>12\ln n</math> is
| |
| :<math>n\cdot \Pr\left[Y_1>\frac{e\ln n}{\ln\ln n}\right] \le \frac{1}{n}</math>.
| |
| Therefore, for <math>m=n</math>, with high probability, the maximum load is <math>O\left(\frac{e\ln n}{\ln\ln n}\right)</math>.
| |
|
| |
|
| === For larger <math>m</math> === | | = Announcement = |
| When <math>m\ge n\ln n</math>, then according to <math>(*)</math>, <math>\mu=\frac{m}{n}\ge \ln n</math>
| | *(8/31/2016) 9月1日星期四,开始本学期的第一次课。 |
| | *(8/31/2016) 9月8日星期四停课一次,补课时间待定。 |
| | *(9/1/2016) 第一次课程slides已上传,见lecture notes部分。讲义随后会补上。 |
| | *(9/5/2016) 由于错误的计算了从欧洲返回中国的时差影响,授课老师实际返回南京的时间应该比原计划多加一天,是9月11日星期日的晚上。这导致在9月22日再次上课之前没有合适的周末时间用来补课。因此下一次上课就是9月22日。因此带来的不变非常抱歉! |
| | *(9/22/2016) 第一次课讲义已经发布。同时还有课程所需的概率论基础的阅读材料。 |
| | *(9/22/2016) 9月28日星期三下午7、8节在原教室仙II-504补课,请看到通知的同学们相互转告。 |
| | *(9/28/2016) 第一次作业发布,10月13日课上交。 |
| | *(10/6/2016) 提醒:按照校历10月8日上周四的课,因此我们的课在10月8日1、2节。 |
| | *(10/20/2016) <font color=red size=4>今天晚上以前会发布第二次作业,两周之后课上交。</font> |
|
| |
|
| We can apply an easier form of the Chernoff bounds,
| | = Course info = |
| :<math> | | * '''Instructor ''': 尹一通, |
| \Pr[Y_i\ge 2e\mu]\le 2^{-2e\mu}\le 2^{-2e\ln n}<\frac{1}{n^2}.
| | :*email: yitong.yin@gmail.com, yinyt@nju.edu.cn |
| </math>
| | :*office: 计算机系 804. |
| By the union bound, the probability that there exists a bin with load <math>\ge 2e\frac{m}{n}</math> is,
| | * '''Class meeting''': Friday 4pm-6pm, 仙II-504. |
| :<math>n\cdot \Pr\left[Y_1>2e\frac{m}{n}\right] = n\cdot \Pr\left[Y_1>2e\mu\right]\le \frac{1}{n}</math>. | | * '''Office hour''': Tuesday 2-4pm, 计算机系 804. |
| Therefore, for <math>m\ge n\ln n</math>, with high probability, the maximum load is <math>O\left(\frac{m}{n}\right)</math>.
| |
|
| |
|
| =Set Balancing= | | = Syllabus = |
| Supposed that we have an <math>n\times m</math> matrix <math>A</math> with 0-1 entries. We are looking for a <math>b\in\{-1,+1\}^m</math> that minimizes <math>\|Ab\|_\infty</math>.
| |
|
| |
|
| Recall that <math>\|\cdot\|_\infty</math> is the infinity norm (also called <math>L_\infty</math> norm) of a vector, and for the vector <math>c=Ab</math>,
| | === 先修课程 Prerequisites === |
| :<math>\|Ab\|_\infty=\max_{i=1,2,\ldots,n}|c_i|</math>.
| | * 必须:离散数学,概率论,线性代数。 |
| | * 推荐:算法设计与分析。 |
|
| |
|
| We can also describe this problem as an optimization:
| | === Course materials === |
| :<math>\begin{align}
| | * [[随机算法 \ 高级算法 (Fall 2016) / Course materials|<font size=3>教材和参考书</font>]] |
| \mbox{minimize }
| |
| &\quad
| |
| \|Ab\|_\infty\\
| |
| \mbox{subject to: }
| |
| &\quad
| |
| b\in\{-1,+1\}^m.
| |
| \end{align}</math>
| |
|
| |
|
| This problem is called set balancing for a reason.
| | === 成绩 Grades === |
| | * 课程成绩:本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。 |
| | * 迟交:如果有特殊的理由,无法按时完成作业,请提前联系授课老师,给出正当理由。否则迟交的作业将不被接受。 |
|
| |
|
| {|border="1"
| | === <font color=red> 学术诚信 Academic Integrity </font>=== |
| |The problem arises in designing statistical experiments. Suppose that we have <math>m</math> '''subjects''', each of which may have up to <math>n</math> '''features'''. This gives us an <math>n\times m</math> matrix <math>A</math>:
| | 学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线,本课程将不遗余力的维护学术诚信规范,违反这一底线的行为将不会被容忍。 |
| :<math>
| |
| \begin{array}{c}
| |
| \mbox{feature 1:}\\
| |
| \mbox{feature 2:}\\
| |
| \vdots\\
| |
| \mbox{feature n:}\\
| |
| \end{array}
| |
| \left[
| |
| \begin{array}{cccc}
| |
| a_{11} & a_{12} & \cdots & a_{1m}\\
| |
| a_{21} & a_{22} & \cdots & a_{2m}\\
| |
| \vdots & \vdots & \ddots & \vdots\\
| |
| a_{n1} & a_{n2} & \cdots & a_{nm}\\
| |
| \end{array}
| |
| \right],
| |
| </math>
| |
| where each column represents a subject and each row represent a feature. An entry <math>a_{ij}\in\{0,1\}</math> indicates whether subject <math>j</math> has feature <math>i</math>.
| |
|
| |
|
| By multiplying a vector <math>b\in\{-1,+1\}^m</math>
| | 作业完成的原则:署你名字的工作必须由你完成。允许讨论,但作业必须独立完成,并在作业中列出所有参与讨论的人。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。 |
| :<math>
| |
| \left[
| |
| \begin{array}{cccc}
| |
| a_{11} & a_{12} & \cdots & a_{1m}\\
| |
| a_{21} & a_{22} & \cdots & a_{2m}\\
| |
| \vdots & \vdots & \ddots & \vdots\\
| |
| a_{n1} & a_{n2} & \cdots & a_{nm}\\
| |
| \end{array}
| |
| \right]
| |
| \left[
| |
| \begin{array}{c}
| |
| b_{1}\\
| |
| b_{2}\\
| |
| \vdots\\
| |
| b_{m}\\
| |
| \end{array}
| |
| \right]
| |
| =
| |
| \left[
| |
| \begin{array}{c}
| |
| c_{1}\\
| |
| c_{2}\\
| |
| \vdots\\
| |
| c_{n}\\
| |
| \end{array}
| |
| \right],
| |
| </math>
| |
| the subjects are partitioned into two disjoint groups: one for -1 and other other for +1. Each <math>c_i</math> gives the difference between the numbers of subjects with feature <math>i</math> in the two groups. By minimizing <math>\|Ab\|_\infty=\|c\|_\infty</math>, we ask for an optimal partition so that each feature is roughly as balanced as possible between the two groups.
| |
|
| |
|
| In a scientific experiment, one of the group serves as a [http://en.wikipedia.org/wiki/Scientific_control control group] (对照组). Ideally, we want the two groups are statistically identical, which is usually impossible to achieve in practice. The requirement of minimizing <math>\|Ab\|_\infty</math> actually means the statistical difference between the two groups are minimized.
| | 本课程将对剽窃行为采取零容忍的态度。在完成作业过程中,对他人工作(出版物、互联网资料、其他人的作业等)直接的文本抄袭和对关键思想、关键元素的抄袭,按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释,都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为,<font color=red> 抄袭和被抄袭双方的成绩都将被取消</font>。因此请主动防止自己的作业被他人抄袭。 |
| |}
| |
|
| |
|
| | 学术诚信影响学生个人的品行,也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为,不仅使自己沦为一个欺骗者,也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。 |
|
| |
|
| We propose an extremely simple "randomized algorithm" for computing a <math>b\in\{-1,+1\}^m</math>: for each <math>i=1,2,\ldots, m</math>, let <math>b_i</math> be independently chosen from <math>\{-1,+1\}</math>, such that
| | = Assignments = |
| :<math>b_i=
| | *[[随机算法 \ 高级算法 (Fall 2016)/Problem Set 1|Problem Set 1]], due on Oct 13, Thursday, in class. |
| \begin{cases} | |
| -1 & \mbox{with probability }\frac{1}{2}\\
| |
| +1 &\mbox{with probability }\frac{1}{2}
| |
| \end{cases}.
| |
| </math>
| |
| | |
| This procedure can hardly be called as an "algorithm", because its decision is made disregard of the input <math>A</math>. We then show that despite of this obliviousness, the algorithm chooses a good enough <math>b</math>, such that for any <math>A</math>, <math>\|Ab\|_\infty=O(\sqrt{m\ln n})</math> with high probability.
| |
| {{Theorem
| |
| |Theorem| | |
| :Let <math>A</math> be an <math>n\times m</math> matrix with 0-1 entries. For a random vector <math>b</math> with <math>m</math> entries chosen independently and with equal probability from <math>\{-1,+1\}</math>,
| |
| ::<math>\Pr[\|Ab\|_\infty>2\sqrt{2m\ln n}]\le\frac{2}{n}</math>.
| |
| }}
| |
| {{Proof|
| |
| Consider particularly the <math>i</math>-th row of <math>A</math>. The entry of <math>Ab</math> contributed by row <math>i</math> is <math>c_i=\sum_{j=1}^m a_{ij}b_j</math>.
| |
| | |
| Let <math>k</math> be the non-zero entries in the row. If <math>k\le2\sqrt{2m\ln n}</math>, then clearly <math>|c_i|</math> is no greater than <math>2\sqrt{2m\ln n}</math>. On the other hand if <math>k>2\sqrt{2m\ln n}</math> then the <math>k</math> nonzero terms in the sum
| |
| :<math>c_i=\sum_{j=1}^m a_{ij}b_j</math>
| |
| are independent, each with probability 1/2 of being either +1 or -1.
| |
| | |
| Thus, for these <math>k</math> nonzero terms, each <math>b_i</math> is either positive or negative independently with equal probability. There are expectedly <math>\mu=\frac{k}{2}</math> positive <math>b_i</math>'s among these <math>k</math> terms, and <math>c_i<-2\sqrt{2m\ln n}</math> only occurs when there are less than <math>\frac{k}{2}-\sqrt{2m\ln n}=\left(1-\delta\right)\mu</math> positive <math>b_i</math>'s, where <math>\delta=\frac{2\sqrt{2m\ln n}}{k}</math>. Applying Chernoff bound, this event occurs with probability at most
| |
| :<math>\begin{align}
| |
| \exp\left(-\frac{\mu\delta^2}{2}\right)
| |
| &=
| |
| \exp\left(-\frac{k}{2}\cdot\frac{8m\ln n}{2k^2}\right)\\
| |
| &=
| |
| \exp\left(-\frac{2m\ln n}{k}\right)\\
| |
| &\le
| |
| \exp\left(-\frac{2m\ln n}{m}\right)\\
| |
| &\le n^{-2}.
| |
| \end{align}
| |
| </math>
| |
| | |
| The same argument can be applied to negative <math>b_i</math>'s, so that the probability that <math>c_i>2\sqrt{2m\ln n}</math> is at most <math>n^{-2}</math>. Therefore, by the union bound,
| |
| :<math>\Pr[|c_i|> 2\sqrt{2m\ln n}]\le\frac{2}{n^2}</math>.
| |
| Apply the union bound to all <math>n</math> rows.
| |
| :<math>\Pr[\|Ab\|_\infty>2\sqrt{2m\ln n}]\le n\cdot\Pr[|c_i|> 2\sqrt{2m\ln n}]\le\frac{2}{n}</math>.
| |
| }}
| |
| | |
|
| |
|
| How good is this randomized algorithm? In fact when <math>m=n</math> there exists a matrix <math>A</math> such that <math>\|Ab\|_\infty=\Omega(\sqrt{n})</math> for any choice of <math>b\in\{-1,+1\}^n</math>.
| | = Lecture Notes = |
| | # [[高级算法 (Fall 2016)/Min-Cut and Max-Cut|Min-Cut and Max-Cut]] ([ftp://tcs.nju.edu.cn/slides/aa2016/Cut.pdf slides]) |
| | #: [[高级算法 (Fall 2016)/Probability Basics|Probability basics]] |
| | # [[高级算法 (Fall 2016)/Greedy and Local Search|Greedy and Local Search]] ([ftp://tcs.nju.edu.cn/slides/aa2016/Greedy.pdf slides]) |
| | #: [ftp://tcs.nju.edu.cn/notes/ComplexityNote.pdf Computational complexity] |
| | # [[File:Under_construction.png|25px]] [[高级算法 (Fall 2016)/''Lovász'' Local Lemma|''Lovász'' Local Lemma]] ([ftp://tcs.nju.edu.cn/slides/aa2016/LLL.pdf slides]) |
| | #: [[高级算法 (Fall 2016)/Nonconstructive Proof of Lovász Local Lemma|Classic Proof of Lovász Local Lemma]] |
| | # Rounding Dynamic Programs (Approximation Algorithms textbook Chapter 8, 9.) ([ftp://tcs.nju.edu.cn/slides/aa2016/DP.pdf slides]) |
| | # Rounding Linear Programs (Approximation Algorithms textbook Chapter 14, 16.) ([ftp://tcs.nju.edu.cn/slides/aa2016/LP.pdf slides]) |
| | # The Primal-Dual Schema (Approximation Algorithms textbook Chapter 12, 15, 24.) |