高级算法 (Fall 2019)/Problem Set 2: Difference between revisions

Revision as of 12:26, 22 October 2019

作业电子版于2019/11/5 23:59 之前提交到邮箱 njuadvalg@163.com
每道题目的解答都要有完整的解题过程。中英文不限。

Problem 1

Let [math]\displaystyle{ X }[/math] be a real-valued random variable with finite [math]\displaystyle{ \mathbb{E}[X] }[/math] and finite [math]\displaystyle{ \mathbb{E}\left[\mathrm{e}^{\lambda X}\right] }[/math] for all [math]\displaystyle{ \lambda\ge 0 }[/math]. We define the log-moment-generating function as

[math]\displaystyle{ \Psi_X(\lambda):=\ln\mathbb{E}[\mathrm{e}^{\lambda X}] \quad\text{ for all }\lambda\ge 0 }[/math],

and its dual function:

[math]\displaystyle{ \Psi_X^*(t):=\sup_{\lambda\ge 0}(\lambda t-\Psi_X(\lambda)) }[/math].

Assume that [math]\displaystyle{ X }[/math] is NOT almost surely constant. Then due to the convexity of [math]\displaystyle{ \mathrm{e}^{\lambda X} }[/math] with respect to [math]\displaystyle{ \lambda }[/math], the function [math]\displaystyle{ \Psi_X(\lambda) }[/math] is strictly convex over [math]\displaystyle{ \lambda\ge 0 }[/math].

Prove the following Chernoff bound:

[math]\displaystyle{ \Pr[X\ge t]\le\exp(-\Psi_X^*(t)) }[/math].

In particular if [math]\displaystyle{ \Psi_X(\lambda) }[/math] is continuously differentiable, prove that the supreme in [math]\displaystyle{ \Psi_X^*(t) }[/math] is achieved at the unique [math]\displaystyle{ \lambda\ge 0 }[/math] satisfying

[math]\displaystyle{ \Psi_X'(\lambda)=t }[/math]

where [math]\displaystyle{ \Psi_X'(\lambda) }[/math] denotes the derivative of [math]\displaystyle{ \Psi_X(\lambda) }[/math] with respect to [math]\displaystyle{ \lambda }[/math].

Normal random variables. Let [math]\displaystyle{ X\sim \mathrm{N}(\mu,\sigma) }[/math] be a Gaussian random variable with mean [math]\displaystyle{ \mu }[/math] and standard deviation [math]\displaystyle{ \sigma }[/math]. What are the [math]\displaystyle{ \Psi_X(\lambda) }[/math] and [math]\displaystyle{ \Psi_X^*(t) }[/math]? And give a tail inequality to upper bound the probability [math]\displaystyle{ \Pr[X\ge t] }[/math].

Poisson random variables. Let [math]\displaystyle{ X\sim \mathrm{Pois}(\nu) }[/math] be a Poisson random variable with parameter [math]\displaystyle{ \nu }[/math], that is, [math]\displaystyle{ \Pr[X=k]=\mathrm{e}^{-\nu}\nu^k/k! }[/math] for all [math]\displaystyle{ k=0,1,2,\ldots }[/math]. What are the [math]\displaystyle{ \Psi_X(\lambda) }[/math] and [math]\displaystyle{ \Psi_X^*(t) }[/math]? And give a tail inequality to upper bound the probability [math]\displaystyle{ \Pr[X\ge t] }[/math].

Bernoulli random variables. Let [math]\displaystyle{ X\in\{0,1\} }[/math] be a single Bernoulli trial with probability of success [math]\displaystyle{ p }[/math], that is, [math]\displaystyle{ \Pr[X=1]=1-\Pr[X=0]=p }[/math]. Show that for any [math]\displaystyle{ t\in(p,1) }[/math], we have [math]\displaystyle{ \Psi_X^*(t)=D(Y \| X) }[/math] where [math]\displaystyle{ Y\in\{0,1\} }[/math] is a Bernoulli random variable with parameter [math]\displaystyle{ t }[/math] and [math]\displaystyle{ D(Y \| X)=(1-t)\ln\frac{1-t}{1-p}+t\ln\frac{t}{p} }[/math] is the Kullback-Leibler divergence between [math]\displaystyle{ Y }[/math] and [math]\displaystyle{ X }[/math].

Sum of independent random variables. Let [math]\displaystyle{ X=\sum_{i=1}^nX_i }[/math] be the sum of [math]\displaystyle{ n }[/math] independently and identically distributed random variables [math]\displaystyle{ X_1,X_2,\ldots, X_n }[/math]. Show that [math]\displaystyle{ \Psi_X(\lambda)=\sum_{i=1}^n\Psi_{X_i}(\lambda) }[/math] and [math]\displaystyle{ \Psi_X^*(t)=n\Psi^*_{X_i}(\frac{t}{n}) }[/math]. Also for binomial random variable [math]\displaystyle{ X\sim \mathrm{Bin}(n,p) }[/math], give an upper bound to the tail inequality [math]\displaystyle{ \Pr[X\ge t] }[/math] in terms of KL-divergence.

Give an upper bound to [math]\displaystyle{ \Pr[X\ge t] }[/math] when every [math]\displaystyle{ X_i }[/math] follows the geometric distribution with a probability [math]\displaystyle{ p }[/math] of success.

Problem 2

An [math]\displaystyle{ n }[/math]-dimensional hypercube [math]\displaystyle{ Q_n }[/math] is a graph with [math]\displaystyle{ 2^n }[/math] vertices, where each vertex is represented by an [math]\displaystyle{ n }[/math]-bit vector, and there is an edge between two vertices if and only if their bit-vectors differ in exactly one bit.

Given an [math]\displaystyle{ n }[/math]-dimensional hypercube with some non-empty subset of vertices [math]\displaystyle{ S }[/math], which is called marked black. Let [math]\displaystyle{ f(u) }[/math] denote the shortest distance from vertex [math]\displaystyle{ u }[/math] to any black vertex. Formally,

[math]\displaystyle{ f(u) = \min_{v \in S}\mathrm{dist}_{Q_n}(u,v), }[/math]

where [math]\displaystyle{ \mathrm{dist}_{Q_n}(u,v) }[/math] denotes the length of the shortest path between [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] in graph [math]\displaystyle{ Q_n }[/math] .

Prove that if we choose [math]\displaystyle{ u }[/math] from all [math]\displaystyle{ 2^n }[/math] vertices uniformly at random, then, with high probability, [math]\displaystyle{ f(u) }[/math] can not deviate from its expectation too much:

[math]\displaystyle{ \mathrm{Pr}[|f(u) - \mathbb{E}[f(u)]| \geq t\sqrt{n \log n}] \leq n^{-c}. }[/math]

Give the relation between [math]\displaystyle{ c }[/math] and [math]\displaystyle{ t }[/math].

@@ Line 2: / Line 2: @@
 *每道题目的解答都要有<font color="red" size=5>完整的解题过程</font>。中英文不限。
-== Problem 1 ==
+== Problem 1==
+Let <math>X</math> be a real-valued random variable with finite <math>\mathbb{E}[X]</math> and finite <math>\mathbb{E}\left[\mathrm{e}^{\lambda X}\right]</math> for all <math>\lambda\ge 0</math>. We define the '''log-moment-generating function''' as
+:<math>\Psi_X(\lambda):=\ln\mathbb{E}[\mathrm{e}^{\lambda X}] \quad\text{ for all }\lambda\ge 0</math>,
+and its ''dual function'':
+:<math>\Psi_X^*(t):=\sup_{\lambda\ge 0}(\lambda t-\Psi_X(\lambda))</math>.
+Assume that <math>X</math> is NOT almost surely constant. Then due to the convexity of <math>\mathrm{e}^{\lambda X}</math> with respect to <math>\lambda</math>, the function <math>\Psi_X(\lambda)</math> is ''strictly'' convex over <math>\lambda\ge 0</math>.
+*Prove the following Chernoff bound:
+::<math>\Pr[X\ge t]\le\exp(-\Psi_X^*(t))</math>.
+:In particular if <math>\Psi_X(\lambda)</math> is continuously differentiable, prove that the supreme in <math>\Psi_X^*(t)</math> is achieved at the unique <math>\lambda\ge 0</math> satisfying
+::<math>\Psi_X'(\lambda)=t</math>
+:where <math>\Psi_X'(\lambda)</math> denotes the derivative of <math>\Psi_X(\lambda)</math> with respect to <math>\lambda</math>.
+*'''Normal random variables.''' Let <math>X\sim \mathrm{N}(\mu,\sigma)</math> be a Gaussian random variable with mean <math>\mu</math> and standard deviation <math>\sigma</math>. What are the <math>\Psi_X(\lambda)</math> and <math>\Psi_X^*(t)</math>? And give a tail inequality to upper bound the probability <math>\Pr[X\ge t]</math>.
+*'''Poisson random variables.''' Let <math>X\sim \mathrm{Pois}(\nu)</math> be a Poisson random variable with parameter <math>\nu</math>, that is, <math>\Pr[X=k]=\mathrm{e}^{-\nu}\nu^k/k!</math> for all <math>k=0,1,2,\ldots</math>. What are the <math>\Psi_X(\lambda)</math> and <math>\Psi_X^*(t)</math>? And give a tail inequality to upper bound the probability <math>\Pr[X\ge t]</math>.
+*'''Bernoulli random variables.''' Let <math>X\in\{0,1\}</math> be a single Bernoulli trial with probability of success <math>p</math>, that is, <math>\Pr[X=1]=1-\Pr[X=0]=p</math>. Show that for any <math>t\in(p,1)</math>, we have <math>\Psi_X^*(t)=D(Y \| X)</math> where <math>Y\in\{0,1\}</math> is a Bernoulli random variable with parameter <math>t</math> and <math>D(Y \| X)=(1-t)\ln\frac{1-t}{1-p}+t\ln\frac{t}{p}</math> is the [https://en.wikipedia.org/wiki/Kullback–Leibler_divergence '''Kullback-Leibler divergence'''] between <math>Y</math> and <math>X</math>.
+*'''Sum of independent random variables.''' Let <math>X=\sum_{i=1}^nX_i</math> be the sum of <math>n</math> independently and identically distributed random variables <math>X_1,X_2,\ldots, X_n</math>. Show that <math>\Psi_X(\lambda)=\sum_{i=1}^n\Psi_{X_i}(\lambda)</math> and <math>\Psi_X^*(t)=n\Psi^*_{X_i}(\frac{t}{n})</math>. Also for binomial random variable <math>X\sim \mathrm{Bin}(n,p)</math>, give an upper bound to the tail inequality <math>\Pr[X\ge t]</math> in terms of KL-divergence.
+:Give an upper bound to <math>\Pr[X\ge t]</math> when every <math>X_i</math> follows the geometric distribution with a probability <math>p</math> of success.
+== Problem 2 ==
 An <math>n</math>-dimensional hypercube <math>Q_n</math> is a graph with <math>2^n</math> vertices, where each vertex is represented by an <math>n</math>-bit vector, and there is an edge between two vertices if and only if their bit-vectors differ in exactly one bit.
@@ Line 16: / Line 39: @@
 Give the relation between <math>c</math> and <math>t</math>.
+== Problem 3 ==
+== Problem 4 ==

高级算法 (Fall 2019)/Problem Set 2: Difference between revisions

Revision as of 12:26, 22 October 2019

Contents

Problem 1

Problem 2

Problem 3

Problem 4

Navigation menu

高级算法 (Fall 2019)/Problem Set 2: Difference between revisions

Revision as of 12:26, 22 October 2019

Problem 1

Problem 2

Problem 3

Problem 4

Navigation menu

Search