高级算法 (Fall 2021)/Conditional expectations and 高级算法 (Fall 2021)/Problem Set 4: Difference between pages

Revision as of 08:10, 20 December 2021

每道题目的解答都要有完整的解题过程。中英文不限。

Problem 1

Problem 2

A [math]\displaystyle{ k }[/math]-uniform hypergraph is an ordered pair [math]\displaystyle{ G=(V,E) }[/math], where [math]\displaystyle{ V }[/math] denotes the set of vertices and [math]\displaystyle{ E }[/math] denotes the set of edges. Moreover, each edge in [math]\displaystyle{ E }[/math] now contains [math]\displaystyle{ k }[/math] distinct vertices, instead of [math]\displaystyle{ 2 }[/math] (so a [math]\displaystyle{ 2 }[/math]-uniform hypergraph is just what we normally call a graph). A hypergraph is [math]\displaystyle{ k }[/math]-regular if all vertices have degree [math]\displaystyle{ k }[/math]; that is, each vertex is exactly contained within [math]\displaystyle{ k }[/math] hypergraph edges.

Show that for sufficiently large [math]\displaystyle{ k }[/math], the vertices of a [math]\displaystyle{ k }[/math]-uniform, [math]\displaystyle{ k }[/math]-regular hypergraph can be [math]\displaystyle{ 2 }[/math]-colored so that no edge is monochromatic. What's the smallest value of [math]\displaystyle{ k }[/math] you can achieve?

Problem 3

Suppose we have graphs [math]\displaystyle{ G=(V,E) }[/math] and [math]\displaystyle{ H=(V,F) }[/math] on the same vertex set. We wish to partition [math]\displaystyle{ V }[/math] into clusters [math]\displaystyle{ V_1,V_2,\cdots }[/math] so as to maximise:

[math]\displaystyle{ (\#\text{ of edges in }E\text{ that lie within clusters})+(\#\text{ of edges in }F\text{ that lie between clusters}). }[/math]

Show that the following SDP is an upperbound on this.

[math]\displaystyle{ \text{maximize} &&& \sum_{(u,v)\in E}\langle x_u,x_v\rangle+\sum_{(u,v)\in F}(1-\langle x_u,x_v\rangle) \\ \begin{align} \text{subject to} && \langle x_u,x_u\rangle & =1, & \forall u & \in V, \\ && \langle x_u,x_v\rangle & \ge0, & \forall u,v & \in V, \\ && x_u & \in R^n, & \forall u & \in V. \end{align} }[/math]

@@ Line 1: / Line 1: @@
-= Conditional Expectations =
+*每道题目的解答都要有<font color="red" size=5>完整的解题过程</font>。中英文不限。
-The '''conditional expectation''' of a random variable <math>Y</math> with respect to an event <math>\mathcal{E}</math> is defined by
-:<math>
-\mathbf{E}[Y\mid \mathcal{E}]=\sum_{y}y\Pr[Y=y\mid\mathcal{E}].
-</math>
-In particular, if the event <math>\mathcal{E}</math> is <math>X=a</math>, the conditional expectation
-:<math>
-\mathbf{E}[Y\mid X=a]
-</math>
-defines a function
-:<math>
-f(a)=\mathbf{E}[Y\mid X=a].
-</math>
-Thus, <math>\mathbf{E}[Y\mid X]</math> can be regarded as a random variable <math>f(X)</math>.
-;Example
-:Suppose that we uniformly sample a human from all human beings. Let <math>Y</math> be his/her height, and let <math>X</math> be the country where he/she is from. For any country <math>a</math>, <math>\mathbf{E}[Y\mid X=a]</math> gives the average height of that country. And <math>\mathbf{E}[Y\mid X]</math> is the random variable which can be defined in either ways:
-:* We choose a human uniformly at random from all human beings, and <math>\mathbf{E}[Y\mid X]</math> is the average height of the country where he/she comes from.
-:* We choose a country at random with a probability ''proportional to its population'', and <math>\mathbf{E}[Y\mid X]</math> is the average height of the chosen country.
-The following proposition states some fundamental facts about conditional expectation.
-{{Theorem
-|Proposition (fundamental facts about conditional expectation)|
-:Let <math>X,Y</math> and <math>Z</math> be arbitrary random variables. Let <math>f</math> and <math>g</math> be arbitrary functions. Then
-:# <math>\mathbf{E}[X]=\mathbf{E}[\mathbf{E}[X\mid Y]]</math>.
-:# <math>\mathbf{E}[X\mid Z]=\mathbf{E}[\mathbf{E}[X\mid Y,Z]\mid Z]</math>.
-:# <math>\mathbf{E}[\mathbf{E}[f(X)g(X,Y)\mid X]]=\mathbf{E}[f(X)\cdot \mathbf{E}[g(X,Y)\mid X]]</math>.
-}}
-The proposition can be formally verified by computing these expectations. Although these equations look formal, the intuitive interpretations to them are very clear.
-The first equation:
-:<math>\mathbf{E}[X]=\mathbf{E}[\mathbf{E}[X\mid Y]]</math>
-says that there are two ways to compute an average. Suppose again that <math>X</math> is the height of a uniform random human and <math>Y</math> is the country where he/she is from. There are two ways to compute the average human height: one is to directly average over the heights of all humans; the other is that first compute the average height for each country, and then average over these heights weighted by the populations of the countries.
-The second equation:
-:<math>\mathbf{E}[X\mid Z]=\mathbf{E}[\mathbf{E}[X\mid Y,Z]\mid Z]</math>
-is the same as the first one, restricted to a particular subspace. As the previous example, inaddition to the height <math>X</math> and the country <math>Y</math>, let <math>Z</math> be the gender of the individual. Thus, <math>\mathbf{E}[X\mid Z]</math> is the average height of a human being of a given sex. Again, this can be computed either directly or on a country-by-country basis.
-The third equation:
-:<math>\mathbf{E}[\mathbf{E}[f(X)g(X,Y)\mid X]]=\mathbf{E}[f(X)\cdot \mathbf{E}[g(X,Y)\mid X]]</math>.
-looks obscure at the first glance, especially when considering that <math>X</math> and <math>Y</math> are not necessarily independent. Nevertheless, the equation follows the simple fact that conditioning on any <math>X=a</math>, the function value <math>f(X)=f(a)</math> becomes a constant, thus can be safely taken outside the expectation due to the linearity of expectation. For any value <math>X=a</math>,
-:<math>
-\mathbf{E}[f(X)g(X,Y)\mid X=a]=\mathbf{E}[f(a)g(X,Y)\mid X=a]=f(a)\cdot \mathbf{E}[g(X,Y)\mid X=a].
-</math>
-The proposition holds in more general cases when <math>X, Y</math> and <math>Z</math> are a sequence of random variables.
-= Martingales =
-A '''martingale''' is a random sequence <math>X_0,X_1,\ldots</math> satisfying the following so-called ''martingale property''.
-{{Theorem
-|Definition (martingale)|
-:A sequence of random variables <math>X_0,X_1,\ldots</math> is a '''martingale''' if for all <math>i> 0</math>,
-:: <math>\begin{align}
-\mathbf{E}[X_{i}\mid X_0,\ldots,X_{i-1}]=X_{i-1}.
-\end{align}</math>
-}}
-==Examples ==
-;coin flips
-:A fair coin is flipped for a number of times. Let <math>Z_j\in\{-1,1\}</math> denote the outcome of the <math>j</math>th flip. Let
-::<math>X_0=0\quad \mbox{ and } \quad X_i=\sum_{j\le i}Z_j</math>.
-:The random variables <math>X_0,X_1,\ldots</math> defines a martingale.
-{{Proof| We first observe that <math>\mathbf{E}[X_i\mid X_0,\ldots,X_{i-1}] = \mathbf{E}[X_i\mid X_{i-1}]</math>, which intuitively says that the next number of HEADs depends only on the current number of HEADs. This property is also called the '''Markov property''' in statistic processes.
-::<math>
-\begin{align}
-\mathbf{E}[X_i\mid X_0,\ldots,X_{i-1}]
-&= \mathbf{E}[X_i\mid X_{i-1}]\\
-&= \mathbf{E}[X_{i-1}+Z_{i}\mid X_{i-1}]\\
-&= \mathbf{E}[X_{i-1}\mid X_{i-1}]+\mathbf{E}[Z_{i}\mid X_{i-1}]\\
-&= X_{i-1}+\mathbf{E}[Z_{i}\mid X_{i-1}]\\
-&= X_{i-1}+\mathbf{E}[Z_{i}] &\quad (\mbox{independence of coin flips})\\
-&= X_{i-1}
-\end{align}
-</math>
-}}
-;edge exposure in a random graph
-:Consider a '''random graph''' <math>G</math> generated as follows. Let <math>[n]</math> be the set of vertices, and let <math>[m]={[n]\choose 2}</math> be the set of all possible edges. For convenience, we enumerate these potential edges by <math>e_1,\ldots, e_m</math>. For each potential edge <math>e_j</math>, we independently flip a fair coin to decide whether the edge <math>e_j</math> appears in <math>G</math>. Let <math>I_j</math> be the random variable that indicates whether <math>e_j\in G</math>. We are interested in some graph-theoretical parameter, say [http://mathworld.wolfram.com/ChromaticNumber.html chromatic number], of the random graph <math>G</math>. Let <math>\chi(G)</math> be the chromatic number of <math>G</math>. Let <math>X_0=\mathbf{E}[\chi(G)]</math>, and for each <math>i\ge 1</math>, let <math>X_i=\mathbf{E}[\chi(G)\mid I_1,\ldots,I_{i}]</math>, namely, the expected chromatic number of the random graph after fixing the first <math>i</math> edges. This process is called edges exposure of a random graph, as we "exposing" the edges one by one in a random graph.
-It is nontrivial to formally verify that the edge exposure sequence for a random graph is a martingale. However, we will later see that this construction can be put into a more general context.
-==Generalization ==
-The martingale can be generalized to be with respect to another sequence of random variables.
-{{Theorem
-|Definition (martingale, general version)|
-:A sequence of random variables <math>Y_0,Y_1,\ldots</math> is a martingale with respect to the sequence <math>X_0,X_1,\ldots</math> if, for all <math>i\ge 0</math>, the following conditions hold:
-:* <math>Y_i</math> is a function of <math>X_0,X_1,\ldots,X_i</math>;
-:* <math>\begin{align}
-\mathbf{E}[Y_{i+1}\mid X_0,\ldots,X_{i}]=Y_{i}.
-\end{align}</math>
-}}
-Therefore, a sequence <math>X_0,X_1,\ldots</math> is a martingale if it is a martingale with respect to itself.
-The purpose of this generalization is that we are usually more interested in a function of a sequence of random variables, rather than the sequence itself.
+== Problem 1 ==
-The following definition describes a very general approach for constructing an important type of martingales.
+== Problem 2 ==
+A ''<math>k</math>-uniform hypergraph'' is an ordered pair <math>G=(V,E)</math>, where <math>V</math> denotes the set of vertices and <math>E</math> denotes the set of edges. Moreover, each edge in <math>E</math> now contains <math>k</math> distinct vertices, instead of <math>2</math> (so a <math>2</math>-uniform hypergraph is just what we normally call a graph).
+A hypergraph is <math>k</math>-regular if all vertices have degree <math>k</math>; that is, each vertex is exactly contained within <math>k</math> hypergraph edges.
-{{Theorem
+Show that for sufficiently large <math>k</math>, the vertices of a <math>k</math>-uniform, <math>k</math>-regular hypergraph can be <math>2</math>-colored so that no edge is monochromatic.
-|Definition (The Doob sequence)|
+What's the smallest value of <math>k</math> you can achieve?
-: The Doob sequence of a function <math>f</math> with respect to a sequence of random variables <math>X_1,\ldots,X_n</math> is defined by
-::<math>
-Y_i=\mathbf{E}[f(X_1,\ldots,X_n)\mid X_1,\ldots,X_{i}], \quad 0\le i\le n.
-</math>
-:In particular, <math>Y_0=\mathbf{E}[f(X_1,\ldots,X_n)]</math> and <math>Y_n=f(X_1,\ldots,X_n)</math>.
-}}
-The Doob sequence of a function defines a martingale. That is
+== Problem 3 ==
-::<math>
+Suppose we have graphs <math>G=(V,E)</math> and <math>H=(V,F)</math> on the same vertex set.
-\mathbf{E}[Y_i\mid X_1,\ldots,X_{i-1}]=Y_{i-1},
+We wish to partition <math>V</math> into clusters <math>V_1,V_2,\cdots</math> so as to maximise:
-</math>
+:<math>(\#\text{ of edges in }E\text{ that lie within clusters})+(\#\text{ of edges in }F\text{ that lie between clusters}).</math>
-for any <math>0\le i\le n</math>.
-To prove this claim, we recall the definition that <math>Y_i=\mathbf{E}[f(X_1,\ldots,X_n)\mid X_1,\ldots,X_{i}]</math>, thus,
+* Show that the following SDP is an upperbound on this.
-:<math>
+:::<math>
+\text{maximize} &&& \sum_{(u,v)\in E}\langle x_u,x_v\rangle+\sum_{(u,v)\in F}(1-\langle x_u,x_v\rangle) \\
 \begin{align}
-\mathbf{E}[Y_i\mid X_1,\ldots,X_{i-1}]
+\text{subject to} && \langle x_u,x_u\rangle & =1, & \forall u & \in V, \\
-&=\mathbf{E}[\mathbf{E}[f(X_1,\ldots,X_n)\mid X_1,\ldots,X_{i}]\mid X_1,\ldots,X_{i-1}]\\
+&& \langle x_u,x_v\rangle & \ge0, & \forall u,v & \in V, \\
-&=\mathbf{E}[f(X_1,\ldots,X_n)\mid X_1,\ldots,X_{i-1}]\\
+&& x_u & \in R^n, & \forall u & \in V.
-&=Y_{i-1},
 \end{align}
 </math>
-where the second equation is due to the fundamental fact about conditional expectation introduced in the first section.
-The Doob martingale describes a very natural procedure to determine a function value of a sequence of random variables. Suppose that we want to predict the value of a function <math>f(X_1,\ldots,X_n)</math> of random variables <math>X_1,\ldots,X_n</math>. The Doob sequence <math>Y_0,Y_1,\ldots,Y_n</math> represents a sequence of refined estimates of the value of <math>f(X_1,\ldots,X_n)</math>, gradually using more information on the values of the random variables <math>X_1,\ldots,X_n</math>. The first element <math>Y_0</math> is just the expectation of <math>f(X_1,\ldots,X_n)</math>. Element <math>Y_i</math> is the expected value of <math>f(X_1,\ldots,X_n)</math> when the values of <math>X_1,\ldots,X_{i}</math> are known, and <math>Y_n=f(X_1,\ldots,X_n)</math> when <math>f(X_1,\ldots,X_n)</math> is fully determined by <math>X_1,\ldots,X_n</math>.
-The following two Doob martingales arise in evaluating the parameters of random graphs.
-===edge exposure martingale===
-:Let <math>G</math> be a random graph on <math>n</math> vertices. Let <math>f</math> be a real-valued function of graphs, such as, chromatic number, number of triangles, the size of the largest clique or independent set, etc. Denote that <math>m={n\choose 2}</math>. Fix an arbitrary numbering of potential edges between the <math>n</math> vertices, and denote the edges as <math>e_1,\ldots,e_m</math>. Let
-::<math>
-X_i=\begin{cases}
-& \mbox{if }e_i\in G,\\
-& \mbox{otherwise}.
-\end{cases}
-</math>
-:Let <math>Y_0=\mathbf{E}[f(G)]</math> and for <math>i=1,\ldots,m</math>, let <math>Y_i=\mathbf{E}[f(G)\mid X_1,\ldots,X_i]</math>.
-:The sequence <math>Y_0,Y_1,\ldots,Y_n</math> gives a Doob martingale that is commonly called the '''edge exposure martingale'''.
-===vertex exposure martingale===
-: Instead of revealing edges one at a time, we could reveal the set of edges connected to a given vertex, one vertex at a time. Suppose that the vertex set is <math>[n]</math>. Let <math>X_i</math> be the subgraph of <math>G</math> induced by the vertex set <math>[i]</math>, i.e. the first <math>i</math> vertices.
-:Let <math>Y_0=\mathbf{E}[f(G)]</math> and for <math>i=1,\ldots,n</math>, let <math>Y_i=\mathbf{E}[f(G)\mid X_1,\ldots,X_i]</math>.
-:The sequence <math>Y_0,Y_1,\ldots,Y_n</math> gives a Doob martingale that is commonly called the '''vertex exposure martingale'''.

高级算法 (Fall 2021)/Conditional expectations and 高级算法 (Fall 2021)/Problem Set 4: Difference between pages

Revision as of 08:10, 20 December 2021

Problem 1

Problem 2

Problem 3

Navigation menu

Search