数据科学基础 (Fall 2024)/Problem Set 6: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
Liumingmou (talk | contribs)
Liumingmou (talk | contribs)
(4 intermediate revisions by the same user not shown)
Line 4: Line 4:


*没有条件的同学可以用纸笔完成作业之后拍照。
*没有条件的同学可以用纸笔完成作业之后拍照。
* '''under construction'''
 
==  Assumption throughout Problem Set 6 ==  
==  Assumption throughout Problem Set 6 ==  
<p>Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\Pr)</math>.</p>
<p>Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\Pr)</math>.</p>
Line 14: Line 14:
* ['''Dominated convergence'''] Suppose <math>|X_n|\le Z</math> for all <math>n\in\mathbb N</math>, where <math>\mathbb E(Z)<\infty</math>. Prove that if <math>X_n \xrightarrow P X</math> then <math>X_n \xrightarrow 1 X</math>.
* ['''Dominated convergence'''] Suppose <math>|X_n|\le Z</math> for all <math>n\in\mathbb N</math>, where <math>\mathbb E(Z)<\infty</math>. Prove that if <math>X_n \xrightarrow P X</math> then <math>X_n \xrightarrow 1 X</math>.
* ['''Slutsky’s theorem'''] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.  
* ['''Slutsky’s theorem'''] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.  
*# Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.  
# Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.  
*# Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
# Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.


== Problem 2 (LLN & CLT)==
== Problem 2 (LLN & CLT)==
* ['''Proportional betting'''] In each of a sequence of independent bets, a gambler either wins 30%, or loses 25% of her current fortune, each with probability <math>1/2</math>. Denoting her fortune after <math>n</math> bets by <math>F_n</math>, show that <math>\mathbb E(F_n)\to\infty</math> as <math>n \to\infty</math>, while <math>F_n \to 0</math> almost surely.
* ['''Proportional betting'''] In each of a sequence of independent bets, a gambler either wins 30%, or loses 25% of her current fortune, each with probability <math>1/2</math>. Denoting her fortune after <math>n</math> bets by <math>F_n</math>, show that <math>\mathbb E(F_n)\to\infty</math> as <math>n \to\infty</math>, while <math>F_n \to 0</math> almost surely.
* ['''Transience'''] Let <math>X_1,X_2,\dots</math> be independent identically distributed random variables taking values in the integers <math>\mathbb Z</math> and having a finite mean. Show that the Markov chain <math>S = \{S_n\}</math> given by <math>S_n = \sum^n_{i=1} X_i</math> is transient, i.e. <math>\forall n\in\mathbb N,\Pr(\exists n'>n, S_{n'}=S_n)<1</math>, if <math>\mathbb E(X_1)\ne 0</math>.
* ['''Transience'''] Let <math>X_1,X_2,\dots</math> be independent identically distributed random variables taking values in the integers <math>\mathbb Z</math> and having a finite mean. Show that the Markov chain <math>S = \{S_n\}</math> given by <math>S_n = \sum^n_{i=1} X_i</math> is transient, i.e. <math>\forall n\in\mathbb N,\Pr(\exists n'>n, S_{n'}=S_n)<1</math>, if <math>\mathbb E(X_1)\ne 0</math>.
* ['''Controlling a Fair Voting'''] In a society of <math>n</math> isolated (independent) and neutral (uniform) peoples, how many peoples are there enough to manipulate the result of a majority vote with <math>1-\delta</math> certainty? You have to use the Berry–Esseen theorem to solve this problem.
* ['''Controlling a Voting'''] In a society of <math>n</math> isolated (independent) peoples, each might vote for the proposal with probability <math>p</math> and vote against the proposal with probability <math>1-p</math>, how many peoples are there enough to manipulate the result of a majority vote with <math>1-\delta</math> certainty? Please apply the Berry–Esseen theorem to solve this problem.


== Problem 3 (Concentration of measure)==
== Problem 3 (Concentration of measure)==
* ['''Tossing coins'''] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\Pr[X > 2n + \delta\sqrt{n\log n}]\leq n^{-\delta^2/6}</math> for any real <math>0<\delta<\sqrt{\frac{4n}{\log n}}</math>.
* ['''<math>k</math>-th moment bound'''] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
** Chernoff Bound:  <math>\Pr[|X| \geq \delta] \leq \min_{t \geq 0} {\mathbb{E}[e^{t|X|}]}/{e^{t\delta}}</math>;
** <math>k</math>th-Moment Bound:  <math>\Pr[|X| \geq \delta] \leq {\mathbb{E}[|X|^k]}/{\delta^k}</math>.
# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?
* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.
== Problem 4 (Random processes)==
== Problem 4 (Random processes)==
* ['''Hypergeometry'''] Given a bag with <math>r</math> red balls and <math>g</math> green balls, suppose that we uniformly sample <math>n</math> balls from the bin without replacement. Set up an appropriate martingale and use it to show that the number of red balls in the sample is tightly concentrated around <math>nr/(r+ g)</math>.
* ['''Pólya’s urn''']A bag contains red and blue balls, with initially  <math>r</math> red and  <math>b</math> blue where  <math>rb >0 </math>. A ball is drawn from the bag, its color noted, and then it is returned to the bag together with a new ball of the same color. Let  <math>R_n </math> be the number of red balls after  <math>n </math> such operations.
# Show that  <math>Y_n = R_n/(n + r + b) </math> is a martingale.
# Let  <math>T </math> be the number of balls drawn until the first blue ball appears, and suppose that  <math>r = b= 1 </math>. Show that  <math>\mathbb E[(T + 2)^{−1}] = 1/4 </math>.
* ['''Family planning'''] Children are either female or male. Their sexes are independent random variables, being female with probability <math>q</math> or male with probability <math>p= 1− q</math>. A woman ceases childbearing at stage <math>T</math> , and we write <math>G_n</math> and <math>B_n</math> for the numbers of girls and boys born to her up to and including stage <math>n</math>. Assume that <math>T</math> is a finite stopping time for the sequence <math>\{(G_n,B_n) : n \le 1\}</math>. Show that, no matter the stopping rule that yields <math>T</math> , we have <math>\mathbb E(G_T )/\mathbb E(B_T )= q/p</math>. What can be said about <math>\mathbb E(G_T /B_T )</math>?
* ['''Random walk on a graph'''] A particle performs a random walk on the vertex set of a connected graph <math>G</math>, which for simplicity we assume to have neither loops nor multiple edges. At each stage it moves to a neighbor of its current position, each such neighbor being chosen with equal probability. If <math>G</math> has <math>\eta<\infty</math> edges, show that the stationary distribution is given by <math>\pi(v) = d_v/(2\eta)</math>, where <math>d_v</math> is the degree of vertex <math>v</math>.
* ['''Reversibility versus periodicity'''] Can a reversible chain be periodic?
* ['''Metropolis–Hastings algorithm'''] To sample a state, for each state <math>x</math>, the Glauber dynamics uniformly chooses a state among the adjacent states of <math>x</math> together with state <math>x</math> itself at random in each step, and moves to the chosen state. The Metropolis-Hastings algorithm generalizes the idea of Glauber dynamics. Let us assume that we have designed an irreducible state space for our Markov chain; now we want to construct a Markov chain on this state space with a stationary distribution <math>\pi_x = b(x)/B</math>, where for all <math>x \in \Omega</math> we have <math>b(x) > 0</math> and such that <math>B =\sum_{x\in\Omega} b(x)</math> is finite.
# For a finite state space <math>\Omega</math> and neighborhood structure <math>\{N(X ) | x \in\Omega\}</math>, let <math>N = \max_{x\in\Omega} |N(x)|</math>. Let <math>M</math> be any number such that <math>M \ge N</math>. For all <math>x \in \Omega</math>, let <math>\pi_x > 0</math> be the desired probability of state <math>x</math> in the stationary distribution. Consider a Markov chain where <math>P_{x,y} =
\begin{cases}(1/M) \min(1, \pi_y/\pi_x ) &\text{if $x \ne y$ and $y \in N(x)$},\\
0 &\text{if $x \ne y$ and $y \notin N(x)$},\\
1 − \sum_{y\ne x} P_{x,y} &\text{if $x = y$}\end{cases}</math>. Assume this chain is irreducible and aperiodic, verify that the stationary distribution is given by the probabilities <math>\pi_x</math>. (Hint: Show the time-reversibility.)
# Let <math>S = \sum_{i=1}^\infty i^{−2} = \pi^2/6</math>. Design a Markov chain based on the Metropolis-Hastings algorithm on the positive integers such that, in the stationary distribution, <math>\pi_i = 1/(Si^2)</math> . The neighbors of any integer <math>i > 1</math> for your chain should be only <math>i − 1</math> and <math>i + 1</math>, and the only neighbor of <math>1</math> should be the integer <math>2</math>.

Revision as of 06:48, 1 December 2024

  • 每道题目的解答都要有完整的解题过程,中英文不限。
  • 我们推荐大家使用LaTeX, markdown等对作业进行排版。
  • 没有条件的同学可以用纸笔完成作业之后拍照。

Assumption throughout Problem Set 6

Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\Pr) }[/math].

Without further notice, we assume that the expectation of random variables are well-defined.

Problem 1 (Convergence) (Bonus)

  • [Convergence in [math]\displaystyle{ r }[/math]-th mean] Suppose [math]\displaystyle{ X_n{\xrightarrow {r}} X }[/math], where [math]\displaystyle{ r\ge 1 }[/math]. Prove or disprove that [math]\displaystyle{ \mathbb E[X_n^r]\to\mathbb E[X^r] }[/math].
  • [Dominated convergence] Suppose [math]\displaystyle{ |X_n|\le Z }[/math] for all [math]\displaystyle{ n\in\mathbb N }[/math], where [math]\displaystyle{ \mathbb E(Z)\lt \infty }[/math]. Prove that if [math]\displaystyle{ X_n \xrightarrow P X }[/math] then [math]\displaystyle{ X_n \xrightarrow 1 X }[/math].
  • [Slutsky’s theorem] Let [math]\displaystyle{ (X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y }[/math] be random variables and [math]\displaystyle{ c\in\mathbb{R} }[/math] be a real number.
  1. Suppose [math]\displaystyle{ X_n \overset{D}{\to} X }[/math] and [math]\displaystyle{ Y_n \overset{D}{\to} c }[/math]. Prove that [math]\displaystyle{ X_nY_n \overset{D}{\to} cX }[/math].
  2. Construct an example such that [math]\displaystyle{ X_n \overset{D}{\to} X }[/math] and [math]\displaystyle{ Y_n \overset{D}{\to} Y }[/math] but [math]\displaystyle{ X_nY_n }[/math] does not converge to [math]\displaystyle{ XY }[/math] in distribution.

Problem 2 (LLN & CLT)

  • [Proportional betting] In each of a sequence of independent bets, a gambler either wins 30%, or loses 25% of her current fortune, each with probability [math]\displaystyle{ 1/2 }[/math]. Denoting her fortune after [math]\displaystyle{ n }[/math] bets by [math]\displaystyle{ F_n }[/math], show that [math]\displaystyle{ \mathbb E(F_n)\to\infty }[/math] as [math]\displaystyle{ n \to\infty }[/math], while [math]\displaystyle{ F_n \to 0 }[/math] almost surely.
  • [Transience] Let [math]\displaystyle{ X_1,X_2,\dots }[/math] be independent identically distributed random variables taking values in the integers [math]\displaystyle{ \mathbb Z }[/math] and having a finite mean. Show that the Markov chain [math]\displaystyle{ S = \{S_n\} }[/math] given by [math]\displaystyle{ S_n = \sum^n_{i=1} X_i }[/math] is transient, i.e. [math]\displaystyle{ \forall n\in\mathbb N,\Pr(\exists n'\gt n, S_{n'}=S_n)\lt 1 }[/math], if [math]\displaystyle{ \mathbb E(X_1)\ne 0 }[/math].
  • [Controlling a Voting] In a society of [math]\displaystyle{ n }[/math] isolated (independent) peoples, each might vote for the proposal with probability [math]\displaystyle{ p }[/math] and vote against the proposal with probability [math]\displaystyle{ 1-p }[/math], how many peoples are there enough to manipulate the result of a majority vote with [math]\displaystyle{ 1-\delta }[/math] certainty? Please apply the Berry–Esseen theorem to solve this problem.

Problem 3 (Concentration of measure)

  • [Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable [math]\displaystyle{ X }[/math] be the number of throws required to obtain a total of [math]\displaystyle{ n }[/math] heads. Show that [math]\displaystyle{ \Pr[X \gt 2n + \delta\sqrt{n\log n}]\leq n^{-\delta^2/6} }[/math] for any real [math]\displaystyle{ 0\lt \delta\lt \sqrt{\frac{4n}{\log n}} }[/math].
  • [[math]\displaystyle{ k }[/math]-th moment bound] Let [math]\displaystyle{ X }[/math] be a random variable with expectation [math]\displaystyle{ 0 }[/math] such that moment generating function [math]\displaystyle{ \mathbf{E}[\exp(t|X|)] }[/math] is finite for some [math]\displaystyle{ t \gt 0 }[/math]. We can use the following two kinds of tail inequalities for [math]\displaystyle{ X }[/math]:
    • Chernoff Bound: [math]\displaystyle{ \Pr[|X| \geq \delta] \leq \min_{t \geq 0} {\mathbb{E}[e^{t|X|}]}/{e^{t\delta}} }[/math];
    • [math]\displaystyle{ k }[/math]th-Moment Bound: [math]\displaystyle{ \Pr[|X| \geq \delta] \leq {\mathbb{E}[|X|^k]}/{\delta^k} }[/math].
  1. Show that for each [math]\displaystyle{ \delta }[/math], there exists a choice of [math]\displaystyle{ k }[/math] such that the [math]\displaystyle{ k }[/math]th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
  2. Why would we still prefer the Chernoff bound to the (seemingly) stronger [math]\displaystyle{ k }[/math]-th moment bound?
  • [Cut size in random graph] Show that with probability at least [math]\displaystyle{ 2/3 }[/math], the size of the max-cut in Erdős–Rényi random graph [math]\displaystyle{ G(n,1/2) }[/math] is at most [math]\displaystyle{ n^2/8 + O(n^{1.5}) }[/math]. In the [math]\displaystyle{ G(n,1/2) }[/math] model, each edge is included in the graph with probability [math]\displaystyle{ 1/2 }[/math], independently of every other edge.

Problem 4 (Random processes)

  • [Hypergeometry] Given a bag with [math]\displaystyle{ r }[/math] red balls and [math]\displaystyle{ g }[/math] green balls, suppose that we uniformly sample [math]\displaystyle{ n }[/math] balls from the bin without replacement. Set up an appropriate martingale and use it to show that the number of red balls in the sample is tightly concentrated around [math]\displaystyle{ nr/(r+ g) }[/math].
  • [Pólya’s urn]A bag contains red and blue balls, with initially [math]\displaystyle{ r }[/math] red and [math]\displaystyle{ b }[/math] blue where [math]\displaystyle{ rb \gt 0 }[/math]. A ball is drawn from the bag, its color noted, and then it is returned to the bag together with a new ball of the same color. Let [math]\displaystyle{ R_n }[/math] be the number of red balls after [math]\displaystyle{ n }[/math] such operations.
  1. Show that [math]\displaystyle{ Y_n = R_n/(n + r + b) }[/math] is a martingale.
  2. Let [math]\displaystyle{ T }[/math] be the number of balls drawn until the first blue ball appears, and suppose that [math]\displaystyle{ r = b= 1 }[/math]. Show that [math]\displaystyle{ \mathbb E[(T + 2)^{−1}] = 1/4 }[/math].
  • [Family planning] Children are either female or male. Their sexes are independent random variables, being female with probability [math]\displaystyle{ q }[/math] or male with probability [math]\displaystyle{ p= 1− q }[/math]. A woman ceases childbearing at stage [math]\displaystyle{ T }[/math] , and we write [math]\displaystyle{ G_n }[/math] and [math]\displaystyle{ B_n }[/math] for the numbers of girls and boys born to her up to and including stage [math]\displaystyle{ n }[/math]. Assume that [math]\displaystyle{ T }[/math] is a finite stopping time for the sequence [math]\displaystyle{ \{(G_n,B_n) : n \le 1\} }[/math]. Show that, no matter the stopping rule that yields [math]\displaystyle{ T }[/math] , we have [math]\displaystyle{ \mathbb E(G_T )/\mathbb E(B_T )= q/p }[/math]. What can be said about [math]\displaystyle{ \mathbb E(G_T /B_T ) }[/math]?
  • [Random walk on a graph] A particle performs a random walk on the vertex set of a connected graph [math]\displaystyle{ G }[/math], which for simplicity we assume to have neither loops nor multiple edges. At each stage it moves to a neighbor of its current position, each such neighbor being chosen with equal probability. If [math]\displaystyle{ G }[/math] has [math]\displaystyle{ \eta\lt \infty }[/math] edges, show that the stationary distribution is given by [math]\displaystyle{ \pi(v) = d_v/(2\eta) }[/math], where [math]\displaystyle{ d_v }[/math] is the degree of vertex [math]\displaystyle{ v }[/math].
  • [Reversibility versus periodicity] Can a reversible chain be periodic?
  • [Metropolis–Hastings algorithm] To sample a state, for each state [math]\displaystyle{ x }[/math], the Glauber dynamics uniformly chooses a state among the adjacent states of [math]\displaystyle{ x }[/math] together with state [math]\displaystyle{ x }[/math] itself at random in each step, and moves to the chosen state. The Metropolis-Hastings algorithm generalizes the idea of Glauber dynamics. Let us assume that we have designed an irreducible state space for our Markov chain; now we want to construct a Markov chain on this state space with a stationary distribution [math]\displaystyle{ \pi_x = b(x)/B }[/math], where for all [math]\displaystyle{ x \in \Omega }[/math] we have [math]\displaystyle{ b(x) \gt 0 }[/math] and such that [math]\displaystyle{ B =\sum_{x\in\Omega} b(x) }[/math] is finite.
  1. For a finite state space [math]\displaystyle{ \Omega }[/math] and neighborhood structure [math]\displaystyle{ \{N(X ) | x \in\Omega\} }[/math], let [math]\displaystyle{ N = \max_{x\in\Omega} |N(x)| }[/math]. Let [math]\displaystyle{ M }[/math] be any number such that [math]\displaystyle{ M \ge N }[/math]. For all [math]\displaystyle{ x \in \Omega }[/math], let [math]\displaystyle{ \pi_x \gt 0 }[/math] be the desired probability of state [math]\displaystyle{ x }[/math] in the stationary distribution. Consider a Markov chain where [math]\displaystyle{ P_{x,y} = \begin{cases}(1/M) \min(1, \pi_y/\pi_x ) &\text{if $x \ne y$ and $y \in N(x)$},\\ 0 &\text{if $x \ne y$ and $y \notin N(x)$},\\ 1 − \sum_{y\ne x} P_{x,y} &\text{if $x = y$}\end{cases} }[/math]. Assume this chain is irreducible and aperiodic, verify that the stationary distribution is given by the probabilities [math]\displaystyle{ \pi_x }[/math]. (Hint: Show the time-reversibility.)
  2. Let [math]\displaystyle{ S = \sum_{i=1}^\infty i^{−2} = \pi^2/6 }[/math]. Design a Markov chain based on the Metropolis-Hastings algorithm on the positive integers such that, in the stationary distribution, [math]\displaystyle{ \pi_i = 1/(Si^2) }[/math] . The neighbors of any integer [math]\displaystyle{ i \gt 1 }[/math] for your chain should be only [math]\displaystyle{ i − 1 }[/math] and [math]\displaystyle{ i + 1 }[/math], and the only neighbor of [math]\displaystyle{ 1 }[/math] should be the integer [math]\displaystyle{ 2 }[/math].