数据科学基础 (Fall 2024)/Problem Set 6
- 每道题目的解答都要有完整的解题过程,中英文不限。
- 我们推荐大家使用LaTeX, markdown等对作业进行排版。
- 没有条件的同学可以用纸笔完成作业之后拍照。
- under construction
Assumption throughout Problem Set 6
Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\Pr) }[/math].
Without further notice, we assume that the expectation of random variables are well-defined.
Problem 1 (Convergence) (Bonus)
- [Convergence in [math]\displaystyle{ r }[/math]-th mean] Suppose [math]\displaystyle{ X_n{\xrightarrow {r}} X }[/math], where [math]\displaystyle{ r\ge 1 }[/math]. Prove or disprove that [math]\displaystyle{ \mathbb E[X_n^r]\to\mathbb E[X^r] }[/math].
- [Dominated convergence] Suppose [math]\displaystyle{ |X_n|\le Z }[/math] for all [math]\displaystyle{ n\in\mathbb N }[/math], where [math]\displaystyle{ \mathbb E(Z)\lt \infty }[/math]. Prove that if [math]\displaystyle{ X_n \xrightarrow P X }[/math] then [math]\displaystyle{ X_n \xrightarrow 1 X }[/math].
- [Slutsky’s theorem] Let [math]\displaystyle{ (X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y }[/math] be random variables and [math]\displaystyle{ c\in\mathbb{R} }[/math] be a real number.
- Suppose [math]\displaystyle{ X_n \overset{D}{\to} X }[/math] and [math]\displaystyle{ Y_n \overset{D}{\to} c }[/math]. Prove that [math]\displaystyle{ X_nY_n \overset{D}{\to} cX }[/math].
- Construct an example such that [math]\displaystyle{ X_n \overset{D}{\to} X }[/math] and [math]\displaystyle{ Y_n \overset{D}{\to} Y }[/math] but [math]\displaystyle{ X_nY_n }[/math] does not converge to [math]\displaystyle{ XY }[/math] in distribution.
Problem 2 (LLN & CLT)
- [Proportional betting] In each of a sequence of independent bets, a gambler either wins 30%, or loses 25% of her current fortune, each with probability [math]\displaystyle{ 1/2 }[/math]. Denoting her fortune after [math]\displaystyle{ n }[/math] bets by [math]\displaystyle{ F_n }[/math], show that [math]\displaystyle{ \mathbb E(F_n)\to\infty }[/math] as [math]\displaystyle{ n \to\infty }[/math], while [math]\displaystyle{ F_n \to 0 }[/math] almost surely.
- [Transience] Let [math]\displaystyle{ X_1,X_2,\dots }[/math] be independent identically distributed random variables taking values in the integers [math]\displaystyle{ \mathbb Z }[/math] and having a finite mean. Show that the Markov chain [math]\displaystyle{ S = \{S_n\} }[/math] given by [math]\displaystyle{ S_n = \sum^n_{i=1} X_i }[/math] is transient, i.e. [math]\displaystyle{ \forall n\in\mathbb N,\Pr(\exists n'\gt n, S_{n'}=S_n)\lt 1 }[/math], if [math]\displaystyle{ \mathbb E(X_1)\ne 0 }[/math].
- [Controlling a Fair Voting] In a society of [math]\displaystyle{ n }[/math] isolated (independent) and neutral (uniform) peoples, how many peoples are there enough to manipulate the result of a majority vote with [math]\displaystyle{ 1-\delta }[/math] certainty? You have to use the Berry–Esseen theorem to solve this problem.
Problem 3 (Concentration of measure)
- [Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable [math]\displaystyle{ X }[/math] be the number of throws required to obtain a total of [math]\displaystyle{ n }[/math] heads. Show that [math]\displaystyle{ \Pr[X \gt 2n + \delta\sqrt{n\log n}]\leq n^{-\delta^2/6} }[/math] for any real [math]\displaystyle{ 0\lt \delta\lt \sqrt{\frac{4n}{\log n}} }[/math].
- [[math]\displaystyle{ k }[/math]-th moment bound] Let [math]\displaystyle{ X }[/math] be a random variable with expectation [math]\displaystyle{ 0 }[/math] such that moment generating function [math]\displaystyle{ \mathbf{E}[\exp(t|X|)] }[/math] is finite for some [math]\displaystyle{ t \gt 0 }[/math]. We can use the following two kinds of tail inequalities for [math]\displaystyle{ X }[/math]:
- Chernoff Bound: [math]\displaystyle{ \Pr[|X| \geq \delta] \leq \min_{t \geq 0} {\mathbb{E}[e^{t|X|}]}/{e^{t\delta}} }[/math]
- [math]\displaystyle{ k }[/math]th-Moment Bound: [math]\displaystyle{ \Pr[|X| \geq \delta] \leq {\mathbb{E}[|X|^k]}/{\delta^k} }[/math]
- Show that for each [math]\displaystyle{ \delta }[/math], there exists a choice of [math]\displaystyle{ k }[/math] such that the [math]\displaystyle{ k }[/math]th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
- Why would we still prefer the Chernoff bound to the (seemingly) stronger [math]\displaystyle{ k }[/math]-th moment bound?
- [Cut size in random graph] Show that with probability at least [math]\displaystyle{ 2/3 }[/math], the size of the max-cut in Erdős–Rényi random graph [math]\displaystyle{ G(n,1/2) }[/math] is at most [math]\displaystyle{ n^2/8 + O(n^{1.5}) }[/math]. In the [math]\displaystyle{ G(n,1/2) }[/math] model, each edge is included in the graph with probability [math]\displaystyle{ 1/2 }[/math], independently of every other edge.