数据科学基础 (Fall 2024)/Problem Set 2
- 每道题目的解答都要有完整的解题过程,中英文不限。
- 我们推荐大家使用LaTeX, markdown等对作业进行排版。
- 没有条件的同学可以用纸笔完成作业之后拍照。
Assumption throughout Problem Set 2
Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\mathbf{Pr}) }[/math].
Without further notice, we assume that the expectation of random variables are well-defined.
The term [math]\displaystyle{ \log }[/math] used in this context refers to the natural logarithm.
Problem 1 (Warm-up problems)
- [Cumulative distribution function (CDF)] Show that if [math]F[/math] and [math]G[/math] are distribution functions and [math]0\le\lambda\le 1[/math] then [math]\lambda F+(1-\lambda)G[/math] is a distribution function. Is the product [math]FG[/math] a distribution function? (The functions [math]\lambda F+(1-\lambda)G[/math] and [math]FG[/math] map [math]x[/math] to [math]\lambda F(x)+(1-\lambda)G(x)[/math] and [math]F(x)G(x)[/math], respectively)
- [Probability mass function (PMF)] We toss [math]\displaystyle{ n }[/math] coins, and each one shows heads with probability [math]\displaystyle{ p }[/math], independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let [math]\displaystyle{ X }[/math] be the number of heads resulting from the second round of tosses, and [math]\displaystyle{ Y }[/math] be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss. Find the PMF of [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math], and [math]\displaystyle{ \mathbf{E}[X] }[/math] and [math]\displaystyle{ \mathbf{E}[Y] }[/math].
- [Function of random variable] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
- [Estimator] Let [math]\{X_r : r \ge 1\}[/math] be observations which are independent and identically distributed with unknown distribution function [math]F[/math]. Describe and justify a method for estimating [math]F(x)[/math].
- [Independence] Let [math]\displaystyle{ X_r }[/math], [math]\displaystyle{ 1\leq r\leq n }[/math] be independent random variables which are symmetric about [math]\displaystyle{ 0 }[/math]; that is, [math]\displaystyle{ X_r }[/math] and [math]\displaystyle{ -X_r }[/math] have the same distributions. Show that, for all [math]\displaystyle{ x }[/math], [math]\displaystyle{ \mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x] }[/math] where [math]\displaystyle{ S_n = \sum_{r=1}^n X_r }[/math]. Is the conclusion true without the assumption of independence?
- [Joint distribution] Is the function [math]\displaystyle{ F(x,y)= 1− e^{−xy}, 0 \le x,y \lt \infty }[/math], the joint distribution function of some pair of random variables?
- [Entropy] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
- [Law of total expectation] Let [math]\displaystyle{ \{X_n\}_{n \ge 1} }[/math] be identically distributed random variable and [math]\displaystyle{ N }[/math] be a random variable taking values in the non-negative integers and independent of the [math]\displaystyle{ X_n }[/math] for all [math]\displaystyle{ n \ge 1 }[/math]. Prove that [math]\displaystyle{ \mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1] }[/math].
- [Composing random variables]
- Show that any discrete random variable may be written as a linear combination of indicator variables.
- Show that any random variable may be expressed as the limit of an increasing sequence of discrete random variables.
- Show that the limit of any increasing convergent sequence of random variables is a random variable.
Problem 2 (Discrete random variable)
- [Geometric distribution] Prove that geometry distribution is normalized, i.e. [math]\displaystyle{ \sum_i p_X(i)=1 }[/math]. Prove that geometry distribution is the only discrete memoryless distribution with range values [math]\displaystyle{ \mathbb{N}_+ }[/math].
- [Binomial distribution] Let [math]\displaystyle{ n_1,n_2 \in \mathbb{N}_+ }[/math] and [math]\displaystyle{ 0 \le p \le 1 }[/math] be parameters, and [math]\displaystyle{ X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p) }[/math] be independent random variables. Prove that [math]\displaystyle{ X+Y \sim \mathrm{Bin}(n_1+n_2,p) }[/math]. Prove that binomial distribution is normalized, i.e. [math]\displaystyle{ \sum_i p_X(i)=1 }[/math].
- [Negative binomial distribution] Let [math]\displaystyle{ X }[/math] follows the negative binomial distribution with parameter [math]\displaystyle{ r \in \mathbb{N}_+ }[/math] and [math]\displaystyle{ p \in (0,1) }[/math]. Calculate [math]\displaystyle{ \mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2 }[/math].
- [Hypergeometric distribution (i)] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].
- [Hypergeometric distribution (ii)] Let [math]X[/math] and [math]Y[/math] be independent [math]\mathrm{Bin}(n,p)[/math] variables, and let [math]Z= X + Y[/math] . Show that the conditional distribution of [math]X[/math] given [math]Z= N[/math] is the hypergeometric distribution.