数据科学基础 (Fall 2024)/Problem Set 2

From TCS Wiki
Jump to navigation Jump to search
  • 每道题目的解答都要有完整的解题过程,中英文不限。
  • 我们推荐大家使用LaTeX, markdown等对作业进行排版。
  • 没有条件的同学可以用纸笔完成作业之后拍照。

Assumption throughout Problem Set 2

Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\mathbf{Pr}) }[/math].

Without further notice, we assume that the expectation of random variables are well-defined.

The term [math]\displaystyle{ \log }[/math] used in this context refers to the natural logarithm.

Problem 1 (Warm-up problems)

  • [Cumulative distribution function (CDF)] Show that if [math]F[/math] and [math]G[/math] are distribution functions and [math]0\le\lambda\le 1[/math] then [math]\lambda F+(1-\lambda)G[/math] is a distribution function. Is the product [math]FG[/math] a distribution function? (The functions [math]\lambda F+(1-\lambda)G[/math] and [math]FG[/math] map [math]x[/math] to [math]\lambda F(x)+(1-\lambda)G(x)[/math] and [math]F(x)G(x)[/math], respectively)
  • [Probability mass function (PMF)] We toss [math]\displaystyle{ n }[/math] coins, and each one shows heads with probability [math]\displaystyle{ p }[/math], independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let [math]\displaystyle{ X }[/math] be the number of heads resulting from the second round of tosses, and [math]\displaystyle{ Y }[/math] be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss. Find the PMF of [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math], and [math]\displaystyle{ \mathbf{E}[X] }[/math] and [math]\displaystyle{ \mathbf{E}[Y] }[/math].
  • [Function of random variable] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
  • [Estimator] Let [math]\{X_r : r \ge 1\}[/math] be observations which are independent and identically distributed with unknown distribution function [math]F[/math]. Describe and justify a method for estimating [math]F(x)[/math].
  • [Independence] Let [math]\displaystyle{ X_r }[/math], [math]\displaystyle{ 1\leq r\leq n }[/math] be independent random variables which are symmetric about [math]\displaystyle{ 0 }[/math]; that is, [math]\displaystyle{ X_r }[/math] and [math]\displaystyle{ -X_r }[/math] have the same distributions. Show that, for all [math]\displaystyle{ x }[/math], [math]\displaystyle{ \mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x] }[/math] where [math]\displaystyle{ S_n = \sum_{r=1}^n X_r }[/math]. Is the conclusion true without the assumption of independence?
  • [Joint distribution] Is the function [math]\displaystyle{ F(x,y)= 1− e^{−xy}, 0 \le x,y \lt \infty }[/math], the joint distribution function of some pair of random variables?
  • [Entropy] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
  • [Law of total expectation] Let [math]\displaystyle{ \{X_n\}_{n \ge 1} }[/math] be identically distributed random variable and [math]\displaystyle{ N }[/math] be a random variable taking values in the non-negative integers and independent of the [math]\displaystyle{ X_n }[/math] for all [math]\displaystyle{ n \ge 1 }[/math]. Prove that [math]\displaystyle{ \mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1] }[/math].
  • [Composing random variables] A sequence of functions [math]\displaystyle{ f_1,\dots,f_n:\Omega\to\mathbb R }[/math] is increasing if [math]\displaystyle{ \forall w\in\Omega, i\in\{2,\dots,n\}, f_i(w)\gt f_{i-1}(w) }[/math]. A sequence of random variables [math]\displaystyle{ X_1,X_2,\dots:\Omega\to\mathbb R }[/math] converges (pointwise) to a random variable [math]\displaystyle{ X }[/math] if [math]\displaystyle{ \forall w\in\Omega,\lim_{n\to\infty}X_n(w)=X(w) }[/math].
  1. Show that any discrete random variable may be written as a linear combination of indicator variables.
  2. Show that any random variable may be expressed as the limit of an increasing sequence of discrete random variables.
  3. Show that the limit of any increasing convergent sequence of random variables is a random variable.

Problem 2 (Discrete random variable)

  • [Geometric distribution] Prove that geometric distribution is normalized, i.e. [math]\displaystyle{ \sum_i p_X(i)=1 }[/math]. Prove that geometric distribution is the only discrete memoryless distribution with range values [math]\displaystyle{ \mathbb{N}_+ }[/math].
  • [Binomial distribution] Let [math]\displaystyle{ n_1,n_2 \in \mathbb{N}_+ }[/math] and [math]\displaystyle{ 0 \le p \le 1 }[/math] be parameters, and [math]\displaystyle{ X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p) }[/math] be independent random variables. Prove that [math]\displaystyle{ X+Y \sim \mathrm{Bin}(n_1+n_2,p) }[/math]. Prove that binomial distribution is normalized, i.e. [math]\displaystyle{ \sum_i p_X(i)=1 }[/math].
  • [Negative binomial distribution] Let [math]\displaystyle{ X }[/math] follows the negative binomial distribution with parameter [math]\displaystyle{ r \in \mathbb{N}_+ }[/math] and [math]\displaystyle{ p \in (0,1) }[/math]. Calculate [math]\displaystyle{ \mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2 }[/math].
  • [Hypergeometric distribution (i)] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].
  • [Hypergeometric distribution (ii)] Let [math]X[/math] and [math]Y[/math] be independent [math]\mathrm{Bin}(n,p)[/math] variables, and let [math]Z= X + Y[/math] . Show that the conditional distribution of [math]X[/math] given [math]Z= N[/math] is the hypergeometric distribution.