数据科学基础 (Fall 2024)/Problem Set 2: Difference between revisions
Jump to navigation
Jump to search
Liumingmou (talk | contribs) No edit summary |
Liumingmou (talk | contribs) |
||
(One intermediate revision by the same user not shown) | |||
Line 20: | Line 20: | ||
* [<strong>Entropy</strong>] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality. | * [<strong>Entropy</strong>] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality. | ||
* [<strong>Law of total expectation</strong>] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>. | * [<strong>Law of total expectation</strong>] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>. | ||
* ['''Composing random variables'''] | * ['''Composing random variables'''] A sequence of functions <math>f_1,\dots,f_n:\Omega\to\mathbb R</math> is increasing if <math>\forall w\in\Omega, i\in\{2,\dots,n\}, f_i(w)>f_{i-1}(w)</math>. A sequence of random variables <math>X_1,X_2,\dots:\Omega\to\mathbb R</math> converges (pointwise) to a random variable <math>X</math> if <math>\forall w\in\Omega,\lim_{n\to\infty}X_n(w)=X(w)</math>. | ||
<ol> | <ol> | ||
<li> Show that any discrete random variable may be written as a linear combination of indicator variables.</li> | <li> Show that any discrete random variable may be written as a linear combination of indicator variables.</li> | ||
Line 28: | Line 28: | ||
== Problem 2 (Discrete random variable) == | == Problem 2 (Discrete random variable) == | ||
* [<strong>Geometric distribution</strong>] Prove that | * [<strong>Geometric distribution</strong>] Prove that geometric distribution is normalized, i.e. <math>\sum_i p_X(i)=1</math>. Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>. | ||
* [<strong>Binomial distribution</strong>] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>. Prove that binomial distribution is normalized, i.e. <math>\sum_i p_X(i)=1</math>. | * [<strong>Binomial distribution</strong>] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>. Prove that binomial distribution is normalized, i.e. <math>\sum_i p_X(i)=1</math>. | ||
* [<strong>Negative binomial distribution</strong>] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>. | * [<strong>Negative binomial distribution</strong>] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>. | ||
* [<strong>Hypergeometric distribution (i)</strong>] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn <strong>without replacement</strong> (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math]. | * [<strong>Hypergeometric distribution (i)</strong>] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn <strong>without replacement</strong> (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math]. | ||
* [<strong>Hypergeometric distribution (ii)</strong>] Let [math]X[/math] and [math]Y[/math] be independent [math]\mathrm{Bin}(n,p)[/math] variables, and let [math]Z= X + Y[/math] . Show that the conditional distribution of [math]X[/math] given [math]Z= N[/math] is the hypergeometric distribution. | * [<strong>Hypergeometric distribution (ii)</strong>] Let [math]X[/math] and [math]Y[/math] be independent [math]\mathrm{Bin}(n,p)[/math] variables, and let [math]Z= X + Y[/math] . Show that the conditional distribution of [math]X[/math] given [math]Z= N[/math] is the hypergeometric distribution. |
Latest revision as of 05:53, 17 October 2024
- 每道题目的解答都要有完整的解题过程,中英文不限。
- 我们推荐大家使用LaTeX, markdown等对作业进行排版。
- 没有条件的同学可以用纸笔完成作业之后拍照。
Assumption throughout Problem Set 2
Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\mathbf{Pr}) }[/math].
Without further notice, we assume that the expectation of random variables are well-defined.
The term [math]\displaystyle{ \log }[/math] used in this context refers to the natural logarithm.
Problem 1 (Warm-up problems)
- [Cumulative distribution function (CDF)] Show that if [math]F[/math] and [math]G[/math] are distribution functions and [math]0\le\lambda\le 1[/math] then [math]\lambda F+(1-\lambda)G[/math] is a distribution function. Is the product [math]FG[/math] a distribution function? (The functions [math]\lambda F+(1-\lambda)G[/math] and [math]FG[/math] map [math]x[/math] to [math]\lambda F(x)+(1-\lambda)G(x)[/math] and [math]F(x)G(x)[/math], respectively)
- [Probability mass function (PMF)] We toss [math]\displaystyle{ n }[/math] coins, and each one shows heads with probability [math]\displaystyle{ p }[/math], independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let [math]\displaystyle{ X }[/math] be the number of heads resulting from the second round of tosses, and [math]\displaystyle{ Y }[/math] be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss. Find the PMF of [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math], and [math]\displaystyle{ \mathbf{E}[X] }[/math] and [math]\displaystyle{ \mathbf{E}[Y] }[/math].
- [Function of random variable] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
- [Estimator] Let [math]\{X_r : r \ge 1\}[/math] be observations which are independent and identically distributed with unknown distribution function [math]F[/math]. Describe and justify a method for estimating [math]F(x)[/math].
- [Independence] Let [math]\displaystyle{ X_r }[/math], [math]\displaystyle{ 1\leq r\leq n }[/math] be independent random variables which are symmetric about [math]\displaystyle{ 0 }[/math]; that is, [math]\displaystyle{ X_r }[/math] and [math]\displaystyle{ -X_r }[/math] have the same distributions. Show that, for all [math]\displaystyle{ x }[/math], [math]\displaystyle{ \mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x] }[/math] where [math]\displaystyle{ S_n = \sum_{r=1}^n X_r }[/math]. Is the conclusion true without the assumption of independence?
- [Joint distribution] Is the function [math]\displaystyle{ F(x,y)= 1− e^{−xy}, 0 \le x,y \lt \infty }[/math], the joint distribution function of some pair of random variables?
- [Entropy] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
- [Law of total expectation] Let [math]\displaystyle{ \{X_n\}_{n \ge 1} }[/math] be identically distributed random variable and [math]\displaystyle{ N }[/math] be a random variable taking values in the non-negative integers and independent of the [math]\displaystyle{ X_n }[/math] for all [math]\displaystyle{ n \ge 1 }[/math]. Prove that [math]\displaystyle{ \mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1] }[/math].
- [Composing random variables] A sequence of functions [math]\displaystyle{ f_1,\dots,f_n:\Omega\to\mathbb R }[/math] is increasing if [math]\displaystyle{ \forall w\in\Omega, i\in\{2,\dots,n\}, f_i(w)\gt f_{i-1}(w) }[/math]. A sequence of random variables [math]\displaystyle{ X_1,X_2,\dots:\Omega\to\mathbb R }[/math] converges (pointwise) to a random variable [math]\displaystyle{ X }[/math] if [math]\displaystyle{ \forall w\in\Omega,\lim_{n\to\infty}X_n(w)=X(w) }[/math].
- Show that any discrete random variable may be written as a linear combination of indicator variables.
- Show that any random variable may be expressed as the limit of an increasing sequence of discrete random variables.
- Show that the limit of any increasing convergent sequence of random variables is a random variable.
Problem 2 (Discrete random variable)
- [Geometric distribution] Prove that geometric distribution is normalized, i.e. [math]\displaystyle{ \sum_i p_X(i)=1 }[/math]. Prove that geometric distribution is the only discrete memoryless distribution with range values [math]\displaystyle{ \mathbb{N}_+ }[/math].
- [Binomial distribution] Let [math]\displaystyle{ n_1,n_2 \in \mathbb{N}_+ }[/math] and [math]\displaystyle{ 0 \le p \le 1 }[/math] be parameters, and [math]\displaystyle{ X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p) }[/math] be independent random variables. Prove that [math]\displaystyle{ X+Y \sim \mathrm{Bin}(n_1+n_2,p) }[/math]. Prove that binomial distribution is normalized, i.e. [math]\displaystyle{ \sum_i p_X(i)=1 }[/math].
- [Negative binomial distribution] Let [math]\displaystyle{ X }[/math] follows the negative binomial distribution with parameter [math]\displaystyle{ r \in \mathbb{N}_+ }[/math] and [math]\displaystyle{ p \in (0,1) }[/math]. Calculate [math]\displaystyle{ \mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2 }[/math].
- [Hypergeometric distribution (i)] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].
- [Hypergeometric distribution (ii)] Let [math]X[/math] and [math]Y[/math] be independent [math]\mathrm{Bin}(n,p)[/math] variables, and let [math]Z= X + Y[/math] . Show that the conditional distribution of [math]X[/math] given [math]Z= N[/math] is the hypergeometric distribution.