TCS Wiki - User contributions [en]

概率论与数理统计 (Spring 2025)/Problem Set 4

2025-05-07T08:33:49Z

Zhangxy: /* Problem 3 (Strong Law of Large Numbers, 15 points, Bonus Problem) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Let <math>X,Y</math> and <math>Z</math> be independent and uniformly distributed on <math>[0,1]</math>. Find the joint density function of <math>XY</math> and <math>Z^2</math>, and show that <math>\textbf{Pr}[XY<Z^2] = 5/9</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Strong Law of Large Numbers, 15 points, '''Bonus Problem''')==
<ul>
Throughout this problem, we assume <math>X_1,X_2,\ldots,</math> be jointly independent square-integrable real random variables of mean zero. We will prove the strong law of large numbers by Kolmogorov maximal inequality.
<li>[Kolmogorov maximal inequality]
Let <math>S_n = \sum_{i=1}^n X_i</math>. Prove that <math>\mathbf{Pr}\left(\max_{1 \le i \le n} |S_i| \ge t \right) \le \frac{\mathbf{Var}(S_n)}{t^2}</math>.
</li>
<li>[Convergence of random series]
Suppose <math>\sum_{i=1}^{+\infty} \mathbf{Var}(X_i) < \infty</math>. Prove that the series <math>\sum_{i=1}^{+\infty} X_i</math> is almost surely convergent.
</li>
<li>[Strong law of large numbers]
Prove the strong law of large numbers using previous propositions.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded, where <math>S_n = \sum_{i=1}^n |X_i|</math>.
</li>

</ul>
</li>

<li>
[Monte Carlo integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>[Stirling's formula]
By considering the central limit theorem for the sum of independent Poisson-distributed random variables, show that
<math>n! \sim \sqrt{2\pi n} \cdot \left(\frac{n}{\mathrm{e}}\right)^n</math>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 4

2025-04-27T15:34:19Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 10 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Strong Law of Large Numbers, 15 points, '''Bonus Problem''')==
<ul>
Throughout this problem, we assume <math>X_1,X_2,\ldots,</math> be jointly independent square-integrable real random variables of mean zero. We will prove the strong law of large numbers by Kolmogorov maximal inequality.
<li>[Kolmogorov maximal inequality]
Let <math>S_n = \sum_{i=1}^n X_i</math>. Prove that <math>\mathbf{Pr}\left(\max_{1 \le i \le n} |S_i| \ge t \right) \le \frac{n \mathbf{Var}(X_1)}{t^2}</math>.
</li>
<li>[Convergence of random series]
Suppose <math>\sum_{i=1}^{+\infty} \mathbf{Var}(X_i) < \infty</math>. Prove that the series <math>\sum_{i=1}^{+\infty} X_i</math> is almost surely convergent.
</li>
<li>[Strong law of large numbers]
Prove the strong law of large numbers using previous propositions.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded, where <math>S_n = \sum_{i=1}^n |X_i|</math>.
</li>

</ul>
</li>

<li>
[Monte Carlo integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>[Stirling's formula]
By considering the central limit theorem for the sum of independent Poisson-distributed random variables, show that
<math>n! \sim \sqrt{2\pi n} \cdot \left(\frac{n}{\mathrm{e}}\right)^n</math>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 4

2025-04-27T15:34:06Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 10 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Strong Law of Large Numbers, 15 points, '''Bonus Problem''')==
<ul>
Throughout this problem, we assume <math>X_1,X_2,\ldots,</math> be jointly independent square-integrable real random variables of mean zero. We will prove the strong law of large numbers by Kolmogorov maximal inequality.
<li>[Kolmogorov maximal inequality]
Let <math>S_n = \sum_{i=1}^n X_i</math>. Prove that <math>\mathbf{Pr}\left(\max_{1 \le i \le n} |S_i| \ge t \right) \le \frac{n \mathbf{Var}(X_1)}{t^2}</math>.
</li>
<li>[Convergence of random series]
Suppose <math>\sum_{i=1}^{+\infty} \mathbf{Var}(X_i) < \infty</math>. Prove that the series <math>\sum_{i=1}^{+\infty} X_i</math> is almost surely convergent.
</li>
<li>[Strong law of large numbers]
Prove the strong law of large numbers using previous propositions.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded, where <math>S_n = \sum_{i=1}^n |X_i|</math>.
</li>

</ul>
</li>

<li>
[Monte Carlo integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>[Stirling's formula]
By considering the central limit theorem for the sum of independent Poisson-distributed random variables, show that
<math>n! \sim \sqrt{2\pi n} \left(\frac{n}{\mathrm{e}}\right)^n</math>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 4

2025-04-27T15:26:17Z

Zhangxy: /* Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Strong Law of Large Numbers, 15 points, '''Bonus Problem''')==
<ul>
Throughout this problem, we assume <math>X_1,X_2,\ldots,</math> be jointly independent square-integrable real random variables of mean zero. We will prove the strong law of large numbers by Kolmogorov maximal inequality.
<li>[Kolmogorov maximal inequality]
Let <math>S_n = \sum_{i=1}^n X_i</math>. Prove that <math>\mathbf{Pr}\left(\max_{1 \le i \le n} |S_i| \ge t \right) \le \frac{n \mathbf{Var}(X_1)}{t^2}</math>.
</li>
<li>[Convergence of random series]
Suppose <math>\sum_{i=1}^{+\infty} \mathbf{Var}(X_i) < \infty</math>. Prove that the series <math>\sum_{i=1}^{+\infty} X_i</math> is almost surely convergent.
</li>
<li>[Strong law of large numbers]
Prove the strong law of large numbers using previous propositions.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded, where <math>S_n = \sum_{i=1}^n |X_i|</math>.
</li>

</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>2(\sqrt{S_n} - \sqrt{n}) \overset{D}{\to} \sigma N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2025)/Problem Set 4

2025-04-27T15:11:31Z

Zhangxy: /* Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Convergence and Independence]
Let <math>X_1,X_2,\ldots,</math> be a sequence of scalar random variables converging in probability to another random variable <math>X</math>. Suppose there is a random variable that is independent of <math>X_i</math> for each individual <math>i</math>. Show that <math>Y</math> is also independent of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded, where <math>S_n = \sum_{i=1}^n |X_i|</math>.
</li>

</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>2(\sqrt{S_n} - \sqrt{n}) \overset{D}{\to} \sigma N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-07T07:04:01Z

Zhangxy: /* Problem 3 (Probability meets distinct sums) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events with <math>\mathbf{Pr}[A_i] > 0</math> for all <math>1 \le i \le n</math>. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+2b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets distinct sums) ==
<ul>

Let <math>f(n)</math> denote the maximal <math>m</math> such that there exists a set of <math>m</math> distinct numbers <math>\{x_1,x_2,\ldots,x_m\}</math>
in <math>[n] = \{1,2,\ldots,n\}</math> all of whose sums are distinct. Namely, <math>\sum_{i \in S} x_i</math> are distinct for all <math>S \subseteq \{1,2,\ldots,m\}</math>.
Use the second moment method (i.e., Chebyshev's inequality) to show that <math>f(n) \le \log_2 n + \frac{1}{2} \log_2 \log_2 n + O(1)</math>. (Remark: Erdös' [https://www.erdosproblems.com/1 first open problem] asks if <math>f(n) \le \log_2 n + C</math> for some universal constant <math>C</math>.)

</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T11:43:01Z

Zhangxy: /* Problem 3 (Probability meets distinct sums) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events with <math>\mathbf{Pr}[A_i] > 0</math> for all <math>1 \le i \le n</math>. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+2b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets distinct sums) ==
<ul>

Let <math>f(n)</math> denote the maximal <math>m</math> such that there exists a set of <math>m</math> distinct numbers <math>\{x_1,x_2,\ldots,x_m\}</math>
in <math>[n] = \{1,2,\ldots,n\}</math> all of whose sums are distinct. Namely, <math>\sum_{i \in S} x_i</math> are distinct for all <math>S \subseteq \{1,2,\ldots,m\}</math>.
Use the second moment method to show that <math>f(n) \le \log_2 n + \frac{1}{2} \log_2 \log_2 n + O(1)</math>. (Remark: Erdös' [https://www.erdosproblems.com/1 first open problem] asks if <math>f(n) \le \log_2 n + C</math> for some universal constant <math>C</math>.)

</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T11:39:18Z

Zhangxy: /* Problem 3 (Probability meets distinct sums) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events with <math>\mathbf{Pr}[A_i] > 0</math> for all <math>1 \le i \le n</math>. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+2b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets distinct sums) ==
<ul>

Let <math>f(n)</math> denote the maximal <math>m</math> such that there exists a set of <math>m</math> distinct numbers
in <math>[n] = \{1,2,\ldots,n\}</math> all of whose sums are distinct.
Use the second moment to show that <math>f(n) \le \log_2 n + \frac{1}{2} \log_2 \log_2 n + O(1)</math>. (Remark: Erdös' [https://www.erdosproblems.com/1 first open problem] asks if <math>f(n) \le \log_2 n + C</math> for some universal constant <math>C</math>.)

</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T11:36:12Z

Zhangxy: /* Problem 3 (Probability meets distinct sums) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events with <math>\mathbf{Pr}[A_i] > 0</math> for all <math>1 \le i \le n</math>. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+2b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets distinct sums) ==
<ul>

Let <math>f(n)</math> denote the maximal <math>m</math> such that there exists a set of <math>m</math> distinct numbers
in <math>[n] = \{1,2,\ldots,n\}</math> all of whose sums are distinct.
Show that <math>f(n) \le \log_2 n + \frac{1}{2} \log_2 \log_2 n + O(1)</math>. (Remark: Erdös' [https://www.erdosproblems.com/1 first open problem] asks if <math>f(n) \le \log_2 n + C</math> for some universal constant <math>C</math>.)

</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T11:35:39Z

Zhangxy:

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events with <math>\mathbf{Pr}[A_i] > 0</math> for all <math>1 \le i \le n</math>. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+2b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets distinct sums) ==
<ul>

Let <math>f(n)</math> denote the maximal <math>m</math> such that there exists a set of <math>m</math> distinct numbers
in <math>[n] = \{1,2,\ldots,n\}</math> all of whose sums are distinct.
Show that <math>f(n) \le \log_2 n + \frac{1}{2} \log_2 \log_2 n + O(1)</math>.

(Remark: Erdös' [https://www.erdosproblems.com/1 first open problem] asks if <math>f(n) \le \log_2 n + C</math> for some universal constant <math>C</math>.)

</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T11:29:14Z

Zhangxy: /* Problem 3 (Probability meets graph theory) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events with <math>\mathbf{Pr}[A_i] > 0</math> for all <math>1 \le i \le n</math>. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+2b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets distinct sums) ==
<ul>
<li>
Let <math>f(n)</math> denote the maximal <math>m</math> such that there exists a set <math></math> of <math>m</math> distinct numbers
in <math>[n] = \{1,2,\ldots,n\}</math> all of whose sums are distinct.
Show that <math>f(n) \le \log_2 n + \frac{1}{2} \log_2 \log_2 n + O(1)</math>.
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T11:10:23Z

Zhangxy: /* Problem 2 (Inequalities) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events with <math>\mathbf{Pr}[A_i] > 0</math> for all <math>1 \le i \le n</math>. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+2b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T10:57:27Z

Zhangxy: /* Problem 2 (Inequalities) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events with <math>\mathbf{Pr}[A_i] > 0</math> for all <math>1 \le i \le n</math>. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T10:50:13Z

Zhangxy: /* Problem 2 (Inequalities) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots \cup A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-06T10:49:52Z

Zhangxy: /* Problem 2 (Inequalities) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li> [Union of events]
Let <math>A_1,A_2,\ldots,A_n</math> be events. Define <math>a = \sum_{i=1}^n \mathbf{Pr}[A_i]</math> and <math>b = \sum_{1 \le i<j\le n} \mathbf{Pr}[A_i \cap A_j]</math>. Show that <math>\mathbf{Pr}[A_1 \cup A_2 \cup \ldots A_n] \ge \max\left\{2-\frac{a+2b}{a^2}, \frac{a^2}{a+b}\right\}</math>.
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:31:13Z

Zhangxy: /* Problem 5 (Probability meets graph theory) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:15:16Z

Zhangxy: /* Problem 4 (Linearity of Expectation) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] Is it generally true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]? Is it ever true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]?
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:13:02Z

Zhangxy: /* Problem 4 (Linearity of Expectation) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] Is it generally true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]? Is it ever true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]?
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:09:49Z

Zhangxy: /* Problem 4 (Linearity of Expectation, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] Is it generally true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]? Is it ever true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]?
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Find the expected value of the number of times the operation is done.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T14:58:54Z

Zhangxy: /* Problem 4 (Linearity of Expectation, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Show that, if <math>X</math> and <math>Y</math> are random variables, then so are <math>X+Y</math>, <math>XY</math> and <math>\min\{X,Y\}</math>.
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] Is it generally true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]? Is it ever true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]?
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T14:55:56Z

Zhangxy: Created page with "*每道题目的解答都要有完整的解题过程，中英文不限。 *我们推荐大家使用LaTeX, markdown等对作业进行排版。 == Assumption throughout Problem Set 2== Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>. Without further notice, we assume that the expectation of random variables are well-defined. The term <math>\log</math> used in this context refers to the natural l..."

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Show that, if <math>X</math> and <math>Y</math> are random variables, then so are <math>X+Y</math>, <math>XY</math> and <math>\min\{X,Y\}</math>.
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] Is it generally true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]? Is it ever true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]?
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li>[Paper Cutting]
We have a rectangular piece of paper divided into <math>H \times W</math> squares, where two of those squares are painted black and the rest are painted white. If we let <math>(i,j)</math> denote the square at the <math>i</math>-th row and <math>j</math>-th column, the squares painted black are <math>(h_1,w_1)</math> and <math>(h_2,w_2)</math>. Bob will repeat the following operation to cut the piece of paper:
<ol>
Assume that we have
<math>h \times w</math> squares remaining. There are
<math>(h−1)</math> horizontal lines and
<math>(w−1)</math> vertical lines that are parallel to the edges of the piece and pass the borders of the squares. He chooses one of these lines uniformly at random and cuts the piece into two along that line. Then, if the two black squares are on the same piece, he throws away the other piece and continues the process; otherwise, he ends the process.
</ol>

Find the expected value of the number of times Bob cuts a piece of paper until he ends the process.

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)

2025-03-13T14:47:27Z

Zhangxy: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2025)/第一次作业提交名单|第一次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 2|Problem Set 2]] 请在 --- 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A2.pdf').

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2025)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2025/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-20T06:49:25Z

Zhangxy:

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion, 8 points)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-2-probability-space">Problem 2 (Probability space, 12 points)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field (I)] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\sigma(S) := \bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Smallest <math>\sigma</math>-field (II)] Let <math>S,T \subseteq 2^{\Omega}</math>. Show that <math>\sigma(S) = \sigma(T)</math> if and only if <math>S \subseteq \sigma(T)</math> and <math>T \subseteq \sigma(S)</math>.
</li>

<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field. Suppose <math>\mathcal{F}_1 \subseteq \mathcal{F}_2\subseteq \mathcal{F}_3\subseteq\ldots</math> be a sequence of <math>\sigma</math>-field. Is <math>\bigcup_{i=1}^{+\infty} \mathcal{F}_i</math> a <math>\sigma</math>-field?
</li>

<li>[Projection] Let <math>\mathcal{F}</math> be a <math>\sigma</math>-field of subsets of <math>\Omega</math> and <math>T \subseteq \Omega</math> be a subset. Show that <math>\{S \cap T \mid S \in \mathcal{F}\}</math> is a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

</ul>

<h2 id="problem-3-probability-space">Problem 3 (Birthday paradox)</h2>
Please design a randomized algorithm using the birthday paradox that solves the following problem in <math>\mathrm{poly}(n) \cdot 2^{n/2}</math> time with high probability (for example, <math>0.99</math> when <math>n</math> is sufficiently large). Please provide a detailed error analysis as well. (WARNING: You will NOT receive any points if you solve this task using Gaussian elimination)
<ul>
<li>
Given an integer sequence <math>a_1,a_2,\ldots,a_{100n}</math> of length <math>100 n</math> satisfying <math>0 \le a_i < 2^n</math> for all <math>1 \le i \le 100 n</math>. Please find out a non-empty subset <math>S \subseteq \{1,2,\ldots,100n\}</math> satisfying <math>\bigoplus_{i \in S} a_i = 0</math>, i.e. the exclusive or of the elements whose indices are in <math>S</math> equals to <math>0</math>.
</li>
</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players (vertices) <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (and vice versa). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-20T06:47:46Z

Zhangxy:

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion, 8 points)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-2-probability-space">Problem 2 (Probability space, 12 points)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field (I)] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\sigma(S) := \bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Smallest <math>\sigma</math>-field (II)] Let <math>S,T \subseteq 2^{\Omega}</math>. Show that <math>\sigma(S) = \sigma(T)</math> if and only if <math>S \subseteq \sigma(T)</math> and <math>T \subseteq \sigma(S)</math>.
</li>

<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field. Suppose <math>\mathcal{F}_1, \mathcal{F}_2,\ldots</math> be a sequence of <math>\sigma</math>-field satisfying <math>\mathcal{F}_1 \subseteq \mathcal{F}_2 \subseteq \ldots</math>. Is <math>\bigcup_{i=1}^{+\infty} \mathcal{F}_i</math> a <math>\sigma</math>-field?
</li>

<li>[Projection] Let <math>\mathcal{F}</math> be a <math>\sigma</math>-field of subsets of <math>\Omega</math> and <math>T \subseteq \Omega</math> be a subset. Show that <math>\{S \cap T \mid S \in \mathcal{F}\}</math> is a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

</ul>

<h2 id="problem-3-probability-space">Problem 3 (Birthday paradox)</h2>
Please design a randomized algorithm using the birthday paradox that solves the following problem in <math>\mathrm{poly}(n) \cdot 2^{n/2}</math> time with high probability (for example, <math>0.99</math> when <math>n</math> is sufficiently large). Please provide a detailed error analysis as well. (WARNING: You will NOT receive any points if you solve this task using Gaussian elimination)
<ul>
<li>
Given an integer sequence <math>a_1,a_2,\ldots,a_{100n}</math> of length <math>100 n</math> satisfying <math>0 \le a_i < 2^n</math> for all <math>1 \le i \le 100 n</math>. Please find out a non-empty subset <math>S \subseteq \{1,2,\ldots,100n\}</math> satisfying <math>\bigoplus_{i \in S} a_i = 0</math>, i.e. the exclusive or of the elements whose indices are in <math>S</math> equals to <math>0</math>.
</li>
</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players (vertices) <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (and vice versa). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-20T06:46:26Z

Zhangxy: /* Assumption throughout Problem Set 1 */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion, 8 points)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-2-probability-space">Problem 2 (Probability space, 12 points)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field (I)] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\sigma(S) := \bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Smallest <math>\sigma</math>-field (II)] Let <math>S,T \subseteq 2^{\Omega}</math>. Show that <math>\sigma(S) = \sigma(T)</math> if and only if <math>S \subseteq \sigma(T)</math> and <math>T \subseteq \sigma(S)</math>.
</li>

<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field. Suppose <math>\mathcal{F}_1, \mathcal{F}_2,\ldots</math> be a sequence of <math>\sigma</math>-field. Is <math>\bigcup_{i=1}^{+\infty} \mathcal{F}_i</math> a <math>\sigma</math>-field?
</li>

<li>[Projection] Let <math>\mathcal{F}</math> be a <math>\sigma</math>-field of subsets of <math>\Omega</math> and <math>T \subseteq \Omega</math> be a subset. Show that <math>\{S \cap T \mid S \in \mathcal{F}\}</math> is a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

</ul>

<h2 id="problem-3-probability-space">Problem 3 (Birthday paradox)</h2>
Please design a randomized algorithm using the birthday paradox that solves the following problem in <math>\mathrm{poly}(n) \cdot 2^{n/2}</math> time with high probability (for example, <math>0.99</math> when <math>n</math> is sufficiently large). Please provide a detailed error analysis as well. (WARNING: You will NOT receive any points if you solve this task using Gaussian elimination)
<ul>
<li>
Given an integer sequence <math>a_1,a_2,\ldots,a_{100n}</math> of length <math>100 n</math> satisfying <math>0 \le a_i < 2^n</math> for all <math>1 \le i \le 100 n</math>. Please find out a non-empty subset <math>S \subseteq \{1,2,\ldots,100n\}</math> satisfying <math>\bigoplus_{i \in S} a_i = 0</math>, i.e. the exclusive or of the elements whose indices are in <math>S</math> equals to <math>0</math>.
</li>
</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players (vertices) <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (and vice versa). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T13:24:12Z

Zhangxy:

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-2-probability-space">Problem 2 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-algebra (I)] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\sigma(S) := \bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Smallest <math>\sigma</math>-field (II)] Let <math>S,T \subseteq 2^{\Omega}</math>. Show that <math>\sigma(S) = \sigma(T)</math> if and only if <math>S \subseteq \sigma(T)</math> and <math>T \subseteq \sigma(S)</math>.
</li>

<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Projection] Let <math>\mathcal{F}</math> be a <math>\sigma</math>-field of subsets of <math>\Omega</math> and <math>T \subseteq \Omega</math> be a subset. Show that <math>\{S \cap T \mid S \in \mathcal{F}\}</math> is a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

</ul>

<h2 id="problem-3-probability-space">Problem 3 (Birthday paradox)</h2>
Please design a randomized algorithm using the birthday paradox that solves the following problem in <math>\mathrm{poly}(n) \cdot 2^{n/2}</math> time with high probability (for example, <math>0.99</math> when <math>n</math> is sufficiently large). Please provide a detailed error analysis as well. (WARNING: You will NOT receive any points if you solve this task using Gaussian elimination)
<ul>
<li>
Given an integer sequence <math>a_1,a_2,\ldots,a_{100n}</math> of length <math>100 n</math> satisfying <math>0 \le a_i < 2^n</math> for all <math>1 \le i \le 100 n</math>. Please find out a non-empty subset <math>S \subseteq \{1,2,\ldots,100n\}</math> satisfying <math>\bigoplus_{i \in S} a_i = 0</math>, i.e. the exclusive or of the elements whose indices are in <math>S</math> equals to <math>0</math>.
</li>
</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players (vertices) <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (and vice versa). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)

2025-02-19T13:20:27Z

Zhangxy: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A1.pdf').

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T13:16:56Z

Zhangxy:

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-2-probability-space">Problem 2 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-algebra (I)] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\sigma(S) := \bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Smallest <math>\sigma</math>-field (II)] Let <math>S,T \subseteq 2^{\Omega}</math>. Show that <math>\sigma(S) = \sigma(T)</math> if and only if <math>S \subseteq \sigma(T)</math> and <math>T \subseteq \sigma(S)</math>.
</li>

<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Projection] Let <math>\mathcal{F}</math> be a <math>\sigma</math>-field of subsets of <math>\Omega</math> and <math>T \subseteq \Omega</math> be a subset. Show that <math>\{S \cap T \mid S \in \mathcal{F}\}</math> is a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

</ul>

<h2 id="problem-3-probability-space">Problem 3 (Birthday paradox)</h2>
Please design a randomized algorithm using the birthday paradox that solves the following problem in <math>\mathrm{poly}(n) \cdot 2^{n/2}</math> time with high probability (for example, <math>0.99</math> when <math>n</math> is sufficiently large). Please provide a detailed error analysis as well. (WARNING: You will NOT receive any points if you solve this task using Gaussian elimination)
<ul>
<li>
Given an integer sequence <math>a_1,a_2,\ldots,a_{100n}</math> of length <math>100 n</math> satisfying <math>0 \le a_i < 2^n</math> for all <math>1 \le i \le 100 n</math>. Please find out a non-empty subset <math>S \subseteq \{1,2,\ldots,100n\}</math> satisfying <math>\bigoplus_{i \in S} a_i = 0</math>, i.e. the exclusive or of the elements whose indices are in <math>S</math> equals to <math>0</math>.
</li>
</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players (vertices) <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (and vice versa). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T13:09:02Z

Zhangxy:

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-2-probability-space">Problem 2 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-algebra (I)] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\sigma(S) := \bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Smallest <math>\sigma</math>-field (II)] Let <math>S,T \subseteq 2^{\Omega}</math>. Show that <math>\sigma(S) = \sigma(T)</math> if and only if <math>S \subseteq \sigma(T)</math> and <math>T \subseteq \sigma(S)</math>.
</li>

<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Projection] Let <math>\mathcal{F}</math> be a <math>\sigma</math>-field of subsets of <math>\Omega</math> and <math>T \subseteq \Omega</math> be a subset. Show that <math>\{S \cap T \mid S \in \mathcal{F}\}</math> is a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

</ul>

<h2 id="problem-3-probability-space">Problem 3 (Birthday paradox)</h2>
Please design a randomized algorithm using the birthday paradox that solves the following problem in <math>\mathrm{poly}(n) \cdot 2^{n/2}</math> time with high probability (for example, <math>0.99</math> when <math>n</math> is sufficiently large). Please provide a detailed error analysis as well.
<ul>
<li>
Given an integer sequence <math>a_1,a_2,\ldots,a_{100n}</math> of length <math>100 n</math> satisfying <math>0 \le a_i < 2^n</math> for all <math>1 \le i \le 100 n</math>. Please find out a non-empty subset <math>S \subseteq \{1,2,\ldots,100n\}</math> satisfying <math>\bigoplus_{i \in S} a_i = 0</math>, i.e. the exclusive or of the elements whose indices are in <math>S</math> equals to <math>0</math>.
</li>
</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players (vertices) <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (and vice versa). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T12:04:53Z

Zhangxy: Created page with "*每道题目的解答都要有完整的解题过程，中英文不限。 *我们推荐大家使用LaTeX, markdown等对作业进行排版。 == Assumption throughout Problem Set 1== Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>. <h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2> Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_..."

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Coprime integers] Given positive integers <math>n \ge 2</math>, calculate the number of integer pairs <math>(x,y)</math> satisfying <math>1 \le x < y \le n</math> and <math>\mathrm{gcd}(x,y) = 1</math>.
(Remark: It suggests the probability that two randomly chosen integers from <math>1</math> to <math>n</math> are coprime tends to <math>\frac{6}{\pi^2}</math> as <math>n</math> tends to infinity.)
</li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-1-classic-examples">Problem 2 (Symmetric 1D random walk)</h2>
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns <math>1</math> point if it's HEADs, and loses <math>1</math> point if otherwise.
<ul>
<li>[Symmetric 1D random walk (I)] Let <math>A_i</math> be the event that the gambler earns <math>0</math> points after playing <math>i</math> rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i)</math>. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty</math>.)
</li>
<li>[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his <math>m</math> points. Let <math>B_i</math> be the event that the game ends within <math>i</math> rounds. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i})</math>.
(Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i)</math>. (ii) Compute <math>\mathbf{Pr}(C_i)</math>, which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty</math>.)
</li>
</ul>

<h2 id="problem-3-probability-space">Problem 3 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

<li>[<math>\sigma</math>-field?] A set <math>A \subseteq \mathbb{N}</math> is said to have asymptotic density <math>\theta</math> if <math>\lim_{n \to \infty} |A \cap \{1,2,\ldots,n\}| / n = \theta</math>. Let <math>\mathcal{A}</math> be the collection of sets for which the asymptotic density exists. Is <math>\mathcal{A}</math> a <math>\sigma</math>-algebra? Please explain your answer.
</li>

</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
<li>['''Noisy channel'''] When coded messages are sent, there are sometimes errors in transmission. In particular, Morse code uses "dots" and "dashes", which are known to occur in the proportion of 3:4. In other words, the probability of a "dot" being sent is <math>3/7</math>, and the probability of a "dash" being sent is <math>4/7</math>. Suppose due to interference on the transmission channel, with probability <math>1/8</math>, a dot will be mistakenly received as a dash, and vice versa. If we receive a dot, what is the probability that the transmitted symbol is indeed a dot?
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
A '''CNF formula''' （'''合取范式'''） <math>\Phi</math> over <math>n</math> Boolean variables <math>x_1,\cdots, x_n</math> is a '''conjunction''' (<math>\land</math>) of '''clauses''' ('''子句''') <math>\Phi=C_1\land C_2\land\cdots\land C_m</math>, where each clause <math>C_j=\ell_{j_1}\lor\ell_{j_2}\cdots\lor\ell_{j_k}</math> is a '''disjunction''' (<math>\lor</math>) of '''literals''' ('''文字'''), where a literal <math>\ell_r</math> is either a variable <math>x_i</math> or the negation <math>\bar{x}_i</math> of a variable. A CNF formula is '''satisfiable''' ('''可满足''') if there is a truth assignment <math>x=(x_1,\cdots, x_n)\in \{\mathtt{true},\mathtt{false}\}^n</math> to the variables such that <math>\Phi(x)=\mathtt{true}</math>. A <math>k</math>-CNF formula is a CNF formula in which each clause contains exactly <math>k</math> literals (without repetition).
<ul>
<li>[Satisfiability (I)] Let <math>\Phi</math> be a <math>k</math>-CNF with less than <math>2^k</math> clauses. Use the probabilistic method to show that <math>\Phi</math> must be satisfiable. You should be explicit about the probability space that is used. </li>
<li>[Satisfiability (II)] Give a constructive proof of the same problem above. That is, prove that <math>\Phi</math> is satisfiable by showing how to construct a truth assignment <math>x=(x_1,\cdots, x_n)\in \{\mathtt{true},\mathtt{false}\}^n</math> such that <math>\Phi(x)=\mathtt{true}</math>. Your construction does NOT have to be efficient. Please explain the difference between this constructive proof and the proof by the probabilistic method.</li>
<li>[Satisfiability (III)] Let <math>\Phi</math> be a <math>k</math>-CNF with <math>m\geq 2^k</math> clauses. Use the probabilistic method to show that there exists a truth assignment <math>x=(x_1,\cdots, x_n)\in \{\mathtt{true},\mathtt{false}\}^n</math> satisfying at least <math>\lfloor m(1-1/2^k) \rfloor</math> clauses in <math>\Phi</math>. (Hint: Consider overlaps of events in Venn diagram.) You should be explicit about the probability space that is used.</li>
</ul>

概率论与数理统计 (Spring 2025)

2025-02-17T06:26:53Z

Zhangxy:

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
TBA

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)

2025-02-17T06:25:33Z

Zhangxy: /* Course info */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = TBD 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
TBA

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)

2025-02-16T07:34:01Z

Zhangxy: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = TBD 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* TBD, 计算机系 804（尹一通）
:* TBD, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
TBA

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2024/Intro.pdf 课程简介]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)

2025-02-16T07:31:19Z

Zhangxy: /* Course info */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = TBD 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* TBD, 计算机系 804（尹一通）
:* TBD, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
TBD

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2024/Intro.pdf 课程简介]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)

2025-02-16T07:30:11Z

Zhangxy: Created page with "{{Infobox |name = Infobox |bodystyle = |title = '''概率论与数理统计''' '''Probability Theory''' & '''Mathematical Statistics''' |titlestyle = |image = |imagestyle = |caption = |captionstyle = |headerstyle = background:#ccf; |labelstyle = background:#ddf; |datastyle = |header1 =Instructor |label1 = |data1 = |header2 = |label2 = |data2 = '''尹一通''' |header3 = |label3 = Em..."

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = TBD 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* Monday, 4pm-5pm, 计算机系 804（尹一通）
:* Tuesday, 3pm-4pm, 计算机系 516（刘景铖）
:* '''QQ群''': 629856946（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
TBD

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2024/Intro.pdf 课程简介]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

Main Page

2025-02-16T07:26:07Z

Zhangxy: /* Home Pages for Courses and Seminars */

This is a course/seminar wiki run by the [http://tcs.nju.edu.cn theory group] in the Department of Computer Science and Technology at Nanjing University.

== Home Pages for Courses and Seminars==
*[[高级算法 (Fall 2024)|高级算法 Advanced Algorithms (Fall 2024)]]

*[[高级算法 (Spring 2025)|高级算法 Advanced Algorithms (Spring 2025 苏州校区)]]

* [[概率论与数理统计 (Spring 2025) | 概率论与数理统计 Probability Theory (Spring 2025)]]

*[[Theory Seminar|理论计算机科学讨论班]]

*[[Study Group|理论计算机科学学习小组]]

;Past courses

* Advanced Algorithms: [[高级算法 (Fall 2024)|Fall 2024]], [[高级算法 (Fall 2023)|Fall 2023]], [[高级算法 (Fall 2022)|Fall 2022]], [[高级算法 (Fall 2021)|Fall 2021]], [[高级算法 (Fall 2020)|Fall 2020]], [[高级算法 (Fall 2019)|Fall 2019]], [[高级算法 (Fall 2018)|Fall 2018]], [[高级算法 (Fall 2017)|Fall 2017]], [[随机算法 \ 高级算法 (Fall 2016)|Fall 2016]].

*Algorithm Design and Analysis: [https://tcs.nju.edu.cn/shili/courses/2024spring-algo/ Spring 2024]

* Combinatorics: [[组合数学 (Spring 2024)|Spring 2024]], [[组合数学 (Spring 2023)|Spring 2023]], [[组合数学 (Fall 2019)|Fall 2019]], [[组合数学 (Fall 2017)|Fall 2017]], [[组合数学 (Fall 2016)|Fall 2016]], [[组合数学 (Fall 2015)|Fall 2015]], [[组合数学 (Spring 2014)|Spring 2014]], [[组合数学 (Spring 2013)|Spring 2013]], [[组合数学 (Fall 2011)|Fall 2011]], [[Combinatorics (Fall 2010)|Fall 2010]].

* Computational Complexity: [[计算复杂性 (Spring 2025)|Spring 2025]], [[计算复杂性 (Spring 2024)|Spring 2024]], [[计算复杂性 (Spring 2023)|Spring 2023]], [[计算复杂性 (Fall 2019)|Fall 2019]], [[计算复杂性 (Fall 2018)|Fall 2018]].

* Numerical Method: [[计算方法 Numerical method (Spring 2024)|Spring 2024]], [[计算方法 Numerical method (Spring 2023)|Spring 2023]], [https://liuexp.github.io/numerical.html Spring 2022].

* Probability Theory: [[概率论与数理统计 (Spring 2024)|Spring 2024]], [[概率论与数理统计 (Spring 2023)|Spring 2023]].

* Quantum Computation: [[量子计算 (Spring 2022)|Spring 2022]], [[量子计算 (Spring 2021)|Spring 2021]], [[量子计算 (Fall 2019)|Fall 2019]].

* Randomized Algorithms: [[随机算法 (Fall 2015)|Fall 2015]], [[随机算法 (Spring 2014)|Spring 2014]], [[随机算法 (Spring 2013)|Spring 2013]], [[随机算法 (Fall 2011)|Fall 2011]], [[Randomized Algorithms (Spring 2010)|Spring 2010]].

;Past seminars, workshops and summer schools
*计算理论之美暑期学校: [[计算理论之美 (Summer 2024)|2024]], [[计算理论之美 (Summer 2023)|2023]], [[计算理论之美 (Summer 2021)|2021]]
*[[TCSPhD2020| 理论计算机科学优秀博士生论坛2020]]
*[[Quantum|量子算法与物理实现研讨会]]
*Nanjing Theory Day: [[Theory@Nanjing 2019|2019]], [[Theory@Nanjing 2018|2018]], [[Theory@Nanjing 2017|2017]]
*[[\Delta Seminar on Logic, Philosophy, and Computer Science|Δ Seminar on Logic, Philosophy, and Computer Science]]
*[[近似算法讨论班 (Fall 2011)|近似算法 Approximation Algorithms, Fall 2011.]]

; 其它链接
* [[General Circulation(Fall 2024)|大气环流 General Circulation of the Atmosphere, Fall 2024]]
* [[General Circulation(Fall 2023)|大气环流 General Circulation of the Atmosphere, Fall 2023]]

* [[概率论 (Summer 2014)| 概率与计算 (上海交大 Summer 2014)]]

Main Page

2025-02-16T07:25:41Z

Zhangxy: /* Home Pages for Courses and Seminars */

This is a course/seminar wiki run by the [http://tcs.nju.edu.cn theory group] in the Department of Computer Science and Technology at Nanjing University.

== Home Pages for Courses and Seminars==
*[[高级算法 (Fall 2024)|高级算法 Advanced Algorithms (Fall 2024)]]

*[[高级算法 (Spring 2025)|高级算法 Advanced Algorithms (Spring 2025 苏州校区)]]

* [[概率论与数理统计 (Spring 2025) | 概率论与数理统计 (Spring 2025)]]

*[[Theory Seminar|理论计算机科学讨论班]]

*[[Study Group|理论计算机科学学习小组]]

;Past courses

* Advanced Algorithms: [[高级算法 (Fall 2024)|Fall 2024]], [[高级算法 (Fall 2023)|Fall 2023]], [[高级算法 (Fall 2022)|Fall 2022]], [[高级算法 (Fall 2021)|Fall 2021]], [[高级算法 (Fall 2020)|Fall 2020]], [[高级算法 (Fall 2019)|Fall 2019]], [[高级算法 (Fall 2018)|Fall 2018]], [[高级算法 (Fall 2017)|Fall 2017]], [[随机算法 \ 高级算法 (Fall 2016)|Fall 2016]].

*Algorithm Design and Analysis: [https://tcs.nju.edu.cn/shili/courses/2024spring-algo/ Spring 2024]

* Combinatorics: [[组合数学 (Spring 2024)|Spring 2024]], [[组合数学 (Spring 2023)|Spring 2023]], [[组合数学 (Fall 2019)|Fall 2019]], [[组合数学 (Fall 2017)|Fall 2017]], [[组合数学 (Fall 2016)|Fall 2016]], [[组合数学 (Fall 2015)|Fall 2015]], [[组合数学 (Spring 2014)|Spring 2014]], [[组合数学 (Spring 2013)|Spring 2013]], [[组合数学 (Fall 2011)|Fall 2011]], [[Combinatorics (Fall 2010)|Fall 2010]].

* Computational Complexity: [[计算复杂性 (Spring 2025)|Spring 2025]], [[计算复杂性 (Spring 2024)|Spring 2024]], [[计算复杂性 (Spring 2023)|Spring 2023]], [[计算复杂性 (Fall 2019)|Fall 2019]], [[计算复杂性 (Fall 2018)|Fall 2018]].

* Numerical Method: [[计算方法 Numerical method (Spring 2024)|Spring 2024]], [[计算方法 Numerical method (Spring 2023)|Spring 2023]], [https://liuexp.github.io/numerical.html Spring 2022].

* Probability Theory: [[概率论与数理统计 (Spring 2024)|Spring 2024]], [[概率论与数理统计 (Spring 2023)|Spring 2023]].

* Quantum Computation: [[量子计算 (Spring 2022)|Spring 2022]], [[量子计算 (Spring 2021)|Spring 2021]], [[量子计算 (Fall 2019)|Fall 2019]].

* Randomized Algorithms: [[随机算法 (Fall 2015)|Fall 2015]], [[随机算法 (Spring 2014)|Spring 2014]], [[随机算法 (Spring 2013)|Spring 2013]], [[随机算法 (Fall 2011)|Fall 2011]], [[Randomized Algorithms (Spring 2010)|Spring 2010]].

;Past seminars, workshops and summer schools
*计算理论之美暑期学校: [[计算理论之美 (Summer 2024)|2024]], [[计算理论之美 (Summer 2023)|2023]], [[计算理论之美 (Summer 2021)|2021]]
*[[TCSPhD2020| 理论计算机科学优秀博士生论坛2020]]
*[[Quantum|量子算法与物理实现研讨会]]
*Nanjing Theory Day: [[Theory@Nanjing 2019|2019]], [[Theory@Nanjing 2018|2018]], [[Theory@Nanjing 2017|2017]]
*[[\Delta Seminar on Logic, Philosophy, and Computer Science|Δ Seminar on Logic, Philosophy, and Computer Science]]
*[[近似算法讨论班 (Fall 2011)|近似算法 Approximation Algorithms, Fall 2011.]]

; 其它链接
* [[General Circulation(Fall 2024)|大气环流 General Circulation of the Atmosphere, Fall 2024]]
* [[General Circulation(Fall 2023)|大气环流 General Circulation of the Atmosphere, Fall 2023]]

* [[概率论 (Summer 2014)| 概率与计算 (上海交大 Summer 2014)]]

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-06-03T07:26:13Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 10 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Characteristic Function]
Suppose <math>X</math> is a discrete random variable only taking integer values and <math>\mathbf{E}[X]=0</math>. Suppose furthermore that there is no infinite subprogression <math>a+q\mathbb{Z}</math> of <math>\mathbb{Z}</math> with <math>q>1</math> for which <math>X</math> takes values almost surely in <math>a+q \mathbb{Z}</math>. Prove that <math>|\phi_X(t)|<1</math> for <math>0 < t \le \pi</math>, where <math>\phi_X(t)=\mathbf{E}[\mathrm{e}^{itX}]</math> is the characteristic function of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded, where <math>S_n = \sum_{i=1}^n |X_i|</math>.
</li>

</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>2(\sqrt{S_n} - \sqrt{n}) \overset{D}{\to} \sigma N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-29T04:02:01Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 10 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Characteristic Function]
Suppose <math>X</math> is a discrete random variable only taking integer values and <math>\mathbf{E}[X]=0</math>. Suppose furthermore that there is no infinite subprogression <math>a+q\mathbb{Z}</math> of <math>\mathbb{Z}</math> with <math>q>1</math> for which <math>X</math> takes values almost surely in <math>a+q \mathbb{Z}</math>. Prove that <math>|\phi_X(t)|<1</math> for <math>0 < t \le \pi</math>, where <math>\phi_X(t)=\mathbf{E}[\mathrm{e}^{itX}]</math> is the characteristic function of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded.
</li>

</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>2(\sqrt{S_n} - \sqrt{n}) \overset{D}{\to} \sigma N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2024)

2024-05-25T05:48:55Z

Zhangxy: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 2pm-4pm 
Wednesday (双), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Monday, 4pm-5pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2024. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：2pm-4pm，仙Ⅰ-204
** 周三（双）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* Monday, 4pm-5pm, 计算机系 804（尹一通）
:* Tuesday, 3pm-4pm, 计算机系 516（刘景铖）
:* '''QQ群''': 629856946（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2024)/Problem Set 1|Problem Set 1]] 请在 <strike>2024/4/3</strike>2024/4/8 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2024)/第一次作业提交名单|第一次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 2|Problem Set 2]] 请在 <strike>2024/4/29</strike>2024/5/6 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A2.pdf').
** [[概率论与数理统计 (Spring 2024)/第二次作业提交名单|第二次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 3|Problem Set 3]] 请在 2024/5/20 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A3.pdf').
** [[概率论与数理统计 (Spring 2024)/第三次作业提交名单|第三次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 4|Problem Set 4]] 请在 2024/6/12 上课之前(10am UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A4.pdf').

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2024/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2024/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2024)/Karger's min-cut algorithm| Karger's min-cut algorithm]]
# [http://tcs.nju.edu.cn/slides/prob2024/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''
#* the [http://tcs.nju.edu.cn/slides/prob2024/discrete-pmf.nb '''Mathematica notebook file'''] for the PMFs of basic discrete distributions
#* [[概率论与数理统计 (Spring 2024)/Average-case analysis of QuickSort|Average-case analysis of '''''QuickSort''''']]
#* [https://www.bilibili.com/video/BV1ta411A7fp/ 高尔顿板（Galton board）视频] 和 [https://en.wikipedia.org/wiki/Galton_board 维基百科页面]
# [http://tcs.nju.edu.cn/slides/prob2024/Deviation.pdf 矩与偏差]
#* 阅读：'''[MU] Chapter 3'''
#* 阅读：'''[BT] 章节 2.4, 4.2, 4.3, 5.1''' 或 '''[GS] Sections 3.3, 3.6, 7.3'''
#* [[概率论与数理统计 (Spring 2024)/Two-point sampling|Two-point sampling]]
#* [[概率论与数理统计 (Spring 2024)/Threshold of k-clique in random graph|Threshold of <math>k</math>-clique in random graph]]
#* [[概率论与数理统计 (Spring 2024)/Weierstrass Approximation Theorem|Weierstrass approximation]]
# [http://tcs.nju.edu.cn/slides/prob2024/Continuous.pdf 连续分布]
#* 阅读：'''[BT] 第3章, 和4.1节''' 或 '''[GS] Chapter 4'''
#* 阅读：'''[MU] Chapters 8, 9'''
#* [https://measure.axler.net/MIRA.pdf Measure, Integration & Real Analysis] by Sheldon Axler
# [http://tcs.nju.edu.cn/slides/prob2024/Convergence.pdf 极限定理]
#* 阅读：'''[BT] 第5章'''
#* 阅读：'''[GS] Sections 5.7~5.10, 7.1~7.5'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]

概率论与数理统计 (Spring 2024)

2024-05-24T15:34:55Z

Zhangxy: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 2pm-4pm 
Wednesday (双), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Monday, 4pm-5pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2024. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：2pm-4pm，仙Ⅰ-204
** 周三（双）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* Monday, 4pm-5pm, 计算机系 804（尹一通）
:* Tuesday, 3pm-4pm, 计算机系 516（刘景铖）
:* '''QQ群''': 629856946（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2024)/Problem Set 1|Problem Set 1]] 请在 <strike>2024/4/3</strike>2024/4/8 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2024)/第一次作业提交名单|第一次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 2|Problem Set 2]] 请在 <strike>2024/4/29</strike>2024/5/6 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A2.pdf').
** [[概率论与数理统计 (Spring 2024)/第二次作业提交名单|第二次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 3|Problem Set 3]] 请在 2024/5/20 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A3.pdf').
** [[概率论与数理统计 (Spring 2024)/第三次作业提交名单|第三次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 4|Problem Set 4]] 请在 2024/6/10 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A4.pdf').

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2024/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2024/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2024)/Karger's min-cut algorithm| Karger's min-cut algorithm]]
# [http://tcs.nju.edu.cn/slides/prob2024/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''
#* the [http://tcs.nju.edu.cn/slides/prob2024/discrete-pmf.nb '''Mathematica notebook file'''] for the PMFs of basic discrete distributions
#* [[概率论与数理统计 (Spring 2024)/Average-case analysis of QuickSort|Average-case analysis of '''''QuickSort''''']]
#* [https://www.bilibili.com/video/BV1ta411A7fp/ 高尔顿板（Galton board）视频] 和 [https://en.wikipedia.org/wiki/Galton_board 维基百科页面]
# [http://tcs.nju.edu.cn/slides/prob2024/Deviation.pdf 矩与偏差]
#* 阅读：'''[MU] Chapter 3'''
#* 阅读：'''[BT] 章节 2.4, 4.2, 4.3, 5.1''' 或 '''[GS] Sections 3.3, 3.6, 7.3'''
#* [[概率论与数理统计 (Spring 2024)/Two-point sampling|Two-point sampling]]
#* [[概率论与数理统计 (Spring 2024)/Threshold of k-clique in random graph|Threshold of <math>k</math>-clique in random graph]]
#* [[概率论与数理统计 (Spring 2024)/Weierstrass Approximation Theorem|Weierstrass approximation]]
# [http://tcs.nju.edu.cn/slides/prob2024/Continuous.pdf 连续分布]
#* 阅读：'''[BT] 第3章, 和4.1节''' 或 '''[GS] Chapter 4'''
#* 阅读：'''[MU] Chapters 8, 9'''
#* [https://measure.axler.net/MIRA.pdf Measure, Integration & Real Analysis] by Sheldon Axler
# [http://tcs.nju.edu.cn/slides/prob2024/Convergence.pdf 极限定理]
#* 阅读：'''[BT] 第5章'''
#* 阅读：'''[GS] Sections 5.7~5.10, 7.1~7.5'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T15:33:12Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 5 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Characteristic Function]
Suppose <math>X</math> is a discrete random variable only taking integer values and <math>\mathbf{E}[X]=0</math>. Suppose furthermore that there is no infinite subprogression <math>a+q\mathbb{Z}</math> of <math>\mathbb{Z}</math> with <math>q>1</math> for which <math>X</math> takes values almost surely in <math>a+q \mathbb{Z}</math>. Prove that <math>|\phi_X(t)|<1</math> for <math>0 < t \le \pi</math>, where <math>\phi_X(t)=\mathbf{E}[\mathrm{e}^{itX}]</math> is the characteristic function of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded.
</li>

</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T15:09:05Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 5 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Characteristic Function]
Suppose <math>X</math> is a discrete random variable only taking integer values and <math>\mathbf{E}[X]=0</math>. Suppose furthermore that there is no infinite subprogression <math>a+q\mathbb{Z}</math> of <math>\mathbb{Z}</math> with <math>q>1</math> for which <math>X</math> takes values almost surely in <math>a+q \mathbb{Z}</math>. Prove that <math>|\phi_X(t)|<1</math> for <math>0 < t \le \pi</math>, where <math>\phi_X(t)=\mathbf{E}[\mathrm{e}^{itX}]</math> is the characteristic function of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T15:07:52Z

Zhangxy: /* Problem 3 (Modes of Convergence and Characteristic Function, 20 points) (Bonus problem) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Characteristic Function]
Suppose <math>X</math> is a discrete random variable only taking integer values and <math>\mathbf{E}[X]=0</math>. Suppose furthermore that there is no infinite subprogression <math>a+q\mathbb{Z}</math> of <math>\mathbb{Z}</math> with <math>q>1</math> for which <math>X</math> takes values almost surely in <math>a+q \mathbb{Z}</math>. Prove that <math>|\phi_X(t)|<1</math> for <math>0 < t \le \pi</math>, where <math>\phi_X(t)=\mathbf{E}[\mathrm{e}^{itX}]</math> is the characteristic function of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T13:31:17Z

Zhangxy: /* Problem 3 (Modes of Convergence and Characteristic Function, 20 points) (Bonus problem) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 20 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Extension of Borel-Cantelli Lemma]
Let <math>(A_n)_{n \ge 1}</math> be events. Suppose <math>\sum_{n \ge 1} \mathbf{Pr}(A_n)=+\infty</math>. Show that
<math>\mathbf{Pr}(A_n \text{ i.o.}) \ge \limsup_{n \to \infty} \frac{ \left(\sum_{k=1}^n\mathbf{Pr}(A_k)\right)^2 }{\sum_{1\le j,k \le n} \mathbf{Pr}(A_j \cap A_k)}</math>.
</li>

<li>
[Characteristic Function]
Suppose <math>X</math> is a discrete random variable only taking integer values and <math>\mathbf{E}[X]=0</math>. Suppose furthermore that there is no infinite subprogression <math>a+q\mathbb{Z}</math> of <math>\mathbb{Z}</math> with <math>q>1</math> for which <math>X</math> takes values almost surely in <math>a+q \mathbb{Z}</math>. Prove that <math>|\phi_X(t)|<1</math> for <math>0 < t \le \pi</math>, where <math>\phi_X(t)=\mathbf{E}[\mathrm{e}^{itX}]</math> is the characteristic function of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T13:27:55Z

Zhangxy: /* Problem 3 (Modes of Convergence, 15 points) (Bonus problem) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 20 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Extension of Borel-Cantelli Lemma]
Let <math>(A_n)_{n \ge 1}</math> be events. Suppose <math>\sum_{n \ge 1} \mathbf{Pr}(A_n)=+\infty</math>. Show that
<math>\mathbf{Pr}(A_n \text{ i.o.}) \ge \limsup_{n \to \infty} \frac{ \left(\sum_{k=1}^n\mathbf{Pr}(A_k)\right)^2 }{\sum_{1\le j,k \le n} \mathbf{Pr}(A_j \cap A_k)}</math>.
</li>

<li>
[Characteristic Function]
Suppose <math>X</math> is a discrete random variable only taking integer values and <math>\mathbf{E}[X]=0</math>. Prove that <math>|\phi_X(t)|<1</math> for <math>0 < t \le \pi</math>, where <math>\phi_X(t)=\mathbf{E}[\mathrm{e}^{itX}]</math> is the characteristic function of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T13:09:23Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 5 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process ]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

<ul>
<li>[Chernoff bound meets graph theory]
<ul>
<li> Show that with a probability approaching 1 (as <math>n</math> tends to infinity), the Erdős–Rényi random graph <math>\textbf{G}(n,1/2)</math> has the property that the maximum degree is <math>(\frac{n}{2} + O(\sqrt{n\log n}))</math>.
</li>
<li> Show that with a probability approaching 1 (as <math>n</math> tends to infinity), the Erdős–Rényi random graph <math>\textbf{G}(n,1/2)</math> has the property that the diameter is exactly 2. The diameter of a graph <math>G</math> is the maximum distance between any pair of vertices.
</li>
</ul>
</li>
</ul>

== Problem 3 (Modes of Convergence, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Extension of Borel-Cantelli Lemma]
Let <math>(A_n)_{n \ge 1}</math> be events. Suppose <math>\sum_{n \ge 1} \mathbf{Pr}(A_n)=+\infty</math>. Show that
<math>\mathbf{Pr}(A_n \text{ i.o.}) \ge \limsup_{n \to \infty} \frac{ \left(\sum_{k=1}^n\mathbf{Pr}(A_k)\right)^2 }{\sum_{1\le j,k \le n} \mathbf{Pr}(A_j \cap A_k)}</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T13:08:58Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 5 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process (I)]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process 1''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 1</math> and <math>count = 0</math>;
:while <math> x > U </math> do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x * y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>

<li>[Random process (II)]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process 2''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

<ul>
<li>[Chernoff bound meets graph theory]
<ul>
<li> Show that with a probability approaching 1 (as <math>n</math> tends to infinity), the Erdős–Rényi random graph <math>\textbf{G}(n,1/2)</math> has the property that the maximum degree is <math>(\frac{n}{2} + O(\sqrt{n\log n}))</math>.
</li>
<li> Show that with a probability approaching 1 (as <math>n</math> tends to infinity), the Erdős–Rényi random graph <math>\textbf{G}(n,1/2)</math> has the property that the diameter is exactly 2. The diameter of a graph <math>G</math> is the maximum distance between any pair of vertices.
</li>
</ul>
</li>
</ul>

== Problem 3 (Modes of Convergence, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Extension of Borel-Cantelli Lemma]
Let <math>(A_n)_{n \ge 1}</math> be events. Suppose <math>\sum_{n \ge 1} \mathbf{Pr}(A_n)=+\infty</math>. Show that
<math>\mathbf{Pr}(A_n \text{ i.o.}) \ge \limsup_{n \to \infty} \frac{ \left(\sum_{k=1}^n\mathbf{Pr}(A_k)\right)^2 }{\sum_{1\le j,k \le n} \mathbf{Pr}(A_j \cap A_k)}</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the conclusion in )
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T13:08:23Z

Zhangxy: /* Problem 4 (LLN and CLT, 15 points + 5 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process (I)]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process 1''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 1</math> and <math>count = 0</math>;
:while <math> x > U </math> do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x * y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>

<li>[Random process (II)]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process 2''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

<ul>
<li>[Chernoff bound meets graph theory]
<ul>
<li> Show that with a probability approaching 1 (as <math>n</math> tends to infinity), the Erdős–Rényi random graph <math>\textbf{G}(n,1/2)</math> has the property that the maximum degree is <math>(\frac{n}{2} + O(\sqrt{n\log n}))</math>.
</li>
<li> Show that with a probability approaching 1 (as <math>n</math> tends to infinity), the Erdős–Rényi random graph <math>\textbf{G}(n,1/2)</math> has the property that the diameter is exactly 2. The diameter of a graph <math>G</math> is the maximum distance between any pair of vertices.
</li>
</ul>
</li>
</ul>

== Problem 3 (Modes of Convergence, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Extension of Borel-Cantelli Lemma]
Let <math>(A_n)_{n \ge 1}</math> be events. Suppose <math>\sum_{n \ge 1} \mathbf{Pr}(A_n)=+\infty</math>. Show that
<math>\mathbf{Pr}(A_n \text{ i.o.}) \ge \limsup_{n \to \infty} \frac{ \left(\sum_{k=1}^n\mathbf{Pr}(A_k)\right)^2 }{\sum_{1\le j,k \le n} \mathbf{Pr}(A_j \cap A_k)}</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where $S_n = \sum_{i=1}^n X_i$.
</li>

</ul>

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T13:01:28Z

Zhangxy: /* Problem 3 (LLN and CLT, 15 points + 5 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process (I)]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process 1''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 1</math> and <math>count = 0</math>;
:while <math> x > U </math> do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x * y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>

<li>[Random process (II)]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process 2''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Modes of Convergence, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Extension of Borel-Cantelli Lemma]
Let <math>(A_n)_{n \ge 1}</math> be events. Suppose <math>\sum_{n \ge 1} \mathbf{Pr}(A_n)=+\infty</math>. Show that
<math>\mathbf{Pr}(A_n \text{ i.o.}) \ge \limsup_{n \to \infty} \frac{ \left(\sum_{k=1}^n\mathbf{Pr}(A_k)\right)^2 }{\sum_{1\le j,k \le n} \mathbf{Pr}(A_j \cap A_k)}</math>.
</li>
</ul>

== Problem 3 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Coupon collector] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x</math>)
</li>

<li>
[Normalized sum] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>\mathbf{E}[X_1] = 0</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\frac{\sum_{k=1}^n X_k}{\left(\sum_{k=1}^n X_k^2\right)^{1/2}} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>.
</li>

</ul>

== Problem 4 (Concentration of measure) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

<ul>
<li>[Chernoff bound meets graph theory]
<ul>
<li> Show that with a probability approaching 1 (as <math>n</math> tends to infinity), the Erdős–Rényi random graph <math>\textbf{G}(n,1/2)</math> has the property that the maximum degree is <math>(\frac{n}{2} + O(\sqrt{n\log n}))</math>.
</li>
<li> Show that with a probability approaching 1 (as <math>n</math> tends to infinity), the Erdős–Rényi random graph <math>\textbf{G}(n,1/2)</math> has the property that the diameter is exactly 2. The diameter of a graph <math>G</math> is the maximum distance between any pair of vertices.
</li>
</ul>
</li>
</ul>