概率论与数理统计 (Spring 2026)/Problem Set 2: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
| (6 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
*每道题目的解答都要有完整的解题过程,中英文不限。 | *每道题目的解答都要有完整的解题过程,中英文不限。 | ||
| Line 16: | Line 14: | ||
<p>The term <math>\log</math> used in this context refers to the natural logarithm.</p> | <p>The term <math>\log</math> used in this context refers to the natural logarithm.</p> | ||
== Problem 1 (Warm-up problems, | == Problem 1 (Warm-up problems, 14 points) == | ||
* ['''Cumulative distribution function (CDF | * ['''Function of random variable'''] Let [math]X[/math] be a random variable and [math]g:\mathbb{R} \to \mathbb{R}[/math] be a continuous and strictly increasing function. Show that [math]Y = g(X)[/math] is a random variable.</li> | ||
* [''' | * ['''Cumulative distribution function (CDF)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], and [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math]. | ||
* [''' | * ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumption of independence? | ||
* ['''Expectation'''] Provide specific examples of a random variable <math>X</math> and a function <math>f</math> for each of the following three scenarios. For each case, clearly define the probability distribution of <math>X</math>, define the function <math>f(x)</math>, and show the calculations for both <math>\mathbf{E}[f(X)]</math> and <math>f(\mathbf{E}[X])</math>. | |||
<ol> | <ol> | ||
<li> | <li><math>\mathbf{E}[f(X)] = f(\mathbf{E}[X])</math></li> | ||
<li><math>\mathbf{E}[f(X)] > f(\mathbf{E}[X])</math></li> | |||
<li><math>\mathbf{E}[f(X)] < f(\mathbf{E}[X])</math></li> | |||
< | </ol> | ||
* [<strong>Law of total expectation</strong>] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation. | * [<strong>Law of total expectation</strong>] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation. | ||
* [<strong>Random number of random variables</strong>] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variables and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>. | |||
* [<strong>Entropy of discrete random variable</strong>] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ | * [<strong>Entropy of discrete random variable</strong>] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \in [N]} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality. | ||
== Problem 2 (Discrete random variable, 14 points) == | == Problem 2 (Discrete random variable, 14 points) == | ||
| Line 66: | Line 62: | ||
<ul> | <ul> | ||
<li> [<strong>Streak</strong>] | <li> [<strong>Streak</strong>] | ||
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third | Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third flip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>. | ||
</li> | </li> | ||
| Line 81: | Line 77: | ||
<ul> | <ul> | ||
<li>[<strong>Random social networks</strong>] | <li>[<strong>Random social networks</strong>] | ||
Let <math>G = (V, E)</math> be a <strong>fixed</strong> undirected graph without | Let <math>G = (V, E)</math> be a <strong>fixed</strong> undirected graph without isolated vertex. | ||
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>. | Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>. | ||
<ol> | <ol> | ||
| Line 94: | Line 90: | ||
<li>[<strong>Turán's Theorem</strong>] | <li>[<strong>Turán's Theorem</strong>] | ||
Let <math>G=(V,E)</math> be a <strong>fixed</strong> undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use | Let <math>G=(V,E)</math> be a <strong>fixed</strong> undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probabilistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.) | ||
</li> | </li> | ||
Latest revision as of 01:58, 8 April 2026
- 每道题目的解答都要有完整的解题过程,中英文不限。
- 我们推荐大家使用LaTeX, markdown等对作业进行排版。
- 为督促大家认真完成平时作业、扎实掌握课程内容,本课程期末考试将从作业题目中随机抽取部分题目进行考查。请大家务必重视每一次作业,认真理解解题思路。
- 若考试中被抽取到的作业题目答错、答不完整或无法作答,将按照相关标准对作业进行扣分处理。
Assumption throughout Problem Set 2
Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\mathbf{Pr}) }[/math].
Without further notice, we assume that the expectation of random variables are well-defined.
The term [math]\displaystyle{ \log }[/math] used in this context refers to the natural logarithm.
Problem 1 (Warm-up problems, 14 points)
- [Function of random variable] Let [math]X[/math] be a random variable and [math]g:\mathbb{R} \to \mathbb{R}[/math] be a continuous and strictly increasing function. Show that [math]Y = g(X)[/math] is a random variable.
- [Cumulative distribution function (CDF)] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], and [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
- [Independence] Let [math]\displaystyle{ X_r }[/math], [math]\displaystyle{ 1\leq r\leq n }[/math] be independent random variables which are symmetric about [math]\displaystyle{ 0 }[/math]; that is, [math]\displaystyle{ X_r }[/math] and [math]\displaystyle{ -X_r }[/math] have the same distributions. Show that, for all [math]\displaystyle{ x }[/math], [math]\displaystyle{ \mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x] }[/math] where [math]\displaystyle{ S_n = \sum_{r=1}^n X_r }[/math]. Is the conclusion true without the assumption of independence?
- [Expectation] Provide specific examples of a random variable [math]\displaystyle{ X }[/math] and a function [math]\displaystyle{ f }[/math] for each of the following three scenarios. For each case, clearly define the probability distribution of [math]\displaystyle{ X }[/math], define the function [math]\displaystyle{ f(x) }[/math], and show the calculations for both [math]\displaystyle{ \mathbf{E}[f(X)] }[/math] and [math]\displaystyle{ f(\mathbf{E}[X]) }[/math].
- [math]\displaystyle{ \mathbf{E}[f(X)] = f(\mathbf{E}[X]) }[/math]
- [math]\displaystyle{ \mathbf{E}[f(X)] \gt f(\mathbf{E}[X]) }[/math]
- [math]\displaystyle{ \mathbf{E}[f(X)] \lt f(\mathbf{E}[X]) }[/math]
- [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.
- [Random number of random variables] Let [math]\displaystyle{ \{X_n\}_{n \ge 1} }[/math] be identically distributed random variables and [math]\displaystyle{ N }[/math] be a random variable taking values in the non-negative integers and independent of the [math]\displaystyle{ X_n }[/math] for all [math]\displaystyle{ n \ge 1 }[/math]. Prove that [math]\displaystyle{ \mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1] }[/math].
- [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \in [N]} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
Problem 2 (Discrete random variable, 14 points)
- [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
- Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.
- Find the expected number of days which elapse before you have a full set of objects.
- [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values [math]\displaystyle{ \mathbb{N}_+ }[/math].
- [Binomial distribution] Let [math]\displaystyle{ n_1,n_2 \in \mathbb{N}_+ }[/math] and [math]\displaystyle{ 0 \le p \le 1 }[/math] be parameters, and [math]\displaystyle{ X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p) }[/math] be independent random variables. Prove that [math]\displaystyle{ X+Y \sim \mathrm{Bin}(n_1+n_2,p) }[/math].
- [Negative binomial distribution] Let [math]\displaystyle{ X }[/math] follows the negative binomial distribution with parameter [math]\displaystyle{ r \in \mathbb{N}_+ }[/math] and [math]\displaystyle{ p \in (0,1) }[/math]. Calculate [math]\displaystyle{ \mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2 }[/math].
- [Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].
- [Poisson distribution] In your pocket is a random number [math]\displaystyle{ N }[/math] of coins, where [math]\displaystyle{ N }[/math] has the Poisson distribution with parameter [math]\displaystyle{ \lambda }[/math]. You toss each coin once, with heads showing with probability [math]\displaystyle{ p }[/math] each time. Let [math]\displaystyle{ X }[/math] be the (random) number of heads outcomes and [math]\displaystyle{ Y }[/math] be the (also random) number of tails.
- Find the joint mass function of [math]\displaystyle{ (X,Y) }[/math].
- Find PDF of the marginal distribution of [math]\displaystyle{ X }[/math] in [math]\displaystyle{ (X,Y) }[/math]. Are [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] independent?
- [Conditional distribution] Let [math]\displaystyle{ \lambda,\mu \gt 0 }[/math] and [math]\displaystyle{ n \in \mathbb{N} }[/math] be parameters, and [math]\displaystyle{ X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu) }[/math] be independent random variables. Find out the conditional distribution of [math]\displaystyle{ X }[/math], given [math]\displaystyle{ X+Y = n }[/math].
Problem 3 (Linearity of Expectation, 10 points)
- [Streak] Suppose we flip a fair coin [math]\displaystyle{ n }[/math] times independently to obtain a sequence of flips [math]\displaystyle{ X_1, X_2, \ldots , X_n }[/math]. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if [math]\displaystyle{ X_3 }[/math], [math]\displaystyle{ X_4 }[/math], and [math]\displaystyle{ X_5 }[/math] are all heads, there is a streak of length [math]\displaystyle{ 3 }[/math] starting at the third flip. (If [math]\displaystyle{ X_6 }[/math] is also heads, then there is also a streak of length [math]\displaystyle{ 4 }[/math] starting at the third lip.) Find the expected number of streaks of length [math]\displaystyle{ k }[/math] for some integer [math]\displaystyle{ k \ge 1 }[/math].
- [Number of cycles] At a banquet, there are [math]\displaystyle{ n }[/math] people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After [math]\displaystyle{ n }[/math] rounds, there will be no idle hands left, and the [math]\displaystyle{ n }[/math] people will form several cycles. For example, when [math]\displaystyle{ n=3 }[/math], the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after [math]\displaystyle{ n }[/math] rounds?
- [Card shuffling] A deck of [math]\displaystyle{ n }[/math] cards, numbered [math]\displaystyle{ 1 }[/math] through [math]\displaystyle{ n }[/math], is initially laid out in order so that card [math]\displaystyle{ i }[/math] is in position [math]\displaystyle{ i }[/math]. At each step, a pair of distinct positions is chosen uniformly at random, and the two cards in those positions are swapped. This operation is repeated [math]\displaystyle{ k }[/math] times. Let [math]\displaystyle{ R_k }[/math] be the random variable representing the number of cards that are in their original starting positions after [math]\displaystyle{ k }[/math] swaps. Calculate the expectation [math]\displaystyle{ \mathbf{E}[R_k] }[/math].
Problem 4 (Probability meets graph theory, 14 points)
- [Random social networks]
Let [math]\displaystyle{ G = (V, E) }[/math] be a fixed undirected graph without isolated vertex.
Let [math]\displaystyle{ d_v }[/math] be the degree of vertex [math]\displaystyle{ v }[/math]. Let [math]\displaystyle{ Y }[/math] be a uniformly chosen vertex, and [math]\displaystyle{ Z }[/math] a uniformly chosen neighbor of [math]\displaystyle{ Y }[/math].
- Show that [math]\displaystyle{ \mathbf{E}[d_Z] \geq \mathbf{E}[d_Y] }[/math].
- Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.
- [Erdős–Rényi random graph] Consider [math]\displaystyle{ G\sim G(n,p) }[/math], where [math]\displaystyle{ G(n,p) }[/math] is the Erdős–Rényi random graph model. Let [math]\displaystyle{ \ell \ge 3 }[/math] be a fixed integer, and let [math]\displaystyle{ N_{\ell} }[/math] be the random variable representing the number of cycles of length exactly [math]\displaystyle{ \ell }[/math] in graph [math]\displaystyle{ G }[/math]. Find the expected value [math]\displaystyle{ \mathbf{E}[N_{\ell}] }[/math].
- [Turán's Theorem] Let [math]\displaystyle{ G=(V,E) }[/math] be a fixed undirected graph, and write [math]\displaystyle{ d_v }[/math] for the degree of the vertex [math]\displaystyle{ v }[/math]. Use probabilistic method to prove that [math]\displaystyle{ \alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1} }[/math], where [math]\displaystyle{ \alpha(G) }[/math] is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set [math]\displaystyle{ I }[/math] from a graph with vertex set [math]\displaystyle{ V }[/math]: First, generate a random permutation of the vertices, denoted as [math]\displaystyle{ v_1,v_2,\ldots,v_n }[/math]. Then, construct the independent set [math]\displaystyle{ I }[/math] as follows: For each vertex [math]\displaystyle{ v_i \in V }[/math], add [math]\displaystyle{ v_i }[/math] to [math]\displaystyle{ I }[/math] if and only if none of its predecessors in the permutation, i.e., [math]\displaystyle{ v_1,\ldots,v_{i-1} }[/math], are neighbors of [math]\displaystyle{ v_i }[/math].)
- [Dominating set] A dominating set of vertices in an undirected graph [math]\displaystyle{ G = (V, E) }[/math] is a set [math]\displaystyle{ S \subseteq V }[/math] such that every vertex of [math]\displaystyle{ G }[/math] belongs to [math]\displaystyle{ S }[/math] or has a neighbor in [math]\displaystyle{ S }[/math]. Let [math]\displaystyle{ G = (V, E) }[/math] be an [math]\displaystyle{ n }[/math]-vertex graph with minimum degree [math]\displaystyle{ d \gt 1 }[/math]. Prove that [math]\displaystyle{ G }[/math] has a dominating set with at most [math]\displaystyle{ \frac{n\left(1+\log(d+1)\right)}{d+1} }[/math] vertices. (Hint: Consider a random vertex subset [math]\displaystyle{ S \subseteq V }[/math] by including each vertex independently with probability [math]\displaystyle{ p := \log(d + 1)/(d + 1) }[/math].)
Problem 5 (1D random walk, 8 points)
Let [math]\displaystyle{ p \in (0,1) }[/math] be a constant, and [math]\displaystyle{ \{X_n\}_{n \ge 1} }[/math] be independent Bernoulli trials with successful probability [math]\displaystyle{ p }[/math]. Define [math]\displaystyle{ S_n = 2\sum_{i=1}^n X_i - n }[/math] and [math]\displaystyle{ S_0 = 0 }[/math].
- [Range of random walk] The range [math]\displaystyle{ R_n }[/math] of [math]\displaystyle{ S_0, S_1, \ldots, S_n }[/math] is defined as the number of distinct values taken by the sequence. Show that [math]\displaystyle{ \mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right) }[/math] as [math]\displaystyle{ n \to \infty }[/math], and deduce that [math]\displaystyle{ n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0) }[/math]. Hence show that [math]\displaystyle{ n^{-1} \mathbf{E}[R_n] \to |2p-1| }[/math] as [math]\displaystyle{ n \to \infty }[/math].
- [Symmetric 1D random walk] Suppose [math]\displaystyle{ p = \frac{1}{2} }[/math]. Prove that [math]\displaystyle{ \mathbf{E}[|S_n|] = \Theta(\sqrt{n}) }[/math].