概率论与数理统计 (Spring 2026)/Problem Set 1: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
Yqzhu (talk | contribs)
No edit summary
Yqzhu (talk | contribs)
No edit summary
Line 5: Line 5:
*为督促大家认真完成平时作业、扎实掌握课程内容,本课程期末考试将从作业题目中<font color=red>随机抽取部分题目</font>进行考查。请大家务必重视每一次作业,认真理解解题思路。
*为督促大家认真完成平时作业、扎实掌握课程内容,本课程期末考试将从作业题目中<font color=red>随机抽取部分题目</font>进行考查。请大家务必重视每一次作业,认真理解解题思路。


*若考试中被抽取到的作业题目答错、答不完整或无法作答,将按照相关标准进行扣分处理。
*若考试中被抽取到的作业题目答错、答不完整或无法作答,将按照相关标准进行<font color=red>扣分处理</font>。


== Assumption throughout Problem Set 1==
== Assumption throughout Problem Set 1==

Revision as of 10:31, 17 March 2026

  • 每道题目的解答都要有完整的解题过程,中英文不限。
  • 我们推荐大家使用LaTeX, markdown等对作业进行排版。
  • 为督促大家认真完成平时作业、扎实掌握课程内容,本课程期末考试将从作业题目中随机抽取部分题目进行考查。请大家务必重视每一次作业,认真理解解题思路。
  • 若考试中被抽取到的作业题目答错、答不完整或无法作答,将按照相关标准进行扣分处理

Assumption throughout Problem Set 1

Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\mathbf{Pr}) }[/math].

Problem 1 (Principle of Inclusion and Exclusion)

Let [math]\displaystyle{ n\ge 1 }[/math] be a positive integer and [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be [math]\displaystyle{ n }[/math] events.

  • [Union bound] Prove [math]\displaystyle{ \mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right) }[/math] using the definition of probability space.
  • [Principle of Inclusion and Exclusion (PIE)] Prove that [math]\displaystyle{ \mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right) }[/math], where [math]\displaystyle{ [n]=\{1,2,\ldots,n\} }[/math].
  • [Surjection] For positive integers [math]\displaystyle{ m\ge n }[/math], prove that the probability of a uniform random function [math]\displaystyle{ f:[m]\to[n] }[/math] to be surjective (满射) is [math]\displaystyle{ \sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m }[/math].
  • [Bonferroni's inequality and Kounias' inequality] Prove that [math]\displaystyle{ \sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i\lt j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i). }[/math] (Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)
  • [Euler totient function] Let [math]\displaystyle{ n = p_1^{k_1} p_2^{k_2} \cdots p_r^{k_r} }[/math] be the prime factorization of [math]\displaystyle{ n }[/math], where [math]\displaystyle{ p_1, p_2, \dots, p_r }[/math] are the distinct prime divisors of [math]\displaystyle{ n }[/math]. The Euler totient function [math]\displaystyle{ \phi(n) }[/math] is defined to be the number of integers [math]\displaystyle{ k }[/math] such that [math]\displaystyle{ 1 \le k \le n }[/math] and [math]\displaystyle{ \gcd(k,n)=1 }[/math]. Prove that [math]\displaystyle{ \phi(n)=n\prod_{i=1}^r\left(1-\frac{1}{p_i}\right). }[/math]

Problem 2 (Probability space)

  • [Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers [math]\displaystyle{ \mathbb{N} }[/math]. More precisely, prove that there does not exist a probability space [math]\displaystyle{ (\mathbb{N},2^{\mathbb{N}},\mathbf{Pr}) }[/math] such that [math]\displaystyle{ \mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\}) }[/math] for all [math]\displaystyle{ i, j \in \mathbb{N} }[/math]. Please explain why the same argument fails to prove that there is no uniform probability law on the real interval [math]\displaystyle{ [0,1] }[/math], that is, there is no such probability space [math]\displaystyle{ ([0,1],\mathcal{F},\mathbf{Pr}) }[/math] that for any interval [math]\displaystyle{ (l,r] \subseteq [0,1] }[/math], it holds that [math]\displaystyle{ (l,r] \in \mathcal{F} }[/math] and [math]\displaystyle{ \mathbf{Pr}( (l,r] ) = r-l }[/math]. (Actually, such probability measure does exist and is called the Lebesgue measure on [math]\displaystyle{ [0,1] }[/math]).

  • [Smallest [math]\displaystyle{ \sigma }[/math]-field (I)] For any subset [math]\displaystyle{ S \subseteq 2^\Omega }[/math], prove that the smallest [math]\displaystyle{ \sigma }[/math]-field containing [math]\displaystyle{ S }[/math] is given by [math]\displaystyle{ \sigma(S) := \bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F} }[/math]. (Hint: You should show that it is indeed a [math]\displaystyle{ \sigma }[/math]-field and also it is the smallest one containing [math]\displaystyle{ S }[/math].)

  • [Smallest [math]\displaystyle{ \sigma }[/math]-field (II)] Let [math]\displaystyle{ S,T \subseteq 2^{\Omega} }[/math]. Show that [math]\displaystyle{ \sigma(S) = \sigma(T) }[/math] if and only if [math]\displaystyle{ S \subseteq \sigma(T) }[/math] and [math]\displaystyle{ T \subseteq \sigma(S) }[/math].

  • [Union of [math]\displaystyle{ \sigma }[/math]-field] Let [math]\displaystyle{ \mathcal{F} }[/math] and [math]\displaystyle{ \mathcal{G} }[/math] be [math]\displaystyle{ \sigma }[/math]-fields of subsets of [math]\displaystyle{ \Omega }[/math]. Show that [math]\displaystyle{ \mathcal{F} \cup \mathcal{G} }[/math] is not necessarily a [math]\displaystyle{ \sigma }[/math]-field. Suppose [math]\displaystyle{ \mathcal{F}_1 \subseteq \mathcal{F}_2\subseteq \mathcal{F}_3\subseteq\ldots }[/math] be a sequence of [math]\displaystyle{ \sigma }[/math]-field. Is [math]\displaystyle{ \bigcup_{i=1}^{+\infty} \mathcal{F}_i }[/math] a [math]\displaystyle{ \sigma }[/math]-field?

  • [Probability space?] Let [math]\displaystyle{ \Omega = \mathbb{R} }[/math], [math]\displaystyle{ \mathcal{F} }[/math] is the set of all subsets [math]\displaystyle{ A \subseteq \Omega }[/math] so that [math]\displaystyle{ A }[/math] or [math]\displaystyle{ \overline{A} }[/math] (complement of [math]\displaystyle{ A }[/math]) is countable, [math]\displaystyle{ P(A) = 0 }[/math] in the first case and [math]\displaystyle{ P(A) = 1 }[/math] in the second. Is [math]\displaystyle{ (\Omega,\mathcal{F},P) }[/math] a probability space? Please explain your answer.

Problem 3 (Birthday paradox)

Please design a randomized algorithm using the birthday paradox that solves the following problem in [math]\displaystyle{ \mathrm{poly}(n) \cdot 2^{n/2} }[/math] time with high probability (for example, [math]\displaystyle{ 0.99 }[/math] when [math]\displaystyle{ n }[/math] is sufficiently large). Please provide a detailed error analysis.

  • Given an integer sequence [math]\displaystyle{ a_1,a_2,\ldots,a_{100n} }[/math] of length [math]\displaystyle{ 100 n }[/math] satisfying [math]\displaystyle{ 0 \le a_i \lt 2^n }[/math] for all [math]\displaystyle{ 1 \le i \le 100 n }[/math]. Please find out two disjoint, non-empty subsets [math]\displaystyle{ S_1,S_2 \subseteq \{1,2,\ldots,100n\} }[/math] satisfying [math]\displaystyle{ \sum \limits_{i \in S_1} a_i = \sum \limits_{i \in S_2} a_i }[/math].

Problem 4 (Conditional probability)

  • [Positive correlation] We say that events [math]\displaystyle{ B }[/math] gives positive information about event [math]\displaystyle{ A }[/math] if [math]\displaystyle{ \mathbf{Pr}(A|B) \gt \mathbf{Pr}(A) }[/math], that is, the occurrence of [math]\displaystyle{ B }[/math] makes the occurrence of [math]\displaystyle{ A }[/math] more likely. Now suppose that [math]\displaystyle{ B }[/math] gives positive information about [math]\displaystyle{ A }[/math].
    1. Does [math]\displaystyle{ A }[/math] give positive information about [math]\displaystyle{ B }[/math]?
    2. Does [math]\displaystyle{ \overline{B} }[/math] give negative information about [math]\displaystyle{ A }[/math], that is, is it true that [math]\displaystyle{ \mathbf{Pr}(A|\overline{B}) \lt \mathbf{Pr}(A) }[/math]?
    3. Does [math]\displaystyle{ \overline{B} }[/math] give positive information or negative information about [math]\displaystyle{ \overline{A} }[/math]?
  • [Balls in urns (I)] There are [math]\displaystyle{ n }[/math] urns of which the [math]\displaystyle{ r }[/math]-th contains [math]\displaystyle{ r-1 }[/math] white balls and [math]\displaystyle{ n-r }[/math] black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the [math]\displaystyle{ {n-1\choose 2} }[/math] pairs of balls are chosen to be removed with equal probability). Find the following probabilities:

    1. the second ball is black;
    2. the second ball is black, given that the first is black.

  • [Balls in urns (II)] Suppose that an urn contains [math]\displaystyle{ w }[/math] white balls and [math]\displaystyle{ b }[/math] black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:

    1. the first white ball drawn is the [math]\displaystyle{ (k+1) }[/math]th ball;
    2. the last ball drawn is white.

Problem 5 (Independence)

Let's consider a series of [math]\displaystyle{ n }[/math] outputs [math]\displaystyle{ (X_1, X_2, \cdots, X_n) \in \{0,1\}^n }[/math] of [math]\displaystyle{ n }[/math] independent Bernoulli trials, where each trial succeeds with the same probability [math]\displaystyle{ 0 \lt p \lt 1 }[/math].

  • [Limited independence] Construct three events [math]\displaystyle{ A,B }[/math] and [math]\displaystyle{ C }[/math] out of [math]\displaystyle{ n }[/math] Bernoulli trials such that [math]\displaystyle{ A, B }[/math] and [math]\displaystyle{ C }[/math] are pairwise independent but are not (mutually) independent. You need to prove that the constructed events [math]\displaystyle{ A, B }[/math] and [math]\displaystyle{ C }[/math] satisfy this. (Hint: Consider the case where [math]\displaystyle{ n = 2 }[/math] and [math]\displaystyle{ p = 1/2 }[/math].)

  • [Product distribution] Suppose someone has observed the output of the [math]\displaystyle{ n }[/math] trials, and she told you that precisely [math]\displaystyle{ k }[/math] out of [math]\displaystyle{ n }[/math] trials succeeded for some [math]\displaystyle{ 0\lt k\lt n }[/math]. Now you want to predict the output of the [math]\displaystyle{ (n+1) }[/math]-th trial while the parameter [math]\displaystyle{ p }[/math] of the Bernoulli trial is unknown. One way to estimate [math]\displaystyle{ p }[/math] is to find such [math]\displaystyle{ \hat{p} }[/math] that makes the observed outcomes most probable, namely you need to solve [math]\displaystyle{ \arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}]. }[/math]

    1. Estimate [math]\displaystyle{ p }[/math] by solving the above optimization problem.
    2. If someone tells you exactly which [math]\displaystyle{ k }[/math] trials succeed (in addition to just telling you the number of successful trials, which is [math]\displaystyle{ k }[/math]), would it help you to estimate [math]\displaystyle{ p }[/math] more accurately? Why?

Problem 6 (Probabilistic method)

  • [Ramsey number] Prove that if there exists a real number [math]\displaystyle{ p \in [0, 1] }[/math] satisfying:
    [math]\displaystyle{ \binom{n}{k} p^{\binom{k}{2}} + \binom{n}{l} (1 - p)^{\binom{l}{2}} \lt 1, }[/math]

    then the Ramsey number [math]\displaystyle{ R(k, l) }[/math] is strictly greater than [math]\displaystyle{ n }[/math]. Furthermore, use this result to show that there exists an absolute constant [math]\displaystyle{ c \gt 0 }[/math] such that:

    [math]\displaystyle{ R(4, l) \ge c \left( \frac{l}{\log l} \right)^{3/2}. }[/math]
    (Hint: [math]\displaystyle{ \binom{n}{k} \le \left(\frac{en}{k}\right)^k }[/math].)