数据科学基础 (Fall 2024)/Problem Set 1
- 每道题目的解答都要有完整的解题过程,中英文不限。
- 我们推荐大家使用LaTeX, markdown等对作业进行排版。
- 没有条件的同学可以用纸笔完成作业之后拍照。
Assumption throughout Problem Set 1
Without further notice, we are working on probability space [math]\displaystyle{ (\Omega,\mathcal{F},\mathbf{Pr}) }[/math].
Problem 1 (Probability space)
- [De Morgan’s Laws] Let [math]\displaystyle{ \{A_i\} }[/math] be a collection of sets. Prove [math]\displaystyle{ \left(\bigcup_iA_i\right)^c=\bigcap_iA_i^c }[/math].
- [[math]\displaystyle{ \sigma }[/math]-algebra] Let [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] belong to some [math]\displaystyle{ \sigma }[/math]-algebra [math]\displaystyle{ \mathcal F }[/math]. Show that [math]\displaystyle{ \mathcal F }[/math] contains the sets [math]\displaystyle{ A\cap B }[/math], [math]\displaystyle{ A\setminus B }[/math], and [math]\displaystyle{ A\oplus B }[/math], but unnecessarily contains the set [math]\displaystyle{ A\cup B }[/math].
[Smallest [math]\displaystyle{ \sigma }[/math]-field] For any subset [math]\displaystyle{ S \subseteq 2^\Omega }[/math], prove that the smallest [math]\displaystyle{ \sigma }[/math]-field containing [math]\displaystyle{ S }[/math] is given by [math]\displaystyle{ \bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F} }[/math]. (Hint: You should show that it is indeed a [math]\displaystyle{ \sigma }[/math]-field and also it is the smallest one containing [math]\displaystyle{ S }[/math].)
[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers [math]\displaystyle{ \mathbb{N} }[/math]. More precisely, prove that there does not exist a probability space [math]\displaystyle{ (\mathbb{N},2^{\mathbb{N}},\mathbf{Pr}) }[/math] such that [math]\displaystyle{ \mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\}) }[/math] for all [math]\displaystyle{ i, j \in \mathbb{N} }[/math]. Please explain why the same argument fails to prove that there is no uniform probability law on the real interval [math]\displaystyle{ [0,1] }[/math], that is, there is no such probability space [math]\displaystyle{ ([0,1],\mathcal{F},\mathbf{Pr}) }[/math] that for any interval [math]\displaystyle{ (l,r] \subseteq [0,1] }[/math], it holds that [math]\displaystyle{ (l,r] \in \mathcal{F} }[/math] and [math]\displaystyle{ \mathbf{Pr}( (l,r] ) = r-l }[/math]. (Actually, such probability measure does exist and is called the Lebesgue measure on [math]\displaystyle{ [0,1] }[/math]).
[Basic properties] Prove the following basic properties of probability:
- [math]\displaystyle{ \mathbf{Pr}(A^c)=1-\mathbf{Pr}(A) }[/math]
- [math]\displaystyle{ \mathbf{Pr}(\emptyset)=0 }[/math]
- [math]\displaystyle{ \mathbf{Pr}(A\setminus B)=\mathbf{Pr}(A)-\mathbf{Pr}(A\cap B) }[/math]
- [math]\displaystyle{ A\subseteq B\implies \mathbf{Pr}(A)\le \mathbf{Pr}(B) }[/math]
- [math]\displaystyle{ \mathbf{Pr}(A\cup B)=\mathbf{Pr}(A)+\mathbf{Pr}(B)-\mathbf{Pr}(A\cap B) }[/math]
- [math]\displaystyle{ \mathbf{Pr}(A\oplus B)=\mathbf{Pr}(A)+\mathbf{Pr}(B)-2\mathbf{Pr}(A\cap B) }[/math]
[Murphy’s Law] A fair coin is tossed repeatedly. Show that, with probability one, a head turns up sooner or later. Show similarly that any given finite sequence of heads and tails occurs eventually with probability one. Explain the connection with Murphy’s Law.
Problem 2 (Principle of Inclusion and Exclusion)
Let [math]\displaystyle{ n\ge 1 }[/math] be a positive integer and [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be [math]\displaystyle{ n }[/math] events.
- [Union bound] Prove [math]\displaystyle{ \mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right) }[/math] using the definition of probability space.
- [Principle of Inclusion and Exclusion (PIE)] Prove that [math]\displaystyle{ \mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right) }[/math], where [math]\displaystyle{ [n]=\{1,2,\ldots,n\} }[/math].
- [Bonferroni's inequality and Kounias' inequality] Prove that [math]\displaystyle{ \sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i\lt j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i). }[/math] (Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)
- [Euler's totient function] For any positive integer [math]\displaystyle{ n }[/math], let [math]\displaystyle{ \varphi(n) }[/math] be the number of positive integers [math]\displaystyle{ x\le n }[/math] such that [math]\displaystyle{ \mathrm{gcd}(x,n)=1 }[/math]. Prove that [math]\displaystyle{ \varphi(n)=n\prod_{i=1}^k(1-1/p_i) }[/math] using the principle of inclusion and exclusion, where [math]\displaystyle{ p_i }[/math]’s are primes and [math]\displaystyle{ n=\prod_{i=1}^k p_i^{c_i} }[/math] is the prime factorization of [math]\displaystyle{ n }[/math] for some positive integers [math]\displaystyle{ \{c_i\}_{i\le k} }[/math].
Problem 3 (Symmetric 1D random walk)
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns [math]\displaystyle{ 1 }[/math] point if it's HEADs, and loses [math]\displaystyle{ 1 }[/math] point if otherwise.
[Symmetric 1D random walk (I)] Let [math]\displaystyle{ A_i }[/math] be the event that the gambler earns [math]\displaystyle{ 0 }[/math] points after playing [math]\displaystyle{ i }[/math] rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute [math]\displaystyle{ \sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) }[/math]. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that [math]\displaystyle{ \sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty }[/math].)
[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his [math]\displaystyle{ m }[/math] points. Let [math]\displaystyle{ B_i }[/math] be the event that the game ends within [math]\displaystyle{ i }[/math] rounds. Compute [math]\displaystyle{ \sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) }[/math]. (Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that [math]\displaystyle{ \sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i) }[/math]. (ii) Compute [math]\displaystyle{ \mathbf{Pr}(C_i) }[/math], which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that [math]\displaystyle{ \sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty }[/math].)
[Symmetric 1D random walk (III)] Suppose that the game ends upon that the gambler loses all his [math]\displaystyle{ m }[/math] points or earns another [math]\displaystyle{ m }[/math] points (i.e., the number of points he possesses reaches [math]\displaystyle{ 2m }[/math]). Let [math]\displaystyle{ C }[/math] be the event that the game ends within [math]\displaystyle{ n }[/math] rounds. Compute [math]\displaystyle{ \mathbf{Pr}(C) }[/math]. (Hint: A variant of the problem is sometimes called the ballot problem, and there is a famous trick known as André's reflection principle.)
Problem 4 (Conditional probability)
- [Two evelopes] You are handed two envelopes. and you know that each contains a positive integer dollar amount and that the two amounts are different. The values of these two amounts are modeled as constants that are unknown. Without knowing what the amounts are, you select at random one of the two envelopes, and after looking at the amount inside, you may switch envelopes if you wish. A friend claims that the following strategy will increase above [math]\displaystyle{ 1 /2 }[/math] your probability of ending up with the envelope with the larger amount: toss a coin repeatedly. let [math]\displaystyle{ X }[/math] be equal to [math]\displaystyle{ 1 /2 }[/math] plus the number of tosses required to obtain heads for the first time, and switch if the amount in the envelope you selected is less than the value of [math]\displaystyle{ X }[/math]. Is your friend correct?
-
[Paradox of induction] Consider a statement whose truth is unknown. If we see many examples that are compatible with it, we are tempted to view the statement as more probable. Such reasoning is often referred to as inductive inference (in a philosophical, rather than mathematical sense). Consider now the statement that "all cows are white." An equivalent statement is that "everything that is not white is not a cow." We then observe several black crows. Our observations are clearly compatible with the statement. but do they make the hypothesis "all cows are white" more likely?
To analyze such a situation, we consider a probabilistic model. Let us assume that there are two possible states of the world, which we model as complementary events:- [math]\displaystyle{ A }[/math] : all cows are white,
- [math]\displaystyle{ A^C }[/math] : [math]\displaystyle{ 50\% }[/math] of all cows are white.
Let [math]\displaystyle{ p }[/math] be the prior probability [math]\displaystyle{ P(A) }[/math] that all cows are white. We make an observation of a cow or a crow, with probability [math]\displaystyle{ q }[/math] and [math]\displaystyle{ 1 - q }[/math], respectively, independent of whether event [math]\displaystyle{ A }[/math] occurs or not . Assume that [math]\displaystyle{ 0 \lt p \lt 1 , 0 \lt q \lt 1 }[/math] , and that all crows are black.
- Given the event [math]\displaystyle{ B = \{\text{a black crow was observed}\} }[/math], what is [math]\displaystyle{ p(A | B) }[/math]?
- Given the event [math]\displaystyle{ C = \{\text{a white cow was observed}\} }[/math], what is [math]\displaystyle{ p(A | C) }[/math]?
Problem 5 (Independence)
- [Pairwise independences] Prove that the following statements are equivalent:
- [math]\displaystyle{ \mathbf{Pr}(A|B)=\mathbf{Pr}(A) }[/math]
- [math]\displaystyle{ \mathbf{Pr}(AB)=\mathbf{Pr}(A)\cdot P(B) }[/math]
- [math]\displaystyle{ \mathbf{Pr}(A|B^C)=\mathbf{Pr}(A) }[/math]
-
[Mutual independences] Let [math]\displaystyle{ I=\{1,2,3\} }[/math], and [math]\displaystyle{ \{A_i:i\in I\} }[/math] a collection of events. Prove that the following statements are equivalent:
- For any [math]\displaystyle{ e\in I }[/math] and all finite subsets [math]\displaystyle{ J\subseteq I\setminus \{e\} }[/math], [math]\displaystyle{ \mathbf{Pr}\left(A_e\middle|\bigcap_{i\in J}A_i\cap\bigcap_{i\notin J} A_i^c\right)=\mathbf{Pr}(A_e) }[/math]
- For all subsets [math]\displaystyle{ J\subseteq I }[/math], [math]\displaystyle{ \mathbf{Pr}\left(\bigcap_{i\in J}A_i\right)=\prod_{i\in J}\mathbf{Pr}(A_i) }[/math]
For any integer [math]\displaystyle{ n\gt 3 }[/math], does the equivalence hold if [math]\displaystyle{ I=\{1,\dots,n\} }[/math]? Prove or disprove it.
[Limited independence] Construct three events [math]\displaystyle{ A,B }[/math] and [math]\displaystyle{ C }[/math] out of [math]\displaystyle{ n }[/math] Bernoulli trials such that [math]\displaystyle{ A, B }[/math] and [math]\displaystyle{ C }[/math] are pairwise independent but are not (mutually) independent. You need to prove that the constructed events [math]\displaystyle{ A, B }[/math] and [math]\displaystyle{ C }[/math] satisfy this. (Hint: Consider the case where [math]\displaystyle{ n = 2 }[/math] and [math]\displaystyle{ p = 1/2 }[/math].)
- [Conditional independence] Show that the conditional independence of [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] given [math]\displaystyle{ C }[/math] neither implies, nor is implied by, the independence of [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math]. For which events [math]\displaystyle{ C }[/math] is it the case that, for all [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math], the events [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are independent if and only if they are conditionally independent given [math]\displaystyle{ C }[/math]?
- [Two hounds] A hunter has two hunting dogs. One day, on the trail of some animal, the hunter comes to a place where the road diverges into two paths. He knows that each dog, independent of the other, will choose the correct path with probability [math]\displaystyle{ p }[/math]. The hunter decides to let each dog choose a path, and if they agree, take that one, and if they disagree, to randomly pick a path. Is his strategy better than just letting one of the two dogs decide on a path?
Problem 6 (Probabilistic method)
- [Tournaments] A tournament can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once. Given two players [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math], we draw an arrow from [math]\displaystyle{ x }[/math] to [math]\displaystyle{ y }[/math] if [math]\displaystyle{ x }[/math] beats [math]\displaystyle{ y }[/math] (we say “ [math]\displaystyle{ x }[/math] dominates [math]\displaystyle{ y }[/math]”), and we draw an arrow from [math]\displaystyle{ y }[/math] to [math]\displaystyle{ x }[/math] if [math]\displaystyle{ y }[/math] beats [math]\displaystyle{ x }[/math] (there are no ties). We say a tournament has property [math]\displaystyle{ S_k }[/math] if for every [math]\displaystyle{ k }[/math] players, there exists another player [math]\displaystyle{ v }[/math] who defeats all of them. Prove that if [math]\displaystyle{ \binom n k(1-2^{-k})^{n-k}\lt 1 }[/math], then there is a tournament on [math]\displaystyle{ n }[/math] vertices that has the property [math]\displaystyle{ S_k }[/math].