TCS Wiki - User contributions [en]

概率论与数理统计 (Spring 2025)/第四次作业提交名单

2025-05-18T08:41:59Z

Zouzongrui: Created page with " 如有错漏请邮件联系助教. <center> {| class="wikitable" |- ! 学号 !! 姓名 |- | 231240009 || 陈心怡 |- | 231240055 || 魏全烨 |- | 231240005 || 王甚易 |- | 231840058 || 王润希 |- | 231200035 || 葛翰飞 |- | 231840166 || 赵少凡 |- | 231240056 || 靳濡搏 |- | 231880394 || 翟笑晨 |- | 231098046 || 李方泽 |- | 231840242 || 李昊凌 |- | 231240018 || 渠翔凯 |- | 231240015 || 邱文宣 |- | 231220001 || 陈梓睿 |- | 23123..."

如有错漏请邮件联系助教.
<center>
{| class="wikitable"
|-
! 学号 !! 姓名
|-
| 231240009 || 陈心怡
|-
| 231240055 || 魏全烨
|-
| 231240005 || 王甚易
|-
| 231840058 || 王润希
|-
| 231200035 || 葛翰飞
|-
| 231840166 || 赵少凡
|-
| 231240056 || 靳濡搏
|-
| 231880394 || 翟笑晨
|-
| 231098046 || 李方泽
|-
| 231840242 || 李昊凌
|-
| 231240018 || 渠翔凯
|-
| 231240015 || 邱文宣
|-
| 231220001 || 陈梓睿
|-
| 231230102 || 庄铸锴
|-
| 231880152 || 李佳骏
|-
| 231880123 || 陆博文
|-
| 231240047 || 姜淮仁
|-
| 231240029 || 朱非凡
|-
| 231240060 || 侯佳松
|-
| 231240002 || 余孟凡
|-
| 231240024 || 陈越
|-
| 231240025 || 李熠辰
|-
| 231240051 || 王昱霏
|-
| 231240011 || 陈力琰
|-
| 231880140 || 桂天麟
|-
| 231240038 || 肖金宇
|-
| 231880320 || 尹铁越
|-
| 231240013 || 南晨曦
|-
| 231240058 || 陈小川
|-
| 231240004 || 何梓杨
|-
| 231240007 || 高亮节
|-
| 231220090 || 邓淇
|-
| 231240043 || 朱宇曦
|-
| 231240053 || 王艺文
|-
| 211250032 || 吕卉森
|-
| 231220002 || 潘谟天
|-
| 231240027 || 彭浩楠
|-
| 231240016 || 徐冰冰
|-
| 231240019 || 邵与乔
|-
| 231240026 || 蒙航
|-
| 231220005 || 樊书岩
|-
| 231098142 || 陈逸帆
|-
| 231240023 || 叶宸见
|-
| 231240021 || 陈奕帆
|}
</center>

概率论与数理统计 (Spring 2025)

2025-05-18T08:41:31Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2025)/第一次作业提交名单|第一次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 2|Problem Set 2]] 请在 2025/3/31 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A2.pdf').
** [[概率论与数理统计 (Spring 2025)/第二次作业提交名单|第二次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 3|Problem Set 3]] 请在 2025/4/21 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A3.pdf').
** [[概率论与数理统计 (Spring 2025)/第三次作业提交名单|第三次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 4|Problem Set 4]] 请在 2025/5/14 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A4.pdf').
** [[概率论与数理统计 (Spring 2025)/第四次作业提交名单|第四次作业提交名单]]

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2025)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2025/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''
#* the [http://tcs.nju.edu.cn/slides/prob2024/discrete-pmf.nb '''Mathematica notebook file'''] for the PMFs of basic discrete distributions
#* [[概率论与数理统计 (Spring 2025)/Average-case analysis of QuickSort|Average-case analysis of '''''QuickSort''''']]
#* [https://www.bilibili.com/video/BV1ta411A7fp/ 高尔顿板（Galton board）视频] 和 [https://en.wikipedia.org/wiki/Galton_board 维基百科页面]
# [http://tcs.nju.edu.cn/slides/prob2025/Deviation.pdf 矩与偏差]
#* 阅读：'''[MU] Chapter 3'''
#* 阅读：'''[BT] 章节 2.4, 4.2, 4.3, 5.1''' 或 '''[GS] Sections 3.3, 3.6, 7.3'''
#* [[概率论与数理统计 (Spring 2025)/Two-point sampling|Two-point sampling]]
#* [[概率论与数理统计 (Spring 2025)/Threshold of k-clique in random graph|Threshold of <math>k</math>-clique in random graph]]
#* [[概率论与数理统计 (Spring 2025)/Weierstrass Approximation Theorem|Weierstrass approximation]]
# [http://tcs.nju.edu.cn/slides/prob2025/Continuous.pdf 连续分布]
#* 阅读：'''[BT] 第3章, 和4.1节''' 或 '''[GS] Chapter 4'''
#* 阅读：'''[MU] Chapters 8, 9'''
#* [https://measure.axler.net/MIRA.pdf Measure, Integration & Real Analysis] by Sheldon Axler
#* [[概率论与数理统计 (Spring 2025)/An exercise on induced distribution|An exercise on induced distribution]]
# [http://tcs.nju.edu.cn/slides/prob2025/Convergence.pdf 极限定理]
#* 阅读：'''[BT] 第5章'''
#* 阅读：'''[GS] Sections 5.7~5.10, 7.1~7.5'''
# [[Media:Concentration-Prob2025.pdf|测度集中]]
#* 阅读：'''[MU] Chapters 4''' and '''Sections 13.1, 13.4~13.5'''
#* 阅读：'''[GS] Sections 5.11, 12.1~12.3, 7.8~7.9'''
#* [[概率论与数理统计 (Spring 2024)/Hoeffding's lemma|Hoeffding's lemma]]
#* [[概率论与数理统计 (Spring 2025)/Bounded Difference Method|The bounded difference method]]
# [[Media:Process-Prob2025.pdf|随机过程]]
#* 阅读：'''[BT] 第6章, 第7章'''
#* 阅读：'''[MU] Chapters 7, Sections 13.1~13.3''' or '''[GS] Chapters 6, Sections 12.4~12.5'''
#* [[概率论与数理统计 (Spring 2024)/OST and applications|OST and applications]]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)/Problem Set 4

2025-04-28T13:25:42Z

Zouzongrui: /* Problem 1 (Continuous Random Variables, 30 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Let <math>X,Y</math> and <math>Z</math> be independent and uniformly distributed on <math>[0,1]</math>. Find the joint density function of <math>XY</math> and <math>Z^2</math>, and show that <math>\textbf{Pr}[XY<Z^2] = 5/9</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Strong Law of Large Numbers, 15 points, '''Bonus Problem''')==
<ul>
Throughout this problem, we assume <math>X_1,X_2,\ldots,</math> be jointly independent square-integrable real random variables of mean zero. We will prove the strong law of large numbers by Kolmogorov maximal inequality.
<li>[Kolmogorov maximal inequality]
Let <math>S_n = \sum_{i=1}^n X_i</math>. Prove that <math>\mathbf{Pr}\left(\max_{1 \le i \le n} |S_i| \ge t \right) \le \frac{n \mathbf{Var}(X_1)}{t^2}</math>.
</li>
<li>[Convergence of random series]
Suppose <math>\sum_{i=1}^{+\infty} \mathbf{Var}(X_i) < \infty</math>. Prove that the series <math>\sum_{i=1}^{+\infty} X_i</math> is almost surely convergent.
</li>
<li>[Strong law of large numbers]
Prove the strong law of large numbers using previous propositions.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded, where <math>S_n = \sum_{i=1}^n |X_i|</math>.
</li>

</ul>
</li>

<li>
[Monte Carlo integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>[Stirling's formula]
By considering the central limit theorem for the sum of independent Poisson-distributed random variables, show that
<math>n! \sim \sqrt{2\pi n} \cdot \left(\frac{n}{\mathrm{e}}\right)^n</math>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 4

2025-04-27T12:53:39Z

Zouzongrui: Created page with "*每道题目的解答都要有完整的解题过程，中英文不限。 *我们推荐大家使用LaTeX, markdown等对作业进行排版。 *Bonus problem为附加题（选做）。 == Assumption throughout Problem Set 4== Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>. Without further notice, we assume that the expectation of random variables are well-defined. The term <math>\log</math> u..."

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence and Characteristic Function, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Characteristic Function]
Suppose <math>X</math> is a discrete random variable only taking integer values and <math>\mathbf{E}[X]=0</math>. Suppose furthermore that there is no infinite subprogression <math>a+q\mathbb{Z}</math> of <math>\mathbb{Z}</math> with <math>q>1</math> for which <math>X</math> takes values almost surely in <math>a+q \mathbb{Z}</math>. Prove that <math>|\phi_X(t)|<1</math> for <math>0 < t \le \pi</math>, where <math>\phi_X(t)=\mathbf{E}[\mathrm{e}^{itX}]</math> is the characteristic function of <math>X</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 10 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>

<li> (Bonus problem, 5 points)
Let <math>X_1,X_2,\ldots,X_n</math> are i.i.d. copies of a random variable <math>X</math> with the standard Cauchy distribution (i.e., the probability density function of <math>X</math> is <math>\frac{1}{\pi} \frac{1}{1+x^2}</math>), show that <math>\frac{S_n}{n \log n}</math> converges in probability to a constant <math>\frac{2}{\pi}</math> but is almost surely unbounded, where <math>S_n = \sum_{i=1}^n |X_i|</math>.
</li>

</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x < \infty</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>2(\sqrt{S_n} - \sqrt{n}) \overset{D}{\to} \sigma N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>

概率论与数理统计 (Spring 2025)

2025-04-27T12:52:41Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2025)/第一次作业提交名单|第一次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 2|Problem Set 2]] 请在 2025/3/31 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A2.pdf').
** [[概率论与数理统计 (Spring 2025)/第二次作业提交名单|第二次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 3|Problem Set 3]] 请在 2025/4/21 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A3.pdf').
** [[概率论与数理统计 (Spring 2025)/第三次作业提交名单|第三次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 4|Problem Set 4]] 请在 2025/5/14 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A4.pdf').

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2025)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2025/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''
#* the [http://tcs.nju.edu.cn/slides/prob2024/discrete-pmf.nb '''Mathematica notebook file'''] for the PMFs of basic discrete distributions
#* [[概率论与数理统计 (Spring 2025)/Average-case analysis of QuickSort|Average-case analysis of '''''QuickSort''''']]
#* [https://www.bilibili.com/video/BV1ta411A7fp/ 高尔顿板（Galton board）视频] 和 [https://en.wikipedia.org/wiki/Galton_board 维基百科页面]
# [http://tcs.nju.edu.cn/slides/prob2025/Deviation.pdf 矩与偏差]
#* 阅读：'''[MU] Chapter 3'''
#* 阅读：'''[BT] 章节 2.4, 4.2, 4.3, 5.1''' 或 '''[GS] Sections 3.3, 3.6, 7.3'''
#* [[概率论与数理统计 (Spring 2025)/Two-point sampling|Two-point sampling]]
#* [[概率论与数理统计 (Spring 2025)/Threshold of k-clique in random graph|Threshold of <math>k</math>-clique in random graph]]
#* [[概率论与数理统计 (Spring 2025)/Weierstrass Approximation Theorem|Weierstrass approximation]]
# [http://tcs.nju.edu.cn/slides/prob2025/Continuous.pdf 连续分布]
#* 阅读：'''[BT] 第3章, 和4.1节''' 或 '''[GS] Chapter 4'''
#* 阅读：'''[MU] Chapters 8, 9'''
#* [https://measure.axler.net/MIRA.pdf Measure, Integration & Real Analysis] by Sheldon Axler
#* [[概率论与数理统计 (Spring 2025)/An exercise on induced distribution|An exercise on induced distribution]]
# [http://tcs.nju.edu.cn/slides/prob2025/Convergence.pdf 极限定理]
#* 阅读：'''[BT] 第5章'''
#* 阅读：'''[GS] Sections 5.7~5.10, 7.1~7.5'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)/第三次作业提交名单

2025-04-21T06:13:58Z

Zouzongrui: Created page with " 如有错漏请邮件联系助教. <center> {| class="wikitable" |- ! 学号 !! 姓名 |- | 231240038 || 肖金宇 |- | 231240002 || 余孟凡 |- | 231200035 || 葛翰飞 |- | 231880394 || 翟笑晨 |- | 231240055 || 魏全烨 |- | 231880320 || 尹铁越 |- | 231880123 || 陆博文 |- | 231840166 || 赵少凡 |- | 231840242 || 李昊凌 |- | 231098046 || 李方泽 |- | 231840058 || 王润希 |- | 231240021 || 陈奕帆 |- | 231240018 || 渠翔凯 |- | 23124..."

如有错漏请邮件联系助教.
<center>
{| class="wikitable"
|-
! 学号 !! 姓名
|-
| 231240038 || 肖金宇
|-
| 231240002 || 余孟凡
|-
| 231200035 || 葛翰飞
|-
| 231880394 || 翟笑晨
|-
| 231240055 || 魏全烨
|-
| 231880320 || 尹铁越
|-
| 231880123 || 陆博文
|-
| 231840166 || 赵少凡
|-
| 231840242 || 李昊凌
|-
| 231098046 || 李方泽
|-
| 231840058 || 王润希
|-
| 231240021 || 陈奕帆
|-
| 231240018 || 渠翔凯
|-
| 231240005 || 王甚易
|-
| 231880140 || 桂天麟
|-
| 231240011 || 陈力琰
|-
| 231230102 || 庄铸锴
|-
| 231240025 || 李熠辰
|-
| 231220005 || 樊书岩
|-
| 231240056 || 靳濡搏
|-
| 231240015 || 邱文宣
|-
| 231240060 || 侯佳松
|-
| 231240026 || 蒙航
|-
| 231240013 || 南晨曦
|-
| 231240047 || 姜淮仁
|-
| 231880152 || 李佳骏
|-
| 231240051 || 王昱霏
|-
| 231240058 || 陈小川
|-
| 231240004 || 何梓杨
|-
| 231240029 || 朱非凡
|-
| 231240024 || 陈越
|-
| 231220001 || 陈梓睿
|-
| 231240053 || 王艺文
|-
| 231240045 || 杨俊炜
|-
| 231240016 || 徐冰冰
|-
| 231240023 || 叶宸见
|-
| 231240009 || 陈心怡
|-
| 211250032 || 吕卉森
|-
| 231220090 || 邓淇
|-
| 231240027 || 彭浩楠
|-
| 231220002 || 潘谟天
|-
| 231240019 || 邵与乔
|-
| 231098142 || 陈逸帆
|}
</center>

概率论与数理统计 (Spring 2025)

2025-04-21T06:13:41Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2025)/第一次作业提交名单|第一次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 2|Problem Set 2]] 请在 2025/3/31 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A2.pdf').
** [[概率论与数理统计 (Spring 2025)/第二次作业提交名单|第二次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 3|Problem Set 3]] 请在 2025/4/21 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A3.pdf').
** [[概率论与数理统计 (Spring 2025)/第三次作业提交名单|第三次作业提交名单]]

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2025)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2025/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''
#* the [http://tcs.nju.edu.cn/slides/prob2024/discrete-pmf.nb '''Mathematica notebook file'''] for the PMFs of basic discrete distributions
#* [[概率论与数理统计 (Spring 2025)/Average-case analysis of QuickSort|Average-case analysis of '''''QuickSort''''']]
#* [https://www.bilibili.com/video/BV1ta411A7fp/ 高尔顿板（Galton board）视频] 和 [https://en.wikipedia.org/wiki/Galton_board 维基百科页面]
# [http://tcs.nju.edu.cn/slides/prob2025/Deviation.pdf 矩与偏差]
#* 阅读：'''[MU] Chapter 3'''
#* 阅读：'''[BT] 章节 2.4, 4.2, 4.3, 5.1''' 或 '''[GS] Sections 3.3, 3.6, 7.3'''
#* [[概率论与数理统计 (Spring 2025)/Two-point sampling|Two-point sampling]]
#* [[概率论与数理统计 (Spring 2025)/Threshold of k-clique in random graph|Threshold of <math>k</math>-clique in random graph]]
#* [[概率论与数理统计 (Spring 2025)/Weierstrass Approximation Theorem|Weierstrass approximation]]
# [http://tcs.nju.edu.cn/slides/prob2025/Continuous.pdf 连续分布]
#* 阅读：'''[BT] 第3章, 和4.1节''' 或 '''[GS] Chapter 4'''
#* 阅读：'''[MU] Chapters 8, 9'''
#* [https://measure.axler.net/MIRA.pdf Measure, Integration & Real Analysis] by Sheldon Axler
#* [[概率论与数理统计 (Spring 2025)/An exercise on induced distribution|An exercise on induced distribution]]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-05T04:12:00Z

Zouzongrui: /* Problem 1 (Warm-up Problems) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Let <math>Y= X_1 + X_2 + \dots + X_n</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li>
[Chebyshev's inequality (I)] Fix <math>0 . Construct a random variable <math>X</math> with <math>\mathbb{E}[X^2] = b^2</math> for which <math>\mathbf{Pr}(|X| \ge a) = b^2/a^2</math>.
</li>

<li>
[Chebyshev's inequality (II)] Let <math>X</math> be a random variable with <math>0 < \mathbb{E}[X^2] < \infty</math>. Show that <math>\lim_{a \to \infty} \frac{a^2 \mathbf{Pr}(|X| \ge a)}{ \mathbb{E}[X^2] } = 0</math>. (Hint: Use the dominated convergence theorem. For discrete random variables, it can be formulated as follows: Let <math>Z,X, X_1,X_2,\ldots,X_n,\ldots</math> be discrete random variables with finite second moments. If <math>|X_n| \le Z</math> and for any <math>a \in \mathbb{R}</math>, <math>\mathbf{Pr}(X_n = a) \to \mathbf{Pr}(X = a)</math> when <math>n</math> tends to infinity, then <math>\mathbb{E}[X_n^2] \to \mathbb{E}[X^2]</math>.)
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-05T04:10:10Z

Zouzongrui: /* Problem 1 (Warm-up Problems) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximized when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li>
[Chebyshev's inequality (I)] Fix <math>0 . Construct a random variable <math>X</math> with <math>\mathbb{E}[X^2] = b^2</math> for which <math>\mathbf{Pr}(|X| \ge a) = b^2/a^2</math>.
</li>

<li>
[Chebyshev's inequality (II)] Let <math>X</math> be a random variable with <math>0 < \mathbb{E}[X^2] < \infty</math>. Show that <math>\lim_{a \to \infty} \frac{a^2 \mathbf{Pr}(|X| \ge a)}{ \mathbb{E}[X^2] } = 0</math>. (Hint: Use the dominated convergence theorem. For discrete random variables, it can be formulated as follows: Let <math>Z,X, X_1,X_2,\ldots,X_n,\ldots</math> be discrete random variables with finite second moments. If <math>|X_n| \le Z</math> and for any <math>a \in \mathbb{R}</math>, <math>\mathbf{Pr}(X_n = a) \to \mathbf{Pr}(X = a)</math> when <math>n</math> tends to infinity, then <math>\mathbb{E}[X_n^2] \to \mathbb{E}[X^2]</math>.)
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-05T04:08:56Z

Zouzongrui: /* Problem 1 (Warm-up Problems) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximum when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li>
[Chebyshev's inequality (I)] Fix <math>0 . Construct a random variable <math>X</math> with <math>\mathbb{E}[X^2] = b^2</math> for which <math>\mathbf{Pr}(|X| \ge a) = b^2/a^2</math>.
</li>

<li>
[Chebyshev's inequality (II)] Let <math>X</math> be a random variable with <math>0 < \mathbb{E}[X^2] < \infty</math>. Show that <math>\lim_{a \to \infty} \frac{a^2 \mathbf{Pr}(|X| \ge a)}{ \mathbb{E}[X^2] } = 0</math>. (Hint: Use the dominated convergence theorem. For discrete random variables, it can be formulated as follows: Let <math>Z,X, X_1,X_2,\ldots,X_n,\ldots</math> be discrete random variables with finite second moments. If <math>|X_n| \le Z</math> and for any <math>a \in \mathbb{R}</math>, <math>\mathbf{Pr}(X_n = a) \to \mathbf{Pr}(X = a)</math> when <math>n</math> tends to infinity, then <math>\mathbb{E}[X_n^2] \to \mathbb{E}[X^2]</math>.)
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-05T04:07:50Z

Zouzongrui: /* Problem 1 (Warm-up Problems) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math> . Show that <math>Y= X_1 + X_2 + \dots + X_n</math> has mean and variance given by <math>\mathbb E[Y ]= \sum^n_1 p_k , \mathrm{Var}(Y )= \sum^n_1 p_k (1− p_k )</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximum when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li>
[Chebyshev's inequality (I)] Fix <math>0 . Construct a random variable <math>X</math> with <math>\mathbb{E}[X^2] = b^2</math> for which <math>\mathbf{Pr}(|X| \ge a) = b^2/a^2</math>.
</li>

<li>
[Chebyshev's inequality (II)] Let <math>X</math> be a random variable with <math>0 < \mathbb{E}[X^2] < \infty</math>. Show that <math>\lim_{a \to \infty} \frac{a^2 \mathbf{Pr}(|X| \ge a)}{ \mathbb{E}[X^2] } = 0</math>. (Hint: Use the dominated convergence theorem. For discrete random variables, it can be formulated as follows: Let <math>Z,X, X_1,X_2,\ldots,X_n,\ldots</math> be discrete random variables with finite second moments. If <math>|X_n| \le Z</math> and for any <math>a \in \mathbb{R}</math>, <math>\mathbf{Pr}(X_n = a) \to \mathbf{Pr}(X = a)</math> when <math>n</math> tends to infinity, then <math>\mathbb{E}[X_n^2] \to \mathbb{E}[X^2]</math>.)
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-05T04:07:31Z

Zouzongrui: /* Problem 1 (Warm-up Problems) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,...,X_n</math> be independent random variables, and suppose that <math>X_k</math> is Bernoulli with parameter <math>p_k</math> . Show that <math>Y= X_1 + X_2 + \dots + X_n</math> has mean and variance given by <math>\mathbb E[Y ]= \sum^n_1 p_k , \mathrm{Var}(Y )= \sum^n_1 p_k (1− p_k )</math>. Show that, for <math>\mathbb E[Y]</math> fixed, <math>\mathrm{Var}(Y )</math> is a maximum when <math>p_1 = p_2 = \dots = p_n</math>. That is to say, the variation in the sum is greatest when individuals are most alike. Is this contrary to intuition?
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li>
[Chebyshev's inequality (I)] Fix <math>0 . Construct a random variable <math>X</math> with <math>\mathbb{E}[X^2] = b^2</math> for which <math>\mathbf{Pr}(|X| \ge a) = b^2/a^2</math>.
</li>

<li>
[Chebyshev's inequality (II)] Let <math>X</math> be a random variable with <math>0 < \mathbb{E}[X^2] < \infty</math>. Show that <math>\lim_{a \to \infty} \frac{a^2 \mathbf{Pr}(|X| \ge a)}{ \mathbb{E}[X^2] } = 0</math>. (Hint: Use the dominated convergence theorem. For discrete random variables, it can be formulated as follows: Let <math>Z,X, X_1,X_2,\ldots,X_n,\ldots</math> be discrete random variables with finite second moments. If <math>|X_n| \le Z</math> and for any <math>a \in \mathbb{R}</math>, <math>\mathbf{Pr}(X_n = a) \to \mathbf{Pr}(X = a)</math> when <math>n</math> tends to infinity, then <math>\mathbb{E}[X_n^2] \to \mathbb{E}[X^2]</math>.)
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-05T03:58:21Z

Zouzongrui: /* Problem 1 (Warm-up Problems) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,\cdots, X_n</math> be pairwise independent random variables. Show that <math>\textbf{Var}\left[\sum_{i=1}^n X_i\right] =\sum_{i=1}^n \textbf{Var} [X_i]</math>.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|= 1</math> if and only if <math>X=aY+b</math> for some real numbers <math>a,b</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li>
[Chebyshev's inequality (I)] Fix <math>0 . Construct a random variable <math>X</math> with <math>\mathbb{E}[X^2] = b^2</math> for which <math>\mathbf{Pr}(|X| \ge a) = b^2/a^2</math>.
</li>

<li>
[Chebyshev's inequality (II)] Let <math>X</math> be a random variable with <math>0 < \mathbb{E}[X^2] < \infty</math>. Show that <math>\lim_{a \to \infty} \frac{a^2 \mathbf{Pr}(|X| \ge a)}{ \mathbb{E}[X^2] } = 0</math>. (Hint: Use the dominated convergence theorem. For discrete random variables, it can be formulated as follows: Let <math>Z,X, X_1,X_2,\ldots,X_n,\ldots</math> be discrete random variables with finite second moments. If <math>|X_n| \le Z</math> and for any <math>a \in \mathbb{R}</math>, <math>\mathbf{Pr}(X_n = a) \to \mathbf{Pr}(X = a)</math> when <math>n</math> tends to infinity, then <math>\mathbb{E}[X_n^2] \to \mathbb{E}[X^2]</math>.)
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-05T03:56:22Z

Zouzongrui: /* Problem 1 (Warm-up Problems) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,\cdots, X_n</math> be pairwise independent random variables. Show that <math>\textbf{Var}\left[\sum_{i=1}^n X_i\right] =\sum_{i=1}^n \textbf{Var} [X_i]</math>.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be independent Bernoulli random variables with parameter <math>1/2</math> . Show that <math>X + Y</math> and <math>|X− Y |</math> are dependent though uncorrelated.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li>
[Chebyshev's inequality (I)] Fix <math>0 . Construct a random variable <math>X</math> with <math>\mathbb{E}[X^2] = b^2</math> for which <math>\mathbf{Pr}(|X| \ge a) = b^2/a^2</math>.
</li>

<li>
[Chebyshev's inequality (II)] Let <math>X</math> be a random variable with <math>0 < \mathbb{E}[X^2] < \infty</math>. Show that <math>\lim_{a \to \infty} \frac{a^2 \mathbf{Pr}(|X| \ge a)}{ \mathbb{E}[X^2] } = 0</math>. (Hint: Use the dominated convergence theorem. For discrete random variables, it can be formulated as follows: Let <math>Z,X, X_1,X_2,\ldots,X_n,\ldots</math> be discrete random variables with finite second moments. If <math>|X_n| \le Z</math> and for any <math>a \in \mathbb{R}</math>, <math>\mathbf{Pr}(X_n = a) \to \mathbf{Pr}(X = a)</math> when <math>n</math> tends to infinity, then <math>\mathbb{E}[X_n^2] \to \mathbb{E}[X^2]</math>.)
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 3

2025-04-05T03:56:03Z

Zouzongrui: Created page with "*每道题目的解答都要有完整的解题过程，中英文不限。 *我们推荐大家使用LaTeX, markdown等对作业进行排版。 == Assumption throughout Problem Set 3== Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>. Without further notice, we assume that the expectation of random variables are well-defined. The term <math>\log</math> used in this context refers to the natural l..."

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 3==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up Problems) ==
<ul>
<li>[Variance (I)]
Let <math>X_1,X_2,\cdots, X_n</math> be pairwise independent random variables. Show that <math>\textbf{Var}\left[\sum_{i=1}^n X_i\right] =\sum_{i=1}^n \textbf{Var} [X_i]</math>.
</li>
<li>[Variance (II)]
Each member of a group of <math>n</math> players rolls a (fair) 6-sided die. For any pair of players who throw the same number, the group scores <math>1</math> point. Find the mean and variance of the total score of the group.
</li>
<li>[Variance (III)]
An urn contains <math>n</math> balls numbered <math>1, 2, \ldots, n</math>. We select <math>k</math> balls uniformly at random without replacement and add up their numbers. Find the mean and variance of the sum.
</li>
<li>[Variance (IV)]
Let <math>N</math> be an integer-valued, positive random variable and let <math>\{X_i\}_{i=1}^{\infty}</math> be indepedently identically distributed random variables that are independent of <math>N</math>, too.
Precisely, for any finite subset <math>I \subseteq\mathbb{N}_+</math>, <math>\{X_i\}_{i \in I}</math> and <math>N</math> are mutually independent. Let <math>X = \sum_{i=1}^N X_i</math>, show that <math>\textbf{Var}[X] = \textbf{Var}[X_1] \mathbb{E}[N] + \mathbb{E}[X_1]^2 \textbf{Var}[N]</math>.
</li>
<li> [Moments (I)]
Show that [math]G(t) = \frac{e^t}{4} + \frac{e^{-t}}{2} + \frac{1}{4}[/math] is a moment-generating function of a random variable, and write the probability mass function of this random variable.
</li>
<li>[Moments (II)]
Let <math>X\sim \text{Geo}(p)</math> for some <math>p \in (0,1)</math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Moments (III)]
Let <math>X\sim \text{Pois}(\lambda)</math> for some <math>\lambda >0 </math>. Find <math>\mathbb{E}[X^3]</math> and <math>\mathbb{E}[X^4]</math>.
</li>
<li>[Covariance and correlation (I)]
Let <math>X</math> and <math>Y</math> be discrete random variables with correlation <math>\rho</math>. Show that <math>|\rho|\leq 1</math>.
</li>
<li>[Covariance and correlation (II)]
Let [math]X[/math] and [math]Y[/math] be discrete random variables with mean <math>0</math>, variance <math>1</math>, and correlation [math]\rho[/math]. Show that [math]\mathbb{E}(\max\{X^2,Y^2\})\leq 1+\sqrt{1-\rho^2}[/math]. (Hint: use the identity [math]\max\{a,b\} = \frac{1}{2}(a+b+|a-b|)[/math].)
</li>
<li>[Covariance and correlation (III)]
Let [math]X[/math] and [math]Y[/math] be independent Bernoulli random variables with parameter [math]1/2[/math]. Show that [math]X+Y[/math] and [math]|X-Y|[/math] are dependent though uncorrelated.
</li>
</ul>

== Problem 2 (Inequalities) ==
<ul>
<li>
[Reverse Markov's inequality] Let <math>X</math> be a discrete random variable with bounded range <math>0 \le X \le U</math> for some <math>U > 0</math>. Show that <math>\mathbf{Pr}(X \le a) \le \frac{U-\mathbf{E}[X]}{U-a}</math> for any <math>0 < a < U</math>.
</li>
<li>
[Markov's inequality] Let <math>X</math> be a discrete random variable. Show that for all <math>\beta \geq 0</math> and all <math>x > 0</math>, <math>\mathbf{Pr}(X\geq x)\leq \mathbb{E}(e^{\beta X})e^{-\beta x}</math>.
</li>
<li>
[Cantelli's inequality] Let <math>X</math> be a discrete random variable with mean <math>0</math> and variance <math>\sigma^2</math>. Prove that for any <math>\lambda > 0</math>, <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2}{\lambda^2+\sigma^2}</math>. (Hint: You may first show that <math>\mathbf{Pr}[X \ge \lambda] \le \frac{\sigma^2 + u^2}{(\lambda + u)^2}</math> for all <math>u > 0</math>.)
</li>
<li>
[Chebyshev's inequality (I)] Fix <math>0 . Construct a random variable <math>X</math> with <math>\mathbb{E}[X^2] = b^2</math> for which <math>\mathbf{Pr}(|X| \ge a) = b^2/a^2</math>.
</li>

<li>
[Chebyshev's inequality (II)] Let <math>X</math> be a random variable with <math>0 < \mathbb{E}[X^2] < \infty</math>. Show that <math>\lim_{a \to \infty} \frac{a^2 \mathbf{Pr}(|X| \ge a)}{ \mathbb{E}[X^2] } = 0</math>. (Hint: Use the dominated convergence theorem. For discrete random variables, it can be formulated as follows: Let <math>Z,X, X_1,X_2,\ldots,X_n,\ldots</math> be discrete random variables with finite second moments. If <math>|X_n| \le Z</math> and for any <math>a \in \mathbb{R}</math>, <math>\mathbf{Pr}(X_n = a) \to \mathbf{Pr}(X = a)</math> when <math>n</math> tends to infinity, then <math>\mathbb{E}[X_n^2] \to \mathbb{E}[X^2]</math>.)
</li>
</ul>

== Problem 3 (Probability meets graph theory) ==
<ul>
<li>[4-clique threshold]
Prove that <math>p = n^{-2/3}</math> is the threshold probability for the existence of 4-clique.
Formally, you are required to show that
<ol type="a">
<li>
with a probability approaching to <math>1</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = \omega(n^{-2/3})</math>; (Hint: use Chebyshev's inequality.)
</li>
<li>
with a probability approaching to <math>0</math> (as <math>n</math> tends to infinity) the Erdős–Rényi random graph <math>\mathbf{G} = \mathbf{G}(n,p)</math> contains a 4-clique when <math>p = o(n^{-2/3})</math>. (Hint: use Markov's inequality.)
</li>
</ol>
</li>
</ul>

概率论与数理统计 (Spring 2025)

2025-04-05T03:49:54Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2025)/第一次作业提交名单|第一次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 2|Problem Set 2]] 请在 2025/3/31 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A2.pdf').
** [[概率论与数理统计 (Spring 2025)/第二次作业提交名单|第二次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 3|Problem Set 3]] 请在 2025/4/21 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A3.pdf').

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2025)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2025/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''
#* the [http://tcs.nju.edu.cn/slides/prob2024/discrete-pmf.nb '''Mathematica notebook file'''] for the PMFs of basic discrete distributions
#* [[概率论与数理统计 (Spring 2025)/Average-case analysis of QuickSort|Average-case analysis of '''''QuickSort''''']]
#* [https://www.bilibili.com/video/BV1ta411A7fp/ 高尔顿板（Galton board）视频] 和 [https://en.wikipedia.org/wiki/Galton_board 维基百科页面]
# [http://tcs.nju.edu.cn/slides/prob2025/Deviation.pdf 矩与偏差]
#* 阅读：'''[MU] Chapter 3'''
#* 阅读：'''[BT] 章节 2.4, 4.2, 4.3, 5.1''' 或 '''[GS] Sections 3.3, 3.6, 7.3'''
#* [[概率论与数理统计 (Spring 2025)/Two-point sampling|Two-point sampling]]
#* [[概率论与数理统计 (Spring 2025)/Threshold of k-clique in random graph|Threshold of <math>k</math>-clique in random graph]]
#* [[概率论与数理统计 (Spring 2025)/Weierstrass Approximation Theorem|Weierstrass approximation]]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)/第二次作业提交名单

2025-03-31T07:23:44Z

Zouzongrui: Created page with " 如有错漏请邮件联系助教. <center> {| class="wikitable" |- ! 学号 !! 姓名 |- | 231840166 || 赵少凡 |- | 231200035 || 葛翰飞 |- | 231240011 || 陈力琰 |- | 231840058 || 王润希 |- | 231880123 || 陆博文 |- | 231240005 || 王甚易 |- | 231880394 || 翟笑晨 |- | 231240047 || 姜淮仁 |- | 231880152 || 李佳骏 |- | 231240015 || 邱文宣 |- | 231240002 || 余孟凡 |- | 231098046 || 李方泽 |- | 231240060 || 侯佳松 |- | 23124..."

如有错漏请邮件联系助教.
<center>
{| class="wikitable"
|-
! 学号 !! 姓名
|-
| 231840166 || 赵少凡
|-
| 231200035 || 葛翰飞
|-
| 231240011 || 陈力琰
|-
| 231840058 || 王润希
|-
| 231880123 || 陆博文
|-
| 231240005 || 王甚易
|-
| 231880394 || 翟笑晨
|-
| 231240047 || 姜淮仁
|-
| 231880152 || 李佳骏
|-
| 231240015 || 邱文宣
|-
| 231240002 || 余孟凡
|-
| 231098046 || 李方泽
|-
| 231240060 || 侯佳松
|-
| 231240038 || 肖金宇
|-
| 231240018 || 渠翔凯
|-
| 231220005 || 樊书岩
|-
| 231230102 || 庄铸锴
|-
| 231240055 || 魏全烨
|-
| 231240029 || 朱非凡
|-
| 231220001 || 陈梓睿
|-
| 231240021 || 陈奕帆
|-
| 231240056 || 靳濡搏
|-
| 231240026 || 蒙航
|-
| 231880140 || 桂天麟
|-
| 231240058 || 陈小川
|-
| 231880320 || 尹铁越
|-
| 231240007 || 高亮节
|-
| 231840242 || 李昊凌
|-
| 231240013 || 南晨曦
|-
| 231240051 || 王昱霏
|-
| 231240046 || 陈昊奕
|-
| 231240024 || 陈越
|-
| 231240004 || 何梓杨
|-
| 231240019 || 邵与乔
|-
| 231240009 || 陈心怡
|-
| 231220002 || 潘谟天
|-
| 231240016 || 徐冰冰
|-
| 231220090 || 邓淇
|-
| 231240053 || 王艺文
|-
| 231240027 || 彭浩楠
|-
| 231240023 || 叶宸见
|-
| 231098142 || 陈逸帆
|-
| 231240025 || 李熠辰
|}
</center>

概率论与数理统计 (Spring 2025)

2025-03-31T07:22:54Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2025)/第一次作业提交名单|第一次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 2|Problem Set 2]] 请在 2025/3/31 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A2.pdf').
** [[概率论与数理统计 (Spring 2025)/第二次作业提交名单|第二次作业提交名单]]

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2025)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2025/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''
#* the [http://tcs.nju.edu.cn/slides/prob2024/discrete-pmf.nb '''Mathematica notebook file'''] for the PMFs of basic discrete distributions
#* [[概率论与数理统计 (Spring 2025)/Average-case analysis of QuickSort|Average-case analysis of '''''QuickSort''''']]
#* [https://www.bilibili.com/video/BV1ta411A7fp/ 高尔顿板（Galton board）视频] 和 [https://en.wikipedia.org/wiki/Galton_board 维基百科页面]
# [http://tcs.nju.edu.cn/slides/prob2025/Deviation.pdf 矩与偏差]
#* 阅读：'''[MU] Chapter 3'''
#* 阅读：'''[BT] 章节 2.4, 4.2, 4.3, 5.1''' 或 '''[GS] Sections 3.3, 3.6, 7.3'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)

2025-03-16T13:49:45Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2025)/第一次作业提交名单|第一次作业提交名单]]

*[[概率论与数理统计 (Spring 2025)/Problem Set 2|Problem Set 2]] 请在 2025/3/31 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A2.pdf').

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2025)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2025/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-14T08:10:13Z

Zouzongrui: /* Problem 1 (Warm-up problems, 16 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 16 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] is a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PDF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the expected number of titles examined by a random reader before discovery of the required book.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.

== Problem 2 (Discrete random variable, 14 points) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-14T08:06:44Z

Zouzongrui: /* Problem 1 (Warm-up problems, 16 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 16 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] is a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the expected number of titles examined by a random reader before discovery of the required book.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.

== Problem 2 (Discrete random variable, 14 points) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:49:20Z

Zouzongrui: /* Problem 1 (Warm-up problems, 16 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 16 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the expected number of titles examined by a random reader before discovery of the required book.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.

== Problem 2 (Discrete random variable, 14 points) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:47:13Z

Zouzongrui: /* Problem 1 (Warm-up problems, 16 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 16 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.

== Problem 2 (Discrete random variable, 14 points) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:43:43Z

Zouzongrui: /* Problem 2 (Discrete random variable) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 16 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Discrete random variable, 14 points) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:43:23Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 16 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:43:03Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:42:43Z

Zouzongrui: /* Problem 2 (Discrete random variable) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:39:14Z

Zouzongrui:

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 3 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 4 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 5 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:38:21Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Cumulative distribution function (CDF) (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Cumulative distribution function (CDF) (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability density function (PDF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:37:55Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Probability density function (PDF)'''] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability density function (PDF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:32:01Z

Zouzongrui: /* Problem 3 (Discrete random variable) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability density function (PDF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PDF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:31:45Z

Zouzongrui: /* Problem 2 (Distribution of random variable, 8 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability density function (PDF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PDF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Tournament]
Prove that there is a [https://en.wikipedia.org/wiki/Tournament_(graph_theory) tournament] <math>T</math> with <math>n</math> players and at least <math>n!/2^{n-1}</math>Hamiltonian paths.
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:28:48Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] It is required to place in order <math>n</math> books <math>B_1, B_2, \cdots, B_n</math> on a library shelf in such way that readers searching from left to right waste as little time as possible on average. Assuming that a random reader requires book <math>B_i</math> with probability <math>p_i</math>, find the ordering of the books which minimizes the number of titles examined by a random reader before discovery of the required book.
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:21:47Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Sorted books''']
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li> [Number of operations]
We have a directed graph <math>G=(V,E)</math> without self-loops and multi-edges.

Until the graph becomes empty, repeat the following operation:
<ul>
Choose one (unerased) vertex uniformly at random (independently from the previous choices). Then, erase that vertex and all vertices that are reachable from the chosen vertex by traversing some edges. Erasing a vertex will also erase the edges incident to it.

</ul>
Let <math>R_v</math> denotes the number of vertices from which vertex <math>v</math> is reachable. Prove that the expected value of the number of operations equals to <math>\sum_{v \in V} \frac{1}{R_v}</math>.
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:09:45Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Function of random variable (II)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] Is it generally true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]? Is it ever true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]?
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:09:14Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Function of random variable (II)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math], [math]X^- = -\min\{0,X\}[/math], [math]|X|=X^+ + X^-[/math], [math]-X[/math], in terms of the distribution function [math]F[/math] of the random variable [math]X[/math].
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] Is it generally true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]? Is it ever true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]?
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 2

2025-03-13T15:06:23Z

Zouzongrui: /* Problem 1 (Warm-up problems, 12 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 2==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Warm-up problems, 12 points) ==
* ['''Function of random variable (I)'''] Let [math]X[/math] be a random variable with distribution function [math]\max(0,\min(1,x))[/math]. Let [math]F[/math] be a distribution function which is continuous and strictly increasing. Show that [math]Y=F^{-1}(X)[/math] be a random variable with distribution function [math]F[/math].
* ['''Function of random variable (II)'''] Express the distribution functions of [math]X^+ = \max\{0,X\}[/math]
* ['''Independence'''] Let <math>X_r</math>, <math>1\leq r\leq n</math> be independent random variables which are symmetric about <math>0</math>; that is, <math>X_r</math> and <math>-X_r</math> have the same distributions. Show that, for all <math>x</math>, <math>\mathbf{Pr}[S_n \geq x] = \mathbf{Pr}[S_n \leq -x]</math> where <math>S_n = \sum_{r=1}^n X_r</math>. Is the conclusion true without the assumtion of independence?
* ['''Dependence'''] Let <math>X</math> and <math>Y</math> be discrete random variables with joint mass function <math>f(x,y) = \frac{C}{(x+y-1)(x+y)(x+y+1)}</math> where <math>x,y \in \mathbb{N}_+</math> (in other words, <math>x,y = 1,2,3,\cdots</math>). Find (1) the value of <math>C</math>, (2) marginal mass function of <math>X</math> and (3) <math>\mathbf{E}[X]</math>.
* ['''Expectation'''] Is it generally true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]? Is it ever true that [math]\mathbf{E}[1/X] = 1/\mathbf{E}[X][/math]?
* [Entropy of discrete random variable] Let [math]X[/math] be a discrete random variable with range of values [math][N] = \{1,2,\ldots,N\}[/math] and probability mass function [math]p[/math]. Define [math]H(X) = -\sum_{n \ge 1} p(n) \log p(n)[/math] with convention [math]0\log 0 = 0[/math]. Prove that [math]H(X) \le \log N[/math] using Jensen's inequality.
* [Law of total expectation] Let [math]X \sim \mathrm{Geom}(p)[/math] for some parameter [math]p \in (0,1)[/math]. Calculate [math]\mathbf{E}[X][/math] using the law of total expectation.

* [Random number of random variables] Let <math>\{X_n\}_{n \ge 1}</math> be identically distributed random variable and <math>N</math> be a random variable taking values in the non-negative integers and independent of the <math>X_n</math> for all <math>n \ge 1</math>. Prove that <math>\mathbf{E}\left[\sum_{i=1}^N X_i\right] = \mathbf{E}[N] \mathbf{E}[X_1]</math>.

== Problem 2 (Distribution of random variable, 8 points) ==

<ul>
<li>[Cumulative distribution function (CDF)] Let <math>X</math> be a random variable with cumulative distribution function <math>F</math>.
<ol>
<li>Show that <math>Y = aX+b</math> is a random variable where <math>a</math> and <math>b</math> are real constants, and express the CDF of <math>Y</math> by <math>F</math>. (Hint: Try expressing the event [math]Y=aX+b\le y[/math] by countably many set operations on the events defined on [math]X[/math].)</li>
<li>Let <math>G</math> be the CDF of random variable <math>Z:\Omega\rightarrow \mathbb{R}</math> and <math>0\leq \lambda \leq 1</math>, show that
<ul>
<li><math>\lambda F + (1-\lambda)G</math> is a CDF function. </li>
<li>The product <math>FG</math> is a CDF function, and if <math>Z</math> and <math>X</math> are independent, then <math>FG</math> is the CDF of <math>\max\{X,Z\}</math>.</li>
</ul></li>
</ol></li>
<li>[Probability mass function (PMF)] We toss <math>n</math> coins, and each one shows heads with probability <math>p</math>, independently of each of the others. Each coin which shows head is tossed again. (If the coin shows tail, it won't be tossed again.) Let <math>X</math> be the number of heads resulting from the second round of tosses, and <math>Y</math> be the number of heads resulting from all tosses, which includes the first and (possible) second round of each toss.
<ol>
<li>Find the PMF of <math>X</math> and <math>Y</math>.</li>
<li>Find <math>\mathbf{E}[X]</math> and <math>\mathbf{E}[Y]</math>.</li>
<li>Let <math>p_X</math> be the PMF of <math>X</math>, show that [math]p_X(k-1)p_X(k+1)\leq [p_X(k)]^2[/math] for <math>1\leq k \leq n-1</math>.</li>
</ol></li>
</ul>

== Problem 3 (Discrete random variable) ==
<ul>
<li> [Geometric distribution (I)] Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are [math]c[/math] different types of object, and each package is equally likely to contain any given type. You buy one package each day.
<ol>
<li>Find the expected number of days which elapse between the acquisitions of the [math]j[/math]-th new type of object and the [math](j + 1)[/math]-th new type.</li>
<li>Find the expected number of days which elapse before you have a full set of objects.</li>
</ol>
</li>

<li> [Geometric distribution (II)] Prove that geometric distribution is the only discrete memoryless distribution with range values <math>\mathbb{N}_+</math>.
</li>

<li> [Binomial distribution] Let <math>n_1,n_2 \in \mathbb{N}_+</math> and <math>0 \le p \le 1</math> be parameters, and <math>X \sim \mathrm{Bin}(n_1,p),Y \sim \mathrm{Bin}(n_2,p)</math> be independent random variables. Prove that <math>X+Y \sim \mathrm{Bin}(n_1+n_2,p)</math>.
</li>

<li> [Negative binomial distribution] Let <math>X</math> follows the negative binomial distribution with parameter <math>r \in \mathbb{N}_+</math> and <math>p \in (0,1)</math>. Calculate <math>\mathbf{Var}[X] = \mathbf{E}[X^2] - \left(\mathbf{E}[X]\right)^2</math>.
</li>
<li>[Hypergeometric distribution] An urn contains [math]N[/math] balls, [math]b[/math] of which are blue and [math]r = N -b[/math] of which are red. A random sample of [math]n[/math] balls is drawn without replacement (无放回) from the urn. Let [math]B[/math] the number of blue balls in this sample. Show that if [math]N, b[/math], and [math]r[/math] approach [math]+\infty[/math] in such a way that [math]b/N \rightarrow p[/math] and [math]r/N \rightarrow 1 - p[/math], then [math]\mathbf{Pr}(B = k) \rightarrow {n\choose k}p^k(1-p)^{n-k}[/math] for [math]0\leq k \leq n[/math].</li>

<li>[Poisson distribution] In your pocket is a random number <math>N</math> of coins, where <math>N</math> has the Poisson distribution with parameter <math>\lambda</math>. You toss each coin once, with heads showing with probability <math>p</math> each time. Let <math>X</math> be the (random) number of heads outcomes and <math>Y</math> be the (also random) number of tails.
<ol>
<li>Find the joint mass function of <math>(X,Y)</math>.</li>
<li>Find PMF of the marginal distribution of <math>X</math> in <math>(X,Y)</math>. Are <math>X</math> and <math>Y</math> independent?</li>
</ol>
</li>

<li>[Conditional distribution (I)]
Let <math>X</math> and <math>Y</math> be independent <math>\text{Bin}(n, p)</math> random variables, and let <math>Z = X + Y</math>. Show that the conditional distribution of <math>X</math> given <math>Z = N</math> is the hypergeometric distribution.
</li>

<li>[Conditional distribution (II)]
Let <math>\lambda,\mu > 0</math> and <math>n \in \mathbb{N}</math> be parameters, and <math>X \sim \mathrm{Pois}(\lambda), Y \sim \mathrm{Pois}(\mu)</math> be independent random variables. Find out the conditional distribution of <math>X</math>, given <math>X+Y = n</math>.
</li>
</ul>

== Problem 4 (Linearity of Expectation, 12 points) ==
<ul>
<li> [Streak]
Suppose we flip a fair coin <math>n</math> times independently to obtain a sequence of flips <math>X_1, X_2, \ldots , X_n</math>. A streak of flips is a consecutive subsequence of flips that are all the same. For example, if <math>X_3</math>, <math>X_4</math>, and <math>X_5</math> are all heads, there is a streak of length <math>3</math> starting at the third lip. (If <math>X_6</math> is also heads, then there is also a streak of length <math>4</math> starting at the third lip.) Find the expected number of streaks of length <math>k</math> for some integer <math>k \ge 1</math>.
</li>

<li> [Number of cycles]
At a banquet, there are <math>n</math> people who shake hands according to the following process: In each round, two idle hands are randomly selected and shaken (these two hands are no longer idle). After <math>n</math> rounds, there will be no idle hands left, and the <math>n</math> people will form several cycles. For example, when <math>n=3</math>, the following situation may occur: the left and right hands of the first person are held together, the left hand of the second person and the right hand of the third person are held together, and the right hand of the second person and the left hand of the third person are held together. In this case, three people form two cycles. How many cycles are expected to be formed after <math>n</math> rounds?
</li>

<li>[Expected Mex]
Let <math>X_1,X_2,\ldots,X_{100} \sim \mathrm{Geo}(1/2)</math> be independent random variables. Compute <math>\mathbf{E}[\mathrm{mex}(X_1,X_2,\ldots,X_{100})]</math>, where <math>\mathrm{mex}(a_1,a_2,\ldots,a_n)</math> is the smallest positive integer that does not appear in <math>a_1,a_2,\ldots,a_n</math>. Your answer is considered correct if the absolute error does not exceed <math>10^{-6}</math>. (Hint: Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required.)
</li>
</li>

</ul>

== Problem 5 (Probability meets graph theory) ==
<ul>
<li>[Random social networks]
Let <math>G = (V, E)</math> be a fixed undirected graph without isolating vertex.
Let <math>d_v</math> be the degree of vertex <math>v</math>. Let <math>Y</math> be a uniformly chosen vertex, and <math>Z</math> a uniformly chosen neighbor of <math>Y</math>.
<ol>
<li>Show that <math>\mathbf{E}[d_Z] \geq \mathbf{E}[d_Y]</math>.</li>
<li>Interpret this inequality in the context of social networks, in which the vertices represent people, and the edges represent friendship.</li>
</ol>
</li>
<li>[Turán's Theorem]
Let <math>G=(V,E)</math> be a fixed undirected graph, and write <math>d_v</math> for the degree of the vertex <math>v</math>. Use probablistic method to prove that <math>\alpha(G) \ge \sum_{v \in V} \frac{1}{d_v + 1}</math>, where <math>\alpha(G)</math> is the size of a maximum independent set. (Hint: Consider the following random procedure for generating an independent set <math>I</math> from a graph with vertex set <math>V</math>: First, generate a random permutation of the vertices, denoted as <math>v_1,v_2,\ldots,v_n</math>. Then, construct the independent set <math>I</math> as follows: For each vertex <math>v_i \in V</math>, add <math>v_i</math> to <math>I</math> if and only if none of its predecessors in the permutation, i.e., <math>v_1,\ldots,v_{i-1}</math>, are neighbors of <math>v_i</math>.)
</li>

<li>[Dominating set]
A ''dominating set'' of vertices in an undirected graph <math>G = (V, E)</math> is a set <math>S \subseteq V</math> such that every vertex of
<math>G</math> belongs to <math>S</math> or has a neighbor in <math>S</math>.

Let <math>G = (V, E)</math> be an <math>n</math>-vertex graph with minimum degree <math>d > 1</math>. Prove that <math>G</math> has a dominating set with at most <math>\frac{n\left(1+\log(d+1)\right)}{d+1}</math> vertices. (Hint: Consider a random vertex subset <math>S \subseteq V</math> by including each vertex independently with
probability <math>p := \log(d + 1)/(d + 1)</math>.)
</li>
</ul>

== Problem 6 (1D random walk, 8 points) ==
Let <math>p \in (0,1)</math> be a constant, and <math>\{X_n\}_{n \ge 1}</math> be independent Bernoulli trials with successful probability <math>p</math>.
Define <math>S_n = 2\sum_{i=1}^n X_i - n</math> and <math>S_0 = 0</math>.

<ul>
<li>
[Range of random walk] The range <math>R_n</math> of <math>S_0, S_1, \ldots, S_n</math> is defined as the number of distinct values taken by the sequence. Show that <math>\mathbf{Pr}\left(R_n = R_{n-1}+1\right) = \mathbf{Pr}\left(\forall 1 \le i \le n, S_i \neq 0\right)</math> as <math>n \to \infty</math>, and deduce that <math>n^{-1} \mathbf{E}[R_n]\to \mathbf{Pr}(\forall i \ge 1, S_i \neq 0)</math>. Hence show that <math>n^{-1} \mathbf{E}[R_n] \to |2p-1|</math> as <math>n \to \infty</math>.
</li>

<li>
[Symmetric 1D random walk (III)] Suppose <math>p = \frac{1}{2}</math>. Prove that <math>\mathbf{E}[|S_n|] = \Theta(\sqrt{n})</math>.

</li>
</ul>

概率论与数理统计 (Spring 2025)/第一次作业提交名单

2025-03-10T12:09:00Z

Zouzongrui: Created page with "如有错漏邮件请及时联系助教。 <center> {| class="wikitable" |- ! 学号 !! 姓名 |- | 231200035 || 葛翰飞 |- | 231880394 || 翟笑晨 |- | 231840166 || 赵少凡 |- | 231880123 || 陆博文 |- | 231840058 || 王润希 |- | 231240002 || 余孟凡 |- | 231240060 || 侯佳松 |- | 231240055 || 魏全烨 |- | 231098046 || 李方泽 |- | 231240015 || 邱文宣 |- | 231240011 || 陈力琰 |- | 231240038 || 肖金宇 |- | 231230102 || 庄铸锴 |-..."

如有错漏邮件请及时联系助教。
<center>
{| class="wikitable"
|-
! 学号 !! 姓名
|-
| 231200035 || 葛翰飞
|-
| 231880394 || 翟笑晨
|-
| 231840166 || 赵少凡
|-
| 231880123 || 陆博文
|-
| 231840058 || 王润希
|-
| 231240002 || 余孟凡
|-
| 231240060 || 侯佳松
|-
| 231240055 || 魏全烨
|-
| 231098046 || 李方泽
|-
| 231240015 || 邱文宣
|-
| 231240011 || 陈力琰
|-
| 231240038 || 肖金宇
|-
| 231230102 || 庄铸锴
|-
| 231880140 || 桂天麟
|-
| 231880152 || 李佳骏
|-
| 231240047 || 姜淮仁
|-
| 231240018 || 渠翔凯
|-
| 231240029 || 朱非凡
|-
| 231840242 || 李昊凌
|-
| 231240056 || 靳濡搏
|-
| 231240007 || 高亮节
|-
| 231220001 || 陈梓睿
|-
| 231240058 || 陈小川
|-
| 231240019 || 邵与乔
|-
| 231240025 || 李熠辰
|-
| 231240004 || 何梓杨
|-
| 231240051 || 王昱霏
|-
| 231240023 || 叶宸见
|-
| 231240045 || 杨俊炜
|-
| 231240046 || 陈昊奕
|-
| 231220005 || 樊书岩
|-
| 231240013 || 南晨曦
|-
| 231240024 || 陈越
|-
| 231240021 || 陈奕帆
|-
| 231240043 || 朱宇曦
|-
| 231830123 || 刘逸杨
|-
| 231240016 || 徐冰冰
|-
| 231220002 || 潘谟天
|-
| 231240027 || 彭浩楠
|-
| 231240053 || 王艺文
|-
| 231220090 || 邓淇
|-
| 231240009 || 陈心怡
|-
| 211250032 || 吕卉森
|-
| 231240026 || 蒙航
|-
| 231240026 || 蒙航
|-
| 211840320 || 华徐涛
|-
| 231880432 || 向灿
|}
</center>

概率论与数理统计 (Spring 2025)

2025-03-10T12:08:11Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2025)/第一次作业提交名单|第一次作业提交名单]]

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2025)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2025/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)

2025-02-21T10:39:23Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 10am-12pm 
Wednesday (单), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Friday, 2pm-3pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2025. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：10am-12pm，仙Ⅰ-204
** 周三（单）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* 周五 2pm-3pm, 计算机系 804（尹一通）
:* 周五 2pm-3pm, 计算机系 516（刘景铖）
:* '''QQ群''': 974742320（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2025)/Problem Set 1|Problem Set 1]] 请在 2025/3/10 上课之前(10am UTC+8)提交到 [mailto:pr_nju@163.com pr_nju@163.com] (文件名为'学号_姓名_A1.pdf').

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2025/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2025/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]
* Concentration of measures:
** [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
** [https://en.wikipedia.org/wiki/Chernoff_bound Chernoff bound]
** [https://en.wikipedia.org/wiki/Hoeffding%27s_inequality Hoeffding's bound]
** [https://en.wikipedia.org/wiki/Doob_martingale#McDiarmid's_inequality McDiarmid's inequality]
** [https://en.wikipedia.org/wiki/Doob_martingale Doob martingale] and [https://en.wikipedia.org/wiki/Azuma%27s_inequality Azuma's inequality]
** [https://en.wikipedia.org/wiki/Sub-Gaussian_distribution Sub-Gaussian distribution]
* [https://en.wikipedia.org/wiki/Stochastic_process Stochastic process]
* [https://en.wikipedia.org/wiki/Stopping_time Stopping time]
* [https://en.wikipedia.org/wiki/Martingale_(probability_theory) Martingale]
** [https://en.wikipedia.org/wiki/Optional_stopping_theorem Optional stopping theorem]
* [https://en.wikipedia.org/wiki/Wald%27s_equation Wald's equation]
* [https://en.wikipedia.org/wiki/Markov_chain Markov chain]
** [https://en.wikipedia.org/wiki/Markov_property Markov property] and [https://en.wikipedia.org/wiki/Memorylessness memorylessness]
** [https://en.wikipedia.org/wiki/Stochastic_matrix Stochastic matrix]
** [https://en.wikipedia.org/wiki/Stationary_distribution Stationary distribution]

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T12:37:15Z

Zouzongrui: /* Assumption throughout Problem Set 1 */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Coprime integers] Given positive integers <math>n \ge 2</math>, calculate the number of integer pairs <math>(x,y)</math> satisfying <math>1 \le x < y \le n</math> and <math>\mathrm{gcd}(x,y) = 1</math>.
(Remark: It suggests the probability that two randomly chosen integers from <math>1</math> to <math>n</math> are coprime tends to <math>\frac{6}{\pi^2}</math> as <math>n</math> tends to infinity.)
</li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-1-classic-examples">Problem 2 (Symmetric 1D random walk)</h2>
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns <math>1</math> point if it's HEADs, and loses <math>1</math> point if otherwise.
<ul>
<li>[Symmetric 1D random walk (I)] Let <math>A_i</math> be the event that the gambler earns <math>0</math> points after playing <math>i</math> rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i)</math>. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty</math>.)
</li>
<li>[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his <math>m</math> points. Let <math>B_i</math> be the event that the game ends within <math>i</math> rounds. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i})</math>.
(Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i)</math>. (ii) Compute <math>\mathbf{Pr}(C_i)</math>, which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty</math>.)
</li>
</ul>

<h2 id="problem-3-probability-space">Problem 3 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

<li>[<math>\sigma</math>-field?] A set <math>A \subseteq \mathbb{N}</math> is said to have asymptotic density <math>\theta</math> if <math>\lim_{n \to \infty} |A \cap \{1,2,\ldots,n\}| / n = \theta</math>. Let <math>\mathcal{A}</math> be the collection of sets for which the asymptotic density exists. Is <math>\mathcal{A}</math> a <math>\sigma</math>-algebra? Please explain your answer.
</li>

</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players (vertices) <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (and vice versa). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T12:35:40Z

Zouzongrui: /* Assumption throughout Problem Set 1 */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Coprime integers] Given positive integers <math>n \ge 2</math>, calculate the number of integer pairs <math>(x,y)</math> satisfying <math>1 \le x < y \le n</math> and <math>\mathrm{gcd}(x,y) = 1</math>.
(Remark: It suggests the probability that two randomly chosen integers from <math>1</math> to <math>n</math> are coprime tends to <math>\frac{6}{\pi^2}</math> as <math>n</math> tends to infinity.)
</li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-1-classic-examples">Problem 2 (Symmetric 1D random walk)</h2>
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns <math>1</math> point if it's HEADs, and loses <math>1</math> point if otherwise.
<ul>
<li>[Symmetric 1D random walk (I)] Let <math>A_i</math> be the event that the gambler earns <math>0</math> points after playing <math>i</math> rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i)</math>. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty</math>.)
</li>
<li>[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his <math>m</math> points. Let <math>B_i</math> be the event that the game ends within <math>i</math> rounds. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i})</math>.
(Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i)</math>. (ii) Compute <math>\mathbf{Pr}(C_i)</math>, which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty</math>.)
</li>
</ul>

<h2 id="problem-3-probability-space">Problem 3 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

<li>[<math>\sigma</math>-field?] A set <math>A \subseteq \mathbb{N}</math> is said to have asymptotic density <math>\theta</math> if <math>\lim_{n \to \infty} |A \cap \{1,2,\ldots,n\}| / n = \theta</math>. Let <math>\mathcal{A}</math> be the collection of sets for which the asymptotic density exists. Is <math>\mathcal{A}</math> a <math>\sigma</math>-algebra? Please explain your answer.
</li>

</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
<li>['''Noisy channel'''] When coded messages are sent, there are sometimes errors in transmission. In particular, Morse code uses "dots" and "dashes", which are known to occur in the proportion of 3:4. In other words, the probability of a "dot" being sent is <math>3/7</math>, and the probability of a "dash" being sent is <math>4/7</math>. Suppose due to interference on the transmission channel, with probability <math>1/8</math>, a dot will be mistakenly received as a dash, and vice versa. If we receive a dot, what is the probability that the transmitted symbol is indeed a dot?
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players (vertices) <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (and vice versa). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T12:34:35Z

Zouzongrui: /* Assumption throughout Problem Set 1 */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Coprime integers] Given positive integers <math>n \ge 2</math>, calculate the number of integer pairs <math>(x,y)</math> satisfying <math>1 \le x < y \le n</math> and <math>\mathrm{gcd}(x,y) = 1</math>.
(Remark: It suggests the probability that two randomly chosen integers from <math>1</math> to <math>n</math> are coprime tends to <math>\frac{6}{\pi^2}</math> as <math>n</math> tends to infinity.)
</li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-1-classic-examples">Problem 2 (Symmetric 1D random walk)</h2>
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns <math>1</math> point if it's HEADs, and loses <math>1</math> point if otherwise.
<ul>
<li>[Symmetric 1D random walk (I)] Let <math>A_i</math> be the event that the gambler earns <math>0</math> points after playing <math>i</math> rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i)</math>. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty</math>.)
</li>
<li>[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his <math>m</math> points. Let <math>B_i</math> be the event that the game ends within <math>i</math> rounds. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i})</math>.
(Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i)</math>. (ii) Compute <math>\mathbf{Pr}(C_i)</math>, which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty</math>.)
</li>
</ul>

<h2 id="problem-3-probability-space">Problem 3 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

<li>[<math>\sigma</math>-field?] A set <math>A \subseteq \mathbb{N}</math> is said to have asymptotic density <math>\theta</math> if <math>\lim_{n \to \infty} |A \cap \{1,2,\ldots,n\}| / n = \theta</math>. Let <math>\mathcal{A}</math> be the collection of sets for which the asymptotic density exists. Is <math>\mathcal{A}</math> a <math>\sigma</math>-algebra? Please explain your answer.
</li>

</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
<li>['''Noisy channel'''] When coded messages are sent, there are sometimes errors in transmission. In particular, Morse code uses "dots" and "dashes", which are known to occur in the proportion of 3:4. In other words, the probability of a "dot" being sent is <math>3/7</math>, and the probability of a "dash" being sent is <math>4/7</math>. Suppose due to interference on the transmission channel, with probability <math>1/8</math>, a dot will be mistakenly received as a dash, and vice versa. If we receive a dot, what is the probability that the transmitted symbol is indeed a dot?
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math>. We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T12:34:22Z

Zouzongrui: /* Assumption throughout Problem Set 1 */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Coprime integers] Given positive integers <math>n \ge 2</math>, calculate the number of integer pairs <math>(x,y)</math> satisfying <math>1 \le x < y \le n</math> and <math>\mathrm{gcd}(x,y) = 1</math>.
(Remark: It suggests the probability that two randomly chosen integers from <math>1</math> to <math>n</math> are coprime tends to <math>\frac{6}{\pi^2}</math> as <math>n</math> tends to infinity.)
</li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-1-classic-examples">Problem 2 (Symmetric 1D random walk)</h2>
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns <math>1</math> point if it's HEADs, and loses <math>1</math> point if otherwise.
<ul>
<li>[Symmetric 1D random walk (I)] Let <math>A_i</math> be the event that the gambler earns <math>0</math> points after playing <math>i</math> rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i)</math>. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty</math>.)
</li>
<li>[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his <math>m</math> points. Let <math>B_i</math> be the event that the game ends within <math>i</math> rounds. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i})</math>.
(Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i)</math>. (ii) Compute <math>\mathbf{Pr}(C_i)</math>, which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty</math>.)
</li>
</ul>

<h2 id="problem-3-probability-space">Problem 3 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

<li>[<math>\sigma</math>-field?] A set <math>A \subseteq \mathbb{N}</math> is said to have asymptotic density <math>\theta</math> if <math>\lim_{n \to \infty} |A \cap \{1,2,\ldots,n\}| / n = \theta</math>. Let <math>\mathcal{A}</math> be the collection of sets for which the asymptotic density exists. Is <math>\mathcal{A}</math> a <math>\sigma</math>-algebra? Please explain your answer.
</li>

</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
<li>['''Noisy channel'''] When coded messages are sent, there are sometimes errors in transmission. In particular, Morse code uses "dots" and "dashes", which are known to occur in the proportion of 3:4. In other words, the probability of a "dot" being sent is <math>3/7</math>, and the probability of a "dash" being sent is <math>4/7</math>. Suppose due to interference on the transmission channel, with probability <math>1/8</math>, a dot will be mistakenly received as a dash, and vice versa. If we receive a dot, what is the probability that the transmitted symbol is indeed a dot?
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math>. We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T12:33:56Z

Zouzongrui: /* Assumption throughout Problem Set 1 */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Coprime integers] Given positive integers <math>n \ge 2</math>, calculate the number of integer pairs <math>(x,y)</math> satisfying <math>1 \le x < y \le n</math> and <math>\mathrm{gcd}(x,y) = 1</math>.
(Remark: It suggests the probability that two randomly chosen integers from <math>1</math> to <math>n</math> are coprime tends to <math>\frac{6}{\pi^2}</math> as <math>n</math> tends to infinity.)
</li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-1-classic-examples">Problem 2 (Symmetric 1D random walk)</h2>
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns <math>1</math> point if it's HEADs, and loses <math>1</math> point if otherwise.
<ul>
<li>[Symmetric 1D random walk (I)] Let <math>A_i</math> be the event that the gambler earns <math>0</math> points after playing <math>i</math> rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i)</math>. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty</math>.)
</li>
<li>[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his <math>m</math> points. Let <math>B_i</math> be the event that the game ends within <math>i</math> rounds. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i})</math>.
(Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i)</math>. (ii) Compute <math>\mathbf{Pr}(C_i)</math>, which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty</math>.)
</li>
</ul>

<h2 id="problem-3-probability-space">Problem 3 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

<li>[<math>\sigma</math>-field?] A set <math>A \subseteq \mathbb{N}</math> is said to have asymptotic density <math>\theta</math> if <math>\lim_{n \to \infty} |A \cap \{1,2,\ldots,n\}| / n = \theta</math>. Let <math>\mathcal{A}</math> be the collection of sets for which the asymptotic density exists. Is <math>\mathcal{A}</math> a <math>\sigma</math>-algebra? Please explain your answer.
</li>

</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
<li>['''Noisy channel'''] When coded messages are sent, there are sometimes errors in transmission. In particular, Morse code uses "dots" and "dashes", which are known to occur in the proportion of 3:4. In other words, the probability of a "dot" being sent is <math>3/7</math>, and the probability of a "dash" being sent is <math>4/7</math>. Suppose due to interference on the transmission channel, with probability <math>1/8</math>, a dot will be mistakenly received as a dash, and vice versa. If we receive a dot, what is the probability that the transmitted symbol is indeed a dot?
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once, and no draws occur. Given two players <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (we say “ <math>x</math> dominates <math>y</math>”). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. For example, a triangle <math>x\rightarrow y \rightarrow \rightarrow z \rightarrow x</math> has property <math>S_1</math> but not <math>S_2</math>. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T12:30:00Z

Zouzongrui: /* Assumption throughout Problem Set 1 */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Coprime integers] Given positive integers <math>n \ge 2</math>, calculate the number of integer pairs <math>(x,y)</math> satisfying <math>1 \le x < y \le n</math> and <math>\mathrm{gcd}(x,y) = 1</math>.
(Remark: It suggests the probability that two randomly chosen integers from <math>1</math> to <math>n</math> are coprime tends to <math>\frac{6}{\pi^2}</math> as <math>n</math> tends to infinity.)
</li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-1-classic-examples">Problem 2 (Symmetric 1D random walk)</h2>
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns <math>1</math> point if it's HEADs, and loses <math>1</math> point if otherwise.
<ul>
<li>[Symmetric 1D random walk (I)] Let <math>A_i</math> be the event that the gambler earns <math>0</math> points after playing <math>i</math> rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i)</math>. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty</math>.)
</li>
<li>[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his <math>m</math> points. Let <math>B_i</math> be the event that the game ends within <math>i</math> rounds. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i})</math>.
(Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i)</math>. (ii) Compute <math>\mathbf{Pr}(C_i)</math>, which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty</math>.)
</li>
</ul>

<h2 id="problem-3-probability-space">Problem 3 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

<li>[<math>\sigma</math>-field?] A set <math>A \subseteq \mathbb{N}</math> is said to have asymptotic density <math>\theta</math> if <math>\lim_{n \to \infty} |A \cap \{1,2,\ldots,n\}| / n = \theta</math>. Let <math>\mathcal{A}</math> be the collection of sets for which the asymptotic density exists. Is <math>\mathcal{A}</math> a <math>\sigma</math>-algebra? Please explain your answer.
</li>

</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
<li>['''Noisy channel'''] When coded messages are sent, there are sometimes errors in transmission. In particular, Morse code uses "dots" and "dashes", which are known to occur in the proportion of 3:4. In other words, the probability of a "dot" being sent is <math>3/7</math>, and the probability of a "dash" being sent is <math>4/7</math>. Suppose due to interference on the transmission channel, with probability <math>1/8</math>, a dot will be mistakenly received as a dash, and vice versa. If we receive a dot, what is the probability that the transmitted symbol is indeed a dot?
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A [[wikipedia:Tournament_(graph_theory)|tournament]] can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once. Given two players <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (we say “ <math>x</math> dominates <math>y</math>”), and we draw an arrow from <math>y</math> to <math>x</math>
if <math>y</math> beats <math>x</math> (there are no ties). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2025)/Problem Set 1

2025-02-19T12:27:39Z

Zouzongrui: /* Assumption throughout Problem Set 1 */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

== Assumption throughout Problem Set 1==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

<h2 id="problem-2-principle-of-inclusion-and-exclusion">Problem 1 (Principle of Inclusion and Exclusion)</h2>
Let <math>n\ge 1</math> be a positive integer and <math>A_1,A_2,\ldots,A_n</math> be <math>n</math> events.
<ul>
<li>[Union bound] Prove <math>\mathbf{Pr}\left( \bigcup_{i=1}^{n} A_i \right) \le \sum_{i=1}^{n} \mathbf{Pr}\left(A_i\right)</math> using the definition of probability space.</li>
<li>[Principle of Inclusion and Exclusion (PIE)] Prove that <math>\mathbf{Pr}\left( \bigcup_{i=1}^n A_i\right) = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|-1} \mathbf{Pr}\left( \bigcap_{i \in S} A_i \right)</math>, where <math>[n]=\{1,2,\ldots,n\}</math>.</li>
<li>[Surjection] For positive integers <math>m\ge n</math>, prove that the probability of a uniform random function <math>f:[m]\to[n]</math> to be surjective (满射) is <math>\sum_{k=1}^n(-1)^{n-k}{n\choose k}\left(\frac{k}{n}\right)^m</math>. </li>
<li>[Coprime integers] Given positive integers <math>n \ge 2</math>, calculate the number of integer pairs <math>(x,y)</math> satisfying <math>1 \le x < y \le n</math> and <math>\mathrm{gcd}(x,y) = 1</math>.
(Remark: It suggests the probability that two randomly chosen integers from <math>1</math> to <math>n</math> are coprime tends to <math>\frac{6}{\pi^2}</math> as <math>n</math> tends to infinity.)
</li>
<li>[Bonferroni's inequality and Kounias' inequality] Prove that
<math>
\sum_{i=1}^n \mathbf{Pr}(A_i) - \sum_{1 \le i< j \le n} \mathbf{Pr}(A_i \cap A_j)\le \mathbf{Pr}\left(\bigcup_{i=1}^n A_i\right) \le \sum_{i=1}^n \mathbf{Pr} \left( A_i\right) - \sum_{i=2}^n \mathbf{Pr}(A_1 \cap A_i).
</math>
(Hint: This is sometimes called Kounias' inequality which is weaker than the Bonferroni's inequality. You can try using Venn diagram to understand these inequalities.)</li>
</ul>

<h2 id="problem-1-classic-examples">Problem 2 (Symmetric 1D random walk)</h2>
A gambler plays a fair gambling game: At each round, he flips a fair coin, earns <math>1</math> point if it's HEADs, and loses <math>1</math> point if otherwise.
<ul>
<li>[Symmetric 1D random walk (I)] Let <math>A_i</math> be the event that the gambler earns <math>0</math> points after playing <math>i</math> rounds of the game, that is, the number of times the coin lands on heads is equal to the number of times it lands on tails. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i)</math>. (Hint: You may use Stirling's approximation to estimate [math]\mathbf{Pr}(A_i)[/math] and derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(A_i) = +\infty</math>.)
</li>
<li>[Symmetric 1D random walk (II)] Suppose that the game ends upon that the gambler loses all his <math>m</math> points. Let <math>B_i</math> be the event that the game ends within <math>i</math> rounds. Compute <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i})</math>.
(Hint: You may first consider [math] \displaystyle{m = 1}[/math] case. Let [math]\displaystyle{C_i}[/math] be the event that the game ends at the [math]\displaystyle{i}[/math]-th round. (i) Prove that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = \sum_{i=1}^{+\infty} (i-1) \mathbf{Pr}(C_i)</math>. (ii) Compute <math>\mathbf{Pr}(C_i)</math>, which is a special case of the ballot problem. (iii) Finally, use Stirling's approximation to derive that <math>\sum_{i=1}^{+\infty} \mathbf{Pr}(\overline{B_i}) = +\infty</math>.)
</li>
</ul>

<h2 id="problem-3-probability-space">Problem 3 (Probability space)</h2>
<ul>
<li>[Nonexistence of probability space] Prove that it is impossible to define a uniform probability law on natural numbers <math>\mathbb{N}</math>. More precisely, prove that there does not exist a probability space <math>(\mathbb{N},2^{\mathbb{N}},\mathbf{Pr})</math> such that <math>\mathbf{Pr}(\{i\}) = \mathbf{Pr}(\{j\})</math> for all <math>i, j \in \mathbb{N}</math>.
Please explain why the same argument fails to prove that there is no uniform probability law on the real interval <math>[0,1]</math>, that is, there is no such probability space <math>([0,1],\mathcal{F},\mathbf{Pr})</math> that for any interval <math>(l,r] \subseteq [0,1]</math>, it holds that <math>(l,r] \in \mathcal{F}</math> and <math>\mathbf{Pr}( (l,r] ) = r-l</math>. (Actually, such probability measure does exist and is called the Lebesgue measure on <math>[0,1]</math>). 
</li>
<li>[Smallest <math>\sigma</math>-field] For any subset <math>S \subseteq 2^\Omega</math>, prove that the smallest <math>\sigma</math>-field containing <math>S</math> is given by <math>\bigcap_{\substack{S \subseteq \mathcal{F} \subseteq 2^\Omega\\ \mathcal{F} \text{ is a } \sigma\text{-field }} } \mathcal{F}</math>. (Hint: You should show that it is indeed a <math>\sigma</math>-field and also it is the smallest one containing <math>S</math>.)
</li>
<li>[Union of <math>\sigma</math>-field] Let <math>\mathcal{F}</math> and <math>\mathcal{G}</math> be <math>\sigma</math>-fields of subsets of <math>\Omega</math>. Show that <math>\mathcal{F} \cup \mathcal{G}</math> is not necessarily a <math>\sigma</math>-field.
</li>

<li>[Probability space?] Let <math>\Omega = \mathbb{R}</math>, <math>\mathcal{F}</math> is the set of all subsets <math>A \subseteq \Omega</math> so that <math>A</math> or <math>\overline{A}</math> (complement of <math>A</math>) is countable, <math>P(A) = 0</math> in the first case and <math>P(A) = 1</math> in the second. Is <math>(\Omega,\mathcal{F},P)</math> a probability space? Please explain your answer.
</li>

<li>[<math>\sigma</math>-field?] A set <math>A \subseteq \mathbb{N}</math> is said to have asymptotic density <math>\theta</math> if <math>\lim_{n \to \infty} |A \cap \{1,2,\ldots,n\}| / n = \theta</math>. Let <math>\mathcal{A}</math> be the collection of sets for which the asymptotic density exists. Is <math>\mathcal{A}</math> a <math>\sigma</math>-algebra? Please explain your answer.
</li>

</ul>

<h2 id="problem-4-conditional-probability">Problem 4 (Conditional probability)</h2>
<ul>
<li>[Positive correlation] We say that events <math>B</math> gives ''positive information'' about event <math>A</math> if <math>\mathbf{Pr}(A|B) > \mathbf{Pr}(A)</math>, that is, the occurrence of <math>B</math> makes the occurrence of <math>A</math> more likely. Now suppose that <math>B</math> gives positive information about <math>A</math>.
# Does <math>A</math> give positive information about <math>B</math>?
# Does <math>\overline{B}</math> give negative information about <math>A</math>, that is, is it true that <math>\mathbf{Pr}(A|\overline{B}) < \mathbf{Pr}(A)</math>?
# Does <math>\overline{B}</math> give positive information or negative information about <math>\overline{A}</math>?

<li>[Balls in urns (I)] There are <math>n</math> urns of which the <math>r</math>-th contains <math>r-1</math> white balls and <math>n-r</math> black balls. You pick an urn uniformly at random (here, "uniformly" means that each urn has equal probability of being chosen) and remove two balls from that urn, uniformly at random without replacement (which means that each of the <math>{n-1\choose 2}</math> pairs of balls are chosen to be removed with equal probability). Find the following probabilities:
# the second ball is black;
# the second ball is black, given that the first is black.
</li>
<li>[Balls in urns (II)] Suppose that an urn contains <math>w</math> white balls and <math>b</math> black balls. The balls are drawn from the urn one by one, each time uniformly and independently at random, without replacement (which means we do not put the chosen ball back after each drawing). Find the probabilities of the events:
# the first white ball drawn is the <math>(k+1)</math>th ball;
# the last ball drawn is white.
</li>
<li>['''Noisy channel'''] When coded messages are sent, there are sometimes errors in transmission. In particular, Morse code uses "dots" and "dashes", which are known to occur in the proportion of 3:4. In other words, the probability of a "dot" being sent is <math>3/7</math>, and the probability of a "dash" being sent is <math>4/7</math>. Suppose due to interference on the transmission channel, with probability <math>1/8</math>, a dot will be mistakenly received as a dash, and vice versa. If we receive a dot, what is the probability that the transmitted symbol is indeed a dot?
</li>
</ul>
<h2 id="problem-5-independence">Problem 5 (Independence)</h2>
Let's consider a series of <math>n</math> outputs <math>(X_1, X_2, \cdots, X_n) \in \{0,1\}^n</math> of <math>n</math> independent Bernoulli trials, where each trial succeeds with the same probability <math>0 .
<ul>
<li>[Limited independence] Construct three events <math>A,B</math> and <math>C</math> out of <math>n</math> Bernoulli trials such that <math>A, B</math> and <math>C</math> are pairwise independent but are not (mutually) independent. You need to prove that the constructed events <math>A, B</math> and <math>C</math> satisfy this. (Hint: Consider the case where <math>n = 2</math> and <math>p = 1/2</math>.)
</li>
<li>[Product distribution] Suppose someone has observed the output of the <math>n</math> trials, and she told you that precisely <math>k</math> out of <math>n</math> trials succeeded for some <math>0< k< n</math>. Now you want to predict the output of the <math>(n+1)</math>-th trial while the parameter <math>p</math> of the Bernoulli trial is unknown. One way to estimate <math>p</math> is to find such <math>\hat{p}</math> that makes the observed outcomes most probable, namely you need to solve
<math>\arg \max_{\hat{p}\in(0,1)} \mathbf{Pr}_{\hat{p}} [k \text{ out of } n\text{ trials succeed}].</math>
<ol>
<li>Estimate <math>p</math> by solving the above optimization problem.</li>
<li>If someone tells you exactly which <math>k</math> trials succeed (in addition to just telling you the number of successful trials, which is <math>k</math>), would it help you to estimate <math>p</math> more accurately? Why?</li>
</ol>
</li>
</ul>
<h2 id="problem-6-probabilistic-method">Problem 6 (Probabilistic method)</h2>
<ul>
<li>[Tournaments] A tournament can be interpreted as the outcome of a round-robin tournament in which every player faces every other player exactly once. Given two players <math>x</math> and <math>y</math>, we draw an
arrow from <math>x</math> to <math>y</math> if <math>x</math> beats <math>y</math> (we say “ <math>x</math> dominates <math>y</math>”), and we draw an arrow from <math>y</math> to <math>x</math>
if <math>y</math> beats <math>x</math> (there are no ties). We say a tournament has property <math>S_k</math> if for every <math>k</math> players, there exists another player <math>v</math> who defeats all of them. Prove that if <math>\binom n k(1-2^{-k})^{n-k}<1</math>, then there is a tournament on <math>n</math> vertices that has the property <math>S_k</math>.</li>
</ul>

概率论与数理统计 (Spring 2024)/第四次作业提交名单

2024-06-12T05:53:07Z

Zouzongrui: Created page with "如有错漏邮件请联系助教. <center> {| class="wikitable" |- ! 学号 !! 姓名 |- | 221240092 || 杨煜申 |- | 201850065 || 严以恒 |- | 211240020 || 朱睿骐 |- | 221240035 || 李想 |- | 221240009 || 冯雨桐 |- | 221240068 || 周凡淇 |- | 221240007 || 郭宇轩 |- | 221240027 || 唐诗博 |- | 221098038 || 张浩淼 |- | 221240093 || 陈力峥 |- | 221240066 || 张植翔 |- | 201240090 || 陈诺星 |- | 221240065 || 何俊渊 |- | 221180115..."

如有错漏邮件请联系助教.
<center>
{| class="wikitable"
|-
! 学号 !! 姓名
|-
| 221240092 || 杨煜申
|-
| 201850065 || 严以恒
|-
| 211240020 || 朱睿骐
|-
| 221240035 || 李想
|-
| 221240009 || 冯雨桐
|-
| 221240068 || 周凡淇
|-
| 221240007 || 郭宇轩
|-
| 221240027 || 唐诗博
|-
| 221098038 || 张浩淼
|-
| 221240093 || 陈力峥
|-
| 221240066 || 张植翔
|-
| 201240090 || 陈诺星
|-
| 221240065 || 何俊渊
|-
| 221180115 || 黄文睿
|-
| 221900059 || 王齐剑
|-
| 221830206 || 李君羡
|-
| 221240056 || 郭子良
|-
| 221870066 || 陶心悦
|-
| 221240098 || 游书阳
|-
| 211240056 || 赵烜呈
|-
| 221240082 || 李岳翔
|-
| 221240041 || 周越洋
|-
| 221240047 || 孙梓洋
|-
| 221240095 || 王芷若
|-
| 201850051 || 李轲楠
|-
| 221240012 || 吴承迅
|-
| 221900051 || 吴子奕
|-
| 221240022 || 韩瑞
|-
| 221220108 || 黄嘉祺
|-
| 221220108 || 黄嘉祺
|-
| 221870120 || 余明晖
|-
| 221900156 || 韩加瑞
|-
| 221240097 || 徐川
|-
| 215220012 || 吕茄旋
|-
| 221900332 || 王卫东
|-
| 221240083 || 陈正佺
|-
| 221240002 || 季悦宁
|-
| 221240075 || 高歌
|-
| 221240051 || 姜钦宸
|-
| 221240032 || 蔡坤志
|-
| 221240060 || 熊浚丞
|-
| 221240008 || 胡佳昕
|-
| 221240074 || 曹任飞
|-
| 221240033 || 陈宣宇
|-
| 221900500 || 李亦非
|-
| 221900045 || 郑玥轲
|-
| 221240001 || 王炳旭
|-
| 221240090 || 冯浚哲
|-
| 221240023 || 蒋耀瑾
|-
| 221240058 || 熊峥杰
|-
| 221240040 || 郑雯琪
|-
| 221240073 || 李恒济
|-
| 221240070 || 汪宗元
|-
| 221900459 || 黄培耕
|-
| 211250032 || 吕卉森
|-
| 221850073 || 成华睿
|-
| 221240013 || 晁宇豪
|-
| 221240024 || 唐之尧
|-
| 221240084 || 史嘉诚
|-
| 221240026 || 刘俨东
|-
| 221250191 || 吴觐宇
|-
| 221240063 || 欧澄垚
|-
| 221250161 || 翟志阳
|-
| 221900133 || 赖昱萌
|-
| 221900006 || 耿天成
|}
</center>

概率论与数理统计 (Spring 2024)

2024-06-12T05:52:36Z

Zouzongrui: /* Assignments */

{{Infobox
|name = Infobox
|bodystyle =
|title = '''概率论与数理统计''' 
'''Probability Theory''' & '''Mathematical Statistics'''
|titlestyle =

|image =
|imagestyle =
|caption =
|captionstyle =
|headerstyle = background:#ccf;
|labelstyle = background:#ddf;
|datastyle =

|header1 =Instructor
|label1 =
|data1 =
|header2 =
|label2 =
|data2 = '''尹一通'''
|header3 =
|label3 = Email
|data3 = yinyt@nju.edu.cn
|header4 =
|label4 = office
|data4 = 计算机系 804
|header5 =
|label5 =
|data5 = '''刘景铖'''
|header6 =
|label6 = Email
|data6 = liu@nju.edu.cn
|header7 =
|label7 = office
|data7 = 计算机系 516
|header8 = Class
|label8 =
|data8 =
|header9 =
|label9 = Class meeting
|data9 = Monday, 2pm-4pm 
Wednesday (双), 10am-12pm 
仙Ⅰ-204
|header10=
|label10 = Office hour
|data10 = Monday, 4pm-5pm 计算机系 804（尹一通） 计算机系 516（刘景铖）
|header11= Textbook
|label11 =
|data11 =
|header12=
|label12 =
|data12 = [[File:概率导论.jpeg|border|100px]]
|header13=
|label13 =
|data13 = '''概率导论'''（第2版·修订版） Dimitri P. Bertsekas and John N. Tsitsiklis 郑忠国童行伟译；人民邮电出版社 (2022)
|header14=
|label14 =
|data14 = [[File:Grimmett_probability.jpg|border|100px]]
|header15=
|label15 =
|data15 = '''Probability and Random Processes''' (4E) Geoffrey Grimmett and David Stirzaker Oxford University Press (2020)
|header16=
|label16 =
|data16 = [[File:Probability_and_Computing_2ed.jpg|border|100px]]
|header17=
|label17 =
|data17 = '''Probability and Computing''' (2E) Michael Mitzenmacher and Eli Upfal Cambridge University Press (2017)
|belowstyle = background:#ddf;
|below =
}}

This is the webpage for the ''Probability Theory and Mathematical Statistics'' (概率论与数理统计) class of Spring 2024. Students who take this class should check this page periodically for content updates and new announcements.

= Announcement =
* TBA

= Course info =
* '''Instructor ''':
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]：[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>]，计算机系 804
:* [https://liuexp.github.io 刘景铖]：[mailto:liu@nju.edu.cn <liu@nju.edu.cn>]，计算机系 516
* '''Teaching assistant''':
** [https://sites.google.com/view/xinyuanzhang 张昕渊]：[mailto:zhangxy@smail.nju.edu.cn <zhangxy@smail.nju.edu.cn>]，计算机系 410
** 邹宗瑞：[mailto:zou.zongrui@smail.nju.edu.cn <zou.zongrui@smail.nju.edu.cn>]，计算机系 410
* '''Class meeting''':
** 周一：2pm-4pm，仙Ⅰ-204
** 周三（双）：10am-12pm，仙Ⅰ-204
* '''Office hour''':
:* Monday, 4pm-5pm, 计算机系 804（尹一通）
:* Tuesday, 3pm-4pm, 计算机系 516（刘景铖）
:* '''QQ群''': 629856946（申请加入需提供姓名、院系、学号）

= Syllabus =
课程内容分为三大部分：
* '''经典概率论'''：概率空间、随机变量及其数字特征、多维与连续随机变量、极限定理等内容
* '''概率与计算'''：测度集中现象 (concentration of measure)、概率法 (the probabilistic method)、离散随机过程的相关专题
* '''数理统计'''：参数估计、假设检验、贝叶斯估计、线性回归等统计推断等概念

对于第一和第二部分，要求清楚掌握基本概念，深刻理解关键的现象与规律以及背后的原理，并可以灵活运用所学方法求解相关问题。对于第三部分，要求熟悉数理统计的若干基本概念，以及典型的统计模型和统计推断问题。

经过本课程的训练，力求使学生能够熟悉掌握概率的语言，并会利用概率思维来理解客观世界并对其建模，以及驾驭概率的数学工具来分析和求解专业问题。

=== 教材与参考书 Course Materials ===
* '''[BT]''' 概率导论（第2版·修订版），[美]伯特瑟卡斯（Dimitri P.Bertsekas）[美]齐齐克利斯（John N.Tsitsiklis）著，郑忠国童行伟译，人民邮电出版社（2022）。
* '''[GS]''' ''Probability and Random Processes'', by Geoffrey Grimmett and David Stirzaker; Oxford University Press; 4th edition (2020).
* '''[MU]''' ''Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis'', by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).

=== 成绩 Grading Policy ===
* 课程成绩：本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
* 迟交：如果有特殊的理由，无法按时完成作业，请提前联系授课老师，给出正当理由。否则迟交的作业将不被接受。

=== 学术诚信 Academic Integrity ===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线，本课程将不遗余力的维护学术诚信规范，违反这一底线的行为将不会被容忍。

作业完成的原则：署你名字的工作必须是你个人的贡献。在完成作业的过程中，允许讨论，前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成，并在作业中致谢（acknowledge）所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中，对他人工作（出版物、互联网资料、其他人的作业等）直接的文本抄袭和对关键思想、关键元素的抄袭，按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释，都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为， 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行，也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为，不仅使自己沦为一个欺骗者，也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

= Assignments =
*[[概率论与数理统计 (Spring 2024)/Problem Set 1|Problem Set 1]] 请在 <strike>2024/4/3</strike>2024/4/8 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A1.pdf').
** [[概率论与数理统计 (Spring 2024)/第一次作业提交名单|第一次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 2|Problem Set 2]] 请在 <strike>2024/4/29</strike>2024/5/6 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A2.pdf').
** [[概率论与数理统计 (Spring 2024)/第二次作业提交名单|第二次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 3|Problem Set 3]] 请在 2024/5/20 上课之前(2pm UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A3.pdf').
** [[概率论与数理统计 (Spring 2024)/第三次作业提交名单|第三次作业提交名单]]
*[[概率论与数理统计 (Spring 2024)/Problem Set 4|Problem Set 4]] 请在 2024/6/12 上课之前(10am UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'学号_姓名_A4.pdf').
** [[概率论与数理统计 (Spring 2024)/第四次作业提交名单|第四次作业提交名单]]

= Lectures =
# [http://tcs.nju.edu.cn/slides/prob2024/Intro.pdf 课程简介]
# [http://tcs.nju.edu.cn/slides/prob2024/ProbSpace.pdf 概率空间]
#* 阅读：'''[BT] 第1章''' 或 '''[GS] Chapter 1'''
#* [[概率论与数理统计 (Spring 2024)/Karger's min-cut algorithm| Karger's min-cut algorithm]]
# [http://tcs.nju.edu.cn/slides/prob2024/RandVar.pdf 随机变量]
#* 阅读：'''[BT] 第2章''' 或 '''[GS] Chapter 2, Sections 3.1~3.5, 3.7'''
#* 阅读：'''[MU] Chapter 2'''
#* the [http://tcs.nju.edu.cn/slides/prob2024/discrete-pmf.nb '''Mathematica notebook file'''] for the PMFs of basic discrete distributions
#* [[概率论与数理统计 (Spring 2024)/Average-case analysis of QuickSort|Average-case analysis of '''''QuickSort''''']]
#* [https://www.bilibili.com/video/BV1ta411A7fp/ 高尔顿板（Galton board）视频] 和 [https://en.wikipedia.org/wiki/Galton_board 维基百科页面]
# [http://tcs.nju.edu.cn/slides/prob2024/Deviation.pdf 矩与偏差]
#* 阅读：'''[MU] Chapter 3'''
#* 阅读：'''[BT] 章节 2.4, 4.2, 4.3, 5.1''' 或 '''[GS] Sections 3.3, 3.6, 7.3'''
#* [[概率论与数理统计 (Spring 2024)/Two-point sampling|Two-point sampling]]
#* [[概率论与数理统计 (Spring 2024)/Threshold of k-clique in random graph|Threshold of <math>k</math>-clique in random graph]]
#* [[概率论与数理统计 (Spring 2024)/Weierstrass Approximation Theorem|Weierstrass approximation]]
# [http://tcs.nju.edu.cn/slides/prob2024/Continuous.pdf 连续分布]
#* 阅读：'''[BT] 第3章, 和4.1节''' 或 '''[GS] Chapter 4'''
#* 阅读：'''[MU] Chapters 8, 9'''
#* [https://measure.axler.net/MIRA.pdf Measure, Integration & Real Analysis] by Sheldon Axler
# [http://tcs.nju.edu.cn/slides/prob2024/Convergence.pdf 极限定理]
#* 阅读：'''[BT] 第5章'''
#* 阅读：'''[GS] Sections 5.7~5.10, 7.1~7.5'''
# [http://tcs.nju.edu.cn/slides/prob2024/Concentration.pdf 测度集中]
#* 阅读：'''[MU] Chapters 4''' and '''Sections 13.1, 13.4~13.5'''
#* 阅读：'''[GS] Sections 5.11, 12.1~12.3, 7.8~7.9'''
#* [[概率论与数理统计 (Spring 2024)/Hoeffding's lemma|Hoeffding's lemma]]
#* [[概率论与数理统计 (Spring 2024)/Entropy and volume of Hamming balls|Entropy and volume of Hamming balls]]
# [http://tcs.nju.edu.cn/slides/prob2024/Process.pdf 随机过程]
#* 阅读：'''[BT] 第6章, 第7章'''
#* 阅读：'''[MU] Chapters 7, Sections 13.1~13.3''' or '''[GS] Chapters 6, Sections 12.4~12.5'''
#* [[概率论与数理统计 (Spring 2024)/OST and applications|OST and applications]]
#[[Media:Stat01-2024.pdf |统计、推断与点估计]]、 [[Media:Stat02-2024.pdf |假设检验]]

= Concepts =
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]
* [https://en.wikipedia.org/wiki/Random_variable Random variable]
** [https://en.wikipedia.org/wiki/Cumulative_distribution_function Cumulative distribution function]
** [https://en.wikipedia.org/wiki/Probability_mass_function Probability mass function]
** [https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
* [https://en.wikipedia.org/wiki/Multivariate_random_variable Random vector]
** [https://en.wikipedia.org/wiki/Joint_probability_distribution Joint probability distribution]
** [https://en.wikipedia.org/wiki/Conditional_probability_distribution Conditional probability distribution]
** [https://en.wikipedia.org/wiki/Marginal_distribution Marginal distribution]
* Some '''discrete''' probability distributions
** [https://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trial] and [https://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
** [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Discrete uniform distribution]
** [https://en.wikipedia.org/wiki/Binomial_distribution Binomial distribution]
** [https://en.wikipedia.org/wiki/Geometric_distribution Geometric distribution]
** [https://en.wikipedia.org/wiki/Negative_binomial_distribution Negative binomial distribution]
** [https://en.wikipedia.org/wiki/Hypergeometric_distribution Hypergeometric distribution]
** [https://en.wikipedia.org/wiki/Poisson_distribution Poisson distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Discrete_distributions others]
* Balls into bins model
** [https://en.wikipedia.org/wiki/Multinomial_distribution Multinomial distribution]
** [https://en.wikipedia.org/wiki/Birthday_problem Birthday problem]
** [https://en.wikipedia.org/wiki/Coupon_collector%27s_problem Coupon collector]
** [https://en.wikipedia.org/wiki/Balls_into_bins_problem Occupancy problem]
* Random graphs
** [https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model Erdős–Rényi random graph model]
** [https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process Galton–Watson branching process]
* [https://en.wikipedia.org/wiki/Expected_value Expectation]
** [https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician Law of the unconscious statistician, ''LOTUS'']
** [https://dlsun.github.io/probability/linearity.html Linearity of expectation]
** [https://en.wikipedia.org/wiki/Conditional_expectation Conditional expectation]
** [https://en.wikipedia.org/wiki/Law_of_total_expectation Law of total expectation]
* [https://en.wikipedia.org/wiki/Markov%27s_inequality Markov's inequality]
* [https://en.wikipedia.org/wiki/Chebyshev%27s_inequality Chebyshev's inequality]
* [https://en.wikipedia.org/wiki/Moment_(mathematics) Moment] related
** [https://en.wikipedia.org/wiki/Central_moment Central moment]
** [https://en.wikipedia.org/wiki/Variance Variance]
** [https://en.wikipedia.org/wiki/Standard_deviation Standard deviation]
** [https://en.wikipedia.org/wiki/Covariance Covariance]
** [https://en.wikipedia.org/wiki/Correlation Correlation]
** [https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Correlation does not imply causation]
** [https://en.wikipedia.org/wiki/Skewness Skewness]
** [https://en.wikipedia.org/wiki/Kurtosis Kurtosis]
** [https://en.wikipedia.org/wiki/Moment_problem Moment problem]
* [https://en.wikipedia.org/wiki/Measure_(mathematics) Measure]
** [https://en.wikipedia.org/wiki/Borel_set Borel set]
** [https://en.wikipedia.org/wiki/Lebesgue_integration Lebesgue integration]
** [https://en.wikipedia.org/wiki/Measurable_function Measurable function]
** [https://en.wikipedia.org/wiki/Cantor_set Cantor set]
** [https://en.wikipedia.org/wiki/Non-measurable_set Non-measurable set]
*[https://en.wikipedia.org/wiki/Probability_density_function Probability density function]
** [https://en.wikipedia.org/wiki/Probability_distribution#Absolutely_continuous_probability_distribution Continuous probability distribution]
**[https://en.wikipedia.org/wiki/Conditional_probability_distribution#Conditional_continuous_distributions Conditional continuous distributions]
**[https://en.wikipedia.org/wiki/Convolution_of_probability_distributions Convolution of probability distributions] and [https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions List of convolutions of probability distributions]
* Some '''continuous''' probability distributions
** [https://en.wikipedia.org/wiki/Continuous_uniform_distribution Continuous uniform distribution]
** [https://en.wikipedia.org/wiki/Exponential_distribution Exponential distribution] and [https://en.wikipedia.org/wiki/Poisson_point_process Poisson point process]
** [https://en.wikipedia.org/wiki/Normal_distribution Normal (Gaussion) distribution]
*** [https://en.wikipedia.org/wiki/Gaussian_function Gaussian function] and [https://en.wikipedia.org/wiki/Gaussian_integral Gaussian integral]
*** [https://en.wikipedia.org/wiki/Standard_normal_table Standard normal table]
*** [https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule 68–95–99.7 rule]
*** [https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables Sum of normally distributed random variables]
*** [https://en.wikipedia.org/wiki/Multivariate_normal_distribution Multivariate normal distribution]
** [https://en.wikipedia.org/wiki/Chi-squared_distribution Chi-squared distribution]
** [https://en.wikipedia.org/wiki/Gamma_distribution Gamma distribution] and [https://en.wikipedia.org/wiki/Gamma_function Gamma function]
** [https://en.wikipedia.org/wiki/Beta_distribution Beta distribution] and [https://en.wikipedia.org/wiki/Beta_function Beta function]
** [https://en.wikipedia.org/wiki/Cauchy_distribution Cauchy distribution]
** and [https://en.wikipedia.org/wiki/List_of_probability_distributions#Absolutely_continuous_distributions others]
* [https://en.wikipedia.org/wiki/Inverse_transform_sampling Inverse transform sampling]
* [https://en.wikipedia.org/wiki/Stochastic_dominance Stochastic dominance] and [https://en.wikipedia.org/wiki/Coupling_(probability) Coupling]
* [https://en.wikipedia.org/wiki/Moment-generating_function Moment-generating function]
* [https://en.wikipedia.org/wiki/Convergence_of_random_variables Convergence of random variables]
** [https://en.wikipedia.org/wiki/Pointwise_convergence Pointwise convergence]
** [https://en.wikipedia.org/wiki/Convergence_of_measures#Weak_convergence_of_measures Weak convergence of measures]
** [https://en.wikipedia.org/wiki/Convergence_in_measure Convergence in measure]
** [https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem Skorokhod's representation theorem]
** [https://en.wikipedia.org/wiki/Continuous_mapping_theorem Continuous mapping theorem]
* [https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma Borel–Cantelli lemma] and [https://en.wikipedia.org/wiki/Zero%E2%80%93one_law zero-one laws]
* [https://en.wikipedia.org/wiki/Law_of_large_numbers Law of large numbers]
* [https://en.wikipedia.org/wiki/Central_limit_theorem Central limit theorem]
** [https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem De Moivre–Laplace theorem]
** [https://en.wikipedia.org/wiki/Berry%E2%80%93Esseen_theorem Berry–Esseen theorem]
* [https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory) Characteristic function]
** [https://en.wikipedia.org/wiki/Lévy%27s_continuity_theorem Lévy's continuity theorem]

概率论与数理统计 (Spring 2024)/Problem Set 4

2024-05-24T13:24:13Z

Zouzongrui: /* Problem 2 (Concentration of measure, 10 points) */

*每道题目的解答都要有完整的解题过程，中英文不限。

*我们推荐大家使用LaTeX, markdown等对作业进行排版。

*Bonus problem为附加题（选做）。

== Assumption throughout Problem Set 4==
Without further notice, we are working on probability space <math>(\Omega,\mathcal{F},\mathbf{Pr})</math>.

Without further notice, we assume that the expectation of random variables are well-defined.

The term <math>\log</math> used in this context refers to the natural logarithm.

== Problem 1 (Continuous Random Variables, 30 points)==
<ul>
<li> [Density function]
Determine the value of <math>C</math> such that <math>f(x) = C\exp(-x-e^{-x}), x\in \mathbb{R}</math> is a probability density function (PDF) for a continuous random variable.
</li>

<li>
[Independence] Let <math>X</math> and <math>Y</math> be independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math> and probability density function (PDF) <math>f</math>. Find out the density functions of <math>V = \max\{X,Y\}</math> and <math>U = \min\{X,Y\}</math>.
</li>

<li>
[Correlation] Let <math>X</math> be uniformly distributed on <math>(-1,1)</math> and <math>Y_k = \cos(k \pi X)</math> for <math>k=1,2,\ldots,n</math>. Are the random variables <math>Y_1, Y_2, \ldots, Y_n</math> correlated? independent? You should prove your claim rigorously.
</li>

<li> [Expectation of random variables (I)]
Let <math>X</math> be a continuous random variable with mean <math>\mu</math> and cumulative distribution function (CDF) <math>F</math>.
<ul>
<li>
Suppose <math>X \ge 0</math>. Show that <math>\int_{0}^a F(x) dx = \int_{a}^{\infty} [1-F(x)] dx</math> if and only if <math>a = \mu</math>.
</li>
<li>
Suppose <math>X</math> has finite variance. Show that <math>g(a) = \mathbb{E}((X-a)^2)</math> achieves the minimum when <math>a = \mu</math>.
</li>
</ul>
</li>

<li>
[Expectation of random variables (II)] Let <math>X, Y</math> be two independent and identically distributed continuous random variables with cumulative distribution function (CDF) <math>F</math>. Furthermore, <math>X,Y \ge 0</math>. Show that <math>\mathbb{E}[|X-Y|] = 2 \left(\mathbb{E}[X] - \int_{0}^{\infty} (1-F(x))^2 dx\right)</math>
</li>

<li>
[Conditional distribution] Let <math>X</math> and <math>Y</math> be two random variables. The joint density of <math>X</math> and <math>Y</math> is given by <math>f(x,y) = c(x^2 - y^2)e^{-x}</math>, where <math>0\leq x <\infty</math> and <math>-x\leq y \leq x</math>. Here, <math>c\in \mathbb{R}_+</math> is a constant. Find out the conditional distribution of <math>Y</math>, given <math>X = x</math>.
</li>

<li>
[Uniform Distribution (I)] Let <math>P_i = (X_i,Y_i), 1\leq i\leq n</math>, be independent, uniformly distributed points in the unit square <math>[0,1]^2</math>. A point <math>P_i</math> is called "peripheral" if, for all <math>r = 1,2,\cdots,n</math>, either <math>X_r \leq X_i</math> or <math>Y_r \leq Y_i</math>, or both. Find out the expected number of peripheral points.
</li>

<li>
[Uniform Distribution (II)] Derive the moment generating function of the standard uniform distribution, i.e., uniform distribution on <math>(0,1)</math>.
</li>

<li>
[Exponential distribution] Let <math>X</math> have an exponential distribution. Show that <math>\textbf{Pr}[X>s+x|X>s] = \textbf{Pr}[X>x]</math>, for <math>x,s\geq 0</math>. This is the memoryless property. Show that the exponential distribution is the only continuous distribution with this property.
</li>

<li>
[Normal distribution(I)] Let <math>X,Y\sim N(0,1)</math> be two independent and identically distributed normal random variables. Let <math>Z = X-Y</math>. Find the density function of <math>Z</math> and <math>|Z|</math> respectively.
</li>

<li>
[Normal distribution(II)] Let <math>X</math> have the <math>N(0,1)</math> distribution and let <math>a>0</math>. Show that the random variable <math>Y</math> given by
<math>\begin{equation*}
Y = \begin{cases}
X, & |X|< a \\
-X, & |X|\geq a
\end{cases}
\end{equation*}</math>
has the <math>N(0,1)</math> distribution, and find an expression for <math>\rho(a) = \textbf{Cov}(X,Y)</math> in terms of the density function <math>\phi</math> of <math>X</math>.
</li>

<li>[Random process]
Given a real number <math>U<1</math> as input of the following process, find out the expected returning value.
{{Theorem|''Process''|
:'''Input:''' real numbers <math>U < 1</math>;
----
:initialize <math>x = 0</math> and <math>count = 0</math>;
:while <math> x do
:* choose <math>y \in (0,1)</math> uniformly at random;
:* update <math>x = x + y</math> and <math>count = count + 1</math>;
:return <math>count</math>;
}}
</li>
<li>
[Random semicircle] We sample <math>n</math> points within a circle <math>C=\{(x,y) \in \mathbb{R}^2 \mid x^2+y^2 \le 1\}</math> independently and uniformly at random (i.e., the density function <math>f(x,y) \propto 1_{(x,y) \in C}</math>). Find out the probability that they all lie within some semicircle of the original circle <math>C</math>. (Hint: you may apply the technique of change of variables, see [https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables function of random variables] or Chapter 4.7 in [GS])
</li>
<li>
[Stochastic domination] Let <math>X, Y</math> be continuous random variables. Show that <math>X</math> dominates <math>Y</math> stochastically if and only if <math>\mathbb{E}[f(X)]\geq \mathbb{E}[f(Y)]</math> for any non-decreasing function <math>f</math> for which the expectations exist.
</li>

</ul>

== Problem 2 (Concentration of measure, 10 points) ==
<ul>
<li>
[Tossing coins] We repeatedly toss a fair coin (with an equal probability of heads and tails). Let the random variable <math>X</math> be the number of throws required to obtain a total of <math>n</math> heads. Show that <math>\textbf{Pr}[X > 2n + 2\sqrt{n\log n}]\leq O(1/n)</math>.
</li>
<li>
[Chernoff vs Chebyshev] We have a standard six-sided die. Let <math>X</math> be the number of times a 6 occurs in <math>n</math> throws off the die. Compare the best upper bounds on <math>\textbf{Pr}[X\geq n/4]</math> that you can obtain using Chebyshev's inequality and Chernoff bounds.
</li>
<li>
[<math>k</math>-th moment bound] Let <math>X</math> be a random variable with expectation <math>0</math> such that moment generating function <math>\mathbf{E}[\exp(t|X|)]</math> is finite for some <math> t > 0 </math>. We can use the following two kinds of tail inequalities for <math> X </math>:
</li>
</ul><nowiki> </nowiki>'''''Chernoff Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}}
\end{align}
</math>
'''''<math>k</math>th-Moment Bound'''''
:<math>
\begin{align}
\mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k}
\end{align}
</math>

# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is no weaker than the Chernoff bound. (Hint: Use the probabilistic method.)
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

* ['''Cut size in random graph'''] Show that with probability at least <math>2/3</math>, the size of the max-cut in [[wikipedia:Erdős–Rényi_model#Definition|Erdős–Rényi random graph]] <math>G(n,1/2)</math> is at most <math>n^2/8 + O(n^{1.5})</math>. In the <math>G(n,1/2)</math> model, each edge is included in the graph with probability <math>1/2</math>, independently of every other edge.

== Problem 3 (Modes of Convergence, 15 points) (Bonus problem)==
<ul>

<li>
[Connection of convergence modes (I)] Let <math>(X_n)_{n \ge 1}, (Y_n)_{n \ge 1}, X, Y</math> be random variables and <math>c\in\mathbb{R}</math> be a real number.
<ul>
<li> Suppose <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} c</math>. Prove that <math>X_nY_n \overset{D}{\to} cX</math>.
</li>
<li>
Construct an example such that <math>X_n \overset{D}{\to} X</math> and <math>Y_n \overset{D}{\to} Y</math> but <math>X_nY_n</math> does not converge to <math>XY</math> in distribution.
</li>
</ul>
</li>

<li> [Connection of convergence modes (II)] Let <math>(X_n)_{n \ge 1}, X</math> be random variables. Prove that <math>X_n \overset{P}{\to} X</math> if and only if for every subsequence <math>X_{n(m)}</math>, there exists a further subsequence <math>Y_k = X_{n(m_k)}</math> that converges almost surely to <math>X</math>. (Hint: you may use the first Borel-Cantelli lemma.)

</li>

<li>
[Extension of Borel-Cantelli Lemma]
Let <math>(A_n)_{n \ge 1}</math> be events. Suppose <math>\sum_{n \ge 1} \mathbf{Pr}(A_n)=+\infty</math>. Show that
<math>\mathbf{Pr}(A_n \text{ i.o.}) \ge \limsup_{n \to \infty} \frac{ \left(\sum_{k=1}^n\mathbf{Pr}(A_k)\right)^2 }{\sum_{1\le j,k \le n} \mathbf{Pr}(A_j \cap A_k)}</math>.
</li>
</ul>

== Problem 4 (LLN and CLT, 15 points + 5 points) ==
In this problem, you may apply the results of Laws of Large Numbers (LLN) and the Central Limit Theorem (CLT) to solve the problems.

<ul>
<li>[St. Petersburg paradox] Consider the well-known game involving a fair coin. In this game, if it takes <math>k</math> tosses to obtain a head, you will win <math>2^k</math> dollars as the reward. Despite the game's expected reward being infinite, people tend to offer relatively modest amounts to participate. The following provides a mathematical explanation for this phenomenon.
<ul>
<li>
For each <math>n \ge 1</math>, let <math>X_{n,1}, X_{n,2},\ldots, X_{n,k}</math> be independent random variables. Furthermore, let <math>b_n > 0</math> be real numbers with <math>b_n \to \infty</math> and <math>\widetilde{X}_{n,k} = X_{n,k} \mathbf{1}_{|X_{n,k}| \le b_n}</math> for all <math>1 \le k \le n</math>. If <math>\sum_{k=1}^n \mathbf{Pr}(|X_{n,k}| > b_n) \to 0</math> and <math>b_n^{-2} \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}^2] \to 0</math> when <math>n \to \infty</math>, then <math>(S_n-a_n)/b_n \overset{P}{\to} 0 </math>, where <math>S_n = \sum_{k=1}^n X_{n,k}</math> and <math>a_n = \sum_{k=1}^n \mathbf{E}[\widetilde{X}_{n,k}]</math>.
</li>
<li>
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math>\frac{S_n}{n \log_2 n} \overset{P}{\to} 1</math>. (Therefore, a fair price to play this game <math>n</math> times is roughly <math>n \log_2 n</math> dollars)
</li>

<li> (Bonus problem, 5 points)
Let <math>S_n</math> be the total winnings after playing <math>n</math> rounds of the game. Prove that <math> \limsup_{n \to \infty} \frac{S_n}{n \log_2 n} = \infty</math> almost surely. (Hint: You may use Borel-Cantelli lemmas)
</li>
</ul>
</li>

<li>
[Monte Carlo Integration] Let <math>f</math> be a continuous function on <math>[0,1]</math> and <math>U_1,U_2,\ldots,U_n \sim U(0,1)</math> be independent random variables. Show that <math>I = \frac{1}{n}\sum_{i=1}^n f(U_i) \to \int_0^1 f(x) \mathrm{d}x</math> in probability. (Remark: This holds so long as <math>f</math> is a measurable function on <math>[0,1]</math> with <math>\int_0^1 |f(x)| \mathrm{d}x</math>)
</li>

<li>
[Square root] Let <math>X_1,X_2,\ldots</math> be i.i.d. random variables with <math>X_i \ge 0</math>, <math>\mathbf{E}[X_1] = 1</math> and <math>\mathbf{Var}[X_1] = \sigma^2 \in (0,+\infty)</math>. Show <math>\sqrt{S_n} - \sqrt{n} \overset{D}{\to} N(0,1)</math> as <math>n \to \infty</math>, where <math>S_n = \sum_{i=1}^n X_i</math>. (Hint: You may use the statement in Problem 3.)
</li>

</ul>