数据科学基础 (Fall 2024): Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
 
(6 intermediate revisions by 2 users not shown)
Line 31: Line 31:
|header6 =  
|header6 =  
|label6  = Email
|label6  = Email
|data6  = TBA
|data6  = lmm@nju.edu.cn
|header7 =
|header7 =
|label7  = office
|label7  = office
Line 40: Line 40:
|header9 =
|header9 =
|label9  = Class meeting
|label9  = Class meeting
|data9  = Friday, 9am-12pm <br/>南雍-西311
|data9  = Friday, 9am-12pm <br/>南雍-西110
|header10=
|header10=
|label10 = Office hour
|label10 = Office hour
Line 72: Line 72:


= Announcement =
= Announcement =
* TBA
* 因上课的同学太多,原教室无法容纳,从9月13日开始改为<strong style="color:red;">在西110上课</strong>。


= Course info =
= Course info =
* '''Instructor ''':  
* '''Instructor ''':  
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]:[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>],计算机学院 804  
:* [http://tcs.nju.edu.cn/yinyt/ 尹一通]:[mailto:yinyt@nju.edu.cn <yinyt@nju.edu.cn>],计算机学院 804  
:* [https://liumingmou.github.io 刘明谋]:[mailto:TBA <TBA>],南雍-东207
:* [https://liumingmou.github.io 刘明谋]:[mailto:lmm@nju.edu.cn <lmm@nju.edu.cn>],南雍-东207
* '''Teaching assistant''':
* '''Teaching assistant''':
** TBA
** TBA
* '''Class meeting''':
* '''Class meeting''':
** 周五:9am-12pm,南雍-西311
** 周五:9am-12pm,南雍-<s>西311</s> <strong>西110</strong>
* '''Office hour''':  
* '''Office hour''':  
:* 周四:6pm-8pm,南雍-东207(刘明谋)
:* 周四:6pm-8pm,南雍-东207(刘明谋)
Line 108: Line 108:


= Assignments =
= Assignments =
*TBA
*[[数据科学基础 (Fall 2024)/Problem Set 1|Problem Set 1]]  请在 2024/10/11 上课之前(9:00 UTC+8)提交到 [mailto:pr2024_nju@163.com pr2024_nju@163.com] (文件名为'<font color=red >学号_姓名_FA1.pdf</font>').


= Lectures =
= Lectures =
Line 114: Line 114:
# [http://tcs.nju.edu.cn/slides/FDS2024/ProbSpace.pdf 概率空间]
# [http://tcs.nju.edu.cn/slides/FDS2024/ProbSpace.pdf 概率空间]
#* 阅读:'''[BT] 第1章'''
#* 阅读:'''[BT] 第1章'''
# [http://tcs.nju.edu.cn/slides/FDS2024/RandVar.pdf 随机变量]
#* 阅读:'''[BT] 第2章'''
#* 阅读:'''[MU] Chapter 2'''
#* [[数据科学基础 (Fall 2024)/Volume of Hamming balls|Volume of Hamming balls]]


= Concepts =
= Concepts =
TBA
* [https://plato.stanford.edu/entries/probability-interpret/ Interpretations of probability]
* [https://en.wikipedia.org/wiki/History_of_probability History of probability]
* Example problems:
** [https://dornsifecms.usc.edu/assets/sites/520/docs/VonNeumann-ams12p36-38.pdf von Neumann's Bernoulli factory] and other [https://peteroupc.github.io/bernoulli.html Bernoulli factory algorithms]
** [https://en.wikipedia.org/wiki/Boy_or_Girl_paradox Boy or Girl paradox]
** [https://en.wikipedia.org/wiki/Monty_Hall_problem Monty Hall problem]
** [https://en.wikipedia.org/wiki/Bertrand_paradox_(probability) Bertrand paradox]
** [https://en.wikipedia.org/wiki/Hard_spheres Hard spheres model] and [https://en.wikipedia.org/wiki/Ising_model Ising model]
** [https://en.wikipedia.org/wiki/PageRank ''PageRank''] and stationary [https://en.wikipedia.org/wiki/Random_walk random walk]
** [https://en.wikipedia.org/wiki/Diffusion_process Diffusion process] and [https://en.wikipedia.org/wiki/Diffusion_model diffusion model]
*[https://en.wikipedia.org/wiki/Probability_space Probability space]
** [https://en.wikipedia.org/wiki/Sample_space Sample space]
** [https://en.wikipedia.org/wiki/Event_(probability_theory) Event] and [https://en.wikipedia.org/wiki/Σ-algebra <math>\sigma</math>-algebra]
** Kolmogorov's [https://en.wikipedia.org/wiki/Probability_axioms axioms of probability]
* [https://en.wikipedia.org/wiki/Discrete_uniform_distribution Classical] and [https://en.wikipedia.org/wiki/Geometric_probability goemetric probability]
* [https://en.wikipedia.org/wiki/Boole%27s_inequality Union bound]
** [https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle Inclusion-Exclusion principle]
** [https://en.wikipedia.org/wiki/Boole%27s_inequality#Bonferroni_inequalities Bonferroni inequalities]
* [https://en.wikipedia.org/wiki/Conditional_probability Conditional probability]
** [https://en.wikipedia.org/wiki/Chain_rule_(probability) Chain rule]
** [https://en.wikipedia.org/wiki/Law_of_total_probability Law of total probability]
** [https://en.wikipedia.org/wiki/Bayes%27_theorem Bayes' law]
* [https://en.wikipedia.org/wiki/Independence_(probability_theory) Independence]
** [https://en.wikipedia.org/wiki/Pairwise_independence Pairwise independence]

Latest revision as of 07:38, 22 September 2024

数据科学基础
Foundations of Data Science
Instructor
尹一通
Email yinyt@nju.edu.cn
office 计算机学院 804
刘明谋
Email lmm@nju.edu.cn
office 南雍-东207
Class
Class meeting Friday, 9am-12pm
南雍-西110
Office hour Thursday, 6pm-8pm
南雍-东207
Textbook
概率导论(第2版·修订版)
Dimitri P. Bertsekas and John N. Tsitsiklis
郑忠国 童行伟 译;人民邮电出版社 (2022)
Probability and Computing (2E)
Michael Mitzenmacher and Eli Upfal
Cambridge University Press (2017)
Foundations of Data Science
Avrim Blum, John Hopcroft, Ravi Kannan
Cambridge University Press (2020)
v · d · e

This is the webpage for the Foundations of Data Science (数据科学基础) class of Fall 2024. Students who take this class should check this page periodically for content updates and new announcements.

Announcement

  • 因上课的同学太多,原教室无法容纳,从9月13日开始改为在西110上课

Course info

  • Instructor :
  • Teaching assistant:
    • TBA
  • Class meeting:
    • 周五:9am-12pm,南雍-西311 西110
  • Office hour:
  • 周四:6pm-8pm,南雍-东207(刘明谋)
  • QQ群: 302709730(申请加入需提供姓名、院系、学号)

Syllabus

TBA

教材与参考书 Course Materials

  • [BT] 概率导论(第2版·修订版),[美]伯特瑟卡斯(Dimitri P.Bertsekas)[美]齐齐克利斯(John N.Tsitsiklis)著,郑忠国 童行伟 译,人民邮电出版社(2022)。
  • [MU] Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis, by Michael Mitzenmacher, Eli Upfal; Cambridge University Press; 2nd edition (2017).
  • [BHK] Foundations of Data Science, by Avrim Blum, John Hopcroft, and Ravindran Kannan; Cambridge University Press (2020).

成绩 Grading Policy

  • 课程成绩:本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
  • 迟交:如果有特殊的理由,无法按时完成作业,请提前联系授课老师,给出正当理由。否则迟交的作业将不被接受。

学术诚信 Academic Integrity

学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线,本课程将不遗余力的维护学术诚信规范,违反这一底线的行为将不会被容忍。

作业完成的原则:署你名字的工作必须是你个人的贡献。在完成作业的过程中,允许讨论,前提是讨论的所有参与者均处于同等完成度。但关键想法的执行、以及作业文本的写作必须独立完成,并在作业中致谢(acknowledge)所有参与讨论的人。符合规则的讨论与致谢将不会影响得分。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中,对他人工作(出版物、互联网资料、其他人的作业等)直接的文本抄袭和对关键思想、关键元素的抄袭,按照 ACM Policy on Plagiarism的解释,都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为, 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行,也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为,不仅使自己沦为一个欺骗者,也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

Assignments

Lectures

  1. 课程简介
  2. 概率空间
    • 阅读:[BT] 第1章
  3. 随机变量

Concepts