高级算法 (Fall 2021)/Problem Set 1 and Assignment 1, Fall 2021: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
No edit summary
 
imported>Etone
(Created page with "==Question #1== 请使用多年(>20年)的NCEP/NCAR月平均再分析资料,画出各季节(至少画出冬夏两季)纬向平均温度场、纬向平均纬向风...")
 
Line 1: Line 1:
*每道题目的解答都要有<font color="red" size=5>完整的解题过程</font>。中英文不限。
==Question #1==


== Problem 1 ==
请使用多年(>20年)的NCEP/NCAR月平均再分析资料,画出各季节(至少画出冬夏两季)纬向平均温度场、纬向平均纬向风场的高度-纬度剖面分布,并简述其分布特征和季节变化特征。
Recall that in class we show by the probabilistic method how to deduce a <math>\frac{n(n-1)}{2}</math> upper bound on the number of distinct min-cuts in any multigraph <math>G</math> with <math>n</math> vertices from the <math>\frac{2}{n(n-1)}</math> lower bound for success probability of Karger's min-cut algorithm.


Also recall that the <math>FastCut</math> algorithm taught in class guarantees to return a min-cut with probability at least <math>\Omega(1/\log n)</math>. Does this imply a much tighter <math>O(\log n)</math> upper bound on the number of distinct min-cuts in any multigraph <math>G</math> with <math>n</math> vertices? Prove your improved upper bound if your answer is "yes", and give a satisfactory explanation if your answer is "no".
==Question #2==


== Problem 2 ==
请使用多年(>20年)的NCEP/NCAR月平均再分析资料,画出各季节(至少画出冬夏两季)温度场、纬向风场在各高度(850、500、100 hPa, 对于温度场请再画出1000hPa)上的水平分布,并简述其分布特征和季节变化特征。


Consider the function <math>f:\mathbb{R}^n\to\mathbb{R}</math> defined as
==On the netcdf file==


:<math>f(\vec x)=f(x_1,x_2,\dots,x_n)=\prod_{i=1}^{n}(a_ix_i-b_i)</math>,
Netcdf (Network Common Data Form) is a set of self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage is hosted by the [https://www.unidata.ucar.edu/software/netcdf Unidata program] at the University Corporation for Atmospheric Research (UCAR). They are also the chief source of netCDF software, standards development, updates etc.。
[http://www.esrl.noaa.gov/psd/data/gridded/tools.html 读取netcdf文件的几种方法]


where <math>\{a_i\}_{1\le i\le n}</math> and <math>\{b_i\}_{1\le i\le n}</math> are '''unknown''' coefficients satisfy that <math>a_i, b_i\in \mathbb{Z}</math> and <math>0\le a_i, b_i \le n</math> for all <math>1\le i\le n</math>.
==Read .nc using Matlab==
<font color="red"> Matlab 2008b </font> 开始,Matlab 开始提供自带的读写.nc文件的函数。以下为读写.nc的一个样本matlab文件。对于图片,Matlab提供多种形式的存储。可在弹出图片的窗口左上方的下拉菜单中选择图片的存储方式,或者在命令行后用“print”命令存储图片(可以输入 help print 获得帮助)。关于Matlab的更多信息,可以去[http://www.mathworks.com/help Matlab网站]。


Let <math>p>n</math> be the smallest prime strictly greater than <math>n</math>. The function <math>g:\mathbb{Z}_p^n\to\mathbb{Z}_p</math> is defined as
==下载区==
本次作业所用的数据资料来自NCEP-DOE Reanalysis 2 的月平均资料,时间跨度为1979/1-2009/12。关于本资料的详细内容,请[http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis2.pressure.html 参见网站]。 以下为完成本次作业所需的数据文件和matlab文件。


:<math>g(\vec x)=g(x_1,x_2,\dots,x_n)=\prod_{i=1}^{n}(a_ix_i-b_i)</math>,
NCEP-DOE 月平均温度场资料 (1979/1-2009/12): [http://tcs.nju.edu.cn/yzhang/air.mon.mean.nc nc file],[http://tcs.nju.edu.cn/yzhang/air_readme.txt 说明文件]


where <math>+</math> and <math>\cdot</math> are defined over the finite field <math>\mathbb{Z}_p</math>.
NCEP-DOE 月平均纬向风(U)资料 (1979/1-2009/12):  [http://tcs.nju.edu.cn/yzhang/uwnd.mon.mean.nc nc file],[http://tcs.nju.edu.cn/yzhang/uwnd_readme.txt 说明文件]


By the properties of finite field, for any value <math>\vec r\in\mathbb{Z}_p^n</math>, it holds that <math>g(\vec r)=f(\vec r)\bmod p</math>.
NCEP-land mask(用于在题目2中画出具体的海陆分布): [http://tcs.nju.edu.cn/yzhang/land.nc nc文件]


Since the coefficients <math>\{a_i\}_{1\le i\le n}</math> and <math>\{b_i\}_{1\le i\le n}</math> are unknown, you can't calculate <math>f(\vec x)</math> directly. However, there exists an oracle <math>O</math>, each time <math>O</math> gets an input <math>\vec x</math>, it immediately outputs the value of <math>g(\vec x)</math>.
题目1和2的Matlab 样本文件: [http://tcs.nju.edu.cn/yzhang/sample_for_Q1.m 第一题m文件], [http://tcs.nju.edu.cn/yzhang/sample_for_Q2.m 第二题m文件]
 
1. Prove that <math>f\not\equiv 0 \Rightarrow g\not\equiv 0</math>.
 
2. Use the oracle <math>O</math> to design an algorithm to determine whether <math>f\equiv 0</math>, with error probability at most <math>\epsilon</math>, where <math>\epsilon\in (0,1)</math> is a constant.
 
== Problem 3 ==
Let <math>X_1,X_2,\ldots,X_n</math> be <math>n</math> random variables, where each <math>X_i \in \{0, 1\}</math> follows the distribution <math>\mu_i</math>. For each <math>1\leq i \leq n</math>, let <math>\rho_i = \mathbb{E}[X_i]</math> and assume <math>\rho_i \geq \frac{1}{2}</math>. Consider the problem of estimating the value of
:<math>Z = \prod_{i = 1}^n \rho_i</math>.
For each <math>1\leq  i \leq n</math>, the algorithm draws <math>s</math> random samples <math>X_i^{(1)},X_i^{(2)},\ldots,X_i^{(s)}</math> independently from the distribution <math>\mu_i</math>, and computes
:<math>\widehat{\rho}_{i}=\frac{1}{s}\sum_{j=1}^s X_i^{(j)}</math>.
Finally, the algorithm outputs the product of all <math>\widehat{Z}_{i}</math>:
:<math>\widehat{Z}=\prod_{i= 1}^n\widehat{\rho}_i</math>.
Express <math>s</math> as a function of <math>n,\varepsilon,\delta</math> so that the output <math>\widehat{Z}</math> satisfies
:<math>\Pr\left[(1 - \varepsilon) Z \leq \widehat{Z} \leq (1 + \varepsilon)Z\right] \geq 1- \delta</math>.
Try to make <math>s</math> as small as possible.
 
== Problem 4 ==
Suppose there is a coin <math> C </math>.
During each query, it outputs HEAD with probability <math>p</math> and TAIL with probability <math>1-p</math>, where <math> p \in (0, 1) </math> is a real number.
* Let <math> q \in (0, 1) </math> be another real number. Design an algorithm that outputs HEAD with probability <math>q</math> and TAIL with probability <math>1-q</math>. There is no other random sources for your algorithm except the coin <math>C</math>. Make sure that your algorithm halts with probability <math>1</math>.
* What is the expected number of queries for the coin <math>C</math> that your algorithm use before it halts?

Latest revision as of 05:33, 22 September 2021

Question #1

请使用多年(>20年)的NCEP/NCAR月平均再分析资料,画出各季节(至少画出冬夏两季)纬向平均温度场、纬向平均纬向风场的高度-纬度剖面分布,并简述其分布特征和季节变化特征。

Question #2

请使用多年(>20年)的NCEP/NCAR月平均再分析资料,画出各季节(至少画出冬夏两季)温度场、纬向风场在各高度(850、500、100 hPa, 对于温度场请再画出1000hPa)上的水平分布,并简述其分布特征和季节变化特征。

On the netcdf file

Netcdf (Network Common Data Form) is a set of self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage is hosted by the Unidata program at the University Corporation for Atmospheric Research (UCAR). They are also the chief source of netCDF software, standards development, updates etc.。 读取netcdf文件的几种方法

Read .nc using Matlab

Matlab 2008b 开始,Matlab 开始提供自带的读写.nc文件的函数。以下为读写.nc的一个样本matlab文件。对于图片,Matlab提供多种形式的存储。可在弹出图片的窗口左上方的下拉菜单中选择图片的存储方式,或者在命令行后用“print”命令存储图片(可以输入 help print 获得帮助)。关于Matlab的更多信息,可以去Matlab网站

下载区

本次作业所用的数据资料来自NCEP-DOE Reanalysis 2 的月平均资料,时间跨度为1979/1-2009/12。关于本资料的详细内容,请参见网站。 以下为完成本次作业所需的数据文件和matlab文件。

NCEP-DOE 月平均温度场资料 (1979/1-2009/12): nc file说明文件

NCEP-DOE 月平均纬向风(U)资料 (1979/1-2009/12): nc file说明文件

NCEP-land mask(用于在题目2中画出具体的海陆分布): nc文件

题目1和2的Matlab 样本文件: 第一题m文件, 第二题m文件