随机算法 (Fall 2011)/Problem set 3 and 高级算法 (Fall 2018): Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
 
imported>Etone
No edit summary
 
Line 1: Line 1:
==Problem 1==
{{Infobox
A '''boolean code''' is a mapping <math>C:\{0,1\}^k\rightarrow\{0,1\}^n</math>. Each <math>x\in\{0,1\}^k</math> is called a '''message''' and <math>y=C(x)</math> is called a '''codeword'''. The '''code rate''' <math>r</math> of a code <math>C</math> is <math>r=\frac{k}{n}</math>. A boolean code <math>C:\{0,1\}^k\rightarrow\{0,1\}^n</math> is a '''linear code''' if it is a linear transformation, i.e. there is a matrix <math>A\in\{0,1\}^{k\times n}</math> such that <math>C(x)=Ax</math> for any <math>x\in\{0,1\}^k</math>, where the additions and multiplications are defined over the finite field of order two, <math>(\{0,1\},+_{\bmod 2},\times_{\bmod 2})</math>.
|name        = Infobox
|bodystyle    =  
|title        = <font size=3>高级算法
<br>Advanced Algorithms</font>
|titlestyle  =  


The '''distance''' between two codeword <math>y_1</math> and <math>y_2</math>, denoted by <math>d(y_1,y_2)</math>, is defined as the Hamming distance between them. Formally, <math>d(y_1,y_2)=\|y_1-y_2\|_1=\sum_{i=1}^k|y_1(i)-y_2(i)|</math>. The distance of a code <math>C</math> is the minimum distance between any two codewords. Formally, <math>d=\min_{x_1,x_2\in \{0,1\}^k\atop x_1\neq x_2}d(C(x_1),C(x_2))</math>.
|image        =
|imagestyle  =  
|caption      =
|captionstyle =  
|headerstyle  = background:#ccf;
|labelstyle  = background:#ddf;
|datastyle    =  


Usually we want to make both the code rate <math>r</math> and the code distance <math>d</math> as large as possible, because a larger rate means that the amount of actual message per transmitted bit is high, and a larger distance allows for more error correction and detection.
|header1 =Instructor
|label1  =
|data1  =
|header2 =
|label2  =
|data2  = 尹一通<br>郑朝栋
|header3 =
|label3  = Email
|data3  = yinyt@nju.edu.cn chaodong@nju.edu.cn 
|header4 =
|label4= office
|data4= 计算机系 804
|header5 = Class
|label5  =
|data5  =
|header6 =
|label6  = Class meetings
|data6  = Wednesday, 8am-10am <br> 仙I-319
|header7 =
|label7  = Place
|data7  =
|header8 =
|label8  = Office hours
|data8  = Wednesday, 10am-12pm <br>计算机系 804(尹一通)、302(郑朝栋)
|header9 = Textbooks
|label9  =
|data9  =
|header10 =
|label10  =
|data10  = [[File:MR-randomized-algorithms.png|border|100px]]
|header11 =
|label11  =
|data11  = Motwani and Raghavan. <br>''Randomized Algorithms''.<br> Cambridge Univ Press, 1995.
|header12 =
|label12  =
|data12  = [[File:Approximation_Algorithms.jpg|border|100px]]
|header13 =
|label13  =
|data13  =  Vazirani. <br>''Approximation Algorithms''. <br> Springer-Verlag, 2001.
|belowstyle = background:#ddf;
|below =
}}


* Prove that there exists a boolean code <math>C:\{0,1\}^k\rightarrow\{0,1\}^n</math> of code rate <math>r</math> and distance <math>\left(\frac{1}{2}-\Theta\left(\sqrt{r}\right)\right)n</math>. Try to optimize the constant in <math>\Theta(\cdot)</math>.
This is the webpage for the ''Advanced Algorithms'' class of fall 2018. Students who take this class should check this page periodically for content updates and new announcements.  
* Prove a similar result for linear boolean codes.


== Problem 2 ==
= Announcement =
Given a binary string, define a '''run''' as a <font color=red>maximal</font> sequence of contiguous 1s; for example, the following string
* (2018/9/5) 新学期第一次上课。
:<math>\underbrace{111}_{3}00\underbrace{11}_{2}00\underbrace{111111}_{5}0\underbrace{1}_{1}0\underbrace{11}_{2}</math>
contains 5 runs, of length 3, 2, 6, 1, and 2.


Let <math>S</math> be a binary string of length <math>n</math>, generated uniformly at random. Let <math>X_k</math> be the number of runs in <math>S</math> of length <math>k</math> or more.
= Course info =
* '''Instructor ''': 尹一通、郑朝栋
:*email: yinyt@nju.edu.cn, chaodong@nju.edu.cn
* '''Class meeting''': Wednesday 8am-10am, 仙I-319.
* '''Office hour''': Wednesday 10am-12pm, 计算机系 804.


*Compute the exact value of <math>\mathbb{E}[X_k]</math> as a function of <math>n</math> and <math>k</math>.
= Syllabus =
*Give the best concentration bound you can for <math>|X_k -\mathbb{E}[X_k]|</math>.


== Problem 3==
=== 先修课程 Prerequisites ===
;The maximum directed cut problem (MAX-DICUT).
* 必须:离散数学,概率论,线性代数。
We are given as input a directed graph <math>G=(V,E)</math>, with each directed edge <math>(u,v)\in E</math> having a nonnegative weight <math>w_{uv}\ge 0</math>. The goal is to partition <math>V</math> into two sets <math>S\,</math> and <math>\bar{S}=V\setminus S</math> so as to maximize the value of <math>\sum_{(u,v)\in E\atop u\in S,v\not\in S}w_{uv}</math>, that is, the total weight of the edges going from <math>S\,</math> to <math>\bar{S}</math>.
* 推荐:算法设计与分析。


* Give a randomized <math>\frac{1}{4}</math>-approximation algorithm based on random sampling.
=== Course materials ===
* Prove that the following is an integer programming for the problem:
* [[高级算法 (Fall 2018) / Course materials|<font size=3>教材和参考书</font>]]
:<math>
\begin{align}
\text{maximize} && \sum_{(i,j)\in E}w_{ij}z_{ij}\\
\text{subject to} && z_{ij} &\le x_i, & \forall (i,j)&\in E,\\
&& z_{ij} &\le 1-x_j, & \forall (i,j)&\in E,\\
&& x_i &\in\{0,1\}, & \forall i&\in V,\\
&& 0 \le z_{ij}&\le 1, & \forall (i,j)&\in E.
\end{align}
</math>
* Consider a randomized rounding algorithm that solves an LP relaxation of the above integer programming and puts vertex <math>i</math> in <math>S</math> with probability <math>f(x_i^*)</math>. We may assume that <math>f(x)</math> is a linear function in the form <math>f(x)=ax+b</math> with some constant <math>a</math> and <math>b</math> to be fixed. Try to find good <math>a</math> and <math>b</math> so that the randomized rounding algorithm has a good approximation ratio.


==Problem 4 ==
=== 成绩 Grades ===
The set cover problem is defined as follows:
* 课程成绩:本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
*Let <math>U=\{u_1,u_2,\ldots,u_n\}</math> be a set of <math>n</math> elements, and let <math>\mathcal{S}=\{S_1,S_2,\ldots,S_m\}</math> be a family of subsets of <math>U</math>. For each <math>u_i\in U</math>, let <math>w_i</math> be a nonnegative weight of <math>u_i</math>. The goal is to find a subset <math>V\subseteq U</math> with the minimum total weight <math>\sum_{i\in V}w_i</math>, that intersects with all <math>S_i\in\mathcal{S}</math>.
* 迟交:如果有特殊的理由,无法按时完成作业,请提前联系授课老师,给出正当理由。否则迟交的作业将不被接受。


This problem is '''NP-hard'''.
=== <font color=red> 学术诚信 Academic Integrity </font>===
学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线,本课程将不遗余力的维护学术诚信规范,违反这一底线的行为将不会被容忍。


('''Remark''': There are two equivalent definitions of the set cover problem. We take the '''hitting set''' version.)
作业完成的原则:署你名字的工作必须由你完成。允许讨论,但作业必须独立完成,并在作业中列出所有参与讨论的人。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。


Questions:
本课程将对剽窃行为采取零容忍的态度。在完成作业过程中,对他人工作(出版物、互联网资料、其他人的作业等)直接的文本抄袭和对关键思想、关键元素的抄袭,按照 [http://www.acm.org/publications/policies/plagiarism_policy ACM Policy on Plagiarism]的解释,都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为,<font color=red> 抄袭和被抄袭双方的成绩都将被取消</font>。因此请主动防止自己的作业被他人抄袭。
* Prove that the following is an integer programming for the problem:
 
:<math>
学术诚信影响学生个人的品行,也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为,不仅使自己沦为一个欺骗者,也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。
\begin{align}
 
\text{minimize} &&  \sum_{(i,j)\in E}w_{i}x_{i}\\
= Assignments =
\text{subject to} && \sum_{i:u_i\in S_j}x_i &\ge 1, &1\le j\le m,\\
* TBA
&& x_i &\in\{0,1\}, & 1\le i\le n.
 
\end{align}
= Lecture Notes =
</math>
# [[高级算法 (Fall 2018)/Min-Cut and Max-Cut|Min-Cut and Max-Cut]]
* Give a randomized rounding algorithm which returns an <math>O(\log m)</math>-approximate solution with probability at least <math>\frac{1}{2}</math>. (Hint: you may repeat the randomized rounding process if there remains some uncovered subsets after one time of applying the randomized rounding.)
#:  [[高级算法 (Fall 2018)/Probability Basics|Probability basics]]

Revision as of 14:15, 4 September 2018

高级算法
Advanced Algorithms
Instructor
尹一通
郑朝栋
Email yinyt@nju.edu.cn chaodong@nju.edu.cn
office 计算机系 804
Class
Class meetings Wednesday, 8am-10am
仙I-319
Office hours Wednesday, 10am-12pm
计算机系 804(尹一通)、302(郑朝栋)
Textbooks
Motwani and Raghavan.
Randomized Algorithms.
Cambridge Univ Press, 1995.
Vazirani.
Approximation Algorithms.
Springer-Verlag, 2001.
v · d · e

This is the webpage for the Advanced Algorithms class of fall 2018. Students who take this class should check this page periodically for content updates and new announcements.

Announcement

  • (2018/9/5) 新学期第一次上课。

Course info

  • Instructor : 尹一通、郑朝栋
  • email: yinyt@nju.edu.cn, chaodong@nju.edu.cn
  • Class meeting: Wednesday 8am-10am, 仙I-319.
  • Office hour: Wednesday 10am-12pm, 计算机系 804.

Syllabus

先修课程 Prerequisites

  • 必须:离散数学,概率论,线性代数。
  • 推荐:算法设计与分析。

Course materials

成绩 Grades

  • 课程成绩:本课程将会有若干次作业和一次期末考试。最终成绩将由平时作业成绩和期末考试成绩综合得出。
  • 迟交:如果有特殊的理由,无法按时完成作业,请提前联系授课老师,给出正当理由。否则迟交的作业将不被接受。

学术诚信 Academic Integrity

学术诚信是所有从事学术活动的学生和学者最基本的职业道德底线,本课程将不遗余力的维护学术诚信规范,违反这一底线的行为将不会被容忍。

作业完成的原则:署你名字的工作必须由你完成。允许讨论,但作业必须独立完成,并在作业中列出所有参与讨论的人。不允许其他任何形式的合作——尤其是与已经完成作业的同学“讨论”。

本课程将对剽窃行为采取零容忍的态度。在完成作业过程中,对他人工作(出版物、互联网资料、其他人的作业等)直接的文本抄袭和对关键思想、关键元素的抄袭,按照 ACM Policy on Plagiarism的解释,都将视为剽窃。剽窃者成绩将被取消。如果发现互相抄袭行为, 抄袭和被抄袭双方的成绩都将被取消。因此请主动防止自己的作业被他人抄袭。

学术诚信影响学生个人的品行,也关乎整个教育系统的正常运转。为了一点分数而做出学术不端的行为,不仅使自己沦为一个欺骗者,也使他人的诚实努力失去意义。让我们一起努力维护一个诚信的环境。

Assignments

  • TBA

Lecture Notes

  1. Min-Cut and Max-Cut
    Probability basics