Theory@Nanjing 2019: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>Etone
No edit summary
imported>Etone
Line 66: Line 66:
:'''Abstract''': We study the problem of constructing coresets for the <math>(k, z)</math>-clustering problem in a doubling metric. A coreset is a weighted subset that can be used as proxy for the full data set: one can apply the same clustering algorithm on the coreset, and the result on the coreset approximates that on the full data set.
:'''Abstract''': We study the problem of constructing coresets for the <math>(k, z)</math>-clustering problem in a doubling metric. A coreset is a weighted subset that can be used as proxy for the full data set: one can apply the same clustering algorithm on the coreset, and the result on the coreset approximates that on the full data set.


:We present an efficient algorithm that constructs an <math>\epsilon<\math>-coreset for the <math>(k, z)</math>-clustering problem in a doubling metric <math>M(X,d)</math>, where the size of the coreset only depends on the parameters <math>k, z, \epsilon</math> and the doubling dimension. To the best of our knowledge, this is the first efficient coreset construction of size independent of <math>|X|</math> for general clustering problems in doubling metrics. We also study the robust coresets and centroid sets in doubling metrics. Our robust coreset construction leads to new results in clustering and property testing, and the centroid sets can be used to accelerate the local search algorithms for clustering problems.   
:We present an efficient algorithm that constructs an <math>\epsilon</math>-coreset for the <math>(k, z)</math>-clustering problem in a doubling metric <math>M(X,d)</math>, where the size of the coreset only depends on the parameters <math>k, z, \epsilon</math> and the doubling dimension. To the best of our knowledge, this is the first efficient coreset construction of size independent of <math>|X|</math> for general clustering problems in doubling metrics. We also study the robust coresets and centroid sets in doubling metrics. Our robust coreset construction leads to new results in clustering and property testing, and the centroid sets can be used to accelerate the local search algorithms for clustering problems.   
|-
|-
|align="center"|2:25 '''pm''' -- 3:15 '''pm'''
|align="center"|2:25 '''pm''' -- 3:15 '''pm'''

Revision as of 23:15, 25 April 2019

General Information

  • Saturday, May 11, 2019: 9 am--5 pm.
  • Venue: 南京大学 计算机科学与技术系 (南京大学仙林校区常州楼)111报告厅 在线地图

List of Speakers (in alphabetic order)

local speakers

Guests

  • 上海交通大学ACM班部分同学,及张驰豪博士。
  • 江苏省青少年信息学与数学奥林匹克竞赛的部分同学。
  • 东京国立信息学研究所的林冰凯博士;他即将加入南京大学任教。

Program

Workshop Program
9:00 am -- 9:50 am Yu Cheng (程宇)

Duke University

Title: Robustness and Strategic Concerns in Machine Learning
Abstract: Most people interact with machine learning systems on a daily basis. Such interactions often happen in strategic environments where people have incentives to manipulate the outcome of the learning algorithms. As machine learning plays a more prominent role in our society, it is important to understand how people's incentives can affect the learning problems, and to design reliable algorithms that perform well in these strategic environments.
In this talk, I will explore the challenges that arise at the interface of machine learning and game theory: selfish agents may interact with machine learning algorithms strategically rather than truthfully, and many existing algorithms are vulnerable to adversarial attacks. I will discuss two lines of my work that address these challenges: designing fast algorithms that are provably robust for several fundamental problems in machine learning and statistics, when the input data is modified by an adversary; and finding optimal policies for collecting data from selfish agents.
Coffee Break
10:15 am -- 11:05 am Jingcheng Liu (刘景铖)

University of California, Berkeley

Title: Private Selection from Private Candidates
Abstract: Differentially Private algorithms often need to select the best amongst many candidate options. Classical works on this selection problem require that the candidates’ goodness, measured as a real-valued score function, does not change by much when one person’s data changes. In many applications such as hyperparameter optimiza- tion, this stability assumption is much too strong. In this work, we consider the selection problem under a much weaker stability assumption on the candidates, namely that the score functions are differentially private. Under this assumption, we present algorithms that are near-optimal along the three relevant dimensions: privacy, utility and computational efficiency.
Our result can be seen as a generalization of the exponential mechanism and its existing generalizations. We also develop an on- line version of our algorithm, that can be seen as a generalization of the sparse vector technique to this weaker stability assumption. We show how our results imply better algorithms for hyperparameter selection in differentially private machine learning, as well as for adaptive data analysis.
Joint work with Kunal Talwar.
11:10 am -- 12:00 pm Penghui Yao (姚鹏晖)

Nanjing University

Title:
Abstract:
Lunch Break (12 noon -- 1:30 pm)
1:30 pm -- 2:20 pm Lingxiao Huang (黄棱潇)

École polytechnique fédérale de Lausanne (EPFL)

Title: epsilon-Coresets for Clustering (with Outliers) in Doubling Metrics
Abstract: We study the problem of constructing coresets for the [math]\displaystyle{ (k, z) }[/math]-clustering problem in a doubling metric. A coreset is a weighted subset that can be used as proxy for the full data set: one can apply the same clustering algorithm on the coreset, and the result on the coreset approximates that on the full data set.
We present an efficient algorithm that constructs an [math]\displaystyle{ \epsilon }[/math]-coreset for the [math]\displaystyle{ (k, z) }[/math]-clustering problem in a doubling metric [math]\displaystyle{ M(X,d) }[/math], where the size of the coreset only depends on the parameters [math]\displaystyle{ k, z, \epsilon }[/math] and the doubling dimension. To the best of our knowledge, this is the first efficient coreset construction of size independent of [math]\displaystyle{ |X| }[/math] for general clustering problems in doubling metrics. We also study the robust coresets and centroid sets in doubling metrics. Our robust coreset construction leads to new results in clustering and property testing, and the centroid sets can be used to accelerate the local search algorithms for clustering problems.
2:25 pm -- 3:15 pm Jiapeng Zhang (张家鹏)

University of California, San Diego

Title: Sunflowers - a bridge between complexity, combinatorics and pseudorandomness
Abstract: The Erdos-Rado sunflower conjecture is one of the tantalizing open problems in combinatorics. It has profound a lot of connections with theoretical computer science. In this talk, I will focus on the connection between the sunflower conjecture and DNF sparsification. Specifically, I will cover the following two parts:
  1. An improved lower approximation of DNF sparsification lemma.
  2. A connection between upper approximation of DNF sparsification and sunflower structures.
These two results provide some clue to improve the upper bound of sunflower structures.
The talk is based on joint works with Xin Li, Shachar Lovett and Noam Solomon.


3:20 pm -- 4:10 pm Weiming Feng (凤维明)

Nanjing University

Title:
Abstract:
Coffee Break
4:30 pm -- 5:00 pm

Panel Discussion

Short Bios

yucheng.jpg

Yu Cheng (程宇) will join the Mathematics (MSCS) Department of University of Illinois at Chicago as Assistant Professor in Fall 2019. He is currently a postdoctoral researcher in the Department of Computer Science at Duke University, hosted by Vincent Conitzer, Rong Ge, Kamesh Munagala, and Debmalya Panigrahi. He obtained his Ph.D. in Computer Science from the University of Southern California in 2017, advised by Shang-Hua Teng. His research interests include machine learning, game theory, and optimization.


jingchengliu.jpg
Jingcheng Liu (刘景铖) is in the final year of his PhD in the CS theory group at UC Berkeley, with an anticipated completion of summer 2019, under the supervision of Professor Alistair Sinclair. He is broadly interested in theoretical computer science. His current research focuses on the interplay between phase transitions in statistical physics, locations of zeros of graph polynomials, and algorithmic questions such as the tractable boundaries of approximate counting, sampling and inference. Before attending UC Berkeley, he completed his undergraduate studies in the ACM Honor Class of 2010, at Shanghai Jiao Tong University.


penghuiyao.jpg
Penghui Yao (姚鹏晖) .
lingxiaohuang.jpg
Lingxiao Huang (黄棱潇) is a postdoc of computer science in EPFL, where he is advised by Nisheeth Vishnoi. He joined EPFL in 2017, after received his Ph.D. in IIIS, Tsinghua University. His current research interest is algorithm design and computational social choice. He is passionate about creating novel algorithms that are motivated by existing practical challenges.
jiapengzhang.jpg
Jiapeng Zhang (张家鹏) Jiapeng Zhang is a PhD candidate at UC San Diego, under the supervise of Shachar Lovett. He is working on boolean function analysis, computational complexity and foundations of cryptography.
weimingfeng.jpg
Weiming Feng (凤维明) .