- This page lists theory and related seminars at EPFL.
**Mailing List.**To subscribe to this list, send an email (can be empty) to: theory-announce-subscribe@listes.epfl.ch**.**- If you are hosting a related seminar and would like to be listed here, please send an email.

### December 14th, 2017, 11:15h INF211

**Eric Balkanski****, Harvard University**, *“The Adaptive Complexity of Maximizing a Submodular Function”*

*Abstract: *In this paper we study the adaptive complexity of submodular optimization. Informally, the adaptive complexity of a problem is the minimal number of sequential rounds required to achieve a constant factor approximation when polynomially-many queries can be executed in parallel at each round. Adaptivity is a fundamental concept that is heavily studied in computer science, largely due to the need for parallelizing computation. Somewhat surprisingly, very little is known about adaptivity in submodular optimization. For the canonical problem of maximizing a monotone submodular function under a cardinality constraint, to the best of our knowledge, all that is known to date is that the adaptive complexity is between 1 and Ω(n). Our main result in this paper is a tight characterization showing that the adaptive complexity of maximizing a monotone submodular function under a cardinality constraint is, up to lower order terms, θ(log n):

– We describe an algorithm which requires O(log n) sequential rounds and achieves an approximation that is arbitrarily close to 1/3;

– We show that no algorithm can achieve an approximation better than O(1 / log n) with fewer than O(log n / log log n) rounds.

Thus, when allowing for parallelization, our algorithm achieves a constant factor approximation exponentially faster than any known existing algorithm for submodular maximization.

Importantly, the approximation algorithm is achieved via adaptive sampling and complements a recent line of work on optimization of functions learned from data. In many cases we do not know the functions we optimize and learn them from labeled samples. Recent results show that no algorithm can obtain a constant factor approximation guarantee using polynomially-many labeled samples as in the PAC and PMAC models, drawn from any distribution. Since learning with non-adaptive samples over any distribution results in a sharp impossibility, we consider learning with adaptive samples where the learner obtains poly(n) samples drawn from a distribution of her choice in every round. Our result implies that in the realizable case, where there is a true underlying function generating the data, θ(log n) batches, up to lower order terms, of adaptive samples are necessary and sufficient to approximately “learn to optimize” a monotone submodular function under a cardinality constraint.

Joint work with Yaron Singer

### October 23rd, 2017, 11:15h BC420

**Silvio Lattanzi, Google Research Zurich**, *“Algorithms for l_p Low-Rank Approximation and non-negative l_1 Column Subset Selection”*

*Abstract: *The problem of low-rank approximation and column subset selection of a matrix are usually studied for the Frobenius norm. Nevertheless in practice it is often interesting to study these problems also for l_1 or l_∞. Here we introduce new algorithms for l_p Low-Rank Approximation and non-negative l_1 Column Subset Selection.

We start by considering the problem of approximating a given matrix by a low-rank matrix so as to minimize the entry-wise l_p-approximation error, for any p ≥ 1; the case p = 2 is the classical SVD problem. We obtain the first provably good approximation algorithms for this version of low-rank approximation that work for every value of p ≥ 1, including p = ∞. Our algorithms are simple, easy to implement, work well in practice, and illustrate interesting tradeoffs between the approximation quality, the running time, and the rank of the approximating matrix.

Then we consider the problems of sparse regression and column subset selection under l_1 error. For both problems, we show that in the non-negative setting it is possible to obtain tight and efficient approximations, without any additional structural assumptions. We then use this technique to obtain an efficient algorithm for column subset selection under l_1 error for non-negative matrices.

Joint work with Flavio Chierichetti, Sreenivas Gollapudi, Ravi Kumar, Rina Panigrahy, David P. Woodruff and Aditya Bhaskara.

### July 3rd, 2017, 11:15h INF211

**Myriam Preissmann****, CNRS**, “*New results on variants of the Majority problem*“

*Abstract:* The variant of the Majority problem we are considering is the following. A colorblind player is given a set $\mathcal B$ of $N$ colored balls. He knows that each ball is colored either red or green, and that there are less green than red balls, but he cannot distinguish the two colors. For any two balls he can ask whether they are colored the same. His goal is to determine the color of each of the balls, asking as few questions as possible. In the case where there are at most $p$ (respectively exactly $p$) green balls the minimum number of questions that guarantees the determination of the colors is denoted by $Q(N,p,\le)$ (respectively $Q(N,p,=)$). We extend results of Aigner on exact values of $Q(N,p,\le)$, and we provide lower and upper bounds as well as exact values for Q(N,p,=), depending on the values of $N$ and $p$. Our results lead to several new questions.

Joint work with Paul-Elliot Anglès d’Auriac, Francis and Vivien Maisonneuve and Emmanuel Preissmann.

### June 13th, 2017, 13:15h INF211

**Arturs Backurs, MIT**, “*Improving Viterbi is Hard: Better Runtimes Imply Faster Clique Algorithms*“

*Abstract:* The classic algorithm of Viterbi computes the most likely path in a Hidden Markov Model (HMM) that results in a given sequence of observations. It runs in time O(Tn^2) given a sequence of T observations from a HMM with n states. The algorithm has found many applications in communication, speech recognition, machine learning, and other areas. Despite significant interest in the problem and prolonged effort by different communities, no known algorithm achieves more than a poly-logarithmic speedup.

In this paper, we explain this difficulty by providing matching conditional lower bounds. We show that the Viterbi algorithm runtime is optimal up to subpolynomial factors even when the number of distinct observations is small. Our lower bounds are based on assumptions that the best known algorithms for the All-Pairs Shortest Paths problem (APSP) and for the Max-Weight k-Clique problem in edge-weighted graphs are essentially tight.

Finally, using a recent algorithm by Green Larsen and Williams for online Boolean matrix-vector multiplication, we get a 2^{Omega(sqrt{log n})} speedup for the Viterbi algorithm when there are few distinct transition probabilities in the HMM.

Joint work with Christos Tzamos. To appear in ICML 2017

### Feb 7, 2017, 14:15h BC 329

**Nikhil Bansal, TU Eindhoven**, “*A fast polynomial space algorithm for Subset Sum”*

*Abstract:* I will describe an algorithm for the subset sum problem that runs in 2^{0.86n} time and uses polynomial space. Previously, all algorithms with running time less than 2^n used exponential space, and obtaining such a guarantee was open. Our algorithm is based on Floyd’s space efficient technique for cycle finding and some additive combinatorics of subset sums.

Based on joint work with Shashwat Garg, Jesper Nederlof and Nikhil Vyas

### Nov 8th, 2016, 15:15 BC 420

**Gregory Valiant, Stanford**, “*When your big data seems too small: accurate inferences beyond the empirical distribution*“

*Abstract:* We discuss two problems related to the general challenge of making accurate inferences about a complex distribution, in the regime in which the amount of data (i.e the sample size) is too small for the empirical distribution of the samples to be an accurate representation of the underlying distribution. The first problem we consider is the following basic learning task: given independent draws from an unknown distribution over a discrete support, output an approximation of the distribution that is as accurate as possible in L1 distance (ie total variation distance). Perhaps surprisingly, it is often possible to “de-noise” the empirical distribution of the samples to return an approximation of the true distribution that is significantly more accurate than the empirical distribution, without relying on any prior assumptions on the distribution. We present an instance optimal learning algorithm which optimally performs this de-noising for every distribution for which such a de-noising is possible. One curious implication of our techniques is an algorithm for accurately estimating the number of new domain elements that would be seen given a new larger sample, of size up to n*log n. (Extrapolation beyond this sample size is provable information theoretically impossible, without additional assumptions on the distribution.) While these results are applicable generally, we highlight an adaptation of this general approach to some problems in genomics (e.g. quantifying the number of unobserved protein coding variants).

The second problem we consider is the task of accurately estimating the eigenvalues of the covariance matrix of a (high dimensional real-valued) distribution–the “population spectrum”. These eigenvalues contain basic information about the distribution, including the presence or lack of low-dimensional structure in the distribution and the applicability of many higher-level machine learning and multivariate statistical tools. As we show, even in the regime where the sample size is linear or sublinear in the dimensionality of the distribution, and hence the eigenvalues and eigenvectors of the empirical covariance matrix are misleading, accurate approximations to the true population spectrum are possible.

This talk is based on three papers, which are joint works with Paul Valiant, with Paul Valiant and James Zou, and with Weihao Kong.

### October 26th, 2016, 15:15 INF 211

**Hang Zhou, MPI**, “*Network Inference: Graph Reconstruction and Verification*“

*Abstract:* How efficiently can we find an unknown graph using shortest path queries between its vertices? This is a natural theoretical question from the standpoint of recovery of hidden information. This question is related to discovering the topology of Internet networks, which is a crucial step for building accurate network models and designing efficient algorithms for Internet applications.

In this talk, I will introduce the problems of graph reconstruction and verification via oracles. I will investigate randomized algorithms based on a Voronoi cell decomposition. I will also analyze greedy algorithms, and prove that they are near-optimal.

The talk is based on joint work with Claire Mathieu and Sampath Kannan.

### October 6th, 2016, 14:15 INF 211

**Klaus Jansen, Kiel**, “*Closing the Gap for Makespan Scheduling via Sparsification Techniques*“

*Abstract:* Makespan scheduling on identical machines is one of the most basic and fundamental packing problems studied in the discrete optimization literature. It asks for an assignment of $n$ jobs to a set of $m$ identical machines that minimizes the makespan. The problem is strongly NP-hard, and thus we do not expect a $(1+\epsilon)$-approximation algorithm run on a time that depends polynomially on $1/\epsilon$. Furthermore, Chen, Jansen and Zhang recently showed that a running time of $2^{(1/\epsilon)^{1-\delta}}+\

A long sequence of algorithms have been developed that try to obtain low dependencies on $1/\epsilon$, the better of which achieves a running time of $2^{\tilde{O}(1/\epsilon^2)}+O

Our main technical contribution is a new structural result on the configuration-ILP. More precisely, we show the existence of a highly symmetric and sparse optimal solution, in which all but a constant number of machines are assigned a configuration with small support. This structure can then be exploited by integer programming techniques and enumeration. We believe that our structural result is of independent interest and should find applications to other settings. In particular, we show how the structure can be applied to the minimum makespan problem on uniform machines and to a larger class of objective functions on identical machines. For all these cases we obtain an efficient PTAS with running time $2^{\tilde{O}(1/\epsilon)} + poly(n)$.

This is joint work with Kim-Manuel Klein (Kiel) und Jose Verschae (Santiago).

### July 28th, 2016, 14:15 BC 420

**David Steurer, Cornell University**, *“*Fast spectral algorithms from sum-of-squares*“*

*Abstract:* We consider two problems that arise in machine learning applications: the problem of recovering a planted sparse vector in a random linear subspace and the problem of decomposing a random overcomplete 3-tensor. For both problems, the best known guarantees for polynomial-time algorithms are based on the sum-of-squares method. We develop new kinds of spectral algorithms inspired by analyses of the sum-of-squares method. These algorithms achieve the same or similar guarantees as sum-of-squares for these problems but the running time is significantly faster—linear or close to linear in the input size.

This work demonstrates for the first time that the sum-of-squares method can lead to practical algorithms for large-scale problems.

Joint work with Samuel B. Hopkins, Tselil Schramm, and Jonathan Shi.

### July 26th, 2016, 14:15 INF 211

**Raghu Meka, UCLA**, *“Communication lower bounds from query lowerbounds”*

*Abstract:* “Lifting” query lower bounds to communication problems is one of the main techniques for proving lower bounds in communication complexity and has seen several remarkable successes recently (e.g., settling the communication complexity of the clique vs independent set problem). In this talk I will briefly survey the recent progress and present a new approach based on decomposing product distributions that leads to lifting theorems for many often-used communication measures.

Along with simplifying some existing theorems, we show that linear programming relaxations need sub-exponential size to beat trivial random guessing for approximately satisfying constraint satisfaction problems (CSPs). Previously, only super-polynomial (~ n^(Omega(log n)) bounds were known for any CSP [CLRS13].

The core of our results is a new decomposition theorem for “high-entropy rectangles” into structurally nice distributions that may be of independent interest in communication complexity.

Based on joint work with Pravesh Kothari and Prasad Raghavendra and joint work with Mika Goos, Shachar Lovett, Thomas Watson, and David Zuckerman.

### July 20th, 2016, 11:15 BC 420

**Jelani Nelson, Harvard University**, “*Heavy hitters via cluster-preserving clustering*“

*Abstract:* Previously it was known that all l2 eps-heavy hitters could be found in the turnstile model in optimal O((1/eps^2) lg n) words of space with O(lg n) update time and very slow O(n lg n) query time with 1/poly(n) failure probability, via the CountSketch of Charikar, Chen and Farach-Colton. The query time could be improved drastically using the “dyadic trick” to O((1/eps^2) poly(lg n)), but the space and update time worsened to log^2 n to achieve such small failure probability. We show that the best of all worlds is possible: we give a new algorithm, ExpanderSketch, achieving O((1/eps^2) lg n) space, O(lg n) update time, and O((1/eps^2) poly(lg n)) query with 1/poly(n) failure probability. This is accomplished via a new reduction from the heavy hitters problem to a graph clustering problem of independent interest, which we solve using a new clustering algorithm CutCloseGrab which is guaranteed to find every single cluster regardless of the number of clusters or the comparative sizes between the cluster and the entire graph.

Joint work with Kasper Green Larsen, Huy L. Nguyen, and Mikkel Thorup.

### June 23rd, 2016, 11:15 BC 01

**Piotr Indyk, MIT**, “*Fast Algorithms for Structured Sparsity*“

*Abstract:* Sparse representations of signals (i.e., representations that have only few non-zero or large coefficients) have emerged as powerful tools in signal processing theory, algorithms, machine learning and other applications. However, real-world signals often exhibit rich structure beyond mere sparsity. For example, a natural image, once represented in the wavelet domain, often has the property that its large coefficients occupy a subtree of the wavelet hierarchy, as opposed to arbitrary positions. A general approach to capturing this type of additional structure is to model the support of the signal of interest (i.e., the set of indices of large coefficients) as belonging to a particular family of sets. Computing a sparse representation of the signal then corresponds to the problem of finding the support from the family that maximizes the sum of the squares of the selected coefficients. Such a modeling approach has proved to be beneficial in a number of applications including compression, de-noising, compressive sensing and machine problem is often computationally difficult or intractable, which is undesirable in many applications where large signals and datasets are commonplace.

In this talk, I will outline some of the past and more recent algorithms for finding structured sparse representations of signals, including piecewise constant approximations, tree-sparse approximations and graph-sparse approximations. The algorithms borrow several techniques from combinatorial optimization (e.g., dynamic programming), graph theory, and approximation algorithms. For many problems the algorithms run in (nearly) linear time, which makes them applicable to very large datasets.

Joint work with Chinmay Hegde and Ludwig Schmidt.

### May 11, 2016, 11:15 BC 420

**Sanjeev Khanna, University of Pennsylvania**, “*On the Single-Pass Streaming Complexity of the Set Cover Problem*“

*Abstract:* In the set cover problem, we are given a collection of m subsets over a universe of n elements, and the goal is to find a sub-collection of sets whose union covers the universe. The set cover problem is a fundamental optimization problem with many applications in computer science and related disciplines. In this talk, we investigate the set cover problem in the streaming model of computation whereby the sets are presented one by one in a stream, and the goal is to solve the set cover problem using a space-efficient algorithm.

We show that to compute an \alpha-approximate set cover (for any \alpha= o(\sqrt{n})) via a single-pass streaming algorithm, \Theta(mn/\alpha) space is both necessary and sufficient (up to an O(\log{n}) factor). We further study the problem of estimating the size of a minimum set cover (as opposed to finding the actual sets), and show that this turns out to be a distinctly easier problem. Specifically, we prove that \Theta(mn/\alpha^2) space is both sufficient and necessary (up to logarithmic factors) for estimating the size of a minimum set cover to within a factor of \alpha. Our algorithm in fact works for the more general problem of estimating the optimal value of a covering integer program. These results provide a tight resolution of the space-approximation tradeoff for single-pass streaming algorithms for the set cover problem.

This is based on a joint work with Sepehr Assadi (Penn) and Yang Li (Penn).

### Mar 22, 2016, 16:15 INF 211

**Rico Zenklusen, ETH Zurich**, “*Firefighting on Trees Beyond Integrality Gaps*“

*Abstract:* The Firefighter problem (FFP) and a variant of it, known as Resource Minimization for Fire Containment (RMFC), are natural models for optimal inhibition of harmful spreading processes. Despite considerable progress on several fronts, large gaps remain in our understanding of the approximability of these problems. This is the case even when the underlying graph is a tree, which is one of the most-studied graph structures in this context and the focus of this talk. In their simplest version, a fire spreads from one fixed vertex step by step from burning to adjacent non-burning vertices, and at each time step, B many non-burning vertices can be protected from catching fire. FFP asks, for a given B, to maximize the number of vertices that will not catch fire, whereas RMFC (on a tree) asks to find the smallest B that allows for saving all leaves of the tree. Prior to this work, the best known approximation ratios were an O(1)-approximation for FFP and an O(log*n)-approximation for RMFC, both being LP-based and essentially matching the integrality gaps of two natural LP relaxations.

We present a PTAS for FFP and an O(1)-approximation for RMFC, both qualitatively matching known hardness results. Our results are obtained through a combination of the known LPs with several new techniques.

This is joint work with David Adjiashvili and Andrea Baggio.

### Mar 4, 2016, 15:00 INF 211

**Samuel Fiorini, Universite libre de Bruxelles**, “*A 7/3-Approximation Algorithm for Cluster Vertex Deletion*“

*Abstract:* The cluster vertex deletion problem is to delete a minimum cost set of vertices from a given vertex-weighted graph in such a way that the resulting graph has no induced path on three vertices. This problem is a special case of the vertex cover problem on a 3-uniform hypergraph, and thus has a straightforward 3-approximation algorithm. Very recently, You, Wang, and Cao (2015) described an efficient 5/2-approximation algorithm for the unweighted version of the problem. Our main result is a 7/3-approximation algorithm for arbitrary weights, using the local ratio technique. We further conjecture that the problem admits a 2-approximation algorithm and give some support for the conjecture. This is in sharp constrast with the fact that the similar problem of deleting vertices to eliminate all triangles in a graph is known to be UGC-hard to approximate to within a ratio better than 3, as proved by Guruswami and Lee (2015).

This is joint work with Gwenaël Joret (Brussels) and Oliver Schaudt (Cologne).

### Feb 29, 2016, 16:15 BC 420

**David Woodruff, IBM** **Almaden**,* “**Beating CountSketch for Heavy Hitters in Insertion Streams**“*

*Abstract:* We consider the problem of finding the most frequent items in a stream of items from a universe of size n. Namely, we consider returning all l_2-heavy hitters, i.e., those items j for which f_j >= eps sqrt{F_2}, where f_j is the number of occurrences of item j, and F_2 = sum_i f_i^2 is the second moment of the stream. In 2002, Charikar, Chen, and Farach-Colton suggested the CountSketch data structure, which solves this using log^2 n bits of space (for constant eps). The only known lower bound is log n bits. Using Gaussian processes, we show it is possible to achieve O(log n log log n) bits of space.

### Feb 2, 2016, 16:15 BC 420

**Monika Henginzer, University of Vienna**, “*Using the Primal-Dual Technique in Dynamic Graph Algorithms*“

*Abstract:* The number of Facebook users has been growing exponentially in the last decade reaching 1.2 billion monthly users in August 2015. According to the Pew Research Center about half of the Facebook users add at least one new friend per week. (No statistics are published about how many people are unfriended each week.) Thus, the Facebook graph is a huge graph that changes frequently. To maintain any non-trivial property of this graph a dynamic graph algorithm is needed: A dynamic graph algorithm is a data structure that maintains a property of a graph while it is modified by edge insertions and deletions.

I will present a hierarchical graph decomposition based on the primal-dual method which lead to a polylogarithmic algorithm for (2+epsilon)-approximate vertex cover and (3+epsilon)-approximate matching.

### Nov 3, 2015, 16:15 BC 329

**Rico Zenklusen**, ETHZ,* “An O(1)-Approximation for Minimum Spanning Tree Interdiction (And More About Interdiction)”*

*Abstract:* Network interdiction studies the maximum impact that the removal of a limited number of edges or vertices can have on a graph optimization problem. Most interdiction problems are NP-hard, and only little is known about their approximability. In this talk, we first give a general introduction to interdiction and then focus on one of the oldest and most-studied interdiction problems, namely minimum spanning tree (MST) interdiction. Here, an MST problem is given together with positive edge costs and a budget B. The goal is to remove edges of total cost at most B such that an MST in the resulting graph is as heavy as possible. So far, the best approximation algorithm, presented by Frederickson and Solis-Oba (SODA 1996), achieves a factor of O(log(m)). We show that MST interdiction admits an O(1)-approximation. Moreover, this implies an O(1)-approximation for a natural interdiction version of the symmetric traveling salesman problem and the path version thereof.

### Oct 27, 2015, 16:15h INF 211

**Immanuel Trummer, EPFL**, “*Multiple Query Optimization on the D-Wave 2X Adiabatic Quantum Computer*“

*Abstract:* We are exploring the potential of quantum computing for hard combinatorial optimization problems that arise in the context of database systems. We obtained access to a D-Wave 2X adiabatic quantum annealer, located at NASA Ames Research Center in California. This device is claimed to exploit the laws of quantum physics to solve hard optimization problems. In my talk, I will outline how we used that device to solve a classical database-related optimization problem: the problem of multiple query optimization (MQO). I will show how we transform MQO problem instances into magnetic fields that are imposed on and between the qubits of the quantum annealer. This transformation enables us to solve the resulting problems using a process called quantum annealing. I analyze the complexity of our transformation in terms of how the number of required qubits grows as a function of the MQO problem dimensions. I also present first experimental results. Our results show that, while the range of problems that can be mapped to the quantum annealer is still limited, the speedup compared to commodity PCs can reach up to three orders of magnitude.

### Oct 2, 2015, 14:15h INF 211

**Georgios Piliouras, Singapore** **University of Technology and Design**, “*Natural Selection, Game Theory and Genetic Diversity*“

*Abstract:* In a recent series of papers a strong connection has been established between standard models of sexual evolution in mathematical biology and Multiplicative Weights Updates Algorithm, a ubiquitous model of online learning and optimization. These papers show that mathematical models of biological evolution are tantamount to applying discrete replicator dynamics, a close variant of MWUA on (asymmetric) partnership games. We show that in the case of partnership games, under minimal genericity assumptions, discrete replicator dynamics converge to pure Nash equilibria for all but a zero measure of initial conditions. This result holds despite the fact that mixed Nash equilibria can be exponentially (or even uncountably) many, completely dominating in number the set of pure Nash equilibria. Thus, in haploid organisms the long term preservation of genetic diversity needs to be safeguarded by other evolutionary mechanisms, such as mutation and speciation.

This is joint work with Ruta Mehta and Ioannis Panageas.

### Sep 29, 2015, 16:15h INF 211

**Emtiyaz Khan, EPFL**, “*Approximation Bayesian Inference: Bringing statistics, optimization, and machine learning together*“

*Abstract:* Machine learning relies on data to teach machines to learn, but dealing with noisy, unreliable, heterogeneous, and high-dimensional data is a big challenge in itself. Bayesian methods are promising in this context but they do not scale well to large data. In this talk, I will discuss my work on approximating the Bayesian integral by an optimization problem. There are two natural questions to ask: when is such a method reasonably accurate and how fast is it? In this talk, I will summarize a few results of my research. Specifically, I will show that the application of fancy optimization methods can result in reasonably fast and accurate methods. Overall, this talk will demonstrate a successful application of statistics, optimization, and machine learning for scalable and accurate Bayesian inference.

### Sep 24, 2015, 16:15h INF 211

**Nick Harvey, UBC**, “*An Algorithmic Proof of the Lovasz Local Lemma via Resampling Oracles*“

*Abstract:* The Lovasz Local Lemma (LLL) is a seminal result in probabilistic combinatorics. It gives a sufficient condition on a probability space and a collection of events for the existence of an outcome that simultaneously avoids all of those events. Finding such an outcome by an efficient algorithm has been an active research topic for decades. Breakthrough work of Moser and Tardos (2009) presented an efficient algorithm for a general setting primarily characterized by a product structure on the probability space.

In this work we present an efficient algorithm for a much more general setting. Our main assumption is that there exist certain functions, called resampling oracles, that can be invoked to address the undesired occurrence of the events. We show that, in all scenarios to which the original LLL applies, there exist resampling oracles; and for essentially all known applications of the LLL we have designed efficient resampling oracles. As applications of these techniques, we present new results for packings of Latin transversals, rainbow matchings and rainbow spanning trees.

Joint work with Jan Vondrak.

### Sep 2, 2015, 16:15h INF 211

**Amir Shpilka, Tel Aviv University**, “*Efficiently decoding Reed-Muller codes from random errors*“

*Abstract:* Reed-Muller codes encode an m-variate polynomial of degree r by evaluating it on all points in {0,1}^m. The form one of the most studied family of codes that is important both in the coding theory community and in theoretical computer science.

In the talk we shall describe a novel efficient algorithm (in the block length n = 2^m) for decoding random errors in Reed-Muller codes far beyond the minimal distance. Specifically, for low rate codes (of degree r = o(√m)) we can correct a random set of n/2 -o(n) errors with high probability. Such result was known only for degrees up to log(m). For high rate codes (of degree roughly m−r for r = o(√m)), we can correct roughly m^r/2 errors.

The algorithm is based on solving a carefully defined set of linear equations and thus it is significantly different than other algorithms for decoding Reed-Muller codes that are based on the recursive structure of the code. It also bares some similarities with the famous Berlekamp-Welch algorithm for decoding Reed-Solomon codes.

Based on joint work with Ramprasad Saptharishi and Ben lee Volk

### Aug 24, 2015, 16:15h INF 211

**He Sun, University of Bristol**, “*Heat kernels in graphs: A journey from random walks to geometry, and back*“

*Abstract:* Heat kernels are one of the most fundamental concepts in physics and mathematics. In physics, the heat kernel is a fundamental solution of the heat equation and connects the Laplacian operator to the rate of heat dissipation. In spectral geometry, many fundamental techniques are based on heat kernels. In finite Markov chain theory, heat kernels correspond to continuous-time random walks and constitute one of the most powerful techniques in estimating the mixing time.

In this talk, we will briefly discuss this line of research and its relation to heat kernels in graphs. In particular, we will see how heat kernels are used to design the first nearly-linear time algorithm for finding clusters in real-world graphs. Some interesting open questions will be addressed as well.

This is based on the joint work with Richard Peng (MIT), and Luca Zanetti (University of Bristol). Parts of the results of this talk appeared in COLT 2015.

### Jul 10, 2015, 16:00h INF 211

**Joydeep Dutta, IIT Kanpur**, “*Gap Function and Error Bounds for Convex Vector Optimization Problems*“

*Abstract:* In this article our main aim is to develop gap function and error bounds for a (non-smooth) convex vector optimization problem. We show that by focussing on convexity we are able to quite efficiently compute the gap functions and try to gain insight about the structure of set of weak Pareto minimizers by viewing its graph. We discuss the several properties of gap functions develop error bounds when the data is strongly convex. We also compare our results with some recent results on weak vector variational inequalities with set-valued maps, and also argue as to why we focus on the convex case.

### Jul 10, 2015, 14:00h BC 420

**Katrina Ligett, Caltech**, “*Differential Privacy – A Toolkit for Stability, Robustness, and Statistical Validity*“

*Abstract:* In this talk, I’ll give an introduction to differential privacy with an emphasis on its relationship to machine learning, and its usefulness outside of privacy. Along the way, I’ll give a taste for the mathematical tools that can be used to achieve differential privacy. My thesis is that anyone who cares about data should care about the tools that the differential privacy literature offers.

### Jul 8, 2015, 16:15h INF 211

**Sushant Sachdeva, Yale University**, “*Algorithms for Lipschitz Learning on Graphs*“

*Abstract:* In this talk, we’ll discuss some interpolation problems on graphs. Given real-valued observations on some of the vertices, our goal is to compute a smooth (Lipschitz) extension to all vertices. Our focus will be the absolutely minimal Lipschitz extension, which is the limit of p-Laplacian regularization for large p.

We’ll present algorithmic results for computing these extensions efficiently — a minimal Lipschitz extension in expected linear time, and an absolutely minimal Lipschitz extension in expected O(mn) time. We have implemented variants of the latter algorithm that run on graphs with millions of edges in a few minutes. Our algorithms naturally extend to directed graphs. We’ll present some experimental results for detecting spam webpages using our algorithms.

Finally, we’ll discuss how these extensions are particularly suited for regularization and outlier removal, which is surprising since outlier removal is NP-hard for other natural extensions.

This is joint work with Rasmus Kyng, Anup Rao, and Daniel Spielman.

### Apr 28, 2015, 16:15h INF 211

**Divesh Aggarwal, EPFL**, “*Faster Algorithm for the Shortest Vector Problem via Discrete Gaussian Sampling*“

*Abstract:* We give a randomized 2^{n+o(n)}-time and space algorithm for solving the Shortest Vector Problem (SVP) on n-dimensional Euclidean lattices. This improves on the previous fastest algorithm: the deterministic Õ(4^{n})-time and Õ(2^{n})-space algorithm of Micciancio and Voulgaris (STOC 2010).

In fact, we give a conceptually simple algorithm that solves the (in our opinion, even more interesting) problem of discrete Gaussian sampling(DGS). More specifically, we show how to sample 2^{n/2} vectors from the discrete Gaussian distribution at *any parameter* in 2^{n+o(n)} time and space. (Prior work only solved DGS for very large parameters.) Our SVP algorithm then follows from a natural reduction from SVP to DGS.

This is joint work with Daniel Dadush, Oded Regev, and Noah Stephens-Davidowitz.

### Apr 2, 2015, 16:00h INJ 114

**[Virtual Seminar] Allan Sly, ****UC Berkeley**, “*Proof of the satisfiability conjecture for large k*“

*Abstract:* We establish the satisfiability threshold for random k-SAT for all k >= k_0. That is, there exists a limiting density alpha_s(k) such that a random k-SAT formula of clause density alpha is with high probability satisfiable for alpha < alpha_s(k), and unsatisfiable for alpha > alpha_s(k). The satisfiability threshold alpha_s(k) is given explicitly by the one-step replica symmetry breaking prediction from statistical physics. We believe that our methods may apply to a range of random CSPs in the 1RSB universality class.

### Mar 10, 2015, 14:00h INJ 114

**Samuel Fiorini, Université libre de Bruxelles, ***“No Small Linear Program Approximates Vertex Cover within a Factor 2-epsilon”*

*Abstract:* The vertex cover problem is one of the most important and intensively studied

combinatorial optimization problems. Khot and Regev (2003) proved that the problem is

NP-hard to approximate within a factor 2-epsilon, assuming the Unique Games Conjecture

(UGC). This is tight because the problem has an easy 2-approximation algorithm. Without

resorting to the UGC, the best inapproximability result for the problem is due to Dinur and

Safra (2002): vertex cover is NP-hard to approximate within a factor 1.3606.

We prove the following unconditional result about linear programming (LP) relaxations of

the problem: every LP relaxation that approximates vertex cover within a factor of 2-epsilon

has super-polynomially many inequalities. As a direct consequence of our methods, we also

establish that LP relaxations that approximate the independent set problem within any constant

factor have super-polynomially many inequalities.

This is joint work with Abbas Bazzi, Sebastian Pokutta and Ola Svensson

### Mar 9, 2015, 10:15h BC 420

**Alexandr Andoni, ****UC Berkeley**, “*Algorithmic design via efficient data representations*“

*Abstract:* Modern datasets are at scales that would be unthinkable just a decade ago. This demands rethinking the principles of algorithmic design used to handle such datasets. In this talk, I will describe how such principles emerge from the methods of efficient data representations.

The first illustration will be the Nearest Neighbor Search (NNS) problem — an ubiquitous massive datasets problem that is of key importance in machine learning and other areas. A central technique for tackling this problem is Locality Sensitive Hashing (LSH), a data representation method that has seen a lot of success in both theory and practice. I will present the best possible LSH-based algorithm for NNS under the Euclidean distance. Then, I will show a new method that, for the first time, provably outperforms the LSH-based algorithms.

Taking a broader perspective, I will also describe how the lens of “efficient data representation” leads to new fast algorithms for other longstanding questions. Specifically, I will discuss a classic dynamic programming problem (edit distance) and a Linear Programming problem (earth-mover distance). Finally, I will briefly mention how the above ideas enable an algorithmic framework for parallel computation systems such as MapReduce.

### Mar 2, 2015, 10:15h BC 420

**Ruta Mehta, College of Computing, Georgia Tech, ***“Games, Equilibria, and Evolution”*

*Abstract:* The tremendous growth of online markets, ad auctions, and social network communities, where agents interact to achieve their own goals, often selfish, has created a need to apply game theoretic solution concepts more than ever before. In this talk I will discuss one of the most important solution concept in game theory, namely Nash equilibrium, its computation and applications. Recently a remarkable connection was discovered between evolution under sexual reproduction and coordination games. Proceeding along these lines I will show some new insights on genetic diversity.

Towards efficient computation, finding Nash equilibrium in two-player normal form game (2-Nash) is one of the most extensively studied problem. Such a game can be represented by two payoff matrices A and B, one for each player. 2-Nash is PPAD-complete in general, while if the game is zero-sum (B=-A) then it reduces to LP and hence is in P.

Extending this notion, in 2005, Kannan and Theobald defined rank of game (A, B) as rank(A+B), e.g., rank-0 are zero-sum games. They asked for an efficient algorithm for constant rank games, where the primary difficulty as disconnected solution sets, even in rank-1 games. I will answer this question affirmatively for rank-1 games, and negatively for games with rank three or more (unless PPAD=P); the status of rank-2 games remains unresolved. In the process I obtain a number of other results, including a simpler proof of PPAD-hardness for 2-Nash.

### Feb 23, 2015, 10:15h BC 420

**Michael Kapralov, ****IBM T. J. Watson Research Center**, “*Sublinear Algorithms for Modern Graph Analysis*“

*Abstract:* Graphs are a common abstraction for representing large social and information networks, and a powerful set of algorithmic primitives has been developed for their analysis. As the sizes of modern datasets grow, however, many classical polynomial time (and sometimes even linear time) solutions become prohibitively expensive. This calls for sublinear algorithms, i.e. algorithms whose resource requirements are substantially smaller than the size of the input that they operate on.

In this talk, we describe a new approach to the problem of approximating the size of maximum matching in a large graph given as a stream of edge updates using sublinear space. The matching problem is one of the most well-studied questions in combinatorial optimization, and has important applications in modern big data analysis (e.g. online advertising). We obtain a polylogarithmic approximation to maximum matching size using space sublinear in the number of *vertices* in the graph and exponentially smaller than previously known. This is the first algorithm for a graph problem that achieves truly sublinear space in the streaming setting, suggesting the possibility of a new class of more efficient graph analysis primitives.

### Feb 17, 2015, 16:00h BC 420

**Sangxia Huang, KTH Royal Institute of Technology**, “*Improved NP-inapproximability for 2-variable Linear Equations*“

*Abstract:* An instance of the E2Lin(2) problem is a system of equations of the form “x_i + x_j = b (mod 2)”. Given such a system in which it is possible to satisfy all but an epsilon fraction of the equations, we would like to find an assignment that violates as few equations as possible. In this paper, we show that it is NPhard to satisfy all but a C*epsilon fraction of the equations, for any C < 11/8 and 0 < epsilon <= 1/8. Our result holds also for the special case of MaxCut. The previous best NPhardness result, standing for over 15 years, had 5/4 in place of 11/8.

Our proof is by a modified gadget reduction from a predicate that supports a pairwise independent distribution. We also show an inherent limitation to this type of gadget reduction. In particular, we show that no such reduction can establish a hardness factor C greater than ~2.54.

Joint work with Johan Håstad, Rajsekar Manokaran, Ryan O’Donnell, John Wright.

### Feb 10, 2015, 16:00h INF 211

**Shayan Oveis Gharan, University of Washington**, “*Effective Resistance Reducing Flows, Spectrally Thin Trees and Asymmetric TSP*“

*Abstract:* We show that the integrality gap of the natural LP relaxation of the Asymmetric Traveling Salesman Problem (ATSP) is at most polyloglog(n). In other words, there is a polynomial time algorithm that approximates the value of the optimum tour within a factor of polyloglog(n).

This is the first significant improvement over the classical O~(log n) approximation factor for ATSP. Our proof builds on recent developments in operator theory, in particular recent resolution of the Kadison Singer conjecture by Marcus, Spielman and Srivastava. In this talk, I will highlight the main ideas of our proof.

This is a joint work with Nima Anari.

### Feb 3, 2015, 16:00h BC 420

**Yin Tat Lee, MIT,** “*A Faster Algorithm for Linear Programming and the Maximum Flow Problem *“

*Abstract: *In this talk, I will present a new algorithm for solving linear programs. Given a linear program with n variables, m > n constraints, and bit complexity L, our algorithm runs in Õ(sqrt(n) L) iterations each consisting of solving Õ(1) linear systems and additional nearly linear time computation. Our method improves upon the convergence rate of previous state-of-the-art linear programming methods which required solving Õ(sqrt(m)L) linear systems. As a corollary, we achieve a running time of Õ(|E| sqrt(|V|)) for solving the maximum flow problem on a directed graph with |E| edges, |V| vertices, thereby improving upon the previous fastest running time Õ(|E|^(3/2)) and Õ(|E| |V|^(2/3)).

This talk reflects joint work with Aaron Sidford (See http://arxiv.org/abs/1312.6677 and http://arxiv.org/abs/1312.6713).

### Jan 27, 2015, 16:00h INJ 114

**[Virtual Seminar] Zeev Dvir, Princeton**, “*2-Server PIR with sub-polynomial communication*“

*Abstract:* A 2-server Private Information Retrieval (PIR) scheme allows a user to retrieve the i-th bit of an n-bit database replicated among two servers (which do not communicate) while not revealing any information about i to either server. In this work we construct a 1-round 2-server PIR with total communication cost O(n^sqrt(log log n/log n)). This improves over the currently known 2-server protocols which require O(n^(1/3)) communication and matches the communication cost of known 3-server PIR schemes. Our improvement comes from reducing the number of servers in existing protocols, based on Matching Vector Codes, from 3 or 4 servers to 2. This is achieved by viewing these protocols in an algebraic way (using polynomial interpolation) and extending them using partial derivatives.

Joint work with Sivakanth Gopi (Princeton University). [arXiV abs/1407.6692]

To be streamed from https://www.youtube.com/watch?v=iXBncrIlMco

### Jan 20, 2015, 16:00h BC 420

**Robert Krauthgamer, Weizmann Institute,** “*The Sketching Complexity of Graph Cuts*“

*Abstract: *We study the problem of sketching an input graph $G$ on $n$ vertices, so that given (only) the sketch, one can estimate the weight (capacity) of any cut in the graph within a small approximation factor $1+\epsilon$. The classic cut-sparsifier construction of Benczur and Karger (STOC 1996) implies a sketch of size $\tilde O(n/\epsilon^2)$ [this notation hides logarithmic factors].

We show that the dependence on $\epsilon$ can be brought down to linear, at the price of achieving a slightly relaxed guarantee. Specifically, we design a randomized scheme that produces from $G$ a sketch of size $\tilde O(n/\epsilon)$ bits, from which the weight of any cut $(S,\bar S)$ can be reported, with high probability, within factor $1+\epsilon$. We also demonstrate some applications where this “for each” guarantee is indeed useful.

We further prove that that our relaxation is necessary. Specifically, a sketch that can $(1+\epsilon)$-approximate the weight of all cuts in the graph simultaneously (a so-called “for all” guarantee instead of “for each”), must be of size $\Omega(n/\epsilon^2)$ bits.

Joint work with Alexandr Andoni and David Woodruff.

### Jan 16, 2015, 15:00h BC 420

**Scott Aaronoson, MIT,** “*Forrelation: **A problem that optimally separates quantum from classical computing*“

*Abstract: *We achieve essentially the largest possible separation between quantum and classical query complexities. We do so using a property-testing problem called Forrelation, where one needs to decide whether one Boolean function is highly correlated with the Fourier transform of a second function. This problem can be solved using 1 quantum query, yet we show that any randomized algorithm needs Omega(sqrt(N) / log(N)) queries (improving an Omega(N^{1/4}) lower bound of Aaronson). Conversely, we show that this 1 versus ~Omega(sqrt(N)) separation is optimal: indeed, any t-query quantum algorithm whatsoever can be simulated by an O(N^{1-1/2t})-query randomized algorithm. Thus, resolving an open question of Buhrman et al. from 2002, there is no partial Boolean function whose quantum query complexity is constant and whose randomized query complexity is linear. We conjecture that a natural generalization of Forrelation achieves the optimal t versus ~Omega(N^{1-1/2t}) separation for all t. As a bonus, we show that this generalization is BQP-complete. This yields what’s arguably the simplest BQP-complete problem yet known, and gives a second sense in which Forrelation “captures the maximum power of quantum computation.”

Joint work with Andris Ambainis. No quantum computing background is needed for this talk.

### Jan 16, 2015, 14:00h BC 420

**Dana Moshkovitz, MIT,** “*Parallel Repetition From Fortification*“

*Abstract: *The Parallel Repetition Theorem upper-bounds the value of a repeated (tensored) two prover game in terms of the value of the base game and the number of repetitions.

In this work we give a simple transformation on games — “fortification” — and show that for fortified games, the value of the repeated game decreases perfectly exponentially with the number of repetitions, up to an arbitrarily small additive error. Our proof is combinatorial and short.As corollaries, we obtain simple combinatorial proofs of state-of-the-art PCP Theorems.

### Dec 16, 2014, 15:00h BC 420

**David Steurer, Cornell** University, “*Lower bounds on the size of semidefinite programming relaxations*“

*Abstract:** *We introduce a method for proving lower bounds on the efficacy of semidefinite programming (SDP) relaxations for combinatorial problems. In particular, we show that the cut, TSP, and stable set polytopes on n-vertex graphs are not the linear image of the feasible region of any SDP (i.e., any spectrahedron) of dimension less than 2nδ, for some constant δ>0. This result yields the first super-polynomial lower bounds on the semidefinite extension complexity of any explicit family of polytopes.

Our results follow from a general technique for proving lower bounds on the positive semidefinite rank of a matrix. To this end, we establish a close connection between arbitrary SDPs and those arising from the sum-of-squares SDP hierarchy. For approximating maximum constraint satisfaction problems, we prove that SDPs of polynomial-size are equivalent in power to those arising from degree-O(1) sum-of-squares relaxations. This result implies, for instance, that no family of polynomial-size SDP relaxations can achieve better than a 7/8-approximation for MAX 3 SAT.

Joint work with James R. Lee, Prasad Raghavendra.

### Nov 27, 2014, 10:00h-12:00h INM 200

**Abhradeep Guha Thakurta, Yahoo! Labs**, “*Differential Privacy to Machine Learning, and Back – Part II*“

*Abstract: *In this tutorial, I will outline some of the recent developments in the area of differentially private machine learning, which exhibit strong connections between statistical data privacy and robust (stable) machine learning. The main theme of this tutorial would be to demonstrate techniques to achieve one from the other.

In the first part, I will first review some of the high-profile privacy breaches (and also outline some the underlying principles behind them) to motivate the need for a rigorous notion of statistical data privacy. Subsequently, I will move on to introduce the notion of differential privacy, and present some of the basic concepts commonly used in the design of differentially private algorithms. I will conclude part one by showing that differential privacy is a strong form of regularization (a common technique used in machine learning to control prediction error), and analyze the Follow-the-perturbed-leader (FTPL) by Kalai and Vempala’ 2005 in this context.

In the second part, I will analyze the robustness (stability) properties of the some of the commonly used machine learning algorithms (e.g., gradient descent, LASSO, and, Frank-Wolfe algorithm) and bootstrap the robustness properties into obtaining optimal differentially private algorithms for empirical risk minimization, and model selection. Some of the robustness analyses are new, and can be of independent interest irrespective of privacy.

### Nov 25, 2014, 10:00h-12:00h BC 01

**Abhradeep Guha Thakurta, Yahoo! Labs**, “*Differential Privacy to Machine Learning, and Back – Part I*“

*Abstract: *In this tutorial, I will outline some of the recent developments in the area of differentially private machine learning, which exhibit strong connections between statistical data privacy and robust (stable) machine learning. The main theme of this tutorial would be to demonstrate techniques to achieve one from the other.

In the first part, I will first review some of the high-profile privacy breaches (and also outline some the underlying principles behind them) to motivate the need for a rigorous notion of statistical data privacy. Subsequently, I will move on to introduce the notion of differential privacy, and present some of the basic concepts commonly used in the design of differentially private algorithms. I will conclude part one by showing that differential privacy is a strong form of regularization (a common technique used in machine learning to control prediction error), and analyze the Follow-the-perturbed-leader (FTPL) by Kalai and Vempala’ 2005 in this context.

In the second part, I will analyze the robustness (stability) properties of the some of the commonly used machine learning algorithms (e.g., gradient descent, LASSO, and, Frank-Wolfe algorithm) and bootstrap the robustness properties into obtaining optimal differentially private algorithms for empirical risk minimization, and model selection. Some of the robustness analyses are new, and can be of independent interest irrespective of privacy.

### Oct 30, 2014, 10:00h INJ 114

**László Végh, London School of Economics**, “*A polynomial projection-type algorithm for linear programming*“

*Abstract: *The talk exhibits a new polynomial-time algorithm for linear programming feasibility, which can be considered as a polynomial-time implementation of the relaxation method, and is based on previous ideas by Chubanov.

The core of the algorithm is a strongly polynomial subroutine, which for a system of the form Ax=b, 0<=x<=u, either finds a feasible solution, or identifies a coordinate i such that x_i <=u_i/2 holds in every basic feasible solution. This subroutine is repeated, with the matrix rescaled after each iteration. I will also discuss three other recent algorithms using analogous approaches: a classical method made polynomial via repeated rescalings. (Joint work with Giacomo Zambelli.)

### Oct 28, 2014, 14:00h INF 211

**László Végh, London School of Economics**, “*The cutting plane method is polynomial for perfect matchings*“

*Abstract: *We exhibit a cutting plane algorithm that converges in polynomial time for the minimum-cost perfect matching problem. We start from the relaxation consisting of the degree constraints only, and add some of Edmonds’s blossom inequalities as cutting planes. In every intermediate linear program we have a linear number of constraints and the optimal solution is half-integral. This is a joint work with Karthik Chandrasekaran and Santosh Vempala.

### Oct 14, 2014, 14:00h BC 420

**Vincenzo Bonifaci, Institute of Systems Analysis and Informatics, Italian National Research Council (IASI-CNR), Rome**, “*Physarum can compute shortest paths*“

*Abstract: *Natural processes are often capable of exhibiting remarkable information processing abilities. Physarum polycephalum — the ‘many headed’ slime mold — is a single-cell, multiple nuclei organism that is apparently able to solve rather complex tasks, such as finding the shortest path between two points in a maze while foraging food. Mathematical models have been proposed by biomathematicians to describe the feedback mechanism used by Physarum to adapt its tubular channels while foraging two food sources s0 and s1. We prove that, under one such model, the mass of the mold will eventually converge to the shortest s0-s1 path of the network that the mold lies on, independently of the structure of the network or of the initial mass distribution. This matches the experimental observations by the biologists and can be seen as an example of a “natural algorithm”, that is, an algorithm developed by evolution over millions of years.

Natural algorithms provide an information processing perspective on biological processes, as well as inspiration for innovative solutions in computer science. We discuss how to turn the Physarum dynamics into a discrete algorithm that is suitable for implementation on modern computers. We also illustrate some of the many open questions concerning Physarum’s computational capabilities.

### Aug 20, 2014, 16:15h INF 211

**Rani Izsak, Weizmann Institute**, “*Welfare Maximization and the Supermodular Degree*“

*Abstract: *Given a set of items and a collection of players, each with a nonnegative monotone valuation set function over the items, the welfare maximization problem requires that every item be allocated to exactly one player, and one wishes to maximize the sum of values obtained by the players, as computed by applying the respective valuation function to the bundle of items allocated to the player. This problem in its full generality is $\NP$-hard, and moreover, at least as hard to approximate as set packing. Better approximation guarantees are known for restricted classes of valuation functions. In this work we introduce a new parameter, the {\em supermodular degree} of a valuation function, which is a measure for the extent to which the function exhibits supermodular behavior. We design an approximation algorithm for the welfare maximization problem whose approximation guarantee is linear in the supermodular degree of the underlying valuation functions.

Joint work with Uriel Feige.

### Jul 25, 2014, 15:00h BC 420

**Karthekeyan Chandrasekaran, Harvard**, “*Finding a Most Biased Coin with Fewest Flips”*

*Abstract: *The multi-armed bandit problem is a well-studied problem with applications in bioinformatics, clinical trials, etc. A variation of the problem is to find the most rewarding arm in the fewest possible number of steps. When the rewards are Bernoulli, this is equivalent to the problem of finding the most biased coin among a set of coins by tossing them adaptively. The goal here is to devise a strategy that minimizes the expected number of tosses until there exists a coin whose posterior probability of being most biased is at least 1-delta, for a given delta. In this talk, I will present an optimal adaptive strategy for a particular probabilistic setting of the problem. I will show that the strategy performs the best possible action in each step by employing tools from the field of Markov games.

Based on joint work with Richard Karp.

### Jul 24, 2014, 11:00h BC 420

**Lorenzo Orecchia, MIT**, “*A Survey of First-Order Iterative Methods in the Design of Fast Graph Algorithms: from Multiplicative Weight Updates to Nesterov’s Algorithm*”

*Abstract:* Fast iterative methods from Convex Optimization play a crucial role in a number of recent breakthroughs in the design of nearly-linear-time algorithms for fundamental graph problems, such as maximum flow and graph partitioning. Multiplicative Weight Updates, Euclidean and Non-Euclidean Gradient Descent, and Nesterov’s Method have become a mainstay in the construction and analysis of fast algorithms.

However, until recently, the relation between these different methods and the reason for their success have been somewhat murky. What is the exact relation between Multiplicative Weight Updates and Gradient Descent? Why do Multiplicative Weight Updates show up in so many settings? What is the intuition behind Nesterov’s iterative algorithm that achieves the asymptotically optimal iteration count for smooth convex functions (hint: it isn’t just an algebraic trick)? The answer to these questions was not clear.

In this survey, we will provide answers by presenting a unified framework that reveals the power and limitations of each method and provides much needed geometric intuition. Among other insights, we will explain how Multiplicative Weight Updates are a particular instance of a dual iterative method, known as Mirror Descent, and how Nesterov’s algorithm can be naturally derived as a combination of Mirror and Gradient Descent.

### Jul 18, 2014, 15:15h INF 211

**Rico Zenklusen, ETH Zurich**,* “Multi-Budgeted Matchings via the Ham Sandwich Theorem“*

*Abstract: *In many applications, one has to deal with multiple, partially conflicting constraints. In this talk, we consider a multi-objective variant of the maximum weight matching problem, which is a classical combinatorial optimization problem with numerous applications. A natural way to deal with several objectives is to turn all of the objectives but one into budget constraints. This leads to the multi-budgeted matching problem, which asks to find a maximum weight matching subject to k linear constraints with nonnegative coefficients. Whereas this problem can easily be shown to be NP hard even for k=1, I will present in this talk a polynomial-time approximation scheme that works for any constant k. Our algorithm is based on rounding an optimal solution x* of an LP relaxation. Starting with a convex decomposition of x* into few matchings, as guaranteed by Caratheodory’s theorem, we reduce the problem of rounding x* to an arguably simpler problem of successively merging two matchings in the convex decomposition of x*.

To prove that our algorithm is correct, we leverage two beautiful non-constructive mathematical theorems. More precisely, the Jordan Curve Theorem gives a concise and intuitive proof why our algorithm works for k=1, and a result of Stromquist and Woodall that follows from the Ham Sandwich Theorem allows for showing correctness for any constant k.

Part of this work is joint with Fabrizio Grandoni, R. Ravi and Mohit Singh

### Jul 14, 2014, 11:00h BC 420

**Vineet Goyal, Columbia**,* “How to Model Buyer Preferences to Maximize Revenues?”*

*Abstract:* Consider a seller with n (substitutable) products with fixed prices and unlimited supply. Each buyer has a random preference (a permutation over products) and selects the most preferred product among the ones offered by the seller. The seller needs to decide on the subset of products to offer such that expected revenue over random preferences is maximized. This problem is referred to as the assortment planning problem in the literature and arises in many applications including retailing and airlines.

The two fundamental challenges in the assortment planning problem are: i) constructing a model for buyer preferences from historical data that reveals only selections and not preferences, and ii) solving the resulting assortment optimization problem efficiently. In this paper, we present a new paradigm for modeling buyer preferences that gives a simultaneous approximation for a large family of parametric preference models. Moreover, the resulting assortment optimization problem is poly-time solvable. In this talk, I will discuss the properties of this new preference model and conclude with several open challenges in this area. The talk should be accessible to a broad audience.

### Jul 2, 2014, 11:00h BC 420

**James Zou, MIT and MSR**, “*Biology for Algorithms and Algorithms for Biology*“

*Abstract:* I will try to illustrate the fruitful and fun interplay between algorithms and biology through two recent vignettes. In the first part, I will describe how we can use slime mold to solve puzzles, including linear programming problems, as an example of biologically inspired algorithm. In the second, I will discuss an algorithm for deciphering the genomic ancestry of both parents from the DNA of the offspring. This is a basic problem of personal genomics. No biological background will be assumed for this talk.

### Past Seminars

For seminars before June 30, 2014 please see here.