Testing High-Dimensional Distributions: Subcube Conditioning, Random Restrictions, and Mean Testing

Seminar Date(s)
Seminar Location
Computer Science and Engineering (CSE) Building - Room 4258
Seminar Speaker
Clément Canonne - IBM Research, Almaden
Photo
Abstract

Given a distribution p​​ on {-1,1}^d​​, we want to test whether p​​ is uniform. If p is assumed to be a product distribution, this can be done with Theta(sqrt{d}) samples; without this assumption, then things get exponentially worse and Theta(2^{d/2}) are necessary and sufficient. Assume however we can condition on arbitrary bits: does the problem become easier? If so, how much?

This is the "subcube conditional sampling model", first considered in Bhattacharyya and Chakraborty (2018). We provide a nearly optimal ~O(sqrt{d})-algorithm for testing uniformity in this model. The key
component of our proof is a natural notion of random restriction for distributions on {-1,1}^d, and a quantitative analysis of how such a restriction affects the mean vector of the distribution. Along the way, we provide an optimal upper bound on the (unconditional) sample complexity of a natural problem, "mean testing."

Joint work with Xi Chen, Gautam Kamath, Amit Levi, and Erik Waingarten.

Seminar Speaker Bio
Clément Canonne is an IBM Herman Goldstine Postdoctoral fellow at IBM Research, Almaden. His research focuses on property and distribution testing, learning theory, and theoretical aspects of machine learning and data privacy. He received his Ph.D. in Computer Science from Columbia University in 2017, and spent two years as a Motwani Postdoctoral fellow at Stanford University before joining IBM Research in 2019.
Seminar Contact
Prof. Alon Orlitsky <aorlitsky@ucsd.edu>