Distribution Approximation Techniques for Security, Differential Privacy, and Learning


Jacobs Hall, Room 2512, Jacobs School of Engineering, 9500 Gilman Dr, La Jolla, San Diego, California 92093

Sponsored By:
Professor Young-Han Kim

Paul Cuff, Ph.D.
Dept. Of Electrical Engineering
Princeton University


In a variety of settings, the performance of  a system can be assured or characterized by relating it to a distribution that closely approximates its behavior.  The approximating distribution may be simpler to analyze than the exact behavior or ideal in some sense.  In this talk we show several different settings where we have developed and used approximation techniques to obtain new  results. First we will discuss a wiretap channel setting (with channel state at the encoder) for which we have recently obtained a secrecy rate that improves on the literature.  Our encoder is both simple and unusual.  In these settings of information theoretic security, the analysis of even simple encoders can be a challenge.  For this result, we greatly benefited from our recently established technique of distribution approximation.  We construct a proxy distribution for which it is simple to prove the security of the communication, and we show that this proxy distribution is close in total variation to the true system behavior. Next we will discuss differential privacy, which has become a popular notion of database privacy.  This privacy definition involves distributions being close to each other under a particular metric.  We develop new bounds to establish that differential privacy is equivalent to a mutual information constraint.  In fact, it is conditional mutual information that must be used to establish this equivalence, which is the feature that was overlooked in previous attempts in the literature to make this sort of connection.  This new viewpoint brings simple intuition and understanding to the concept. Finally, a classic statistics question that has gathered a great deal of interest and activity in recent years is that of estimating key features about the distribution from which data was generated without making any assumptions---in particular, estimating how many possible values the data can take (e.g. how many species were not sampled?) or estimating the entropy.  Results of the past few years have shown that you can accurately estimate these quantities from the data quicker than you might expect (even logarithmically undersampling the space); however, unfortunately, you have no way of knowing that you’ve taken sufficient samples for an accurate estimate.  We show that by modifying the objective, allowing the estimate to be accurate for an approximating distribution, you can overcome this obstacle and know from the data alone that you have an accurate estimate. 

Speaker Bio:
Paul Cuff received the B.S. degree in electrical engineering from Brigham Young University, Provo, UT, in 2004 and the M.S. and Ph.D. degrees in electrical engineering from Stanford University in 2006 and 2009. His Ph.D. research advisor was Thomas Cover. Since 2009 he has been an Assistant Professor of Electrical Engineering at Princeton University. Over the years Dr. Cuff has interacted with industry in both the technology and the financial sectors, spending summers at Google, Microsoft Research, and elsewhere, and giving talks at a number of hedge funds. In 2005, while in graduate school, he co-founded a tech startup called Adaptive Hearing Solutions with Bernard Widrow centered around signal processing technology. This venture began with the winning of the Stanford business plan competition. As a graduate student, Dr. Cuff was awarded the ISIT 2008 Student Paper Award for his work titled "Communication Requirements for Generating Correlated Random Variables." This work has led to fruitful and unexpected avenues of research in secure source coding. As faculty, he received the NSF Career Award in 2014 and the AFOSR Young Investigator Program Award in 2015., and a member of APS and AVS.

Cheryl Wills (clwills@ece.ucsd.edu)