
Private information retrieval protocols allow a user to retrieve a data item from a database without revealing any information about the identity of the item being retrieved. Specifically, in information-theoretic k-server PIR, the database is replicated among k non-communicating servers, and each server learns nothing about the item retrieved by the user. The effectiveness of PIR protocols is usually measured in terms of their communication complexity, which is the total number of bits exchanged between the user and the servers. However, another important cost parameter is the storage overhead, which is the ratio between the total number of bits stored on all the servers and the number of bits in the database. Since single-server information-theoretic PIR is impossible, the storage overhead of all existing PIR protocols is at least 2 (or k, in the case of k-server PIR).
In this work, we show that information-theoretic PIR can be achieved with storage overhead arbitrarily close to the optimal value of 1, without sacrificing the communication complexity. Specifically, we prove that all known k-server PIR protocols can be efficiently emulated, while preserving both privacy and communication complexity but significantly reducing the storage overhead. To this end, we distribute the n bits of the database among s+r servers, each storing n/s coded bits (rather than replicas). Notably, our coding scheme remains the same, regardless of the specific k-server PIR protocol being emulated. For every fixed k, the resulting storage overhead (s+r)/s approaches 1 as s grows; explicitly we have r < k \sqrt{s}(1 + o(1)). Moreover, in the special case k = 2, the storage overhead is only 1 + (1/s).
In order to achieve these results, we introduce and study a new kind of binary linear codes. We call them k-server PIR codes, although they could be also called "availability codes". We then show how such codes can be constructed from Steiner systems, from one-step majority-logic decodable codes, from constant-weight codes, and from certain locally recoverable codes. We also establish several bounds on the parameters of k-server PIR codes, and tabulate the results for all s \le 32 and k \le 16.
His research interests include error-correcting codes, algebraic and iterative decoding algorithms, lattices and sphere packings, coding for digital media, cryptography and computational complexity theory, and fun math problems.
He received an IBM Invention Achievement Award in 1993, and NSF Research Initiation and CAREER awards in 1994 and 1995. In 1996, he was appointed Fellow in the Center for Advanced Study at the University of Illinois, and received the Xerox Award for faculty research. In the same year, he became a Fellow of the Packard Foundation. He received the IEEE Information Theory Society Paper Award (jointly with Ralf Koetter) for the year 2004. In 2005, he received the Fulbright Senior Scholar Fellowship, and the Best Paper Award at the IEEE Symposium on Foundations of Computer Science (FOCS). During 1995–1998, he was an Associate Editor for Coding Theory and during 1998–2001, he was the Editor-in-Chief of the IEEE Transactions on Information Theory. From 2003 to 2009, he was an Editor for the SIAM Journal on Discrete Mathematics. He is currently serving on the Executive Editorial Board for the IEEE Transactions on Information Theory. He has been a member of the Board of Governors of the IEEE Information Theory Society during 1998-2006, and again starting in 2011.