Learning Sums of Independent Integer Random Variables
Let S = X 1 +···+X n be a sum of n independent integer random variables X i , where each X i is supported on {0, 1, ..., k - 1} but otherwise may have an arbitrary distribution (in particular the Xi's need not be identically distributed). How many samples are required to learn the distribution...
        Saved in:
      
    
          | Published in | 2013 IEEE 54th Annual Symposium on Foundations of Computer Science pp. 217 - 226 | 
|---|---|
| Main Authors | , , , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        01.10.2013
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0272-5428 | 
| DOI | 10.1109/FOCS.2013.31 | 
Cover
| Summary: | Let S = X 1 +···+X n be a sum of n independent integer random variables X i , where each X i is supported on {0, 1, ..., k - 1} but otherwise may have an arbitrary distribution (in particular the Xi's need not be identically distributed). How many samples are required to learn the distribution S to high accuracy? In this paper we show that the answer is completely independent of n, and moreover we give a computationally efficient algorithm which achieves this low sample complexity. More precisely, our algorithm learns any such S to ε-accuracy (with respect to the total variation distance between distributions) using poly(k, 1/ε) samples, independent of n. Its running time is poly(k, 1/ε) in the standard word RAM model. Thus we give a broad generalization of the main result of [DDS12b] which gave a similar learning result for the special case k = 2 (when the distribution S is a Poisson Binomial Distribution). Prior to this work, no nontrivial results were known for learning these distributions even in the case k = 3. A key difficulty is that, in contrast to the case of k = 2, sums of independent {0, 1, 2}-valued random variables may behave very differently from (discretized) normal distributions, and in fact may be rather complicated - they are not log-concave, they can be Θ(n)-modal, there is no relationship between Kolmogorov distance and total variation distance for the class, etc. Nevertheless, the heart of our learning result is a new limit theorem which characterizes what the sum of an arbitrary number of arbitrary independent {0, 1, ... , k-1}-valued random variables may look like. Previous limit theorems in this setting made strong assumptions on the "shift invariance" of the random variables Xi in order to force a discretized normal limit. We believe that our new limit theorem, as the first result for truly arbitrary sums of independent {0, 1, ... , k-1}-valued random variables, is of independent interest. | 
|---|---|
| ISSN: | 0272-5428 | 
| DOI: | 10.1109/FOCS.2013.31 |