Maximum Entropy Distribution Estimation with Generalized Regularization

We present a unified and complete account of maximum entropy distribution estimation subject to constraints represented by convex potential functions or, alternatively, by convex regularization. We provide fully general performance guarantees and an algorithm with a complete convergence proof. As sp...

Full description

Saved in:

Bibliographic Details
Published in	Learning Theory pp. 123 - 138
Main Authors	Dudík, Miroslav, Schapire, Robert E.
Format	Book Chapter Conference Proceeding
Language	English
Published	Berlin, Heidelberg Springer Berlin Heidelberg 2006 Springer
Series	Lecture Notes in Computer Science
Subjects	Applied sciences Artificial intelligence Computer science; control theory; systems Dual Objective Exact sciences and technology Generalize Regularization Gibbs Distribution Performance Guarantee Relative Entropy Information use Divergence Compactness Bias Convex function Potential function Method of maximum entropy Artificial intelligence
Online Access	Get full text
ISBN	3540352945 9783540352945
ISSN	0302-9743 1611-3349
DOI	10.1007/11776420_12

Cover

More Information
Summary:	We present a unified and complete account of maximum entropy distribution estimation subject to constraints represented by convex potential functions or, alternatively, by convex regularization. We provide fully general performance guarantees and an algorithm with a complete convergence proof. As special cases, we can easily derive performance guarantees for many known regularization types, including ℓ1, ℓ2, $\ell_{\rm 2}^{\rm 2}$ and ℓ1 + $\ell_{\rm 2}^{\rm 2}$ style regularization. Furthermore, our general approach enables us to use information about the structure of the feature space or about sample selection bias to derive entirely new regularization functions with superior guarantees. We propose an algorithm solving a large and general subclass of generalized maxent problems, including all discussed in the paper, and prove its convergence. Our approach generalizes techniques based on information geometry and Bregman divergences as well as those based more directly on compactness.
Bibliography:	Original Abstract: We present a unified and complete account of maximum entropy distribution estimation subject to constraints represented by convex potential functions or, alternatively, by convex regularization. We provide fully general performance guarantees and an algorithm with a complete convergence proof. As special cases, we can easily derive performance guarantees for many known regularization types, including ℓ1, ℓ2, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\ell_{\rm 2}^{\rm 2}$\end{document} and ℓ1 + \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\ell_{\rm 2}^{\rm 2}$\end{document} style regularization. Furthermore, our general approach enables us to use information about the structure of the feature space or about sample selection bias to derive entirely new regularization functions with superior guarantees. We propose an algorithm solving a large and general subclass of generalized maxent problems, including all discussed in the paper, and prove its convergence. Our approach generalizes techniques based on information geometry and Bregman divergences as well as those based more directly on compactness.
ISBN:	3540352945 9783540352945
ISSN:	0302-9743 1611-3349
DOI:	10.1007/11776420_12