Formalizing Complex Prior Information to Quantify Subjective Interestingness of Frequent Pattern Sets

In this paper, we are concerned with the problem of modelling prior information of a data miner about the data, with the purpose of quantifying subjective interestingness of patterns. Recent results have achieved this for the specific case of prior expectations on the row and column marginals, based...

Full description

Saved in:

Bibliographic Details
Published in	Advances in Intelligent Data Analysis XI pp. 161 - 171
Main Authors	Kontonasios, Kleanthis-Nikolaos, DeBie, Tijl
Format	Book Chapter
Language	English
Published	Berlin, Heidelberg Springer Berlin Heidelberg 2012
Series	Lecture Notes in Computer Science
Subjects	Itemset Frequency Markov Network MaxEnt Model Maximum Entropy Principle Prior Knowledge
Online Access	Get full text
ISBN	9783642341557 3642341551
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-642-34156-4_16

Cover

More Information
Summary:	In this paper, we are concerned with the problem of modelling prior information of a data miner about the data, with the purpose of quantifying subjective interestingness of patterns. Recent results have achieved this for the specific case of prior expectations on the row and column marginals, based on the Maximum Entropy principle [2,9]. In the current paper, we extend these ideas to make them applicable to more general prior information, such as knowledge of frequencies of itemsets, a cluster structure in the data, or the presence of dense areas in the database. As in [2,9], we show how information theory can be used to quantify subjective interestingness against this model, in particular the subjective interestingness of tile patterns [3]. Our method presents an efficient, flexible, and rigorous alternative to the randomization approach presented in [5]. We demonstrate our method by searching for interesting patterns in real-life data with respect to various realistic types of prior information.
ISBN:	9783642341557 3642341551
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-642-34156-4_16