Discovering Subsumption Hierarchies of Ontology Concepts from Text Corpora

This paper proposes a method for learning ontologies given a corpus of text documents. The method identifies concepts in documents and organizes them into a subsumption hierarchy, without presupposing the existence of a seed ontology. The method uncovers latent topics in terms of which document text...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence pp. 402 - 408
Main Authors	Zavitsanos, Elias, Paliouras, Georgios, Vouros, George A., Petridis, Sergios
Format	Conference Proceeding
Language	English
Published	Washington, DC, USA IEEE Computer Society 02.11.2007
Series	ACM Conferences
Subjects	Applied computing > Document management and text processing Applied computing > Document management and text processing > Document capture > Document analysis Computing methodologies > Machine learning Information systems > Information retrieval Information systems > Information systems applications > Data mining Theory of computation > Models of computation > Probabilistic computation
Online Access	Get full text
ISBN	0769530265 9780769530260
DOI	10.1109/WI.2007.47

Cover

More Information
Summary:	This paper proposes a method for learning ontologies given a corpus of text documents. The method identifies concepts in documents and organizes them into a subsumption hierarchy, without presupposing the existence of a seed ontology. The method uncovers latent topics in terms of which document text is being generated. These topics form the concepts of the new ontology. This is done in a language neutral way, using probabilistic space reduction techniques over the original term space of the corpus. Given multiple sets of concepts (latent topics) being discovered, the proposed method constructs a subsumption hierarchy by performing conditional independence tests among pairs of latent topics, given a third one. The paper provides experimental results over the GENIA corpus from the domain of biomedicine.
ISBN:	0769530265 9780769530260
DOI:	10.1109/WI.2007.47