Discovering Subsumption Hierarchies of Ontology Concepts from Text Corpora

This paper proposes a method for learning ontologies given a corpus of text documents. The method identifies concepts in documents and organizes them into a subsumption hierarchy, without presupposing the existence of a seed ontology. The method uncovers latent topics in terms of which document text...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the IEEE/WIC/ACM International Conference on Web Intelligence pp. 402 - 408
Main Authors Zavitsanos, Elias, Paliouras, Georgios, Vouros, George A., Petridis, Sergios
Format Conference Proceeding
LanguageEnglish
Published Washington, DC, USA IEEE Computer Society 02.11.2007
SeriesACM Conferences
Subjects
Online AccessGet full text
ISBN0769530265
9780769530260
DOI10.1109/WI.2007.47

Cover

More Information
Summary:This paper proposes a method for learning ontologies given a corpus of text documents. The method identifies concepts in documents and organizes them into a subsumption hierarchy, without presupposing the existence of a seed ontology. The method uncovers latent topics in terms of which document text is being generated. These topics form the concepts of the new ontology. This is done in a language neutral way, using probabilistic space reduction techniques over the original term space of the corpus. Given multiple sets of concepts (latent topics) being discovered, the proposed method constructs a subsumption hierarchy by performing conditional independence tests among pairs of latent topics, given a third one. The paper provides experimental results over the GENIA corpus from the domain of biomedicine.
ISBN:0769530265
9780769530260
DOI:10.1109/WI.2007.47