Discovering Subsumption Hierarchies of Ontology Concepts from Text Corpora
This paper proposes a method for learning ontologies given a corpus of text documents. The method identifies concepts in documents and organizes them into a subsumption hierarchy, without presupposing the existence of a seed ontology. The method uncovers latent topics in terms of which document text...
Saved in:
| Published in | Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence pp. 402 - 408 |
|---|---|
| Main Authors | , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
Washington, DC, USA
IEEE Computer Society
02.11.2007
|
| Series | ACM Conferences |
| Subjects | |
| Online Access | Get full text |
| ISBN | 0769530265 9780769530260 |
| DOI | 10.1109/WI.2007.47 |
Cover
| Summary: | This paper proposes a method for learning ontologies given a corpus of text documents. The method identifies concepts in documents and organizes them into a subsumption hierarchy, without presupposing the existence of a seed ontology. The method uncovers latent topics in terms of which document text is being generated. These topics form the concepts of the new ontology. This is done in a language neutral way, using probabilistic space reduction techniques over the original term space of the corpus. Given multiple sets of concepts (latent topics) being discovered, the proposed method constructs a subsumption hierarchy by performing conditional independence tests among pairs of latent topics, given a third one. The paper provides experimental results over the GENIA corpus from the domain of biomedicine. |
|---|---|
| ISBN: | 0769530265 9780769530260 |
| DOI: | 10.1109/WI.2007.47 |