An Ontological Framework for Information Extraction From Diverse Scientific Sources
Automatic information extraction from online published scientific documents is useful in various applications such as tagging, web indexing and search engine optimization. As a result, automatic information extraction has become among the hottest areas of research in text mining. Although various in...
Saved in:
Published in | IEEE access Vol. 9; pp. 42111 - 42124 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 2169-3536 2169-3536 |
DOI | 10.1109/ACCESS.2021.3063181 |
Cover
Summary: | Automatic information extraction from online published scientific documents is useful in various applications such as tagging, web indexing and search engine optimization. As a result, automatic information extraction has become among the hottest areas of research in text mining. Although various information extraction techniques have been proposed in the literature, their efficiency demands domain specific documents with static and well-defined format. Furthermore, their accuracy is challenged with a slight modification in the format. To overcome these issues, a novel ontological framework for information extraction (OFIE) using fuzzy rule-base (FRB) and word sense disambiguation (WSD) is proposed. The proposed approach is validated with a significantly wider document domains sourced from well-known publishing services such as IEEE, ACM, Elsevier, and Springer. We have also compared the proposed information extraction approach against state-of-the-art techniques. The results of the experiment show that the proposed approach is less sensitive to changes in the document format and has a significantly better average accuracy of 89.14% and F-score as 89%. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2021.3063181 |