Named Entity Recognition in Indonesian History Textbook Using BERT Model
History is not recognized as an explicit subject in some primary or secondary education institutions anymore. Certainly, this can cause concern for the younger generation about their nation's history. Whereas history textbooks are available in digital form and contain much information, the pres...
Saved in:
Published in | Cogito smart journal Vol. 11; no. 1; pp. 140 - 151 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
30.06.2025
|
Online Access | Get full text |
ISSN | 2541-2221 2477-8079 |
DOI | 10.31154/cogito.v11i1.880.140-151 |
Cover
Summary: | History is not recognized as an explicit subject in some primary or secondary education institutions anymore. Certainly, this can cause concern for the younger generation about their nation's history. Whereas history textbooks are available in digital form and contain much information, the presentation is still unstructured and difficult to understand. This research aims to develop a model of extracting historical entities from textbooks using the Named Entity Recognition (NER) approach based on the BERT (Bidirectional Encoder Representations from Transformers). The text data is derived from the history chapter of the 8th Social Science published by the Ministry of Education. The research stages include data extraction, preprocessing, IOB labeling, identifying entities by the BERT algorithm, and performance evaluation. The preprocessing results successfully reduced irrelevant words and improved analysis efficiency. The BERT model showed high performance with a precision value of 88.68%, a recall of 74.60%, and an F1-score of 81.03%. In addition, there were fluctuations in training time between epochs that were influenced by entity variation and sentence complexity. Overall, this research shows, the model application can extract historical entities automatically and accurately, thus potentially enriching historical understanding for students and society through the utilization of Natural Language Processing technology |
---|---|
ISSN: | 2541-2221 2477-8079 |
DOI: | 10.31154/cogito.v11i1.880.140-151 |