A tree-based corpus annotated with Cyber-Syndrome, symptoms, and acupoints
Prolonged and over-excessive interaction with cyberspace poses a threat to people’s health and leads to the occurrence of Cyber-Syndrome, which covers not only physiological but also psychological disorders. This paper aims to create a tree-shaped gold-standard corpus that annotates the Cyber-Syndro...
Saved in:
Published in | Scientific data Vol. 11; no. 1; pp. 482 - 12 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
London
Nature Publishing Group UK
10.05.2024
Nature Publishing Group Nature Portfolio |
Subjects | |
Online Access | Get full text |
ISSN | 2052-4463 2052-4463 |
DOI | 10.1038/s41597-024-03321-0 |
Cover
Summary: | Prolonged and over-excessive interaction with cyberspace poses a threat to people’s health and leads to the occurrence of Cyber-Syndrome, which covers not only physiological but also psychological disorders. This paper aims to create a tree-shaped gold-standard corpus that annotates the Cyber-Syndrome, clinical manifestations, and acupoints that can alleviate their symptoms or signs, designating this corpus as CS-A. In the CS-A corpus, this paper defines six entities and relations subject to annotation. There are 448 texts to annotate in total manually. After three rounds of updating the annotation guidelines, the inter-annotator agreement (IAA) improved significantly, resulting in a higher IAA score of 86.05%. The purpose of constructing CS-A corpus is to increase the popularity of Cyber-Syndrome and draw attention to its subtle impact on people’s health. Meanwhile, annotated corpus promotes the development of natural language processing technology. Some model experiments can be implemented based on this corpus, such as optimizing and improving models for discontinuous entity recognition, nested entity recognition, etc. The CS-A corpus has been uploaded to
figshare
. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 ObjectType-Undefined-3 |
ISSN: | 2052-4463 2052-4463 |
DOI: | 10.1038/s41597-024-03321-0 |