A tree-based corpus annotated with Cyber-Syndrome, symptoms, and acupoints

Prolonged and over-excessive interaction with cyberspace poses a threat to people’s health and leads to the occurrence of Cyber-Syndrome, which covers not only physiological but also psychological disorders. This paper aims to create a tree-shaped gold-standard corpus that annotates the Cyber-Syndro...

Full description

Saved in:
Bibliographic Details
Published inScientific data Vol. 11; no. 1; pp. 482 - 12
Main Authors Wang, Wenxi, Zhao, Zhan, Ning, Huansheng
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 10.05.2024
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text
ISSN2052-4463
2052-4463
DOI10.1038/s41597-024-03321-0

Cover

More Information
Summary:Prolonged and over-excessive interaction with cyberspace poses a threat to people’s health and leads to the occurrence of Cyber-Syndrome, which covers not only physiological but also psychological disorders. This paper aims to create a tree-shaped gold-standard corpus that annotates the Cyber-Syndrome, clinical manifestations, and acupoints that can alleviate their symptoms or signs, designating this corpus as CS-A. In the CS-A corpus, this paper defines six entities and relations subject to annotation. There are 448 texts to annotate in total manually. After three rounds of updating the annotation guidelines, the inter-annotator agreement (IAA) improved significantly, resulting in a higher IAA score of 86.05%. The purpose of constructing CS-A corpus is to increase the popularity of Cyber-Syndrome and draw attention to its subtle impact on people’s health. Meanwhile, annotated corpus promotes the development of natural language processing technology. Some model experiments can be implemented based on this corpus, such as optimizing and improving models for discontinuous entity recognition, nested entity recognition, etc. The CS-A corpus has been uploaded to figshare .
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Undefined-3
ISSN:2052-4463
2052-4463
DOI:10.1038/s41597-024-03321-0