A Study of BERT-Based Classification Performance of Text-Based Health Counseling Data

The entry into a hyper-connected society increases the generalization of communication using SNS. Therefore, research to analyze big data accumulated in SNS and extract meaningful information is being conducted in various fields. In particular, with the recent development of Deep Learning, the perfo...

Full description

Saved in:
Bibliographic Details
Published inComputer modeling in engineering & sciences Vol. 135; no. 1; pp. 795 - 808
Main Authors Woo Sung, Yeol, Seung Park, Dae, Ghil Kim, Cheong
Format Journal Article
LanguageEnglish
Published Henderson Tech Science Press 2023
Subjects
Online AccessGet full text
ISSN1526-1506
1526-1492
1526-1506
DOI10.32604/cmes.2022.022465

Cover

Abstract The entry into a hyper-connected society increases the generalization of communication using SNS. Therefore, research to analyze big data accumulated in SNS and extract meaningful information is being conducted in various fields. In particular, with the recent development of Deep Learning, the performance is rapidly improving by applying it to the field of Natural Language Processing, which is a language understanding technology to obtain accurate contextual information. In this paper, when a chatbot system is applied to the healthcare domain for counseling about diseases, the performance of NLP integrated with machine learning for the accurate classification of medical subjects from text-based health counseling data becomes important. Among the various algorithms, the performance of Bidirectional Encoder Representations from Transformers was compared with other algorithms of CNN, RNN, LSTM, and GRU. For this purpose, the health counseling data of Naver Q&A service were crawled as a dataset. KoBERT was used to classify medical subjects according to symptoms and the accuracy of classification results was measured. The simulation results show that KoBERT model performed high performance by more than 5% and close to 18% as large as the smallest.
AbstractList The entry into a hyper-connected society increases the generalization of communication using SNS. Therefore, research to analyze big data accumulated in SNS and extract meaningful information is being conducted in various fields. In particular, with the recent development of Deep Learning, the performance is rapidly improving by applying it to the field of Natural Language Processing, which is a language understanding technology to obtain accurate contextual information. In this paper, when a chatbot system is applied to the healthcare domain for counseling about diseases, the performance of NLP integrated with machine learning for the accurate classification of medical subjects from text-based health counseling data becomes important. Among the various algorithms, the performance of Bidirectional Encoder Representations from Transformers was compared with other algorithms of CNN, RNN, LSTM, and GRU. For this purpose, the health counseling data of Naver Q&A service were crawled as a dataset. KoBERT was used to classify medical subjects according to symptoms and the accuracy of classification results was measured. The simulation results show that KoBERT model performed high performance by more than 5% and close to 18% as large as the smallest.
Author Seung Park, Dae
Woo Sung, Yeol
Ghil Kim, Cheong
Author_xml – sequence: 1
  givenname: Yeol
  surname: Woo Sung
  fullname: Woo Sung, Yeol
– sequence: 2
  givenname: Dae
  surname: Seung Park
  fullname: Seung Park, Dae
– sequence: 3
  givenname: Cheong
  surname: Ghil Kim
  fullname: Ghil Kim, Cheong
BookMark eNp9kE1PwkAQhjcGEwH9Ad6aeC7uRz-PUFFMSDQK5810mdWS0sXdbSL_3hY4GA8eJjOH552ZPCMyaEyDhNwyOhE8odG92qGbcMr5pKsoiS_IkMU8CVlMk8Gv-YqMnNtSKhKR5UOyngbvvt0cAqOD2fxtFc7A4SYoanCu0pUCX5kmeEWrjd1Bo7AHV_jtz-ACofafQWHaxmFdNR_BA3i4Jpcaaoc35z4m68f5qliEy5en52K6DFX3og-RQaZYKkoUIs-YFpnOU4hFWeaaZSWqXJea8gQw4kkKSPVGQYmQpgqjErkYk7vT3r01Xy06L7emtU13UgpOKet85Kyj2IlS1jhnUcu9rXZgD5JRebQne3uytydP9rpM-iejKn-U4S1U9T_JH7cxdzg
CitedBy_id crossref_primary_10_32604_cmes_2023_029911
crossref_primary_10_7717_peerj_cs_2240
Cites_doi 10.18280/ria
10.23844/kjcp.2020.05.32.2.821
10.1186/s13673-019-0205-6
10.36498/KBIGDT.2021.6.1.63
10.1162/neco.1997.9.8.1735
10.1109/MIC.2020.3037151
10.1109/ACCESS.2021.3118960
10.1038/s41746-020-0280-0
ContentType Journal Article
Copyright 2023. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2023. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
7SC
7TB
8FD
8FE
8FG
ABJCF
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
COVID
DWQXO
FR3
GNUQQ
HCIFZ
JQ2
K7-
KR7
L6V
L7M
L~C
L~D
M7S
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
DOI 10.32604/cmes.2022.022465
DatabaseName CrossRef
Computer and Information Systems Abstracts
Mechanical & Transportation Engineering Abstracts
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials - QC
ProQuest Central
Technology Collection
ProQuest One
Coronavirus Research Database
ProQuest Central Korea
Engineering Research Database
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
Civil Engineering Abstracts
ProQuest Engineering Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Engineering Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DatabaseTitle CrossRef
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
Mechanical & Transportation Engineering Abstracts
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Engineering Collection
Advanced Technologies & Aerospace Collection
Civil Engineering Abstracts
Engineering Database
ProQuest One Academic Eastern Edition
Coronavirus Research Database
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
Materials Science & Engineering Collection
Engineering Research Database
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1526-1506
EndPage 808
ExternalDocumentID 10_32604_cmes_2022_022465
GroupedDBID -~X
AAFWJ
AAYXX
ACIWK
ADMLS
AFKRA
ALMA_UNASSIGNED_HOLDINGS
BENPR
CCPQU
CITATION
EBS
EJD
F5P
IPNFZ
J9A
OK1
PHGZM
PHGZT
PIMPY
RTS
7SC
7TB
8FD
8FE
8FG
ABJCF
ABUWG
ARAPS
AZQEC
BGLVJ
COVID
DWQXO
FR3
GNUQQ
HCIFZ
JQ2
K7-
KR7
L6V
L7M
L~C
L~D
M7S
P62
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
ID FETCH-LOGICAL-c246t-e1a8c173be33981f38f97a53bb9f18bec9fbf026ae4267ae0fdcabea77ce4be23
IEDL.DBID 8FG
ISSN 1526-1506
1526-1492
IngestDate Sat Sep 06 07:31:38 EDT 2025
Thu Apr 24 22:53:53 EDT 2025
Tue Jul 01 03:43:17 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Issue 1
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c246t-e1a8c173be33981f38f97a53bb9f18bec9fbf026ae4267ae0fdcabea77ce4be23
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://www.proquest.com/docview/3200120291?pq-origsite=%requestingapplication%
PQID 3200120291
PQPubID 2048798
PageCount 14
ParticipantIDs proquest_journals_3200120291
crossref_primary_10_32604_cmes_2022_022465
crossref_citationtrail_10_32604_cmes_2022_022465
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-00-00
20230101
PublicationDateYYYYMMDD 2023-01-01
PublicationDate_xml – year: 2023
  text: 2023-00-00
PublicationDecade 2020
PublicationPlace Henderson
PublicationPlace_xml – name: Henderson
PublicationTitle Computer modeling in engineering & sciences
PublicationYear 2023
Publisher Tech Science Press
Publisher_xml – name: Tech Science Press
References ref14
Miner (ref9) 2020; 3
Gwon (ref1) 2021; 17
Vaswani (ref23) 2017
ref18
Kandpal (ref12) 2020
Kim (ref17) 2014
Hochreiter (ref20) 1997; 9
Shanmuganathan (ref3) 2021; 11
ref24
Lee (ref19) 2017; 22
Wu (ref11) 2020
ref26
ref25
Heo (ref5) 2021; 10
Softić (ref13) 2021
Park (ref16) 2021; 6
ref22
Jovanović (ref10) 2020; 25
ref21
Kim (ref15) 2020; 32
Rahaman (ref8) 2019; 33
Nasr (ref7) 2021; 9
Muhammad (ref27) 2020
Salminen (ref6) 2020; 10
Cha (ref2) 2021; 17
Zeng (ref4) 2021; 17
References_xml – volume: 17
  start-page: 318
  year: 2021
  ident: ref1
  article-title: Towards a redundant response avoidance for intelligent chatbot
  publication-title: Journal of Information Processing Systems
– volume: 33
  start-page: 435
  year: 2019
  ident: ref8
  article-title: Developing IoT based smart health monitoring systems: A review
  publication-title: Revue D’Intelligence Artificielle
  doi: 10.18280/ria
– volume: 10
  start-page: 465
  year: 2021
  ident: ref5
  article-title: Detection of adverse drug reactions using drug reviews with BERT+ algorithm
  publication-title: KIPS Transactions on Software and Data Engineering
– ident: ref24
– volume: 32
  start-page: 821
  year: 2020
  ident: ref15
  article-title: The application of artificial intelligence technology in counseling and psychotherapy: Recent foreign cases
  publication-title: The Korean Journal of Counseling and Psychotherapy
  doi: 10.23844/kjcp.2020.05.32.2.821
– ident: ref22
– volume: 10
  start-page: 1
  year: 2020
  ident: ref6
  article-title: Developing an online hate classifier for multiple social media platforms
  publication-title: Human-Centric Computing and Information Sciences
  doi: 10.1186/s13673-019-0205-6
– ident: ref25
– volume: 6
  start-page: 63
  year: 2021
  ident: ref16
  article-title: Analysis of the status of natural language processing technology based on deep learning
  publication-title: The Korea Journal of BigData
  doi: 10.36498/KBIGDT.2021.6.1.63
– volume: 22
  start-page: 87
  year: 2017
  ident: ref19
  article-title: Basic and applied research of CNN and RNN
  publication-title: Broadcasting and Media Magazine
– start-page: 3005
  year: 2020
  ident: ref11
  article-title: A review of telemedicine in time of COVID-19
– start-page: 468
  year: 2020
  ident: ref27
  article-title: Developing English conversation chatbot using dialogflow
– volume: 9
  start-page: 1735
  year: 1997
  ident: ref20
  article-title: Long short-term memory
  publication-title: Neural Computation
  doi: 10.1162/neco.1997.9.8.1735
– volume: 17
  start-page: 818
  year: 2021
  ident: ref4
  article-title: Cross-domain text sentiment classification method based on the CNN-BiLSTM-TE model
  publication-title: Journal of Information Processing Systems
– volume: 25
  start-page: 44
  year: 2020
  ident: ref10
  article-title: Chatbots as conversational healthcare services
  publication-title: IEEE Internet Computing
  doi: 10.1109/MIC.2020.3037151
– start-page: 1746
  year: 2014
  ident: ref17
  article-title: Convolutional neural networks for sentence classification
– ident: ref21
– volume: 17
  start-page: 1020
  year: 2021
  ident: ref2
  article-title: The kernel trick for content-based media retrieval in online social networks
  publication-title: Journal of Information Processing Systems
– volume: 11
  start-page: 1
  year: 2021
  ident: ref3
  article-title: AI based forecasting of influenza patterns from twitter information using random forest algorithm
  publication-title: Human-Centric Computing and Information Sciences
– start-page: 1
  year: 2021
  ident: ref13
  article-title: Health chatbot: Design, implementation, acceptance and usage motivation
– ident: ref26
– volume: 9
  start-page: 145248
  year: 2021
  ident: ref7
  article-title: Smart healthcare in the Age of AI: Recent advances, challenges, and future prospects
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2021.3118960
– start-page: 6000
  year: 2017
  ident: ref23
  article-title: Attention is all you need
  publication-title: Advances in neural information processing systems
– ident: ref18
– volume: 3
  start-page: 1
  year: 2020
  ident: ref9
  article-title: Chatbots in the fight against the COVID-19 pandemic
  publication-title: NPJ Digital Medicine
  doi: 10.1038/s41746-020-0280-0
– start-page: 625
  year: 2020
  ident: ref12
  article-title: Contextual chatbot for healthcare purposes (using deep learning)
– ident: ref14
SSID ssj0036389
Score 2.3668218
Snippet The entry into a hyper-connected society increases the generalization of communication using SNS. Therefore, research to analyze big data accumulated in SNS...
SourceID proquest
crossref
SourceType Aggregation Database
Enrichment Source
Index Database
StartPage 795
SubjectTerms Algorithms
Big Data
Classification
Counseling
Deep learning
Machine learning
Natural language processing
Title A Study of BERT-Based Classification Performance of Text-Based Health Counseling Data
URI https://www.proquest.com/docview/3200120291
Volume 135
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1NTwIxEG0ULl78NqJIevBkUtm2C92eDChITCSEQMJt0-5OTwoo68F_b9vtSrhw7mzSvNnO67SdeQjdK06ZAODEGOuGmLKcqEgBscwdaS2k0twVJ7-Pu6N5_LboLMKB2yY8q6xiog_U-SpzZ-RtznydJ5P0af1FnGqUu10NEhqHqE6Z5VpXKT58rSIxd2zs-6WyLrGZACtvNe2GJYrb2Se4bt2MPfqeap1dXtoNy55rhqfoOGwSca_06hk6gOU5OqkEGHBYjxdo3sPuHeAvXhncH0xnpG85Kcde6NI9AfKo48m2NsAZzlyuWxqWNUh4W5eOX1ShLtF8OJg9j0jQSSCZnXlBgKoko4Jr4Fwm1PDESKE6XGtpaGKdJI02NtdSYOlYKIhMnikNSogMYg2MX6HacrWEa4StVcJAmRgUt0PcJnM5lSASaSND3IEGiiqU0iw0EXdaFh-pTSY8sKkDNnXApiWwDfTw_8m67KCxz7hZQZ-GxbRJt66_2T98i46cGnx5QtJEteL7B-7snqHQLf9jtFC9PxhPpn9iT8BS
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV07T8MwELYQDLDwRhQKeIAFyTSx0yYZEGqBUl4VQq3ULdjJeYKWRxHqn-I3cuckVCzdmH0Z8vl83519D8YOtfJlCKCEtbgNgS8zoT0NApnbMyaMtVFUnHzfbXT6wc2gPphj32UtDKVVljbRGepslNIdeU1JV-cpY__s9U3Q1Ch6XS1HaORqcQuTLwzZPk6vL3B_j6RsX_bOO6KYKiBSGTTGAnwdpX6oDCgVR75VkY1DXVfGxNaP8JdiayxGJhqQvEINns1SbUCHYQqBAWp0gCZ_IVBKUQph1L4qLb8i9nf9WWVDYOQh81dUdJC8oJa-AHUHl_LE9XCr_-XBvzTguK29ypYLp5Q3cy1aY3MwXGcr5cAHXpz_DdZvcso7nPCR5a3Lx55oIQdm3A3WpJQjt8v8YVqLQII9iq1zwbzmiU_r4PmFHutN1v8XBLfY_HA0hG3GUSqSoG0AWuGSwuAx82MIoxgtUVCHCvNKlJK0aFpOszOeEwxeHLAJAZsQsEkObIUd_37ymnfsmCVcLaFPisP7kUxVbWf28gFb7PTu75K76-7tLluiSfT57UyVzY_fP2EP_ZWx2XdKwtnTf2vlDw9B_mY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Study+of+BERT-Based+Classification+Performance+of+Text-Based+Health+Counseling+Data&rft.jtitle=Computer+modeling+in+engineering+%26+sciences&rft.au=Woo+Sung%2C+Yeol&rft.au=Seung+Park%2C+Dae&rft.au=Ghil+Kim%2C+Cheong&rft.date=2023&rft.issn=1526-1506&rft.volume=135&rft.issue=1&rft.spage=795&rft.epage=808&rft_id=info:doi/10.32604%2Fcmes.2022.022465&rft.externalDBID=n%2Fa&rft.externalDocID=10_32604_cmes_2022_022465
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1526-1506&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1526-1506&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1526-1506&client=summon