Topic Mining-Based Knowledge Discovery of User Health Information Needs

Understanding the user’s need for health information has become increasingly important as the use of digital health services continues to grow. However, the unstructured data of user-generated questions presents challenges in accurately capturing and analyzing these needs. This study contributes to...

Full description

Saved in:
Bibliographic Details
Published inIndonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 7; no. 4; pp. 641 - 653
Main Authors Khoiriyah Harahap, Dayana, Ditha Tania, Ken, Eka Sevtiyuni, Putri
Format Journal Article
LanguageEnglish
Published 20.10.2025
Online AccessGet full text
ISSN2656-8624
2656-8624
DOI10.35882/ijeeemi.v7i4.270

Cover

Abstract Understanding the user’s need for health information has become increasingly important as the use of digital health services continues to grow. However, the unstructured data of user-generated questions presents challenges in accurately capturing and analyzing these needs. This study contributes to addressing SDG 3 (Good Health and Well-being) by utilizing topic mining-based knowledge discovery to identify the primary topics emerging from user questions submitted through the “Tanya Dokter” feature on the Alodokter platform. A total of 8,550 questions were obtained through web scraping between July 2024 and June 2025. The collected data were preprocessed and subsequently analyzed using seven topic modeling approaches: Latent Dirichlet Allocation (LDA), Correlated Topic Model (CTM), Latent Semantic Analysis (LSA), Non-negative Matrix Factorization (NMF), BERTopic, Top2Vec, and ProdLDA. To assess model performance, the coherence metric (c_v) was employed to identify the most effective method. Among these techniques, NMF achieved the best results, producing the highest coherence score of 0.67 with six well-defined topics. The findings show six primary areas of concern: pregnancy; menstruation and contraceptive management; general health and minor ailments; infant care; dermatological conditions; and musculoskeletal and other physical complaints. General health-related issues occurred most frequently, particularly during seasonal transitions, while menstruation and contraceptive management received the least attention, despite menstruation contributing to women’s health risks and the use of contraceptives helping to reduce maternal mortality in Indonesia. These findings offer valuable insights for digital health platforms like Alodokter to enhance information delivery and health literacy, ultimately improving online health services and supporting the achievement of SDG 3
AbstractList Understanding the user’s need for health information has become increasingly important as the use of digital health services continues to grow. However, the unstructured data of user-generated questions presents challenges in accurately capturing and analyzing these needs. This study contributes to addressing SDG 3 (Good Health and Well-being) by utilizing topic mining-based knowledge discovery to identify the primary topics emerging from user questions submitted through the “Tanya Dokter” feature on the Alodokter platform. A total of 8,550 questions were obtained through web scraping between July 2024 and June 2025. The collected data were preprocessed and subsequently analyzed using seven topic modeling approaches: Latent Dirichlet Allocation (LDA), Correlated Topic Model (CTM), Latent Semantic Analysis (LSA), Non-negative Matrix Factorization (NMF), BERTopic, Top2Vec, and ProdLDA. To assess model performance, the coherence metric (c_v) was employed to identify the most effective method. Among these techniques, NMF achieved the best results, producing the highest coherence score of 0.67 with six well-defined topics. The findings show six primary areas of concern: pregnancy; menstruation and contraceptive management; general health and minor ailments; infant care; dermatological conditions; and musculoskeletal and other physical complaints. General health-related issues occurred most frequently, particularly during seasonal transitions, while menstruation and contraceptive management received the least attention, despite menstruation contributing to women’s health risks and the use of contraceptives helping to reduce maternal mortality in Indonesia. These findings offer valuable insights for digital health platforms like Alodokter to enhance information delivery and health literacy, ultimately improving online health services and supporting the achievement of SDG 3
Author Eka Sevtiyuni, Putri
Ditha Tania, Ken
Khoiriyah Harahap, Dayana
Author_xml – sequence: 1
  givenname: Dayana
  surname: Khoiriyah Harahap
  fullname: Khoiriyah Harahap, Dayana
– sequence: 2
  givenname: Ken
  surname: Ditha Tania
  fullname: Ditha Tania, Ken
– sequence: 3
  givenname: Putri
  surname: Eka Sevtiyuni
  fullname: Eka Sevtiyuni, Putri
BookMark eNqN0MFSwjAUheGMgzMi8gDu8gLFJG2TdKmowIi6wXUnCTcYpk06CcL07UVh4dLVvZv_LL5rNPDBA0K3lEzyUkp257YA0LrJXrhiwgS5QEPGS55JzorBn_8KjVPaEkKYrHIpxBDNVqFzBr867_wme1AJ1vjFh0MD6w3gR5dM2EPscbD4I0HEc1DN7hMvvA2xVTsXPH4DWKcbdGlVk2B8viO0en5aTefZ8n22mN4vMyNLkhlrCC-lEsRYXYC02jIhBJGUKjCKVpyA1FVutKa5slxU3GotCJS0YMYW-Qix0-yX71R_UE1Td9G1KvY1JfUvRn3GqH8w6iPGMaKnyMSQUgT7j-YbI-RoUA
ContentType Journal Article
DBID AAYXX
CITATION
ADTOC
UNPAY
DOI 10.35882/ijeeemi.v7i4.270
DatabaseName CrossRef
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
EISSN 2656-8624
EndPage 653
ExternalDocumentID 10.35882/ijeeemi.v7i4.270
10_35882_ijeeemi_v7i4_270
GroupedDBID AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
M~E
ADTOC
UNPAY
ID FETCH-LOGICAL-c850-cfc0658a70cfb4e8fbf27770811aeca1960e8b93cbb13af6796fbb70e5142cf43
IEDL.DBID UNPAY
ISSN 2656-8624
IngestDate Sun Oct 26 03:41:59 EDT 2025
Sat Oct 25 05:19:02 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Issue 4
Language English
License https://creativecommons.org/licenses/by-sa/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c850-cfc0658a70cfb4e8fbf27770811aeca1960e8b93cbb13af6796fbb70e5142cf43
OpenAccessLink https://proxy.k.utb.cz/login?url=https://doi.org/10.35882/ijeeemi.v7i4.270
PageCount 13
ParticipantIDs unpaywall_primary_10_35882_ijeeemi_v7i4_270
crossref_primary_10_35882_ijeeemi_v7i4_270
PublicationCentury 2000
PublicationDate 2025-10-20
PublicationDateYYYYMMDD 2025-10-20
PublicationDate_xml – month: 10
  year: 2025
  text: 2025-10-20
  day: 20
PublicationDecade 2020
PublicationTitle Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics
PublicationYear 2025
SSID ssj0002893877
Score 2.308916
Snippet Understanding the user’s need for health information has become increasingly important as the use of digital health services continues to grow. However, the...
SourceID unpaywall
crossref
SourceType Open Access Repository
Index Database
StartPage 641
Title Topic Mining-Based Knowledge Discovery of User Health Information Needs
URI https://doi.org/10.35882/ijeeemi.v7i4.270
UnpaywallVersion publishedVersion
Volume 7
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2656-8624
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0002893877
  issn: 2656-8624
  databaseCode: M~E
  dateStart: 20190101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0UDp78iBoxSnrwpNl12bbb5YiKEg3EAyR42kxrm6ziQgQ0ePC3O2UXg8ZEvbdJ8zrNe23fzBByJIRhHJwpzDLl8dBqD4zgXsxQfZiI2QhccnK7E7V6_Lov-kWxaJcLs_R_zwSqv9P0wRjzlPovMuV-KPF2Xo4Eyu4SKfc6t4071zwONYnnEh3yX8uf533hnbVpNoLZKwwGS2RyuZHbsMbzGoTOQ_LoTyfK12_fKjT-aZ2bZL2QlLSRx8AWWTHZNrnqDkeppu15_wfvDLnqnt4s3s_oRTrWzrs5o0NLexiFNE9HokV2ktst2kFiG--Q7mWze97yiqYJno5F4GmrnagAGWiruImtsqGUEom_BkYDnrfAxKrOtFI1Bta9IlmlZGBQOIXacrZLStkwM3vO9GRCAM2sksADEMDwsNaBQd1KK0JdIccLRJNRXhojwSvFHI6kgCNxcCQIR4WcfGL---j9f40-IKXJ89QcojqYqCpZbb83q0V0fAC7Ar5a
linkProvider Unpaywall
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA3SHjz5gYoVlRw8KVm3m2SzPdbPorR4aKGelklMYLVui22V-uuddLdSRVDvCYSXCe8leTNDyJGUlgvwpjDHNRORMwysFCzhqD5szF0MPjm53YlbPXHTl_2yWLTPhVn6v-cS1d9p9mitfc6CV5WJIFJ4O6_GEmV3hVR7nbvmvW8eh5qE-USH4tfy53lfeGd1mo9g9gaDwRKZXK0XNqzxvAah95A8BdOJDsz7twqNf1rnBlkrJSVtFjGwSVZsvkWuu8NRZmh73v-BnSFXPdDbxfsZvcjGxns3Z3ToaA-jkBbpSLTMTvK7RTtIbONt0r267J63WNk0gZlEhsw440UFqNA4LWzitIuUUkj8dbAG8LyFNtENbrSuc3D-FclprUKLwikyTvAdUsmHud31picbARjutAIRggSOh7UBHBpOORmZGjleIJqOitIYKV4p5nCkJRyphyNFOGrk5BPz30fv_Wv0PqlMXqb2ANXBRB-WcfEBSYC9KQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Topic+Mining-Based+Knowledge+Discovery+of+User+Health+Information+Needs&rft.jtitle=Indonesian+Journal+of+Electronics%2C+Electromedical+Engineering%2C+and+Medical+Informatics&rft.au=Khoiriyah+Harahap%2C+Dayana&rft.au=Ditha+Tania%2C+Ken&rft.au=Eka+Sevtiyuni%2C+Putri&rft.date=2025-10-20&rft.issn=2656-8624&rft.eissn=2656-8624&rft.volume=7&rft.issue=4&rft.spage=641&rft.epage=653&rft_id=info:doi/10.35882%2Fijeeemi.v7i4.270&rft.externalDBID=n%2Fa&rft.externalDocID=10_35882_ijeeemi_v7i4_270
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2656-8624&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2656-8624&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2656-8624&client=summon