RAG 기반 방사선치료학 교육용 챗봇의 지식소스 유형 및 임베딩 모델 성능 비교

The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were...

Full description

Saved in:
Bibliographic Details
Published inJournal of Radiological Science and Technology, 48(4) Vol. 48; no. 4; pp. 405 - 416
Main Author 정재홍(Jae-Hong Jung), 이경배(Kyung-Bae Lee), 김대건(Daegun Kim), 이영진(Youngjin Lee)
Format Journal Article
LanguageKorean
Published 대한방사선과학회(구 대한방사선기술학회) 31.08.2025
KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY
대한방사선과학회
Subjects
Online AccessGet full text
ISSN2288-3509
2384-1168

Cover

Abstract The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI's GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이 적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며 성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는 영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다.
AbstractList The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI's GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이 적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며 성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는 영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다.
본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다. The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI’s GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. KCI Citation Count: 0
Author 정재홍(Jae-Hong Jung), 이경배(Kyung-Bae Lee), 김대건(Daegun Kim), 이영진(Youngjin Lee)
Author_xml – sequence: 1
  fullname: 정재홍(Jae-Hong Jung), 이경배(Kyung-Bae Lee), 김대건(Daegun Kim), 이영진(Youngjin Lee)
BackLink https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003237070$$DAccess content in National Research Foundation of Korea (NRF)
BookMark eNpNTstKw0AAXERBrf2HvXgRApvd7Gb3GETro1AovS9J3GiIJNB48SYSFHygB4tVVCoqWvHQioL-krv-g8F6cC4zDDPDTIPxNEvVGJjChDuWbTM-XmrMuUUoEpOgmudxgAjDLmeYToGNpleDXx8DPehCPeib3RdT9MxnV98dfXcu4Nf7qbm6NZd9aIbn-m3fXHehedwxh9dm78gc3ENz1fvudsrqCTQ3hX7t6bM-1M9P-vgDmmKoDx6g_izKmRkwEfmbuar-cQW0Fhda80tWvVFbnvfqViIoscLI4RHC1KUBY4GiCEcBt4XvEJdTqko7Qg5S3KUR5SoUjhLI9rHNBQp9Qh1SAXOj2bQdySSMZebHv7yeyaQtvWZrWdrIdQQpUQGzo3AS51uxTNfyTbnirTZw-QAzTARzmUD4X247CzIZZFkSqnRLtaWDbIRcm7qYIkZ-AN-6i0A
ContentType Journal Article
Copyright COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED
Copyright_xml – notice: COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED
DBID P5Y
SSSTE
JDI
ACYCR
DEWEY 616.0757
DatabaseName Kyobo Scholar Journals
Scholar(스콜라)
KoreaScience
Korean Citation Index
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Medicine
DocumentTitleAlternate Performance Comparison of Knowledge Sources Types and Embedding Models in a RAG-based Educational Chatbots for Radiation Therapy
EISSN 2384-1168
EndPage 416
ExternalDocumentID oai_kci_go_kr_ARTI_10749333
JAKO202526239676902
4010071572506
GroupedDBID .UV
ALMA_UNASSIGNED_HOLDINGS
P5Y
SSSTE
JDI
ACYCR
ID FETCH-LOGICAL-k953-cf48f02575b66be502fb819a437855e75bf040e875f58ec94e901a21890ca3543
ISSN 2288-3509
IngestDate Tue Sep 02 03:16:41 EDT 2025
Thu Oct 02 04:38:06 EDT 2025
Wed Oct 01 06:54:30 EDT 2025
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Issue 4
Keywords 검색 증강 생성
대형언어모델
교육
Education
챗봇
방사선치료
RAG
Chatbot
Radiation therapy
LLM
Language Korean
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-k953-cf48f02575b66be502fb819a437855e75bf040e875f58ec94e901a21890ca3543
Notes KISTI1.1003/JNL.JAKO202526239676902
http://journal.iksrs.or.kr/index.php
OpenAccessLink http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202526239676902&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01
PageCount 12
ParticipantIDs nrf_kci_oai_kci_go_kr_ARTI_10749333
kisti_ndsl_JAKO202526239676902
kyobo_bookcenter_4010071572506
PublicationCentury 2000
PublicationDate 2025-08-31
PublicationDateYYYYMMDD 2025-08-31
PublicationDate_xml – month: 08
  year: 2025
  text: 2025-08-31
  day: 31
PublicationDecade 2020
PublicationTitle Journal of Radiological Science and Technology, 48(4)
PublicationTitleAlternate Journal of radiological science and technology
PublicationYear 2025
Publisher 대한방사선과학회(구 대한방사선기술학회)
KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY
대한방사선과학회
Publisher_xml – name: 대한방사선과학회(구 대한방사선기술학회)
– name: KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY
– name: 대한방사선과학회
SSID ssib036278625
ssib053376989
ssib030194475
ssib023718648
Score 1.933429
Snippet The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models...
본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs....
SourceID nrf
kisti
kyobo
SourceType Open Website
Open Access Repository
Publisher
StartPage 405
SubjectTerms 방사선과학
TableOfContents 서 론 Ⅱ. 대상 및 방법 Ⅲ. 결 과 Ⅳ. 고 찰 Ⅴ. 결 론 REFERENCES
Title RAG 기반 방사선치료학 교육용 챗봇의 지식소스 유형 및 임베딩 모델 성능 비교
URI https://scholar.kyobobook.co.kr/article/detail/4010071572506
http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202526239676902&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01
https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003237070
Volume 48
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
ispartofPNX Journal of Radiological Science and Technology, 2025, 48(4), , pp.405-416
journalDatabaseRights – providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2384-1168
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssib023718648
  issn: 2288-3509
  databaseCode: M~E
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnR1Na9RANLQ9eRFFxfpRAjqnZSXJzCQzx2R3a61UoVToLSTZRMvKLqztQQ8isijUih4srtJKRUWrHlpRaP9Sk_4H35tkP5Ra1Ev25c3Lm_cxH-8lszOadlHEMhKhiYdmhGEZN_QuB0ESlSHzoCKRkS3VJkkz1-ypG2x6ns-PjH4ZWrW0tBheiu4d-L-S__Eq4MCv-C_Zf_BsnykgAAb_whU8DNe_8vGse7lEai7xBPEMUvPwKkWpgFxJahUiLOJWFMCIayDgSSBCGpcSAUVVIjmRMucEkCKSFeKaCnCBUUk9ZxLpKN4QfzqqrJpXB7QOLpnAWjzEImAr5gAAA1bq8TRUfQKq7IkpJvPCGkqIKEtRAR-W1wyCAgslsaiCrqVCG5AHUVBW8JKKQ6HEH6Lu2aC-0B_vewMbfjwYfGPAVscERN5s6DWJUtJAS-WyurnhoMrqMAno7uUyWMSrFhp6xoDEVWIq_cA8CLioiMcO4FJBQwmzZ182_IbG4r1XzkWfGuKYO7RyaDOAWinxBt5XgAscBjODZUFHoNzIbRDnOCpY2TTzM4qK2YgZfCiwYeYBe47_Fgv8sut4I1rwb7b8RtuH3OqKj0t3JaV0VBulJh4WMnO_1hu6LQoxjj3IrGHakLiVZP_ethzInfv3kGQ4eHQpHv3YUwbSQ8yZFvD3bitsQbTXbCdD0d7cMe1o0WB0N-9zx7WRRuuEdgv6m763s5VudfV0azN7-DXrbGS73fTdyv7qK33vx_Ns7W32elPPtl-m3x9n6109-_gge7KePVrJlt_r2drGfncVHn2mZ2866beN9MWmnn7-lD7d0bPOdrr8QU93O8DmpDY3WZurTJWLk0rKDclpOUqYSMDxDg9tO4y5YSUhRNoBo47gPAZ0AnNlLByecBFHksUQhQcQXEsjCihn9JQ21mw149OaXk-set2BLC2KYDpNYIjlNBJRyK0oTowgHNcmlJH8Zv3ObX_avXodW5wFSYzExeqGhQRoPR-TdFzgHbd9ZuBaKJM7kO_Y49oFMKty7iFOPvNXVGe1I4MGf04bW2wvxechSF8MJ1Tz-AncfMQy
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RAG+%EA%B8%B0%EB%B0%98+%EB%B0%A9%EC%82%AC%EC%84%A0%EC%B9%98%EB%A3%8C%ED%95%99+%EA%B5%90%EC%9C%A1%EC%9A%A9+%EC%B1%97%EB%B4%87%EC%9D%98+%EC%A7%80%EC%8B%9D%EC%86%8C%EC%8A%A4+%EC%9C%A0%ED%98%95+%EB%B0%8F+%EC%9E%84%EB%B2%A0%EB%94%A9+%EB%AA%A8%EB%8D%B8+%EC%84%B1%EB%8A%A5+%EB%B9%84%EA%B5%90&rft.jtitle=Journal+of+Radiological+Science+and+Technology%2C+48%284%29&rft.au=%EC%A0%95%EC%9E%AC%ED%99%8D&rft.au=%EC%9D%B4%EA%B2%BD%EB%B0%B0&rft.au=%EA%B9%80%EB%8C%80%EA%B1%B4&rft.au=%EC%9D%B4%EC%98%81%EC%A7%84&rft.date=2025-08-31&rft.pub=%EB%8C%80%ED%95%9C%EB%B0%A9%EC%82%AC%EC%84%A0%EA%B3%BC%ED%95%99%ED%9A%8C&rft.issn=2288-3509&rft.eissn=2384-1168&rft.spage=405&rft.epage=416&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_10749333
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2288-3509&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2288-3509&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2288-3509&client=summon