RAG 기반 방사선치료학 교육용 챗봇의 지식소스 유형 및 임베딩 모델 성능 비교

The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were...

Full description

Saved in:

Bibliographic Details
Published in	Journal of Radiological Science and Technology, 48(4) Vol. 48; no. 4; pp. 405 - 416
Main Author	정재홍(Jae-Hong Jung), 이경배(Kyung-Bae Lee), 김대건(Daegun Kim), 이영진(Youngjin Lee)
Format	Journal Article
Language	Korean
Published	대한방사선과학회(구 대한방사선기술학회) 31.08.2025 KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY 대한방사선과학회
Subjects	방사선과학 검색 증강 생성 대형언어모델 교육 Education 챗봇 방사선치료 RAG Chatbot Radiation therapy LLM
Online Access	Get full text
ISSN	2288-3509 2384-1168

Cover

Abstract	The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI's GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이 적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며 성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는 영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다.
AbstractList	The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI's GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이 적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며 성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는 영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다. 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다. The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI’s GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. KCI Citation Count: 0
Author	정재홍(Jae-Hong Jung), 이경배(Kyung-Bae Lee), 김대건(Daegun Kim), 이영진(Youngjin Lee)
Author_xml	– sequence: 1 fullname: 정재홍(Jae-Hong Jung), 이경배(Kyung-Bae Lee), 김대건(Daegun Kim), 이영진(Youngjin Lee)
BackLink	https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003237070$$DAccess content in National Research Foundation of Korea (NRF)
BookMark	eNpNTstKw0AAXERBrf2HvXgRApvd7Gb3GETro1AovS9J3GiIJNB48SYSFHygB4tVVCoqWvHQioL-krv-g8F6cC4zDDPDTIPxNEvVGJjChDuWbTM-XmrMuUUoEpOgmudxgAjDLmeYToGNpleDXx8DPehCPeib3RdT9MxnV98dfXcu4Nf7qbm6NZd9aIbn-m3fXHehedwxh9dm78gc3ENz1fvudsrqCTQ3hX7t6bM-1M9P-vgDmmKoDx6g_izKmRkwEfmbuar-cQW0Fhda80tWvVFbnvfqViIoscLI4RHC1KUBY4GiCEcBt4XvEJdTqko7Qg5S3KUR5SoUjhLI9rHNBQp9Qh1SAXOj2bQdySSMZebHv7yeyaQtvWZrWdrIdQQpUQGzo3AS51uxTNfyTbnirTZw-QAzTARzmUD4X247CzIZZFkSqnRLtaWDbIRcm7qYIkZ-AN-6i0A
ContentType	Journal Article
Copyright	COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED
Copyright_xml	– notice: COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED
DBID	P5Y SSSTE JDI ACYCR
DEWEY	616.0757
DatabaseName	Kyobo Scholar Journals Scholar(스콜라) KoreaScience Korean Citation Index
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Medicine
DocumentTitleAlternate	Performance Comparison of Knowledge Sources Types and Embedding Models in a RAG-based Educational Chatbots for Radiation Therapy
EISSN	2384-1168
EndPage	416
ExternalDocumentID	oai_kci_go_kr_ARTI_10749333 JAKO202526239676902 4010071572506
GroupedDBID	.UV ALMA_UNASSIGNED_HOLDINGS P5Y SSSTE JDI ACYCR
ID	FETCH-LOGICAL-k953-cf48f02575b66be502fb819a437855e75bf040e875f58ec94e901a21890ca3543
ISSN	2288-3509
IngestDate	Tue Sep 02 03:16:41 EDT 2025 Thu Oct 02 04:38:06 EDT 2025 Wed Oct 01 06:54:30 EDT 2025
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Issue	4
Keywords	검색 증강 생성 대형언어모델 교육 Education 챗봇 방사선치료 RAG Chatbot Radiation therapy LLM
Language	Korean
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-k953-cf48f02575b66be502fb819a437855e75bf040e875f58ec94e901a21890ca3543
Notes	KISTI1.1003/JNL.JAKO202526239676902 http://journal.iksrs.or.kr/index.php
OpenAccessLink	http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202526239676902&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01
PageCount	12
ParticipantIDs	nrf_kci_oai_kci_go_kr_ARTI_10749333 kisti_ndsl_JAKO202526239676902 kyobo_bookcenter_4010071572506
PublicationCentury	2000
PublicationDate	2025-08-31
PublicationDateYYYYMMDD	2025-08-31
PublicationDate_xml	– month: 08 year: 2025 text: 2025-08-31 day: 31
PublicationDecade	2020
PublicationTitle	Journal of Radiological Science and Technology, 48(4)
PublicationTitleAlternate	Journal of radiological science and technology
PublicationYear	2025
Publisher	대한방사선과학회(구 대한방사선기술학회) KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY 대한방사선과학회
Publisher_xml	– name: 대한방사선과학회(구 대한방사선기술학회) – name: KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY – name: 대한방사선과학회
SSID	ssib036278625 ssib053376989 ssib030194475 ssib023718648
Score	1.933429
Snippet	The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models... 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs....
SourceID	nrf kisti kyobo
SourceType	Open Website Open Access Repository Publisher
StartPage	405
SubjectTerms	방사선과학
TableOfContents	서 론 Ⅱ. 대상 및 방법 Ⅲ. 결 과 Ⅳ. 고 찰 Ⅴ. 결 론 REFERENCES
Title	RAG 기반 방사선치료학 교육용 챗봇의 지식소스 유형 및 임베딩 모델 성능 비교
URI	https://scholar.kyobobook.co.kr/article/detail/4010071572506 http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202526239676902&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01 https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003237070
Volume	48
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
ispartofPNX	Journal of Radiological Science and Technology, 2025, 48(4), , pp.405-416
journalDatabaseRights	– providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2384-1168 dateEnd: 99991231 omitProxy: true ssIdentifier: ssib023718648 issn: 2288-3509 databaseCode: M~E dateStart: 20150101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnR1Na9RANLQ9eRFFxfpRAjqnZSXJzCQzx2R3a61UoVToLSTZRMvKLqztQQ8isijUih4srtJKRUWrHlpRaP9Sk_4H35tkP5Ra1Ev25c3Lm_cxH-8lszOadlHEMhKhiYdmhGEZN_QuB0ESlSHzoCKRkS3VJkkz1-ypG2x6ns-PjH4ZWrW0tBheiu4d-L-S__Eq4MCv-C_Zf_BsnykgAAb_whU8DNe_8vGse7lEai7xBPEMUvPwKkWpgFxJahUiLOJWFMCIayDgSSBCGpcSAUVVIjmRMucEkCKSFeKaCnCBUUk9ZxLpKN4QfzqqrJpXB7QOLpnAWjzEImAr5gAAA1bq8TRUfQKq7IkpJvPCGkqIKEtRAR-W1wyCAgslsaiCrqVCG5AHUVBW8JKKQ6HEH6Lu2aC-0B_vewMbfjwYfGPAVscERN5s6DWJUtJAS-WyurnhoMrqMAno7uUyWMSrFhp6xoDEVWIq_cA8CLioiMcO4FJBQwmzZ182_IbG4r1XzkWfGuKYO7RyaDOAWinxBt5XgAscBjODZUFHoNzIbRDnOCpY2TTzM4qK2YgZfCiwYeYBe47_Fgv8sut4I1rwb7b8RtuH3OqKj0t3JaV0VBulJh4WMnO_1hu6LQoxjj3IrGHakLiVZP_ethzInfv3kGQ4eHQpHv3YUwbSQ8yZFvD3bitsQbTXbCdD0d7cMe1o0WB0N-9zx7WRRuuEdgv6m763s5VudfV0azN7-DXrbGS73fTdyv7qK33vx_Ns7W32elPPtl-m3x9n6109-_gge7KePVrJlt_r2drGfncVHn2mZ2866beN9MWmnn7-lD7d0bPOdrr8QU93O8DmpDY3WZurTJWLk0rKDclpOUqYSMDxDg9tO4y5YSUhRNoBo47gPAZ0AnNlLByecBFHksUQhQcQXEsjCihn9JQ21mw149OaXk-set2BLC2KYDpNYIjlNBJRyK0oTowgHNcmlJH8Zv3ObX_avXodW5wFSYzExeqGhQRoPR-TdFzgHbd9ZuBaKJM7kO_Y49oFMKty7iFOPvNXVGe1I4MGf04bW2wvxechSF8MJ1Tz-AncfMQy
linkProvider	ISSN International Centre
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RAG+%EA%B8%B0%EB%B0%98+%EB%B0%A9%EC%82%AC%EC%84%A0%EC%B9%98%EB%A3%8C%ED%95%99+%EA%B5%90%EC%9C%A1%EC%9A%A9+%EC%B1%97%EB%B4%87%EC%9D%98+%EC%A7%80%EC%8B%9D%EC%86%8C%EC%8A%A4+%EC%9C%A0%ED%98%95+%EB%B0%8F+%EC%9E%84%EB%B2%A0%EB%94%A9+%EB%AA%A8%EB%8D%B8+%EC%84%B1%EB%8A%A5+%EB%B9%84%EA%B5%90&rft.jtitle=Journal+of+Radiological+Science+and+Technology%2C+48%284%29&rft.au=%EC%A0%95%EC%9E%AC%ED%99%8D&rft.au=%EC%9D%B4%EA%B2%BD%EB%B0%B0&rft.au=%EA%B9%80%EB%8C%80%EA%B1%B4&rft.au=%EC%9D%B4%EC%98%81%EC%A7%84&rft.date=2025-08-31&rft.pub=%EB%8C%80%ED%95%9C%EB%B0%A9%EC%82%AC%EC%84%A0%EA%B3%BC%ED%95%99%ED%9A%8C&rft.issn=2288-3509&rft.eissn=2384-1168&rft.spage=405&rft.epage=416&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_10749333
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2288-3509&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2288-3509&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2288-3509&client=summon