RAG 기반 방사선치료학 교육용 챗봇의 지식소스 유형 및 임베딩 모델 성능 비교
The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were...
Saved in:
| Published in | Journal of Radiological Science and Technology, 48(4) Vol. 48; no. 4; pp. 405 - 416 |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | Korean |
| Published |
대한방사선과학회(구 대한방사선기술학회)
31.08.2025
KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY 대한방사선과학회 |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2288-3509 2384-1168 |
Cover
| Abstract | The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI's GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이 적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며 성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는 영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다. |
|---|---|
| AbstractList | The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI's GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이 적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며 성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는 영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다. 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs. 도메인 특화)에 따른 성능 차이를 비교 평가하였다. 지식소스에는 교재(Textbook), 보고서(Report), 요약(Summary)이고, 언어 생성 모델에는 오픈AI사의 GPT-3.5-turbo 모델과 임베딩 모델에는 총 7가지 모델(OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, PubMedBERT)를 사용했다. 성능 평가는 총 일곱 가지 지표{BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} 지표를 사용했고, 총 10회 반복 측정하여 평균(Mean)과 표준편차(Standard deviation, SD)를 산출했다. 통계분석에는 독립 표본 t-검정과 이원분산분석(Two-way ANOVA)이적용되었다. 요약 지식소스 유형의 의미적 유사도는 평균 0.898 (89.8%)였고, 교재 및 보고서 유형보다 통계적으로 성능이 우수하였다(p < 0.001). 전반적으로 요약 지식소스 유형에서 일반 임베딩 모델이 도메인 특화 모델보다 우수했고, 교재 및 보고서 유형에서는 일부 지표에서만 유의한 차이를 확인하였다. 일반 임베딩 모델이 도메인 특화 모델보다 통계적으로 유의한 차이를 보이며성능이 높았다. 본 연구는 방사선치료학 교육을 위한 RAG 시스템 개발 시 지식소스 구성과 임베딩 모델이 시스템 성능에 미치는영향을 평가하였다. 향후 의료 교육용 인공지능 챗봇 설계 및 개발 시 최적화를 위한 유용한 기초자료가 될 것으로 기대한다. The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models (general vs. domain-specific) in a Retrieval-Augmented Generation (RAG) based radiation therapy education system. The knowledge sources used were textbooks, reports, and summaries. The language generation model was OpenAI’s GPT-3.5-turbo model, and the embedding models were seven different models (OpenAI, MiniLM, BGE, E5, KoSBERT, BioClinicalBERT, and PubMedBERT). Seven metrics {BLEU, ROUGE-L, Sentence BERT (SBERT), BERTScore, Semantic similarity, Accuracy, Time (s)} were to evaluate the performance. The mean and standard deviation (SD) were calculated from 10 repeat measurement. Independent sample t-test and two-way ANOVA were used for statistical analysis. The average semantic similarity of the summary knowledge source type was 0.898 (89.8%), statistically superior to the textbook and report types (p < 0.001). Overall, the generic embedding model was better than the domain-specific model for the summary knowledge source type, while only some metrics were significantly different for the textbook and report types. The general embedding model was high performed the domain-specific model. This study evaluated the impact of knowledge source configuration and embedding model on system performance when developing a RAG system for radiation therapy education. It is expected to provide a useful basis for optimization in the design and development of artificial intelligence (AI) chatbots for medical education in the future. KCI Citation Count: 0 |
| Author | 정재홍(Jae-Hong Jung), 이경배(Kyung-Bae Lee), 김대건(Daegun Kim), 이영진(Youngjin Lee) |
| Author_xml | – sequence: 1 fullname: 정재홍(Jae-Hong Jung), 이경배(Kyung-Bae Lee), 김대건(Daegun Kim), 이영진(Youngjin Lee) |
| BackLink | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003237070$$DAccess content in National Research Foundation of Korea (NRF) |
| BookMark | eNpNTstKw0AAXERBrf2HvXgRApvd7Gb3GETro1AovS9J3GiIJNB48SYSFHygB4tVVCoqWvHQioL-krv-g8F6cC4zDDPDTIPxNEvVGJjChDuWbTM-XmrMuUUoEpOgmudxgAjDLmeYToGNpleDXx8DPehCPeib3RdT9MxnV98dfXcu4Nf7qbm6NZd9aIbn-m3fXHehedwxh9dm78gc3ENz1fvudsrqCTQ3hX7t6bM-1M9P-vgDmmKoDx6g_izKmRkwEfmbuar-cQW0Fhda80tWvVFbnvfqViIoscLI4RHC1KUBY4GiCEcBt4XvEJdTqko7Qg5S3KUR5SoUjhLI9rHNBQp9Qh1SAXOj2bQdySSMZebHv7yeyaQtvWZrWdrIdQQpUQGzo3AS51uxTNfyTbnirTZw-QAzTARzmUD4X247CzIZZFkSqnRLtaWDbIRcm7qYIkZ-AN-6i0A |
| ContentType | Journal Article |
| Copyright | COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED |
| Copyright_xml | – notice: COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED |
| DBID | P5Y SSSTE JDI ACYCR |
| DEWEY | 616.0757 |
| DatabaseName | Kyobo Scholar Journals Scholar(스콜라) KoreaScience Korean Citation Index |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Medicine |
| DocumentTitleAlternate | Performance Comparison of Knowledge Sources Types and Embedding Models in a RAG-based Educational Chatbots for Radiation Therapy |
| EISSN | 2384-1168 |
| EndPage | 416 |
| ExternalDocumentID | oai_kci_go_kr_ARTI_10749333 JAKO202526239676902 4010071572506 |
| GroupedDBID | .UV ALMA_UNASSIGNED_HOLDINGS P5Y SSSTE JDI ACYCR |
| ID | FETCH-LOGICAL-k953-cf48f02575b66be502fb819a437855e75bf040e875f58ec94e901a21890ca3543 |
| ISSN | 2288-3509 |
| IngestDate | Tue Sep 02 03:16:41 EDT 2025 Thu Oct 02 04:38:06 EDT 2025 Wed Oct 01 06:54:30 EDT 2025 |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Issue | 4 |
| Keywords | 검색 증강 생성 대형언어모델 교육 Education 챗봇 방사선치료 RAG Chatbot Radiation therapy LLM |
| Language | Korean |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-k953-cf48f02575b66be502fb819a437855e75bf040e875f58ec94e901a21890ca3543 |
| Notes | KISTI1.1003/JNL.JAKO202526239676902 http://journal.iksrs.or.kr/index.php |
| OpenAccessLink | http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202526239676902&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01 |
| PageCount | 12 |
| ParticipantIDs | nrf_kci_oai_kci_go_kr_ARTI_10749333 kisti_ndsl_JAKO202526239676902 kyobo_bookcenter_4010071572506 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-08-31 |
| PublicationDateYYYYMMDD | 2025-08-31 |
| PublicationDate_xml | – month: 08 year: 2025 text: 2025-08-31 day: 31 |
| PublicationDecade | 2020 |
| PublicationTitle | Journal of Radiological Science and Technology, 48(4) |
| PublicationTitleAlternate | Journal of radiological science and technology |
| PublicationYear | 2025 |
| Publisher | 대한방사선과학회(구 대한방사선기술학회) KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY 대한방사선과학회 |
| Publisher_xml | – name: 대한방사선과학회(구 대한방사선기술학회) – name: KOREAN SOCIETY OF RADIOLOGICAL TECHNOLOGY – name: 대한방사선과학회 |
| SSID | ssib036278625 ssib053376989 ssib030194475 ssib023718648 |
| Score | 1.933429 |
| Snippet | The purpose of this study was to evaluate the performance differences between knowledge source types (textbook, report, and summary) and embedding models... 본 연구는 검색 증강 생성(Retrieval-Augmented Generation, RAG) 기반 방사선치료학 교육 시스템에서 지식소스 유형(교재, 보고서, 요약) 및 임베딩 모델(일반 vs.... |
| SourceID | nrf kisti kyobo |
| SourceType | Open Website Open Access Repository Publisher |
| StartPage | 405 |
| SubjectTerms | 방사선과학 |
| TableOfContents | 서 론 Ⅱ. 대상 및 방법 Ⅲ. 결 과 Ⅳ. 고 찰 Ⅴ. 결 론 REFERENCES |
| Title | RAG 기반 방사선치료학 교육용 챗봇의 지식소스 유형 및 임베딩 모델 성능 비교 |
| URI | https://scholar.kyobobook.co.kr/article/detail/4010071572506 http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202526239676902&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01 https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003237070 |
| Volume | 48 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| ispartofPNX | Journal of Radiological Science and Technology, 2025, 48(4), , pp.405-416 |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2384-1168 dateEnd: 99991231 omitProxy: true ssIdentifier: ssib023718648 issn: 2288-3509 databaseCode: M~E dateStart: 20150101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnR1Na9RANLQ9eRFFxfpRAjqnZSXJzCQzx2R3a61UoVToLSTZRMvKLqztQQ8isijUih4srtJKRUWrHlpRaP9Sk_4H35tkP5Ra1Ev25c3Lm_cxH-8lszOadlHEMhKhiYdmhGEZN_QuB0ESlSHzoCKRkS3VJkkz1-ypG2x6ns-PjH4ZWrW0tBheiu4d-L-S__Eq4MCv-C_Zf_BsnykgAAb_whU8DNe_8vGse7lEai7xBPEMUvPwKkWpgFxJahUiLOJWFMCIayDgSSBCGpcSAUVVIjmRMucEkCKSFeKaCnCBUUk9ZxLpKN4QfzqqrJpXB7QOLpnAWjzEImAr5gAAA1bq8TRUfQKq7IkpJvPCGkqIKEtRAR-W1wyCAgslsaiCrqVCG5AHUVBW8JKKQ6HEH6Lu2aC-0B_vewMbfjwYfGPAVscERN5s6DWJUtJAS-WyurnhoMrqMAno7uUyWMSrFhp6xoDEVWIq_cA8CLioiMcO4FJBQwmzZ182_IbG4r1XzkWfGuKYO7RyaDOAWinxBt5XgAscBjODZUFHoNzIbRDnOCpY2TTzM4qK2YgZfCiwYeYBe47_Fgv8sut4I1rwb7b8RtuH3OqKj0t3JaV0VBulJh4WMnO_1hu6LQoxjj3IrGHakLiVZP_ethzInfv3kGQ4eHQpHv3YUwbSQ8yZFvD3bitsQbTXbCdD0d7cMe1o0WB0N-9zx7WRRuuEdgv6m763s5VudfV0azN7-DXrbGS73fTdyv7qK33vx_Ns7W32elPPtl-m3x9n6109-_gge7KePVrJlt_r2drGfncVHn2mZ2866beN9MWmnn7-lD7d0bPOdrr8QU93O8DmpDY3WZurTJWLk0rKDclpOUqYSMDxDg9tO4y5YSUhRNoBo47gPAZ0AnNlLByecBFHksUQhQcQXEsjCihn9JQ21mw149OaXk-set2BLC2KYDpNYIjlNBJRyK0oTowgHNcmlJH8Zv3ObX_avXodW5wFSYzExeqGhQRoPR-TdFzgHbd9ZuBaKJM7kO_Y49oFMKty7iFOPvNXVGe1I4MGf04bW2wvxechSF8MJ1Tz-AncfMQy |
| linkProvider | ISSN International Centre |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RAG+%EA%B8%B0%EB%B0%98+%EB%B0%A9%EC%82%AC%EC%84%A0%EC%B9%98%EB%A3%8C%ED%95%99+%EA%B5%90%EC%9C%A1%EC%9A%A9+%EC%B1%97%EB%B4%87%EC%9D%98+%EC%A7%80%EC%8B%9D%EC%86%8C%EC%8A%A4+%EC%9C%A0%ED%98%95+%EB%B0%8F+%EC%9E%84%EB%B2%A0%EB%94%A9+%EB%AA%A8%EB%8D%B8+%EC%84%B1%EB%8A%A5+%EB%B9%84%EA%B5%90&rft.jtitle=Journal+of+Radiological+Science+and+Technology%2C+48%284%29&rft.au=%EC%A0%95%EC%9E%AC%ED%99%8D&rft.au=%EC%9D%B4%EA%B2%BD%EB%B0%B0&rft.au=%EA%B9%80%EB%8C%80%EA%B1%B4&rft.au=%EC%9D%B4%EC%98%81%EC%A7%84&rft.date=2025-08-31&rft.pub=%EB%8C%80%ED%95%9C%EB%B0%A9%EC%82%AC%EC%84%A0%EA%B3%BC%ED%95%99%ED%9A%8C&rft.issn=2288-3509&rft.eissn=2384-1168&rft.spage=405&rft.epage=416&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_10749333 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2288-3509&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2288-3509&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2288-3509&client=summon |