한국대지 XRD 실험자료 대상 k-평균 군집화 모델 적용성 분석
Mineral composition used to identify the sedimentary environment can be obtained through X-ray diffraction (XRD) analysis. However, due to time constraints for analyzing a large number of samples, a machine learning-based mineral composition analysis model was developed. This model demonstrated reas...
Saved in:
| Published in | 자원환경지질 Vol. 57; no. 5; pp. 529 - 537 |
|---|---|
| Main Authors | , , , , , , |
| Format | Journal Article |
| Language | Korean |
| Published |
Korea Society Of Economic&Environmental Geology
31.10.2024
대한자원환경지질학회 |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1225-7281 2288-7962 |
| DOI | 10.9719/EEG.2024.57.5.529 |
Cover
| Abstract | Mineral composition used to identify the sedimentary environment can be obtained through X-ray diffraction (XRD) analysis. However, due to time constraints for analyzing a large number of samples, a machine learning-based mineral composition analysis model was developed. This model demonstrated reasonable reliability for samples with usual compositions but showed poor performance for unusual samples. Consequently, a clustering model has recently been developed to classify the unusual samples, allowing experts to handle. The purpose of this study is to examine the applicability of the clustering model, developed using XRD data from the Ulleung Basin in previous study, using samples from different regions. Research data consist of intensity profile from XRD experiment and its mineral composition analysis for a total of 54 sediment samples from the Korea Plateau, located northwest of the Ulleung Basin. Because the intensity of samples in the Korea Plateau comprises 7,420 values (3.005-64.996°), differing from 3,100 values (3.01-64.99°) of samples in the Ulleung Basin, linear interpolation was used to align the input feature. Then, min-max scaler was applied to intensity profile for each sample to preserve the trend and peak ratio of the intensity. Applying the clustering model to the 54 preprocessed intensity profiles, 35 samples and 19 samples were classified into expert and machine learning groups, respectively. For machine learning group, false positive was zero among the 19 samples. This means that the clustering model can increase reliability in when mineral composition from machine learning model because unusual sample did not belong to the machine learning group. For the 35 samples in expert group, the 31 samples were classified as false negative (FN). It means that although machine learning model can properly analyze these samples, they were assigned to expert group. However, when these FN samples were analyzed using machine learning based composition analysis model, a high mean absolute error of 2.94% was observed. Therefore, it is reasonable that the samples were assigned to expert group. 퇴적물 생성환경 규명에 사용되는 광물조성자료는 X-선 회절(X-ray diffraction, XRD)분석을 통해 얻을 수 있으나, 대규모 시료에 대한 조성분석 시 효율적인 분석을 위해 머신러닝 기반 광물조성 분석모델이 개발되었다. 해당 모델은 일반조성 시료에 대해 준수한 분석신뢰도를 보였으나, 특이조성을 가지는 시료에 대해서는 저조한 성능을 보였다. 이에 따라 최근 전체 시료 중 특이조성시료를 전문가가 분석할 수 있도록 분류하는 군집화모델이 개발되었다. 본 연구에서는 울릉분지 XRD 시료로 개발한 군집화모델의 타 지역 시료에 대한 적용가능성을 검토하고자 한다. 연구자료는 울릉분지 북서쪽에 위치한 한국대지의 54개 퇴적물 시료에 대한 XRD 실험 및 전문가 광물조성 분석결과로 구성된다. 한국대지 시료의 intensity는 7,420개(3.005-64.996°)로, 울릉분지 3,100개(3.01-64.99°)와 차이를 보여 선형보간을 활용해 일치시켰다. 이후 intensity 비율과 경향성을 보존하기 위해 시료별 최소-최대 정규화를 수행하였다. 전처리한 실험자료에 군집화모델을 적용한 결과, 54개 시료 중 전문가분석은 35개, 머신러닝분석은 19개로 배정되었다. 머신러닝분석으로 판단된 19개 시료 중 false positive(FP)는 0으로, 머신러닝분석 군집에 특이조성시료가 존재하지 않음을 확인하였다. FP는 실제 특이조성을 가져 전문가분석이 필요하지만 머신러닝이 분석하는 것으로 판단된 것을 의미하기 때문에 FP가 적을수록 머신러닝 모델 적용 시 높은 분석신뢰도를 기대할 수 있다. 전문가분석의 경우 35개 중 31개 시료가 false negative로 배정되었으며, 이는 머신러닝이 분석해도 무방하나 전문가가 분석해야할 시료 수가 전체의 57%임을 의미한다. 그러나 해당 시료들을 머신러닝기반 조성분석모델로 분석할 경우 2.94%의 높은 평균절대오차의 평균을 보이기 때문에 전문가분석 군집으로 배정된 것을 합리적으로 평가할 수 있다. |
|---|---|
| AbstractList | 퇴적물 생성환경 규명에 사용되는 광물조성자료는 X-선 회절(X-ray diffraction, XRD)분석을 통해 얻을 수 있으나, 대규모 시료에 대한 조성분석 시 효율적인 분석을 위해 머신러닝 기반 광물조성 분석모델이 개발되었다. 해당 모델은 일반조성 시료에 대해 준수한 분석신뢰도를 보였으나, 특이조성을 가지는 시료에 대해서는 저조한 성능을 보였다. 이에 따라 최근 전체 시료 중 특이조성시료를 전문가가 분석할 수 있도록 분류하는 군집화모델이 개발되었다. 본 연구에서는 울릉분지 XRD 시료로 개발한 군집화모델의 타 지역 시료에 대한 적용가능성을 검토하고자 한다. 연구자료는 울릉분지 북서쪽에 위치한 한국대지의 54개 퇴적물 시료에 대한 XRD 실험 및 전문가 광물조성 분석결과로 구성된다. 한국대지 시료의 intensity는 7,420개(3.005-64.996°)로, 울릉분지 3,100개(3.01-64.99°)와 차이를 보여 선형보간을 활용해 일치시켰다. 이후 intensity 비율과 경향성을 보존하기 위해 시료별 최소-최대 정규화를 수행하였다.
전처리한 실험자료에 군집화모델을 적용한 결과, 54개 시료 중 전문가분석은 35개, 머신러닝분석은 19개로 배정되었다. 머신러닝분석으로 판단된 19개 시료 중 false positive(FP)는 0으로, 머신러닝분석 군집에 특이조성시료가 존재하지 않음을 확인하였다. FP는 실제 특이조성을 가져 전문가분석이 필요하지만 머신러닝이 분석하는 것으로 판단된 것을 의미하기 때문에 FP가 적을수록 머신러닝 모델 적용 시 높은 분석신뢰도를 기대할 수 있다. 전문가분석의 경우 35개 중 31개 시료가 false negative로 배정되었으며, 이는 머신러닝이 분석해도 무방하나 전문가가 분석해야할 시료 수가 전체의 57%임을 의미한다. 그러나 해당 시료들을 머신러닝기반 조성분석모델로 분석할 경우 2.94%의 높은 평균절대오차의 평균을 보이기 때문에 전문가분석 군집으로 배정된 것을 합리적으로 평가할 수 있다. Mineral composition used to identify the sedimentary environment can be obtained through X-ray diffraction (XRD) analysis. However, due to time constraints for analyzing a large number of samples, a machine learning-based mineral composition analysis model was developed. This model demonstrated reasonable reliability for samples with usual compositions but showed poor performance for unusual samples. Consequently, a clustering model has recently been developed to classify the unusual samples, allowing experts to handle. The purpose of this study is to examine the applicability of the clustering model, developed using XRD data from the Ulleung Basin in previous study, using samples from different regions. Research data consist of intensity profile from XRD experiment and its mineral composition analysis for a total of 54 sediment samples from the Korea Plateau, located northwest of the Ulleung Basin. Because the intensity of samples in the Korea Plateau comprises 7,420 values (3.005-64.996°), differing from 3,100 values (3.01-64.99°) of samples in the Ulleung Basin, linear interpolation was used to align the input feature. Then, min-max scaler was applied to intensity profile for each sample to preserve the trend and peak ratio of the intensity.
Applying the clustering model to the 54 preprocessed intensity profiles, 35 samples and 19 samples were classified into expert and machine learning groups, respectively. For machine learning group, false positive was zero among the 19 samples. This means that the clustering model can increase reliability in when mineral composition from machine learning model because unusual sample did not belong to the machine learning group. For the 35 samples in expert group, the 31 samples were classified as false negative (FN). It means that although machine learning model can properly analyze these samples, they were assigned to expert group. However, when these FN samples were analyzed using machine learning based composition analysis model, a high mean absolute error of 2.94% was observed. Therefore, it is reasonable that the samples were assigned to expert group. KCI Citation Count: 0 Mineral composition used to identify the sedimentary environment can be obtained through X-ray diffraction (XRD) analysis. However, due to time constraints for analyzing a large number of samples, a machine learning-based mineral composition analysis model was developed. This model demonstrated reasonable reliability for samples with usual compositions but showed poor performance for unusual samples. Consequently, a clustering model has recently been developed to classify the unusual samples, allowing experts to handle. The purpose of this study is to examine the applicability of the clustering model, developed using XRD data from the Ulleung Basin in previous study, using samples from different regions. Research data consist of intensity profile from XRD experiment and its mineral composition analysis for a total of 54 sediment samples from the Korea Plateau, located northwest of the Ulleung Basin. Because the intensity of samples in the Korea Plateau comprises 7,420 values (3.005-64.996°), differing from 3,100 values (3.01-64.99°) of samples in the Ulleung Basin, linear interpolation was used to align the input feature. Then, min-max scaler was applied to intensity profile for each sample to preserve the trend and peak ratio of the intensity. Applying the clustering model to the 54 preprocessed intensity profiles, 35 samples and 19 samples were classified into expert and machine learning groups, respectively. For machine learning group, false positive was zero among the 19 samples. This means that the clustering model can increase reliability in when mineral composition from machine learning model because unusual sample did not belong to the machine learning group. For the 35 samples in expert group, the 31 samples were classified as false negative (FN). It means that although machine learning model can properly analyze these samples, they were assigned to expert group. However, when these FN samples were analyzed using machine learning based composition analysis model, a high mean absolute error of 2.94% was observed. Therefore, it is reasonable that the samples were assigned to expert group. 퇴적물 생성환경 규명에 사용되는 광물조성자료는 X-선 회절(X-ray diffraction, XRD)분석을 통해 얻을 수 있으나, 대규모 시료에 대한 조성분석 시 효율적인 분석을 위해 머신러닝 기반 광물조성 분석모델이 개발되었다. 해당 모델은 일반조성 시료에 대해 준수한 분석신뢰도를 보였으나, 특이조성을 가지는 시료에 대해서는 저조한 성능을 보였다. 이에 따라 최근 전체 시료 중 특이조성시료를 전문가가 분석할 수 있도록 분류하는 군집화모델이 개발되었다. 본 연구에서는 울릉분지 XRD 시료로 개발한 군집화모델의 타 지역 시료에 대한 적용가능성을 검토하고자 한다. 연구자료는 울릉분지 북서쪽에 위치한 한국대지의 54개 퇴적물 시료에 대한 XRD 실험 및 전문가 광물조성 분석결과로 구성된다. 한국대지 시료의 intensity는 7,420개(3.005-64.996°)로, 울릉분지 3,100개(3.01-64.99°)와 차이를 보여 선형보간을 활용해 일치시켰다. 이후 intensity 비율과 경향성을 보존하기 위해 시료별 최소-최대 정규화를 수행하였다. 전처리한 실험자료에 군집화모델을 적용한 결과, 54개 시료 중 전문가분석은 35개, 머신러닝분석은 19개로 배정되었다. 머신러닝분석으로 판단된 19개 시료 중 false positive(FP)는 0으로, 머신러닝분석 군집에 특이조성시료가 존재하지 않음을 확인하였다. FP는 실제 특이조성을 가져 전문가분석이 필요하지만 머신러닝이 분석하는 것으로 판단된 것을 의미하기 때문에 FP가 적을수록 머신러닝 모델 적용 시 높은 분석신뢰도를 기대할 수 있다. 전문가분석의 경우 35개 중 31개 시료가 false negative로 배정되었으며, 이는 머신러닝이 분석해도 무방하나 전문가가 분석해야할 시료 수가 전체의 57%임을 의미한다. 그러나 해당 시료들을 머신러닝기반 조성분석모델로 분석할 경우 2.94%의 높은 평균절대오차의 평균을 보이기 때문에 전문가분석 군집으로 배정된 것을 합리적으로 평가할 수 있다. |
| Author | 김유리(Yuri Kim) 박선영(Sun Young Park) 박주영(Ju Young Park) 최지영(Jiyoung Choi) 김성일(Sungil Kim) 이보연(Bo Yeon Yi) 이경북(Kyungbook Lee) |
| Author_xml | – sequence: 1 fullname: 박주영(Ju Young Park) – sequence: 2 fullname: 박선영(Sun Young Park) – sequence: 3 fullname: 최지영(Jiyoung Choi) – sequence: 4 fullname: 김성일(Sungil Kim) – sequence: 5 fullname: 김유리(Yuri Kim) – sequence: 6 fullname: 이보연(Bo Yeon Yi) – sequence: 7 fullname: 이경북(Kyungbook Lee) |
| BackLink | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003130637$$DAccess content in National Research Foundation of Korea (NRF) |
| BookMark | eNpNkEtLAlEAhS9hkJk_oN1s2gQz3cfcufcuxcwsQRAX7S7zjGFkBhw37ZTcREYFRQ8qhAIrCGph9Zuc8T-k2aLVgcPH4eMsg0wYhS4AqwhqgiGxUSqVNQyxrlGmUY1isQCyGHOuMmHgDMgijKnKMEdLIB_HvgUh1ymiRGRBbXJ5Nx69Jf1OOuwoe_VNJT1-mlxdpw9nyWNfmfWHXSVQJ6dH46-BMh69p8Pzyc2Fkrw-JyffSjroprcvae9DST57ae9-BSx6ZjN283-ZA_WtUqO4rVZr5UqxUFUDQbGKEKG6aWKdUWbbghLHtjzIhSNMQRiCnmUaLsWMObbrcM4NZhKhc0IcT4cmyYH1-WjY8mRg-zIy_d_cj2TQkoV6oyIRNIggAk_htTkc-HHbl6ETN-VOYbc2O4wQRJmBCMLwH3cQWZG0oiiw3bDttqQOEYQMTqWnbuQHBmx-nQ |
| ContentType | Journal Article |
| Copyright | COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED |
| Copyright_xml | – notice: COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED |
| DBID | P5Y SSSTE JDI ACYCR |
| DEWEY | 553 |
| DOI | 10.9719/EEG.2024.57.5.529 |
| DatabaseName | 교보문고 스콜라 Scholar Scholar(스콜라) KoreaScience Korean Citation Index |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Geology |
| DocumentTitleAlternate | Application of K-means Clustering Model to XRD Experimental Data in the Korea Plateau |
| EISSN | 2288-7962 |
| EndPage | 537 |
| ExternalDocumentID | oai_kci_go_kr_ARTI_10639392 JAKO202433157613120 4010070113371 |
| GroupedDBID | P5Y SSSTE .UV JDI ACYCR |
| ID | FETCH-LOGICAL-k952-11354aa24757cc953dcbf089d9a93710fba6e5277dced88867a394833df40a3 |
| ISSN | 1225-7281 |
| IngestDate | Sun Aug 03 03:11:19 EDT 2025 Thu Dec 05 13:20:23 EST 2024 Tue Feb 18 14:55:52 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Issue | 5 |
| Keywords | 혼동행렬 confusion matrix 머신러닝 k-평균 군집화 X-ray diffraction X-선 회절(XRD) machine learning Korea Plateau K-means clustering 한국대지 X k |
| Language | Korean |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-k952-11354aa24757cc953dcbf089d9a93710fba6e5277dced88867a394833df40a3 |
| Notes | KISTI1.1003/JNL.JAKO202433157613120 |
| OpenAccessLink | http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202433157613120&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01 |
| PageCount | 9 |
| ParticipantIDs | nrf_kci_oai_kci_go_kr_ARTI_10639392 kisti_ndsl_JAKO202433157613120 kyobo_bookcenter_4010070113371 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-10-31 |
| PublicationDateYYYYMMDD | 2024-10-31 |
| PublicationDate_xml | – month: 10 year: 2024 text: 2024-10-31 day: 31 |
| PublicationDecade | 2020 |
| PublicationTitle | 자원환경지질 |
| PublicationTitleAlternate | Economic and environmental geology |
| PublicationYear | 2024 |
| Publisher | Korea Society Of Economic&Environmental Geology 대한자원환경지질학회 |
| Publisher_xml | – name: Korea Society Of Economic&Environmental Geology – name: 대한자원환경지질학회 |
| SSID | ssib008451539 ssib001195850 ssib051116406 ssib036278822 |
| Score | 2.2871068 |
| Snippet | Mineral composition used to identify the sedimentary environment can be obtained through X-ray diffraction (XRD) analysis. However, due to time constraints for... 퇴적물 생성환경 규명에 사용되는 광물조성자료는 X-선 회절(X-ray diffraction, XRD)분석을 통해 얻을 수 있으나, 대규모 시료에 대한 조성분석 시 효율적인 분석을 위해... |
| SourceID | nrf kisti kyobo |
| SourceType | Open Website Open Access Repository Publisher |
| StartPage | 529 |
| SubjectTerms | 지질학 |
| TableOfContents | 1. 서 론 2. 연구방법 3. 연구결과 4. 결 론 Acknowledgements References |
| Title | 한국대지 XRD 실험자료 대상 k-평균 군집화 모델 적용성 분석 |
| URI | https://scholar.kyobobook.co.kr/article/detail/4010070113371 http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202433157613120&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01 https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003130637 |
| Volume | 57 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| ispartofPNX | 자원환경지질, 2024, 57(5), 288, pp.529-537 |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2288-7962 dateEnd: 99991231 omitProxy: true ssIdentifier: ssib008451539 issn: 1225-7281 databaseCode: M~E dateStart: 20140101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnR3LbtNA0CrlwgWBAFEelSXYU5Tg18a7R9txW4pKERSpN8vPEqVKpJAcygG1ohdEESCBeKigSiAVkJDgUOCbGucfmFknjgNFPA6x12vPw7PezIw9OyNJF0EpJoz61XKoxgm-ugnLPk1M_PoYKXHI1FiUZFm4Wp27acwv0-WJQ9uFqKVuJ6iEdw5cV_I_owp9MK64SvYfRjZHCh3QhvGFLYwwbP9qjIlbI5wS7hDXIjYlVo24NmEOhi-4DrFMaJSWwcvDI2YTyxAQJuEMe7hLuIIQlg5ApTFYBl1qqVFGADZDGBckTGIppQExe0iDqwIrJ9wQOCyLWEwgqxGbCdqWAtgESTjHBX6D2Kq43K7iQdbFa0VjucAjNGzRyAgxwYNG7Fp-n8NGHporcCsoHzyjE9sRaBhyorH5bkn80ZWuiWBxfiAUcGQpRagb3ebvwByUCXfG-MmJ1dcEkHOrVR8DgpvgQ4GjQARQDVkVtFbqqyVR7poXX85oRkGriaUOLbD88_jbxSRfbQ7T2h0tZITJOBuPvqJkmkgTdYazejZDVZXl8h5MSVrQO3TASGbC0CyPzs_akZsqJpd13dkK8lqhZoVWcsixpOPgdWMiKFXVdUzQcFgDnYmFURbuuiNjG_MUFb6JMwNs41GOWbCMTFZIDgmWPbjmothtfm9ZmAHydekXrsBbRBeqjvu1VtAC46_ZTgrG39Ix6ejAa5OtbAoelyYarRPSYv_Z9v7ep97Werq7LsM0k9MH7_rPX6RvHvfebsnYf29DbpT7j-7vf9uR9_c-p7tP-i-fyr2P73sPv8vpzkb66kO6-UXufd1MN1-flK7PuEvOXHlQn6Tc4FQrg3Co4fuaYVIzDDnVozBIFMYj7mOSSSUJ_GpMNdOMwjhijFVNX-cG0_UoMRRfPyVNNlvN-LQk-0nEwiCEHyAMaRhEAUYrBNUo0YxAZVPStJCE14xur3rz1pVFFJSuq9QEY1zVFLwAReShY45B3XHbGxvBKekCyM5rhHUP08njfqXlNdoeOM2XPRXdFPCTzvwJzVnpyOghPydNdtrd-DzY5J1gWjwaPwCuB7he |
| linkProvider | ISSN International Centre |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%ED%95%9C%EA%B5%AD%EB%8C%80%EC%A7%80+XRD+%EC%8B%A4%ED%97%98%EC%9E%90%EB%A3%8C+%EB%8C%80%EC%83%81+k-%ED%8F%89%EA%B7%A0+%EA%B5%B0%EC%A7%91%ED%99%94+%EB%AA%A8%EB%8D%B8+%EC%A0%81%EC%9A%A9%EC%84%B1+%EB%B6%84%EC%84%9D&rft.jtitle=%EC%9E%90%EC%9B%90%ED%99%98%EA%B2%BD%EC%A7%80%EC%A7%88&rft.au=%EB%B0%95%EC%A3%BC%EC%98%81%28Ju+Young+Park%29&rft.au=%EB%B0%95%EC%84%A0%EC%98%81%28Sun+Young+Park%29&rft.au=%EC%B5%9C%EC%A7%80%EC%98%81%28Jiyoung+Choi%29&rft.au=%EA%B9%80%EC%84%B1%EC%9D%BC%28Sungil+Kim%29&rft.date=2024-10-31&rft.pub=Korea+Society+Of+Economic%26Environmental+Geology&rft.issn=1225-7281&rft.volume=57&rft.issue=5&rft.spage=529&rft.epage=537&rft_id=info:doi/10.9719%2FEEG.2024.57.5.529&rft.externalDocID=4010070113371 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1225-7281&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1225-7281&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1225-7281&client=summon |