A System Fault Diagnosis Method with a Reclustering Algorithm
The log analysis-based system fault diagnosis method can help engineers analyze the fault events generated by the system. The K-means algorithm can perform log analysis well and does not require a lot of prior knowledge, but the K-means-based system fault diagnosis method needs to be improved in bot...
Saved in:
| Published in | Scientific programming Vol. 2021; pp. 1 - 8 |
|---|---|
| Main Authors | , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
Hindawi
09.03.2021
John Wiley & Sons, Inc |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1058-9244 1875-919X 1875-919X |
| DOI | 10.1155/2021/6617882 |
Cover
| Abstract | The log analysis-based system fault diagnosis method can help engineers analyze the fault events generated by the system. The K-means algorithm can perform log analysis well and does not require a lot of prior knowledge, but the K-means-based system fault diagnosis method needs to be improved in both efficiency and accuracy. To solve this problem, we propose a system fault diagnosis method based on a reclustering algorithm. First, we propose a log vectorization method based on the PV-DM language model to obtain low-dimensional log vectors which can provide effective data support for the subsequent fault diagnosis; then, we improve the K-means algorithm and make the effect of K-means algorithm based log clustering; finally, we propose a reclustering method based on keywords’ extraction to improve the accuracy of fault diagnosis. We use system log data generated by two supercomputers to verify our method. The experimental results show that compared with the traditional K-means method, our method can improve the accuracy of fault diagnosis while ensuring the efficiency of fault diagnosis. |
|---|---|
| AbstractList | The log analysis-based system fault diagnosis method can help engineers analyze the fault events generated by the system. The K-means algorithm can perform log analysis well and does not require a lot of prior knowledge, but the K-means-based system fault diagnosis method needs to be improved in both efficiency and accuracy. To solve this problem, we propose a system fault diagnosis method based on a reclustering algorithm. First, we propose a log vectorization method based on the PV-DM language model to obtain low-dimensional log vectors which can provide effective data support for the subsequent fault diagnosis; then, we improve the K-means algorithm and make the effect of K-means algorithm based log clustering; finally, we propose a reclustering method based on keywords’ extraction to improve the accuracy of fault diagnosis. We use system log data generated by two supercomputers to verify our method. The experimental results show that compared with the traditional K-means method, our method can improve the accuracy of fault diagnosis while ensuring the efficiency of fault diagnosis. |
| Author | Zhang, Ting Ying, Shi Wang, Bingming Li, Yiyao Yang, Zhe Geng, Jiangyi Dong, Bo |
| Author_xml | – sequence: 1 givenname: Zhe surname: Yang fullname: Yang, Zhe organization: School of Computer ScienceWuhan UniversityWuhanChinawhu.edu.cn – sequence: 2 givenname: Shi orcidid: 0000-0002-0471-0021 surname: Ying fullname: Ying, Shi organization: School of Computer ScienceWuhan UniversityWuhanChinawhu.edu.cn – sequence: 3 givenname: Bingming orcidid: 0000-0002-8723-0970 surname: Wang fullname: Wang, Bingming organization: School of Computer ScienceWuhan UniversityWuhanChinawhu.edu.cn – sequence: 4 givenname: Yiyao surname: Li fullname: Li, Yiyao organization: School of Software EngineeringTongji UniversityShanghaiChinatongji.edu.cn – sequence: 5 givenname: Bo surname: Dong fullname: Dong, Bo organization: School of Computer ScienceWuhan UniversityWuhanChinawhu.edu.cn – sequence: 6 givenname: Jiangyi surname: Geng fullname: Geng, Jiangyi organization: School of Computer ScienceWuhan UniversityWuhanChinawhu.edu.cn – sequence: 7 givenname: Ting surname: Zhang fullname: Zhang, Ting organization: School of Computer ScienceWuhan UniversityWuhanChinawhu.edu.cn |
| BookMark | eNqFj0tLAzEUhYNUsK3u_AEBlzqaxySTWbgo1apQEXyAu5Bm0jZlmtQkQ-m_d8p0Jaire7nnu4dzBqDnvDMAnGN0jTFjNwQRfMM5LoQgR6CPRcGyEpefvXZHTGQlyfMTMIhxhRAWGKE-uB3Bt11MZg0nqqkTvLNq4Xy0ET6btPQV3Nq0hAq-Gl03LResW8BRvfChva9PwfFc1dGcHeYQfEzu38eP2fTl4Wk8mmaa0iJls_ksrzA2ylBTVJzkuFACaSUqzQglJuclz0XOGTGC0znVVPNWN6yYlVxhRocg63wbt1G7rapruQl2rcJOYiT33eW-uzx0b_mLjt8E_9WYmOTKN8G1ESVhiFJGcElbinSUDj7GYOZS26SS9S4FZevfrK9-PP2T5LLDl9ZVamv_pr8BDQWDgg |
| CitedBy_id | crossref_primary_10_1111_coin_12646 crossref_primary_10_1360_SST_2022_0194 crossref_primary_10_1109_ACCESS_2021_3128283 crossref_primary_10_1007_s11219_024_09672_6 |
| ContentType | Journal Article |
| Copyright | Copyright © 2021 Zhe Yang et al. Copyright © 2021 Zhe Yang et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0 |
| Copyright_xml | – notice: Copyright © 2021 Zhe Yang et al. – notice: Copyright © 2021 Zhe Yang et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0 |
| DBID | RHU RHW RHX AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D ADTOC UNPAY |
| DOI | 10.1155/2021/6617882 |
| DatabaseName | Hindawi Publishing Complete Hindawi Publishing Subscription Journals Hindawi Publishing Open Access CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | CrossRef Technology Research Database |
| Database_xml | – sequence: 1 dbid: RHX name: Hindawi Publishing Open Access url: http://www.hindawi.com/journals/ sourceTypes: Publisher – sequence: 2 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1875-919X |
| Editor | Wang, Pengwei |
| Editor_xml | – sequence: 1 givenname: Pengwei surname: Wang fullname: Wang, Pengwei |
| EndPage | 8 |
| ExternalDocumentID | 10.1155/2021/6617882 10_1155_2021_6617882 |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 62072342; 61672392 |
| GroupedDBID | .4S .DC 0R~ 4.4 5VS AAFWJ AAJEY ABDBF ABJNI ACGFS ADBBV AENEX ALMA_UNASSIGNED_HOLDINGS ARCSS ASPBG AVWKF BCNDV DU5 EAD EAP EBS EDO EMK EPL EST ESX GROUPED_DOAJ HZ~ I-F IAO IHR IOS KQ8 MIO MK~ ML~ MV1 NGNOM O9- OK1 RHU RHW RHX TUS 24P AAMMB AAYXX ACCMX AEFGJ AGXDD AIDQK AIDYY CITATION H13 7SC 7SP 8FD JQ2 L7M L~C L~D ABEFU ABUBZ ACPQW ADTOC AFRHK AGIAB CAG COF EJD FEDTE IL9 IPNFZ MET RIG UNPAY VOH |
| ID | FETCH-LOGICAL-c337t-bfb4d11eae3e7d62417a80ca8dc5232e4696484652e863f3c3c680ce57b96a153 |
| IEDL.DBID | RHX |
| ISSN | 1058-9244 1875-919X |
| IngestDate | Sun Oct 26 04:16:59 EDT 2025 Fri Jul 25 09:32:37 EDT 2025 Wed Oct 01 03:30:17 EDT 2025 Thu Apr 24 23:05:21 EDT 2025 Sun Jun 02 19:18:04 EDT 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. https://creativecommons.org/licenses/by/4.0 cc-by |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c337t-bfb4d11eae3e7d62417a80ca8dc5232e4696484652e863f3c3c680ce57b96a153 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-0471-0021 0000-0002-8723-0970 |
| OpenAccessLink | https://dx.doi.org/10.1155/2021/6617882 |
| PQID | 2503352193 |
| PQPubID | 2046410 |
| PageCount | 8 |
| ParticipantIDs | unpaywall_primary_10_1155_2021_6617882 proquest_journals_2503352193 crossref_citationtrail_10_1155_2021_6617882 crossref_primary_10_1155_2021_6617882 hindawi_primary_10_1155_2021_6617882 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2021-03-09 |
| PublicationDateYYYYMMDD | 2021-03-09 |
| PublicationDate_xml | – month: 03 year: 2021 text: 2021-03-09 day: 09 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | Scientific programming |
| PublicationYear | 2021 |
| Publisher | Hindawi John Wiley & Sons, Inc |
| Publisher_xml | – name: Hindawi – name: John Wiley & Sons, Inc |
| References | J. G. Lou (5) C. Yuan (10) W. Xu (6) J. Han (15) 2012 S. J. OlinerA (16) T. Mikolov (14) 2013 L. Tangl (12) W. Shang (7) S. He (11) Y. Liang (17) M. Chen (8) Y. Liang (1) T. Reidemeister (9) N. R. Adiga (2) Q. Fu (4) Q. Lin (3) R. Collobert (13) |
| References_xml | – start-page: 231 ident: 5 article-title: Mining invariants from console logs for system problem detection – start-page: 583 ident: 17 article-title: Failure prediction in IBM Bluegene/L event logs – start-page: 375 ident: 10 article-title: Automated known problem diagnosis with event traces – start-page: 575 ident: 16 article-title: What supercomputers say: a study of five system logs – start-page: 377 ident: 9 article-title: Mining unstructured log files for recurrent fault diagnosis – volume-title: Data Mining: Concept and Technology year: 2012 ident: 15 – year: 2013 ident: 14 article-title: Efficient estimation of word representations in vector space – start-page: 149 ident: 4 article-title: Execution anomaly detection in distributed systems through unstructured log analysis – start-page: 36 ident: 8 article-title: Failure diagnosis using decision trees – start-page: 785 ident: 12 article-title: LogSig: generatingsystemevents from raw textual logs – start-page: 117 ident: 6 article-title: Detecting large-scale system problems by mining console logs – start-page: 402 ident: 7 article-title: Assisting developers of big data analytics applications when deploying on hadoop clouds – start-page: 160 ident: 13 article-title: A unified architecture for natural language processing: deep neural networks with multitask learning – start-page: 476 ident: 1 article-title: Filtering failure logs for a bluegene/L prototype – start-page: 60 ident: 2 article-title: An overview of the BlueGene/L supercomputer – start-page: 207 ident: 11 article-title: Experience report: system log analysis for anomaly detection – start-page: 102 ident: 3 article-title: Log clustering based problem identification for online service systems |
| SSID | ssj0018100 |
| Score | 2.293915 |
| Snippet | The log analysis-based system fault diagnosis method can help engineers analyze the fault events generated by the system. The K-means algorithm can perform log... |
| SourceID | unpaywall proquest crossref hindawi |
| SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1 |
| SubjectTerms | Accuracy Algorithms Clustering Failure Fault diagnosis Machine learning Neural networks Semantics Software Supercomputers |
| SummonAdditionalLinks | – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fT9swED6NIgQv69gP0a1DfgBeUEoTx07ysIdqW4WQQDxQqRMP0dlxGCK0FU2Etr9-58RBdBIb4jHJyUnubN_32ec7gD3pKzRZlHjDQCkvlER3UIbCQ_R9I8gDynrB7fRMHk_Ck6mYuqhKexYmsyni55gtBz8tJ72_rmdrp9fl0XJh2bp_JO3Jtpg4YZavwboUhMM7sD45Ox_9qLc3BY3ioK7k6hMgpxGdTNuodyFWmljxRxvupStwc7OaLfDXPRbFI88z7sJl-81NwMnNoCrVQP_-K53jy37qDbx2gJSNmh60Da_M7C1022IPzI39d_BlxJrs5myMVVGyb02M3vWSndZFqJld0WXICIcWlc2-QD6RjYqr-R3dv30Pk_H3i6_Hnqu94GnOo9JTuQozMhcabqJMkp-PMB5qjDNN1DUwxKplSNhFBCaWPOeaa0nPjYhUIpGm0Q_Qmc1nZgeY3QrlNFUIjVkYqRxxqHI-1EFsF2FR9eCwtUCqXWJyWx-jSGuCIkRqtZM67fRg_0F60STkeEJuzyn7P2L91tJpa5A0sHu7hGsS3oODB-v_s52PzxX8BFv2sg5jS_rQKe8q85lwTal2XQf-A3ji72E priority: 102 providerName: Unpaywall |
| Title | A System Fault Diagnosis Method with a Reclustering Algorithm |
| URI | https://dx.doi.org/10.1155/2021/6617882 https://www.proquest.com/docview/2503352193 https://downloads.hindawi.com/journals/sp/2021/6617882.pdf |
| UnpaywallVersion | publishedVersion |
| Volume | 2021 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1875-919X dateEnd: 20240530 omitProxy: true ssIdentifier: ssj0018100 issn: 1875-919X databaseCode: KQ8 dateStart: 19920101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVWIB databaseName: Wiley Online Library Open Access customDbUrl: eissn: 1875-919X dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0018100 issn: 1875-919X databaseCode: 24P dateStart: 19920101 isFulltext: true titleUrlDefault: https://authorservices.wiley.com/open-science/open-access/browse-journals.html providerName: Wiley-Blackwell |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT4NAEN5ojdGLb2O1NnuoXgwRWHaBg4dGbRqTNo2xST2RXVi0CdKmhTT-e2dhaazGxxEYIJll9ptvZphBqMUswWXk-oZpC2E4DOgOZw41OLcsSQEBWRFw6_VZd-g8jOhIN0maf0_hA9opem5dM_Urmwd77brHVOXWY3e0TBZ4llk2HaBguwBXVX37l3tXkGfzVVHexXjFsdzK0yl_X_Ak-YQxnT20o51D3C5Xcx-tyfQA7VaDF7C2w0N008Zlp3Hc4XmS4buyXm48x71iIDRW0VXMMfiESa46IQA-4XbyMpnB-bcjNOzcP912DT0HwQgJcTNDxMKJQHVcEulGDDDX5Z4Zci8KgUbaEhguc8CPoLb0GIlJSEIG1yV1hc84bGnHqJZOUnmCsEpLEjBbGvLIcUXMuSliYoa2pwKiXNTRVaWjINRNwtWsiiQoyAKlgdJooDVaRxdL6WnZHOMHuZZW9x9ijWotAm1J88BWeVbwMXxSR5fL9fn1Oaf_e90Z2laHRUGZ30C1bJbLc_AwMtFE67YzaBZfWRNtDPuD9vMHVdDGWw |
| linkProvider | Hindawi Publishing |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fT9swED6NIgQv69gP0a1DfgBeUEoTx07ysIdqW4WQQDxQqRMP0dlxGCK0FU2Etr9-58RBdBIb4jHJyUnubN_32ec7gD3pKzRZlHjDQCkvlER3UIbCQ_R9I8gDynrB7fRMHk_Ck6mYuqhKexYmsyni55gtBz8tJ72_rmdrp9fl0XJh2bp_JO3Jtpg4YZavwboUhMM7sD45Ox_9qLc3BY3ioK7k6hMgpxGdTNuodyFWmljxRxvupStwc7OaLfDXPRbFI88z7sJl-81NwMnNoCrVQP_-K53jy37qDbx2gJSNmh60Da_M7C1022IPzI39d_BlxJrs5myMVVGyb02M3vWSndZFqJld0WXICIcWlc2-QD6RjYqr-R3dv30Pk_H3i6_Hnqu94GnOo9JTuQozMhcabqJMkp-PMB5qjDNN1DUwxKplSNhFBCaWPOeaa0nPjYhUIpGm0Q_Qmc1nZgeY3QrlNFUIjVkYqRxxqHI-1EFsF2FR9eCwtUCqXWJyWx-jSGuCIkRqtZM67fRg_0F60STkeEJuzyn7P2L91tJpa5A0sHu7hGsS3oODB-v_s52PzxX8BFv2sg5jS_rQKe8q85lwTal2XQf-A3ji72E |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+System+Fault+Diagnosis+Method+with+a+Reclustering+Algorithm&rft.jtitle=Scientific+programming&rft.au=Yang%2C+Zhe&rft.au=Ying%2C+Shi&rft.au=Wang%2C+Bingming&rft.au=Li%2C+Yiyao&rft.date=2021-03-09&rft.issn=1058-9244&rft.eissn=1875-919X&rft.volume=2021&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1155%2F2021%2F6617882&rft.externalDBID=n%2Fa&rft.externalDocID=10_1155_2021_6617882 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1058-9244&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1058-9244&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1058-9244&client=summon |