Application of explainable artificial intelligence in the identification of Squamous Cell Carcinoma biomarkers
Non-melanoma skin cancers (NMSCs) are the fifth most common type of cancer worldwide, affecting both men and women. Each year, more than a million new occurrences of NMSC are estimated, with Squamous Cell Carcinoma (SCC) representing approximately 20% of all skin malignancies. The purpose of this st...
        Saved in:
      
    
          | Published in | Computers in biology and medicine Vol. 146; p. 105505 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        United States
          Elsevier Ltd
    
        01.07.2022
     Elsevier Limited  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0010-4825 1879-0534 1879-0534  | 
| DOI | 10.1016/j.compbiomed.2022.105505 | 
Cover
| Summary: | Non-melanoma skin cancers (NMSCs) are the fifth most common type of cancer worldwide, affecting both men and women. Each year, more than a million new occurrences of NMSC are estimated, with Squamous Cell Carcinoma (SCC) representing approximately 20% of all skin malignancies. The purpose of this study was to find potential diagnostic biomarkers for SCC by application of eXplainable Artificial Intelligence (XAI) on XGBoost machine learning (ML) models trained on binary classification datasets comprising the expression data of 40 SCC, 38 AK, and 46 normal healthy skin samples. After successfully incorporating SHAP values into the ML models, 23 significant genes were identified and were found to be associated with the progression of SCC. These identified genes may serve as diagnostic and prognostic biomarkers in patients with SCC.
•Purpose of the study is to identify SCC biomarker genes using Explainable AI.•XGBoost ML models were used to identify key genes involved in SCC progression.•The XAI analysis on the trained XGBoost models was performed using the Python SHAP (SHapley Additive exPlanations) package.•Genes with the highest average SHAP value are again utilised to train new XGboost models.•Accuracy was determined to ensure that explaining ML models is possible without jeopardising the model's performance. | 
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23  | 
| ISSN: | 0010-4825 1879-0534 1879-0534  | 
| DOI: | 10.1016/j.compbiomed.2022.105505 |