Identification of potential feature genes in non-alcoholic fatty liver disease using bioinformatics analysis and machine learning strategies
The prevalence of non-alcoholic fatty liver disease (NAFLD) and NAFLD-associated hepatocellular carcinoma (HCC) has continuously increased in recent years. Machine learning is an effective method for screening the feature genes of a disease for prediction, prevention and personalized treatment. Here...
Saved in:
| Published in | Computers in biology and medicine Vol. 157; p. 106724 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
Elsevier Ltd
01.05.2023
Elsevier Limited |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0010-4825 1879-0534 1879-0534 |
| DOI | 10.1016/j.compbiomed.2023.106724 |
Cover
| Summary: | The prevalence of non-alcoholic fatty liver disease (NAFLD) and NAFLD-associated hepatocellular carcinoma (HCC) has continuously increased in recent years. Machine learning is an effective method for screening the feature genes of a disease for prediction, prevention and personalized treatment. Here, we used the “limma” package and weighted gene co-expression network analysis (WGCNA) to screen 219 NAFLD-related genes and found that they were mainly enriched in inflammation-related pathways. Four feature genes (AXUD1, FOSB, GADD45B, and SOCS2) were screened by LASSO regression and support vector machine-recursive feature elimination (SVM-RFE) machine learning algorithms. Therefore, a clinical diagnostic model with an area under the curve (AUC) value of 0.994 was constructed, which was superior to other indicators of NAFLD. Significant correlations existed between feature genes expression and steatohepatitis histology or clinical variables. These findings were also validated in external datasets and a mouse model. Finally, we found that feature genes expression was significantly decreased in NAFLD-associated HCC and that SOCS2 may be a prognostic biomarker. Our findings may provide new insights into the diagnosis, prevention and treatment targets of NAFLD and NAFLD-associated HCC.
•Machine learning showed that AXUD1, FOSB, GADD45B and SOCS2 are considered to be biomarkers for NAFLD.•AXUD1, FOSB, GADD45B, and SOCS2 showed a significant negative correlation with the histology grade of steatohepatitis.•The ROC curve and the nomogram was constructed for clinical use.•Low expression of SOCS2 in HCC predicts a worse survival rate, and SOCS2 may be a prognostic marker for NAFLD-associated HCC. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 0010-4825 1879-0534 1879-0534 |
| DOI: | 10.1016/j.compbiomed.2023.106724 |