Identification of potential feature genes in non-alcoholic fatty liver disease using bioinformatics analysis and machine learning strategies

The prevalence of non-alcoholic fatty liver disease (NAFLD) and NAFLD-associated hepatocellular carcinoma (HCC) has continuously increased in recent years. Machine learning is an effective method for screening the feature genes of a disease for prediction, prevention and personalized treatment. Here...

Full description

Saved in:
Bibliographic Details
Published inComputers in biology and medicine Vol. 157; p. 106724
Main Authors Zhang, Zhaohui, Wang, Shihao, Zhu, Zhengwen, Nie, Biao
Format Journal Article
LanguageEnglish
Published United States Elsevier Ltd 01.05.2023
Elsevier Limited
Subjects
Online AccessGet full text
ISSN0010-4825
1879-0534
1879-0534
DOI10.1016/j.compbiomed.2023.106724

Cover

More Information
Summary:The prevalence of non-alcoholic fatty liver disease (NAFLD) and NAFLD-associated hepatocellular carcinoma (HCC) has continuously increased in recent years. Machine learning is an effective method for screening the feature genes of a disease for prediction, prevention and personalized treatment. Here, we used the “limma” package and weighted gene co-expression network analysis (WGCNA) to screen 219 NAFLD-related genes and found that they were mainly enriched in inflammation-related pathways. Four feature genes (AXUD1, FOSB, GADD45B, and SOCS2) were screened by LASSO regression and support vector machine-recursive feature elimination (SVM-RFE) machine learning algorithms. Therefore, a clinical diagnostic model with an area under the curve (AUC) value of 0.994 was constructed, which was superior to other indicators of NAFLD. Significant correlations existed between feature genes expression and steatohepatitis histology or clinical variables. These findings were also validated in external datasets and a mouse model. Finally, we found that feature genes expression was significantly decreased in NAFLD-associated HCC and that SOCS2 may be a prognostic biomarker. Our findings may provide new insights into the diagnosis, prevention and treatment targets of NAFLD and NAFLD-associated HCC. •Machine learning showed that AXUD1, FOSB, GADD45B and SOCS2 are considered to be biomarkers for NAFLD.•AXUD1, FOSB, GADD45B, and SOCS2 showed a significant negative correlation with the histology grade of steatohepatitis.•The ROC curve and the nomogram was constructed for clinical use.•Low expression of SOCS2 in HCC predicts a worse survival rate, and SOCS2 may be a prognostic marker for NAFLD-associated HCC.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0010-4825
1879-0534
1879-0534
DOI:10.1016/j.compbiomed.2023.106724