A fuzzy granular sparse learning model for identifying antigenic variants of influenza viruses

Sparse learning has significant applications in statistics, big data, bioinformatics and machine learning. In big data systems, a large amount of redundant, missing and noisy data cause sparsity, and the rapid changes of information result in uncertainty. Since the traditional sparse learning model...

Full description

Saved in:
Bibliographic Details
Published inApplied soft computing Vol. 109; p. 107573
Main Authors Chen, Yumin, Cai, Zhiwen, Shi, Lei, Li, Wei
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.09.2021
Subjects
Online AccessGet full text
ISSN1568-4946
1872-9681
DOI10.1016/j.asoc.2021.107573

Cover

More Information
Summary:Sparse learning has significant applications in statistics, big data, bioinformatics and machine learning. In big data systems, a large amount of redundant, missing and noisy data cause sparsity, and the rapid changes of information result in uncertainty. Since the traditional sparse learning model is difficult to deal with uncertain data, we propose a Fuzzy Granular Sparse Learning (FGSL) model for identifying antigenic variants of influenza viruses. Firstly, a fuzzy set theory is introduced to measure and granulate the influenza viruses. Some fuzzy granules are induced by a single feature fuzzy granulation. Then, a fuzzy granular vector is constructed from these fuzzy granules, and the fuzzy granular regression is presented. Some constraint norms for granules and granular vectors are proposed, which are two granule norms and four granular vector norms. Therefore, the FGSL model is constructed based on granular regression and constraint norms. The FGSL model includes granular ridge and lasso regressions under different constraint norms. Furthermore, we prove the derivative forms of two granular regression functions, guaranteeing the convergence of the FGSL model. The optimization problem of the FGSL model is discussed and two gradient descent algorithms of the FGSL model are designed. Finally, we employ the FGSL model to serologic data and hemagglutinin sequences for learning antigenicity-associated mutations and inferring antigenic variants. The experimental results confirm some advantages of the FGSL model with fast convergence, low RMSE and strong feature selection ability. We successfully identify antigenic variants of influenza viruses by the FGSL model. •We present some fuzzy granular vectors for designing classifiers.•Some fuzzy granular operators on these granular vectors are defined.•We propose an FGSL model for identifying antigenic variants of influenza viruses.•We further design two gradient descent algorithms for the FGSL model.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2021.107573