A fuzzy granular sparse learning model for identifying antigenic variants of influenza viruses
Sparse learning has significant applications in statistics, big data, bioinformatics and machine learning. In big data systems, a large amount of redundant, missing and noisy data cause sparsity, and the rapid changes of information result in uncertainty. Since the traditional sparse learning model...
Saved in:
| Published in | Applied soft computing Vol. 109; p. 107573 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier B.V
01.09.2021
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1568-4946 1872-9681 |
| DOI | 10.1016/j.asoc.2021.107573 |
Cover
| Summary: | Sparse learning has significant applications in statistics, big data, bioinformatics and machine learning. In big data systems, a large amount of redundant, missing and noisy data cause sparsity, and the rapid changes of information result in uncertainty. Since the traditional sparse learning model is difficult to deal with uncertain data, we propose a Fuzzy Granular Sparse Learning (FGSL) model for identifying antigenic variants of influenza viruses. Firstly, a fuzzy set theory is introduced to measure and granulate the influenza viruses. Some fuzzy granules are induced by a single feature fuzzy granulation. Then, a fuzzy granular vector is constructed from these fuzzy granules, and the fuzzy granular regression is presented. Some constraint norms for granules and granular vectors are proposed, which are two granule norms and four granular vector norms. Therefore, the FGSL model is constructed based on granular regression and constraint norms. The FGSL model includes granular ridge and lasso regressions under different constraint norms. Furthermore, we prove the derivative forms of two granular regression functions, guaranteeing the convergence of the FGSL model. The optimization problem of the FGSL model is discussed and two gradient descent algorithms of the FGSL model are designed. Finally, we employ the FGSL model to serologic data and hemagglutinin sequences for learning antigenicity-associated mutations and inferring antigenic variants. The experimental results confirm some advantages of the FGSL model with fast convergence, low RMSE and strong feature selection ability. We successfully identify antigenic variants of influenza viruses by the FGSL model.
•We present some fuzzy granular vectors for designing classifiers.•Some fuzzy granular operators on these granular vectors are defined.•We propose an FGSL model for identifying antigenic variants of influenza viruses.•We further design two gradient descent algorithms for the FGSL model. |
|---|---|
| ISSN: | 1568-4946 1872-9681 |
| DOI: | 10.1016/j.asoc.2021.107573 |