Complex diseases SNP selection and classification by hybrid Association Rule Mining and Artificial Neural Network—based Evolutionary Algorithms

Recently, various techniques have been applied to classify Single Nucleotide Polymorphisms (SNP) data as they have been shown to be implicated in various human diseases. One of the major problems related to SNP sets is the large p, small n problem which refers to the high number of features and the...

Full description

Saved in:
Bibliographic Details
Published inEngineering applications of artificial intelligence Vol. 51; pp. 58 - 70
Main Authors Boutorh, Aicha, Guessoum, Ahmed
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.05.2016
Subjects
Online AccessGet full text
ISSN0952-1976
1873-6769
DOI10.1016/j.engappai.2016.01.004

Cover

More Information
Summary:Recently, various techniques have been applied to classify Single Nucleotide Polymorphisms (SNP) data as they have been shown to be implicated in various human diseases. One of the major problems related to SNP sets is the large p, small n problem which refers to the high number of features and the small number of samples, which makes the classification task complex. In this paper, a new hybrid intelligent technique based on Association Rule Mining (ARM) and Neural Networks (NN) which uses Evolutionary Algorithms (EA) is proposed to deal with the dimensionality problem. On the one hand, ARM optimized by Grammatical Evolution (GE) is used to select the most informative features and to reduce the dimensionality by parallel extraction of associations between SNPs in two separate datasets of case and control samples. On the other hand, and to complement the previous task, a NN is used for efficient classification. The Genetic Algorithm (GA) is used for setting up the parameters of the two combined techniques. The proposed GA-NN-GEARM approach has been applied on four different SNP datasets obtained from the NCBI Gene Expression Omnibus (GEO) website. The created model has reached a high classification accuracy, reaching in some cases 100%, and has outperformed several feature selection techniques when combined with different classifiers.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2016.01.004