A Parallel Software Pipeline for DMET Microarray Genotyping Data Analysis

Personalized medicine is an aspect of the P4 medicine (predictive, preventive, personalized and participatory) based precisely on the customization of all medical characters of each subject. In personalized medicine, the development of medical treatments and drugs is tailored to the individual chara...

Full description

Saved in:

Bibliographic Details
Published in	Biotech (Basel) Vol. 7; no. 2; p. 17
Main Authors	Agapito, Giuseppe, Guzzi, Pietro Hiram, Cannataro, Mario
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 14.06.2018 MDPI
Subjects	Computer programs Data analysis Data mining Data processing Datasets Drug development Gene polymorphism Genome-wide association studies Genomes Genomics Genotyping Linux Mass spectroscopy Medical treatment Metabolism Next-generation sequencing Operating systems Pharmacogenomics Phenotypes Precision medicine Single-nucleotide polymorphism Software Software utilities Statistical analysis User interface overall survival curves pharmacogenomics single nucleotide polymorphisms statistical analysis multiple analysis pipeline data mining
Online Access	Get full text
ISSN	2571-5135 2571-5135 2673-6284
DOI	10.3390/ht7020017

Cover

More Information
Summary:	Personalized medicine is an aspect of the P4 medicine (predictive, preventive, personalized and participatory) based precisely on the customization of all medical characters of each subject. In personalized medicine, the development of medical treatments and drugs is tailored to the individual characteristics and needs of each subject, according to the study of diseases at different scales from genotype to phenotype scale. To make concrete the goal of personalized medicine, it is necessary to employ high-throughput methodologies such as Next Generation Sequencing (NGS), Genome-Wide Association Studies (GWAS), Mass Spectrometry or Microarrays, that are able to investigate a single disease from a broader perspective. A side effect of high-throughput methodologies is the massive amount of data produced for each single experiment, that poses several challenges (e.g., high execution time and required memory) to bioinformatic software. Thus a main requirement of modern bioinformatic softwares, is the use of good software engineering methods and efficient programming techniques, able to face those challenges, that include the use of parallel programming and efficient and compact data structures. This paper presents the design and the experimentation of a comprehensive software pipeline, named microPipe, for the preprocessing, annotation and analysis of microarray-based Single Nucleotide Polymorphism (SNP) genotyping data. A use case in pharmacogenomics is presented. The main advantages of using microPipe are: the reduction of errors that may happen when trying to make data compatible among different tools; the possibility to analyze in parallel huge datasets; the easy annotation and integration of data. microPipe is available under Creative Commons license, and is freely downloadable for academic and not-for-profit institutions.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2571-5135 2571-5135 2673-6284
DOI:	10.3390/ht7020017