Development of Computer Algorithm for Editing of Next Generation Sequencing Metagenome Data

The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up w...

Full description

Saved in:

Bibliographic Details
Published in	Journal of computational biology Vol. 24; no. 9; p. 882
Main Authors	Khanna, Radhika, Mittal, Sangeeta, Mohanty, Sujata
Format	Journal Article
Language	English
Published	United States 01.09.2017
Subjects	Animals Drosophila - genetics High-Throughput Nucleotide Sequencing - methods High-Throughput Nucleotide Sequencing - standards Metagenome Sequence Analysis, DNA - methods Sequence Analysis, DNA - standards Software MATLAB next generation sequencing computer algorithm sequence editing Python
Online Access	Get more information
ISSN	1557-8666
DOI	10.1089/cmb.2016.0179

Cover

More Information
Summary:	The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up with strong inference. However, exploiting this huge genomic data set becomes a challenge for the molecular biologists. To corroborate this problem, computational software and hardware are being developed in parallel and become an integral part of life science. While executing the "Genomics project of Indian Drosophila species," we found strings of Ns in the whole genome sequences generated on Illumina platform. The present article aims at developing a computer algorithm (MATLAB and Python based) for editing raw sequences mainly eliminating bad residues before submitting to the publicly accessible sequence repository. These algorithms will be helpful to life scientists for analyzing large amount of biological data in short span of time.
ISSN:	1557-8666
DOI:	10.1089/cmb.2016.0179