Development of Computer Algorithm for Editing of Next Generation Sequencing Metagenome Data

The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up w...

Full description

Saved in:

Bibliographic Details
Published in	Journal of computational biology Vol. 24; no. 9; p. 882
Main Authors	Khanna, Radhika, Mittal, Sangeeta, Mohanty, Sujata
Format	Journal Article
Language	English
Published	United States 01.09.2017
Subjects	Animals Drosophila - genetics High-Throughput Nucleotide Sequencing - methods High-Throughput Nucleotide Sequencing - standards Metagenome Sequence Analysis, DNA - methods Sequence Analysis, DNA - standards Software MATLAB next generation sequencing computer algorithm sequence editing Python
Online Access	Get more information
ISSN	1557-8666
DOI	10.1089/cmb.2016.0179

Cover

Abstract	The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up with strong inference. However, exploiting this huge genomic data set becomes a challenge for the molecular biologists. To corroborate this problem, computational software and hardware are being developed in parallel and become an integral part of life science. While executing the "Genomics project of Indian Drosophila species," we found strings of Ns in the whole genome sequences generated on Illumina platform. The present article aims at developing a computer algorithm (MATLAB and Python based) for editing raw sequences mainly eliminating bad residues before submitting to the publicly accessible sequence repository. These algorithms will be helpful to life scientists for analyzing large amount of biological data in short span of time.
AbstractList	The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up with strong inference. However, exploiting this huge genomic data set becomes a challenge for the molecular biologists. To corroborate this problem, computational software and hardware are being developed in parallel and become an integral part of life science. While executing the "Genomics project of Indian Drosophila species," we found strings of Ns in the whole genome sequences generated on Illumina platform. The present article aims at developing a computer algorithm (MATLAB and Python based) for editing raw sequences mainly eliminating bad residues before submitting to the publicly accessible sequence repository. These algorithms will be helpful to life scientists for analyzing large amount of biological data in short span of time.
Author	Mohanty, Sujata Mittal, Sangeeta Khanna, Radhika
Author_xml	– sequence: 1 givenname: Radhika surname: Khanna fullname: Khanna, Radhika organization: 1 Department of Biotechnology, Jaypee Institute of Information Technology , Noida, India – sequence: 2 givenname: Sangeeta surname: Mittal fullname: Mittal, Sangeeta organization: 2 Department of Computer Science and Information Technology, Jaypee Institute of Information Technology , Noida, India – sequence: 3 givenname: Sujata surname: Mohanty fullname: Mohanty, Sujata organization: 1 Department of Biotechnology, Jaypee Institute of Information Technology , Noida, India
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/28632436$$D View this record in MEDLINE/PubMed
BookMark	eNo1jztPwzAURi0Eog8YWZH_QIpfcZyxSh8gFRiAiaG6Sa5DUGyHxEXw71EFTN9wjo70zcipDx4JueJswZnJbypXLgTjesF4lp-QKU_TLDFa6wmZjeM7Y1xqlp2TiTBaCiX1lLyu8BO70Dv0kQZLi-D6Q8SBLrsmDG18c9SGga7rNra-ORoP-BXpFj0OENvg6RN-HNBXR3qPERr0wSFdQYQLcmahG_Hyb-fkZbN-Lm6T3eP2rljukkoYFZOqZggsV3WNkJYpKKsl56aGXCqTlZmxJUghNJemVAJFxXluc0g1pBaNqsScXP92-0PpsN73Q-tg-N7_vxQ_1ahU7w
ContentType	Journal Article
DBID	CGR CUY CVF ECM EIF NPM
DOI	10.1089/cmb.2016.0179
DatabaseName	Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed
DatabaseTitle	MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid)
DatabaseTitleList	MEDLINE
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	no_fulltext_linktorsrc
Discipline	Biology Mathematics
EISSN	1557-8666
ExternalDocumentID	28632436
Genre	Journal Article
GroupedDBID	--- 0R~ 1-M 29K 34G 39C 4.4 53G 5GY ABBKN ABEFU ACGFO ADBBV AENEX AFOSN AI. ALMA_UNASSIGNED_HOLDINGS BAWUL BNQNF CAG CGR COF CS3 CUY CVF D-I DIK DU5 EBS ECM EIF EJD F5P IAO IER IGS IHR IM4 ISR ITC MV1 NPM NQHIM O9- OK1 P2P R.V RIG RML RMSOB RNS TN5 TR2 UE5 VH1
ID	FETCH-LOGICAL-c284t-cd0ea094ddea5b5a4f63118da93487b78fba3226138b42e2c119f9a56a5fe84c2
IngestDate	Thu Jan 02 23:09:49 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	9
Keywords	MATLAB next generation sequencing computer algorithm sequence editing Python
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c284t-cd0ea094ddea5b5a4f63118da93487b78fba3226138b42e2c119f9a56a5fe84c2
PMID	28632436
ParticipantIDs	pubmed_primary_28632436
PublicationCentury	2000
PublicationDate	2017-Sep
PublicationDateYYYYMMDD	2017-09-01
PublicationDate_xml	– month: 09 year: 2017 text: 2017-Sep
PublicationDecade	2010
PublicationPlace	United States
PublicationPlace_xml	– name: United States
PublicationTitle	Journal of computational biology
PublicationTitleAlternate	J Comput Biol
PublicationYear	2017
SSID	ssj0013607
Score	2.17313
Snippet	The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of...
SourceID	pubmed
SourceType	Index Database
StartPage	882
SubjectTerms	Animals Drosophila - genetics High-Throughput Nucleotide Sequencing - methods High-Throughput Nucleotide Sequencing - standards Metagenome Sequence Analysis, DNA - methods Sequence Analysis, DNA - standards Software
Title	Development of Computer Algorithm for Editing of Next Generation Sequencing Metagenome Data
URI	https://www.ncbi.nlm.nih.gov/pubmed/28632436
Volume	24
hasFullText
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF6sIuhBtL5f7MFbSU3TPDbHIhUR2oNtoeCh7G42tmofYDzor3f2ldRqRb2Ektk82PkyO7Od-QahCyJqdSaT2gVPmOMzxhwKfonj8yhMSZxyTwWKrXZ40_Nv-0G_2MxR1SUZq_L3b-tK_qNVOAd6lVWyf9BsflM4Ab9Bv3AEDcPxVzqey_hRaRWmQ0Ol8fwwhaB_OFZJhM1kZHOb22CKDdO0sRQqkVpKWyKjkrB1LAAJGV3itHL1CLuBaBiccqM9pBNdYnZHk-HoKbf4rVGWqcYClY6sZRBZIZnCNbovQef10T7X7ELAymbTrGARMZYzgOUu1C1UrGnV5dEGQvGcnSS649AX--0SSX_Kx0wm3YVVaS7mx8H0z8ZKmR6RPPOaO-Vn6QKdthWVUCmKZK-PttzesX87hW5kiFjhTS4_vYekjTbXLoQgyhXpbqMtow7c0IDYQStiUkbruqvoWxlttnIq3pdddD8HEjxNsQUJzkGCASTYgESOkCDBBUhwARJcgARLkOyh3nWze3XjmI4aDgc3JHN44goKAT2saTRgAfXTsA4RZkLjOgSuLCIpo2DhwcUjzPeEx2u1OI1pENIgFcTn3j5anUwn4hBhweospLQmIGD1IzeJ3YTIoTH4RAL89iN0oOdoMNO0KQM7e8dLJSdoo8DWKVpL4TsVZ-D0ZexcKeoDuEZaVQ
linkProvider	National Library of Medicine
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Development+of+Computer+Algorithm+for+Editing+of+Next+Generation+Sequencing+Metagenome+Data&rft.jtitle=Journal+of+computational+biology&rft.au=Khanna%2C+Radhika&rft.au=Mittal%2C+Sangeeta&rft.au=Mohanty%2C+Sujata&rft.date=2017-09-01&rft.eissn=1557-8666&rft.volume=24&rft.issue=9&rft.spage=882&rft_id=info:doi/10.1089%2Fcmb.2016.0179&rft_id=info%3Apmid%2F28632436&rft_id=info%3Apmid%2F28632436&rft.externalDocID=28632436