Development of Computer Algorithm for Editing of Next Generation Sequencing Metagenome Data

The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up w...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational biology Vol. 24; no. 9; p. 882
Main Authors Khanna, Radhika, Mittal, Sangeeta, Mohanty, Sujata
Format Journal Article
LanguageEnglish
Published United States 01.09.2017
Subjects
Online AccessGet more information
ISSN1557-8666
DOI10.1089/cmb.2016.0179

Cover

Abstract The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up with strong inference. However, exploiting this huge genomic data set becomes a challenge for the molecular biologists. To corroborate this problem, computational software and hardware are being developed in parallel and become an integral part of life science. While executing the "Genomics project of Indian Drosophila species," we found strings of Ns in the whole genome sequences generated on Illumina platform. The present article aims at developing a computer algorithm (MATLAB and Python based) for editing raw sequences mainly eliminating bad residues before submitting to the publicly accessible sequence repository. These algorithms will be helpful to life scientists for analyzing large amount of biological data in short span of time.
AbstractList The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up with strong inference. However, exploiting this huge genomic data set becomes a challenge for the molecular biologists. To corroborate this problem, computational software and hardware are being developed in parallel and become an integral part of life science. While executing the "Genomics project of Indian Drosophila species," we found strings of Ns in the whole genome sequences generated on Illumina platform. The present article aims at developing a computer algorithm (MATLAB and Python based) for editing raw sequences mainly eliminating bad residues before submitting to the publicly accessible sequence repository. These algorithms will be helpful to life scientists for analyzing large amount of biological data in short span of time.
Author Mohanty, Sujata
Mittal, Sangeeta
Khanna, Radhika
Author_xml – sequence: 1
  givenname: Radhika
  surname: Khanna
  fullname: Khanna, Radhika
  organization: 1 Department of Biotechnology, Jaypee Institute of Information Technology , Noida, India
– sequence: 2
  givenname: Sangeeta
  surname: Mittal
  fullname: Mittal, Sangeeta
  organization: 2 Department of Computer Science and Information Technology, Jaypee Institute of Information Technology , Noida, India
– sequence: 3
  givenname: Sujata
  surname: Mohanty
  fullname: Mohanty, Sujata
  organization: 1 Department of Biotechnology, Jaypee Institute of Information Technology , Noida, India
BackLink https://www.ncbi.nlm.nih.gov/pubmed/28632436$$D View this record in MEDLINE/PubMed
BookMark eNo1jztPwzAURi0Eog8YWZH_QIpfcZyxSh8gFRiAiaG6Sa5DUGyHxEXw71EFTN9wjo70zcipDx4JueJswZnJbypXLgTjesF4lp-QKU_TLDFa6wmZjeM7Y1xqlp2TiTBaCiX1lLyu8BO70Dv0kQZLi-D6Q8SBLrsmDG18c9SGga7rNra-ORoP-BXpFj0OENvg6RN-HNBXR3qPERr0wSFdQYQLcmahG_Hyb-fkZbN-Lm6T3eP2rljukkoYFZOqZggsV3WNkJYpKKsl56aGXCqTlZmxJUghNJemVAJFxXluc0g1pBaNqsScXP92-0PpsN73Q-tg-N7_vxQ_1ahU7w
ContentType Journal Article
DBID CGR
CUY
CVF
ECM
EIF
NPM
DOI 10.1089/cmb.2016.0179
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
DatabaseTitleList MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Biology
Mathematics
EISSN 1557-8666
ExternalDocumentID 28632436
Genre Journal Article
GroupedDBID ---
0R~
1-M
29K
34G
39C
4.4
53G
5GY
ABBKN
ABEFU
ACGFO
ADBBV
AENEX
AFOSN
AI.
ALMA_UNASSIGNED_HOLDINGS
BAWUL
BNQNF
CAG
CGR
COF
CS3
CUY
CVF
D-I
DIK
DU5
EBS
ECM
EIF
EJD
F5P
IAO
IER
IGS
IHR
IM4
ISR
ITC
MV1
NPM
NQHIM
O9-
OK1
P2P
R.V
RIG
RML
RMSOB
RNS
TN5
TR2
UE5
VH1
ID FETCH-LOGICAL-c284t-cd0ea094ddea5b5a4f63118da93487b78fba3226138b42e2c119f9a56a5fe84c2
IngestDate Thu Jan 02 23:09:49 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Keywords MATLAB
next generation sequencing
computer algorithm
sequence editing
Python
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c284t-cd0ea094ddea5b5a4f63118da93487b78fba3226138b42e2c119f9a56a5fe84c2
PMID 28632436
ParticipantIDs pubmed_primary_28632436
PublicationCentury 2000
PublicationDate 2017-Sep
PublicationDateYYYYMMDD 2017-09-01
PublicationDate_xml – month: 09
  year: 2017
  text: 2017-Sep
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Journal of computational biology
PublicationTitleAlternate J Comput Biol
PublicationYear 2017
SSID ssj0013607
Score 2.17313
Snippet The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of...
SourceID pubmed
SourceType Index Database
StartPage 882
SubjectTerms Animals
Drosophila - genetics
High-Throughput Nucleotide Sequencing - methods
High-Throughput Nucleotide Sequencing - standards
Metagenome
Sequence Analysis, DNA - methods
Sequence Analysis, DNA - standards
Software
Title Development of Computer Algorithm for Editing of Next Generation Sequencing Metagenome Data
URI https://www.ncbi.nlm.nih.gov/pubmed/28632436
Volume 24
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF6sIuhBtL5f7MFbSU3TPDbHIhUR2oNtoeCh7G42tmofYDzor3f2ldRqRb2Ektk82PkyO7Od-QahCyJqdSaT2gVPmOMzxhwKfonj8yhMSZxyTwWKrXZ40_Nv-0G_2MxR1SUZq_L3b-tK_qNVOAd6lVWyf9BsflM4Ab9Bv3AEDcPxVzqey_hRaRWmQ0Ol8fwwhaB_OFZJhM1kZHOb22CKDdO0sRQqkVpKWyKjkrB1LAAJGV3itHL1CLuBaBiccqM9pBNdYnZHk-HoKbf4rVGWqcYClY6sZRBZIZnCNbovQef10T7X7ELAymbTrGARMZYzgOUu1C1UrGnV5dEGQvGcnSS649AX--0SSX_Kx0wm3YVVaS7mx8H0z8ZKmR6RPPOaO-Vn6QKdthWVUCmKZK-PttzesX87hW5kiFjhTS4_vYekjTbXLoQgyhXpbqMtow7c0IDYQStiUkbruqvoWxlttnIq3pdddD8HEjxNsQUJzkGCASTYgESOkCDBBUhwARJcgARLkOyh3nWze3XjmI4aDgc3JHN44goKAT2saTRgAfXTsA4RZkLjOgSuLCIpo2DhwcUjzPeEx2u1OI1pENIgFcTn3j5anUwn4hBhweospLQmIGD1IzeJ3YTIoTH4RAL89iN0oOdoMNO0KQM7e8dLJSdoo8DWKVpL4TsVZ-D0ZexcKeoDuEZaVQ
linkProvider National Library of Medicine
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Development+of+Computer+Algorithm+for+Editing+of+Next+Generation+Sequencing+Metagenome+Data&rft.jtitle=Journal+of+computational+biology&rft.au=Khanna%2C+Radhika&rft.au=Mittal%2C+Sangeeta&rft.au=Mohanty%2C+Sujata&rft.date=2017-09-01&rft.eissn=1557-8666&rft.volume=24&rft.issue=9&rft.spage=882&rft_id=info:doi/10.1089%2Fcmb.2016.0179&rft_id=info%3Apmid%2F28632436&rft_id=info%3Apmid%2F28632436&rft.externalDocID=28632436