Routine performance and errors of 454 HLA exon sequencing in diagnostics

Background Next-generation sequencing (NGS) has changed genomics significantly. More and more applications strive for sequencing with different platforms. Now, in 2012, after a decade of development and evolution, NGS has been accepted for a variety of research fields. Determination of sequencing er...

Full description

Saved in:

Bibliographic Details
Published in	BMC bioinformatics Vol. 14; no. 1; p. 176
Main Authors	Niklas, Norbert, Pröll, Johannes, Danzer, Martin, Stabentheiner, Stephanie, Hofer, Katja, Gabriel, Christian
Format	Journal Article
Language	English
Published	London BioMed Central 03.06.2013 BioMed Central Ltd Springer Nature B.V
Subjects	Algorithms Bioinformatics Biomedical and Life Sciences Computational Biology/Bioinformatics Computer Appl. in Life Sciences DNA sequencing Exons Genetic aspects High-Throughput Nucleotide Sequencing - methods Histocompatibility antigens Histocompatibility Testing HLA Antigens - genetics HLA histocompatibility antigens Humans Life Sciences Methods Microarrays Nucleotide sequencing Physiological aspects Research Article Results and data Sequence Analysis, DNA - methods Austria Germany Next-generation sequencing Error characteristics Quality control Human leukocyte antigen typing
Online Access	Get full text
ISSN	1471-2105 1471-2105
DOI	10.1186/1471-2105-14-176

Cover

More Information
Summary:	Background Next-generation sequencing (NGS) has changed genomics significantly. More and more applications strive for sequencing with different platforms. Now, in 2012, after a decade of development and evolution, NGS has been accepted for a variety of research fields. Determination of sequencing errors is essential in order to follow next-generation sequencing beyond research use only. This study describes the overall 454 system performance of using multiple GS Junior runs with an in-house established and validated diagnostic assay for human leukocyte antigen (HLA) exon sequencing. Based on this data, we extracted, evaluated and characterized errors and variants of 60 HLA loci per run with respect to their adjacencies. Results We determined an overall error rate of 0.18% in a total of 118,484,408 bases. 31.3% of all reads analyzed (n=349,503) contain one or more errors. The largest group are deletions that account for 50% of the errors. Incorrect bases are not distributed equally along sequences and tend to be more frequent at sequence ends. Certain sequence positions in the middle or at the beginning of the read accumulate errors. Typically, the corresponding quality score at the actual error position is lower than the adjacent scores. Conclusions Here we present the first error assessment in a human next-generation sequencing diagnostics assay in an amplicon sequencing approach. Improvements of sequence quality and error rate that have been made over the years are evident and it is shown that both have now reached a level where diagnostic applications become feasible. Our presented data are better than previously published error rates and we can confirm and quantify the often described relation of homopolymers and errors. Nevertheless, a certain depth of coverage is needed, in particular with challenging areas of the sequencing target. Furthermore, the usage of error correcting tools is not essential but might contribute towards the capacity and efficiency of a sequencing run.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23 ObjectType-Feature-1
ISSN:	1471-2105 1471-2105
DOI:	10.1186/1471-2105-14-176