KATK: Fast genotyping of rare variants directly from unmapped sequencing reads
KATK is a fast and accurate software tool for calling variants directly from raw next‐generation sequencing reads. It uses predefined k‐mers to retrieve only the reads of interest from the FASTQ file and calls genotypes by aligning retrieved reads locally. KATK does not use data about known polymorp...
Saved in:
Published in | Human mutation Vol. 42; no. 6; pp. 777 - 786 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
United States
John Wiley & Sons, Inc
01.06.2021
|
Subjects | |
Online Access | Get full text |
ISSN | 1059-7794 1098-1004 1098-1004 |
DOI | 10.1002/humu.24197 |
Cover
Summary: | KATK is a fast and accurate software tool for calling variants directly from raw next‐generation sequencing reads. It uses predefined k‐mers to retrieve only the reads of interest from the FASTQ file and calls genotypes by aligning retrieved reads locally. KATK does not use data about known polymorphisms and has NC (no call) as the default genotype. The reference or variant allele is called only if there is sufficient evidence for their presence in data. Thus it is not biased against rare variants or de‐novo mutations. With simulated datasets, we achieved a false‐negative rate of 0.23% (sensitivity 99.77%) and a false discovery rate of 0.19%. Calling all human exonic regions with KATK requires 1–2 h, depending on sequencing coverage.
The computation time of KATK. The computation time of both indexing and calling grows linearly with the size of the FASTQ file. The tests were conducted on a CentOS 5.10 Linux server with 32 cores (2.27 GHz) and 512 GiB (gibibyte, 230 bytes) RAM, and the maximum number of threads was set to 24. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 ObjectType-Undefined-3 |
ISSN: | 1059-7794 1098-1004 1098-1004 |
DOI: | 10.1002/humu.24197 |