Sensitive and specific post-call filtering of genetic variants in xenograft and primary tumors

Abstract Motivation Tumor genome sequencing offers great promise for guiding research and therapy, but spurious variant calls can arise from multiple sources. Mouse contamination can generate many spurious calls when sequencing patient-derived xenografts. Paralogous genome sequences can also generat...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 34; no. 10; pp. 1713 - 1718
Main Authors Mannakee, Brian K, Balaji, Uthra, Witkiewicz, Agnieszka K, Gutenkunst, Ryan N, Knudsen, Erik S
Format Journal Article
LanguageEnglish
Published England Oxford University Press 15.05.2018
Subjects
Online AccessGet full text
ISSN1367-4803
1367-4811
1460-2059
1367-4811
DOI10.1093/bioinformatics/bty010

Cover

More Information
Summary:Abstract Motivation Tumor genome sequencing offers great promise for guiding research and therapy, but spurious variant calls can arise from multiple sources. Mouse contamination can generate many spurious calls when sequencing patient-derived xenografts. Paralogous genome sequences can also generate spurious calls when sequencing any tumor. We developed a BLAST-based algorithm, Mouse And Paralog EXterminator (MAPEX), to identify and filter out spurious calls from both these sources. Results When calling variants from xenografts, MAPEX has similar sensitivity and specificity to more complex algorithms. When applied to any tumor, MAPEX also automatically flags calls that potentially arise from paralogous sequences. Our implementation, mapexr, runs quickly and easily on a desktop computer. MAPEX is thus a useful addition to almost any pipeline for calling genetic variants in tumors. Availability and implementation The mapexr package for R is available at https://github.com/bmannakee/mapexr under the MIT license. Supplementary information Supplementary data are available at Bioinformatics online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1367-4803
1367-4811
1460-2059
1367-4811
DOI:10.1093/bioinformatics/bty010