Indel variant analysis of short-read sequencing data with Scalpel

Fang et al . describe a computational protocol to accurately call indels from whole-genome and whole-exome sequencing data using Scalpel. Important issues for indel identification, such as short repeat regions and varying sequencing coverage, are discussed. As the second most common type of variatio...

Full description

Saved in:
Bibliographic Details
Published inNature protocols Vol. 11; no. 12; pp. 2529 - 2548
Main Authors Fang, Han, Bergmann, Ewa A, Arora, Kanika, Vacic, Vladimir, Zody, Michael C, Iossifov, Ivan, O'Rawe, Jason A, Wu, Yiyang, Jimenez Barron, Laura T, Rosenbaum, Julie, Ronemus, Michael, Lee, Yoon-ha, Wang, Zihua, Dikoglu, Esra, Jobanputra, Vaidehi, Lyon, Gholson J, Wigler, Michael, Schatz, Michael C, Narzisi, Giuseppe
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 01.12.2016
Nature Publishing Group
Subjects
Online AccessGet full text
ISSN1754-2189
1750-2799
1750-2799
DOI10.1038/nprot.2016.150

Cover

More Information
Summary:Fang et al . describe a computational protocol to accurately call indels from whole-genome and whole-exome sequencing data using Scalpel. Important issues for indel identification, such as short repeat regions and varying sequencing coverage, are discussed. As the second most common type of variation in the human genome, insertions and deletions (indels) have been linked to many diseases, but the discovery of indels of more than a few bases in size from short-read sequencing data remains challenging. Scalpel ( http://scalpel.sourceforge.net ) is an open-source software for reliable indel detection based on the microassembly technique. It has been successfully used to discover mutations in novel candidate genes for autism, and it is extensively used in other large-scale studies of human diseases. This protocol gives an overview of the algorithm and describes how to use Scalpel to perform highly accurate indel calling from whole-genome and whole-exome sequencing data. We provide detailed instructions for an exemplary family-based de novo study, but we also characterize the other two supported modes of operation: single-sample and somatic analysis. Indel normalization, visualization and annotation of the mutations are also illustrated. Using a standard server, indel discovery and characterization in the exonic regions of the example sequencing data can be completed in ∼5 h after read mapping.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1754-2189
1750-2799
1750-2799
DOI:10.1038/nprot.2016.150