A Comprehensive Workflow for Read Depth-Based Identification of Copy-Number Variation from Whole-Genome Sequence Data

A remaining hurdle to whole-genome sequencing (WGS) becoming a first-tier genetic test has been accurate detection of copy-number variations (CNVs). Here, we used several datasets to empirically develop a detailed workflow for identifying germline CNVs >1 kb from short-read WGS data using read de...

Full description

Saved in:
Bibliographic Details
Published inAmerican journal of human genetics Vol. 102; no. 1; pp. 142 - 155
Main Authors Trost, Brett, Walker, Susan, Wang, Zhuozhi, Thiruvahindrapuram, Bhooma, MacDonald, Jeffrey R., Sung, Wilson W.L., Pereira, Sergio L., Whitney, Joe, Chan, Ada J.S., Pellecchia, Giovanna, Reuter, Miriam S., Lok, Si, Yuen, Ryan K.C., Marshall, Christian R., Merico, Daniele, Scherer, Stephen W.
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 04.01.2018
Elsevier
Subjects
Online AccessGet full text
ISSN0002-9297
1537-6605
1537-6605
DOI10.1016/j.ajhg.2017.12.007

Cover

More Information
Summary:A remaining hurdle to whole-genome sequencing (WGS) becoming a first-tier genetic test has been accurate detection of copy-number variations (CNVs). Here, we used several datasets to empirically develop a detailed workflow for identifying germline CNVs >1 kb from short-read WGS data using read depth-based algorithms. Our workflow is comprehensive in that it addresses all stages of the CNV-detection process, including DNA library preparation, sequencing, quality control, reference mapping, and computational CNV identification. We used our workflow to detect rare, genic CNVs in individuals with autism spectrum disorder (ASD), and 120/120 such CNVs tested using orthogonal methods were successfully confirmed. We also identified 71 putative genic de novo CNVs in this cohort, which had a confirmation rate of 70%; the remainder were incorrectly identified as de novo due to false positives in the proband (7%) or parental false negatives (23%). In individuals with an ASD diagnosis in which both microarray and WGS experiments were performed, our workflow detected all clinically relevant CNVs identified by microarrays, as well as additional potentially pathogenic CNVs < 20 kb. Thus, CNVs of clinical relevance can be discovered from WGS with a detection rate exceeding microarrays, positioning WGS as a single assay for genetic variation detection.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
These authors contributed equally to this work
ISSN:0002-9297
1537-6605
1537-6605
DOI:10.1016/j.ajhg.2017.12.007