Aggregate population genomics data from large cohorts is vital for assessing germline variant pathogenicity. However, there are no specifications on how sequencing quality metrics should be considered, and whether exome-derived and genome-derived allele frequencies should be considered in isolation. Germline genome sequence data was simulated for nine read-depths to identify a minimum acceptable read-depth for detecting variants. gnomAD exome-derived and genome-derived datasets were assessed for read-depth, for six key cancer genes selected for variant curation by ClinGen expert panels. Non-Finnish European allele frequency or filter allele frequency of coding variants in these genes, assigned into frequency bins using modified ACMG-AMP criteria, were compared between exome-derived and genome-derived datasets. A 30X read-depth achieved acceptable precision and recall for detection of substitutions, but poor recall for small insertions/deletions. Exome-derived and genome-derived datasets exhibited low read-depth for different gene exons. Individual variants were mostly assigned to the same allele frequency bin (>95%) or filter allele frequency bin (>97%). Two major bin divergences were resolved by applying the minimal acceptable read-depth threshold. These findings show the importance of assessing read-depth separately for population datasets sourced from different short-read sequencing technologies before assigning a frequency-based ACMG-AMP classification code for variant interpretation. This article is protected by copyright. All rights reserved.
Authors | Davidson, Aimee L; Leonard, Conrad; Koufariotis, Lambros T; Parsons, Michael T; Hollway, Georgina E; Pearson, John V; Newell, Felicity; Waddell, Nicola; Spurdle, Amanda B |
---|---|
Journal | HUMAN MUTATION |
Pages | 530-536 |
Volume | 42 |
Date | 1/01/2021 |
Grant ID | |
Funding Body | |
URL | http://www.ncbi.nlm.nih.gov/pubmed/?term=10.1002/humu.24183 |