Advertisement

Assessing Limit of Detection in Clinical Sequencing

  • Elizabeth R. Starks
    Correspondence
    Address correspondence to Elizabeth R. Starks, M.Sc., Invitae, 1400 16th St., San Francisco, CA 94103; or Aly Karsan, M.D., Canada’s Michael Smith Genome Sciences Centre, BC Cancer, 570 W 7th Ave., Vancouver, BC V5Z 4S6, Canada.
    Affiliations
    Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
    Search for articles by this author
  • Lucas Swanson
    Affiliations
    Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
    Search for articles by this author
  • T. Roderick Docking
    Affiliations
    Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
    Search for articles by this author
  • Ian Bosdet
    Affiliations
    Department of Pathology & Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada
    Search for articles by this author
  • Sarah Munro
    Affiliations
    Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
    Search for articles by this author
  • Richard A. Moore
    Affiliations
    Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada
    Search for articles by this author
  • Aly Karsan
    Correspondence
    Address correspondence to Elizabeth R. Starks, M.Sc., Invitae, 1400 16th St., San Francisco, CA 94103; or Aly Karsan, M.D., Canada’s Michael Smith Genome Sciences Centre, BC Cancer, 570 W 7th Ave., Vancouver, BC V5Z 4S6, Canada.
    Affiliations
    Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada

    Department of Pathology & Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada
    Search for articles by this author
Published:January 21, 2021DOI:https://doi.org/10.1016/j.jmoldx.2020.12.010
      Clinical reporting of solid tumor sequencing requires reliable assessment of the accuracy and reproducibility of each assay. Somatic mutation variant allele fractions may be below 10% in many samples due to sample heterogeneity, tumor clonality, and/or sample degradation in fixatives such as formalin. The toolkits available to the clinical sequencing community for correlating assay design parameters with assay sensitivity remain limited, and large-scale empirical assessments are often relied upon due to the lack of clear theoretical grounding. To address this uncertainty, a theoretical model was developed for predicting the expected variant calling sensitivity for a given library complexity and sequencing depth. Binomial models were found to be appropriate when assay sensitivity was only limited by library complexity or sequencing depth, but functional scaling for library complexity was necessary when both library complexity and sequencing depth were co-limiting. This model was empirically validated with sequencing experiments by using a series of DNA input amounts and sequencing depths. Based on these findings, a workflow is proposed for determining the limiting factors to sensitivity in different assay designs, and the formulas for these scenarios are presented. The approach described here provides designers of clinical assays with the methods to theoretically predict assay design outcomes a priori, potentially reducing burden in clinical tumor assay design and validation efforts.
      Next-generation sequencing is now a core component of diagnosis and treatment stratification for many cancer patients, and clinical laboratories require published tools and guidelines to benchmark and validate these assays. Somatic mutations may occur at a low variant allele fraction (VAF) that challenges the detection limits of clinical assays. Low VAF may be due to a low percentage of tumor cells in the extracted tissues, or to multiple cell clones within the tumor population. There is a high demand for clinical tests that are sensitive to these low VAF somatic mutations, and clinical testing panels often promote their efficacy at low VAFs by reporting sensitivity statistics down to VAFs of 5%, or in some cases even lower.
      • Hiatt J.B.
      • Pritchard C.C.
      • Salipante S.J.
      • O’Roak B.J.
      • Shendure J.
      Single molecule molecular inversion probes for targeted, high accuracy detection of low-frequency variation.
      • Misyura M.
      • Zhang T.
      • Sukhai M.A.
      • Thomas M.
      • Garg S.
      • Kamel-Reid S.
      • Stockley T.L.
      Comparison of next-generation sequencing panels and platforms for detection and verification of somatic tumor variants for clinical diagnostics.
      • Kuderer N.M.
      • Burton K.A.
      • Blau S.
      • Rose A.L.
      • Parker S.
      • Lyman G.H.
      • Blau C.A.
      Comparison of 2 commercially available next-generation sequencing platforms in oncology.
      Genetic testing results must be sufficiently sensitive and reproducible to be applied in a clinical setting.
      • Rehm H.L.
      • Bale S.J.
      • Bayrak-Toydemir P.
      • Berg J.S.
      • Brown K.K.
      • Deignan J.L.
      • Friez M.J.
      • Funke B.H.
      • Hegde M.R.
      • Lyon E.
      Working Group of the American College of Medical Genetics and Genomics Laboratory Quality Assurance Committee
      ACMG clinical laboratory standards for next-generation sequencing.
      Correlation studies have revealed that reproducibility of genetic testing results may be below 90% between assays when methodologies and/or detection limits differ, but reproducibility may approach 100% for VAFs above 5% when comparing deeply sequenced next-generation sequencing panels on solid tumors.
      • Misyura M.
      • Zhang T.
      • Sukhai M.A.
      • Thomas M.
      • Garg S.
      • Kamel-Reid S.
      • Stockley T.L.
      Comparison of next-generation sequencing panels and platforms for detection and verification of somatic tumor variants for clinical diagnostics.
      ,
      • Kuderer N.M.
      • Burton K.A.
      • Blau S.
      • Rose A.L.
      • Parker S.
      • Lyman G.H.
      • Blau C.A.
      Comparison of 2 commercially available next-generation sequencing platforms in oncology.
      National Comprehensive Cancer Network guidelines for many cancer indications require confirmed negative results for particular mutations to stratify prognosis and treatment
      • O’Donnell M.R.
      Risk stratification and emerging treatment strategies in acute myeloid leukemia.
      ; however, with low reproducibility, negative results from many assays cannot be interpreted as true-negative findings. Accurate calculation of sensitivity and detection limits is therefore crucial to the clinical genetic panel design process, but guidelines and utilities for sensitive panel design are sparse. Jennings et al
      • Jennings L.J.
      • Arcila M.E.
      • Corless C.
      • Kamel-Reid S.
      • Lubin I.M.
      • Pfeifer J.
      • Temple-Smolkin R.L.
      • Voelkerding K.V.
      • Nikiforova M.N.
      Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists.
      recommend read depths of 30 for germline testing and from 500 to 1000 for somatic variant testing; however, these recommendations assume high DNA inputs, high library complexity (low read duplicate rate), and effective read de-duplication. These assumptions do not apply universally, and there are no clear models for exceptions to this best-case clinical testing scenario.
      We developed targeted clinical amplicon panels with dropletized targeted PCR and experimentally evaluated the relationships between our assay design and its variant detection performance. Dropletized PCR occurs in an emulsion to prevent primer pair interactions, and it reduces both PCR bias and stochastic amplification in library construction.
      • Tewhey R.
      • Warner J.B.
      • Nakano M.
      • Libby B.
      • Medkova M.
      • David P.H.
      • Kotsopoulos S.K.
      • Samuels M.L.
      • Hutchison J.B.
      • Larson J.W.
      • Topol E.J.
      • Weiner M.P.
      • Harismendy O.
      • Olson J.
      • Link D.R.
      • Frazer K.A.
      Microdroplet-based PCR enrichment for large-scale targeted sequencing.
      ,
      • Hori M.
      • Fukano H.
      • Suzuki Y.
      Uniform amplification of multiple DNAs by emulsion PCR.
      PCR is the differential amplification between differing genomic targets and stochastic amplification as the differential amplification of alleles from the same genomic target. In the absence of PCR bias and stochastic amplification, it is predicted that a binomial model would be the optimal model for estimating theoretical read depth requirements to meet sensitivity requirements with particular variant calling thresholds. Although library complexity and read depth are known to have an interactive effect on final assay sensitivity, no commonly applied practice that accounted for both factors simultaneously was observed. Many published recommendations for assay design and minimal read depth selection are based exclusively on simple binomial models
      • Spencer D.H.
      • Tyagi M.
      • Vallania F.
      • Bredemeyer A.J.
      • Pfeifer J.D.
      • Mitra R.D.
      • Duncavage E.J.
      Performance of common analysis methods for detecting low-frequency single nucleotide variants in targeted next-generation sequence data.
      or stochastic branching models.
      • Heinrich V.
      • Stange J.
      • Dickhaus T.
      • Imkeller P.
      • Krüger U.
      • Bauer S.
      • Mundlos S.
      • Robinson P.N.
      • Hecht J.
      • Krawitz P.M.
      The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
      Perhaps due to the wide diversity of targeted panel platforms, guidelines about which model to select, as well as how to tailor models to different platforms, are generally lacking. Non-binomial models are often regarded as intractable due to their complexity. Panel designers often therefore choose to apply a theta term, or corrective constant, to a binomial model after exhaustive experimentation.
      • Jennings L.J.
      • Arcila M.E.
      • Corless C.
      • Kamel-Reid S.
      • Lubin I.M.
      • Pfeifer J.
      • Temple-Smolkin R.L.
      • Voelkerding K.V.
      • Nikiforova M.N.
      Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists.
      Derivation of a theta term has no theoretical grounding; it applies only to the specific set of parameters that was empirically tested and therefore cannot be predicted except through brute force testing. Theta estimates can require large experimental designs at great cost to a testing center before a panel’s sensitivity can be estimated and deemed acceptable. This approach often requires multiple testing iterations, which increase the required time for assay development in addition to the cost.
      Given the disparate empirical findings and process recommendations across publications, including binomial, stochastic branching, and other modifications of binomial and Poisson distributions,
      • Jennings L.J.
      • Arcila M.E.
      • Corless C.
      • Kamel-Reid S.
      • Lubin I.M.
      • Pfeifer J.
      • Temple-Smolkin R.L.
      • Voelkerding K.V.
      • Nikiforova M.N.
      Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists.
      ,
      • Spencer D.H.
      • Tyagi M.
      • Vallania F.
      • Bredemeyer A.J.
      • Pfeifer J.D.
      • Mitra R.D.
      • Duncavage E.J.
      Performance of common analysis methods for detecting low-frequency single nucleotide variants in targeted next-generation sequence data.
      • Heinrich V.
      • Stange J.
      • Dickhaus T.
      • Imkeller P.
      • Krüger U.
      • Bauer S.
      • Mundlos S.
      • Robinson P.N.
      • Hecht J.
      • Krawitz P.M.
      The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
      • Xu C.
      A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data.
      we hypothesized that the appropriate choice of model for a clinical assay was not universal and depended on four factors: DNA input quantity, sequencing depth, PCR competition, and sequence de-duplication. Here, we refer to PCR competition as the inconsistent amplification rates between different fragments during PCR and sequence de-duplication as any informatics postprocessing to eliminate PCR duplicates from a sequence library. In addition, we hypothesized that a calculation based on these factors would allow reliable estimates of assay sensitivity before any empirical testing, which could reduce the cost and effort required to optimize and validate panels.
      In the current study, we developed an approach to calculate the interactive effect of library complexity and read depth in the absence of PCR bias and stochastic amplification, which is referred to here as the “nested binomial.” The primary advantage of this approach over empirical estimation of a corrective theta term is that library complexity can be predicted a priori, and thus the sensitivity of an assay can be predicted before any laboratory testing has been performed with those parameter adjustments. To facilitate targeted sequencing panel design planning in our laboratory, R scripts was created in a library, LoD (https://github.com/lizardstarks/LoD, last accessed December 30, 2020), which assist the user in appropriately estimating assay sensitivity for a targeted sequencing panel. This package allows for two different conceptual models of assay limitations (binomial or nested binomial). User selections will determine the appropriate model for a given assay. The incorporation of these models within the tool framework allows for in silico experimentation with assay design, so that optimal arrangements within the unique design constraints of a particular assay may be determined.

      Materials and Methods

       Library Preparation and Sequencing Protocol

      Genomic DNA from peripheral blood, bone marrow, and formalin-fixed, paraffin-embedded blocks was extracted by using standard protocols. To evaluate the effects of DNA quantity on assay sensitivity to low VAFs, 20% dilutions of cancer cell lines KASUMI-6 (ATCC, Manassas, VA), OCI-AML3, and MOLM-13 (DSMZ, Braunschweig, Germany) were prepared into a healthy reference sample NA12878 as 500 ng and 1 μg DNA starting amounts. Test outcomes for a random subset of 34 formalin-fixed clinical solid tumor samples comprising 250 ng of submitted DNA were also evaluated.
      Libraries were sheared to a target size of 3 Kb using a Covaris (Woburn, MA) ultrasonicator. After shearing, the amount of sheared DNA input to amplification was quantified using fluorometry (Qubit, Thermo Fisher Scientific, Waltham, MA). In all calculations in this article, this amplification input amount was used as the “DNA input” for the LoD model estimates (Table 1).
      Table 1DNA Starting Amounts and Resulting Amplification DNA Inputs after Shearing
      MaterialDNA starting amount (ng)Amplification input (ng)
      Fresh1000670
      Fresh500338
      FFPE250125
      FFPE, formalin-fixed, paraffin-embedded.
      Samples were processed by using the RainDance ThunderStorm (RainDance Technologies, Inc., Lexington, MA) instrument with primer libraries targeting 70 to 250 bp amplicons. To minimize biased amplification rates due to sequence length, the majority (>75%) of amplicon designs were designed to only range between 200 and 250 bp in size. Sequencing was performed by using paired-end 250 base reads on a MiSeq or HiSeq 2500 (Illumina, San Diego, CA). The RainDance droplet-based PCR system partitions very small quantities of DNA (<1 target-bearing fragment per droplet on average, estimated by the vendor) with individual primer pair droplets or plexes of three to five primers, depending on the design requested (RDT 1000; RainDance Technologies, Inc.). Single-plex droplets were used for the fresh tissue protocols, and plexes of 3 to 5 primers were used for the formalin-fixed, paraffin-embedded tissue protocols. Plexes were optimized by the vendor to reduce chances of competition in cases in which the amplicon coordinates overlap and to reduce the incidence of primer cross-dimerization, a primer interaction artifact that generates large quantities of short artifact reads. Because this dropletization strategy partitions DNA fragments with small numbers of primer pairs per droplet rather than the full primer set, it is possible to isolate target-bearing fragments in droplets where they cannot be amplified. To account for loss of target-bearing fragments due to primer plexing, the amplification DNA input was divided by the number of primer pairs to estimate the number of target-bearing fragments partitioned with each primer pair. This step is only necessary for assays with partitioned amplification.

       Sequencing and Read Downsampling Protocol

      The adapters and primers from the Illumina reads were trimmed with the Python library cutadapt-1.10.
      • Martin M.
      Cutadapt removes adapter sequences from high-throughput sequencing reads.
      Reads were aligned to GRCh37 with bwa mem version 0.7.5a.
      • Li H.
      Aligning sequence reads, clone sequences and assembly contigs with bwa-mem. ArXiv.
      Variants were called with VarScan version 2.3.6,
      • Koboldt D.C.
      • Zhang Q.
      • Larson D.E.
      • Shen D.
      • McLellan M.D.
      • Lin L.
      • Miller C.A.
      • Mardis E.R.
      • Ding L.
      • Wilson R.K.
      VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.
      with a 1% minimum alternate allele fraction threshold, a minimum average base quality of Phred = 20, and a minimum alignment quality of MAPQ = 1. Variants were then filtered to higher VAFs with custom scripts by using the VarScan VAF field.
      To evaluate the effect of sequencing depth, randomized downsampling trials were performed of variant locations called by VarScan in patient libraries in 500 replicates.
      • Bosdet I.E.
      • Docking T.R.
      • Butterfield Y.S.
      • Mungall A.J.
      • Zeng T.
      • Coope R.J.
      • Yorida E.
      • Chow K.
      • Bala M.
      • Young S.S.
      • Hirst M.
      • Birol I.
      • Moore R.A.
      • Jones S.J.
      • Marra M.A.
      • Holt R.
      • Karsan A.
      A clinically validated diagnostic second-generation sequencing assay for detection of hereditary BRCA1 and BRCA2 mutations.
      A range of paired read depths from 10 to 140 was generated by using seeds from 1 to 500 in SAMtools version 0.1.18.
      • Li H.
      • Handsaker B.
      • Wysoker A.
      • Fennell T.
      • Ruan J.
      • Homer N.
      • Marth G.
      • Abecasis G.
      • Durbin R.
      1000 Genomes Project Data Processing Subgroup
      The Sequence alignment/Map format and SAMtools.
      Variant calling was re-performed with VarScan on the downsampled BAMs and the sensitivity was quantified as a fraction of successful variant calls from the total downsampling trials. VAF distributions were analyzed by using the VAFs reported by VarScan. All downsampling and VAF distribution experiments were restricted to single nucleotide variants, with the exception of the distinctly noted FLT3 tandem duplication experiments.

       Bioinformatics Methods

      In subsequent sections, a process flow for panel sensitivity model selection was empirically validated by testing the applicability of the binomial and nested binomial models to three probability scenarios: i) an assay limited by DNA input, ii) an assay limited by total read depth, and iii) an assay co-limited by both DNA input and read depth. All empirical calculations were based on clinical sequencing panel results from the Center for Clinical Genomics (BC Cancer, Vancouver, BC, Canada). First, it was empirically validated that simplified binomial calculations (Formula A) were sufficient for assays solely limited by DNA input or read depth. The nested binomial calculations in experiments co-limited by both DNA input and read depth were then empirically tested (Formula C). Because stochastic branching models are not applicable to dropletized PCR assays due to the elimination of PCR competition, that approach was not validated.
      • Heinrich V.
      • Stange J.
      • Dickhaus T.
      • Imkeller P.
      • Krüger U.
      • Bauer S.
      • Mundlos S.
      • Robinson P.N.
      • Hecht J.
      • Krawitz P.M.
      The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
      Formula A: S = 1 – pbinom(D∗T, D, V)
      where S = sensitivity, D = mean unique target-bearing fragments per target base, T = variant calling VAF threshold, and V = expected VAF in the sample (50% for heterozygous germline variants, also known as the probability term).
      Formula A for tests of the VarScan2 caller was tuned to reflect the variant caller’s minimal variant calling evidence requirements (two reads). The first pbinom term, D∗T, was thresholded to be ≥2.
      Formula B is the calculation for determining the proportion of starting target-bearing fragments that will be sequenced in a given read depth with unbiased amplification.
      Formula B: 1 – (Y1Y)ˆR
      where Y is a term for the estimated genomic equivalents in the sequencing input, and R is the read depth. This calculation assumes that random sampling of any genomic fragment is equally likely, which is only valid for sequenced libraries not subject to variable amplification rates due to PCR competition, stochastic amplification, or allelic bias due to sequence content.
      • Heinrich V.
      • Stange J.
      • Dickhaus T.
      • Imkeller P.
      • Krüger U.
      • Bauer S.
      • Mundlos S.
      • Robinson P.N.
      • Hecht J.
      • Krawitz P.M.
      The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
      Formula B was nested within a simple binomial (Formula A) to create Formula C, which is referred to as the “nested binomial”:
      Formula C: S = 1 – (pbinom(B∗R∗T, B∗R, V)
      where S = sensitivity, R = reads per target base, T = variant calling VAF threshold, and V = expected VAF in the sample (50% for heterozygous germline variants). The result of Formula B is applied as a library complexity correction within the binomial.

      Results

       Empirical Validation of Simplified Binomial Model of Amplification Reaction Input versus Sensitivity (DNA Input Utility)

      Although guidelines for assay sensitivity often emphasize read depth,
      • Spencer D.H.
      • Tyagi M.
      • Vallania F.
      • Bredemeyer A.J.
      • Pfeifer J.D.
      • Mitra R.D.
      • Duncavage E.J.
      Performance of common analysis methods for detecting low-frequency single nucleotide variants in targeted next-generation sequence data.
      ,
      • Heinrich V.
      • Stange J.
      • Dickhaus T.
      • Imkeller P.
      • Krüger U.
      • Bauer S.
      • Mundlos S.
      • Robinson P.N.
      • Hecht J.
      • Krawitz P.M.
      The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
      DNA input to PCR and PCR efficiency can be crucial limiting factors to the sensitivity of an assay before sequencing. After the number of sequenced reads exceeds the number of unique genomic copies in the template, assay sensitivity improvements will asymptotically approach the sensitivity limit imposed by the number of unique genomic copies available. This means that if DNA inputs are sufficiently limiting, a simplified binomial model can be applied to evaluate the effect of changes to DNA input amounts on the unique number of template molecules available for analysis.
      Simplified binomial models (Formula A) are appropriate for assays with emulsion PCR, such as the dropletized PCR on the RainDance platform. Dropletized PCR reactions isolate genomic fragments before amplification, which prevents competitive loss of unique fragments during PCR and reduces differences in amplification efficiency between amplicons.
      Figure 1 illustrates the outcome of these calculations for a Myeloid Panel with 44 genes, which uses dropletized PCR on the RainDance platform, from initial testing (v1) and optimized DNA starting amount (v2) assay designs. The calculation presented in Figure 1 is the maximum possible assay sensitivity because it excludes any downstream limitations such as read depth. The parameter C (human DNA copies) was estimated based on an assumption of 650 g/Mol of bp. N, the DNA input amount, is the amount of DNA that remained from the initial submissions of 500 ng and 1000 ng after shearing with the Covaris instrument. C and N were multiplied to generate term I, the approximate number of genomic copies entering the targeted amplification process. P, the total plexes, represents the number of sequestered reaction volumes for targeted amplification, thus accounting for any reductions in assay sensitivity due to the division of the original DNA quantity. In this design, each droplet only contained one primer pair, which proportionally reduced the chances that a target fragment would co-occur with the correct primer pair to be amplified. If a design does not sequester primers and target fragments in this fashion, this correction will not be necessary, and term P may be set to 1. Term A, the mean target amplicon coverage, is only applicable if the target panel designs have inconsistent redundancy across particular regions of the design and accounts for the average expected coverage across all target bases. During the course of these experiments, DNA amounts submitted for sequencing were doubled from 500 ng to 1000 ng, which nearly doubled the mean unique genomic copies from 118 to 234 copies (indicated as v1 and v2 in Figure 1).
      Figure thumbnail gr1
      Figure 1Schematic for amplifiable template calculations. Example values are calculated from a subset of Myeloid Panel samples. The final term D represents the predicted mean unique copies arising from template DNA per target base.

       Theoretical Input Calculations

      To estimate the minimum necessary DNA starting amount to achieve the study’s sensitivity goals, binomial theoretical calculations were performed. Table 2 shows the relevant analysis parameters included in the theoretical model. An expected VAF of 10% was selected as the lower detection limit goal for this assay, because lower allelic fractions are expected in somatic samples than in germline tests. Although it was not the primary focus of the current evaluation, the minimum VAF threshold of the variant caller must be set suitably below the expected VAF to account for sampling error, and the effect of adjusting this parameter can be assessed by using the presented formulas. The template loss term applies a linear correction for DNA lost during stages upstream of amplification and sequencing, and it should only be used if there is an empirically verified cause of template loss during sample preparation.
      Table 2Analysis Parameters for Theoretical Assay Sensitivity
      ParameterValue
      Somatic sensitivity goal99%
      Somatic detection target VAF10%
      Somatic VAF threshold3%
      Myeloid PCR plexes1220
      Myeloid template loss termNA
      NA, not applicable; VAF, variant allele fraction.
      Figure 2 shows an extrapolation of a panel of myeloid cancer genes (Myeloid Panel) model for somatic variants present at 10% VAF, using a variant calling threshold of 3%. Given the theoretical output, a minimum Myeloid Panel DNA input amount of approximately 600 ng was selected, which is close to the theoretically derived amount of 670 ng required to achieve the desired sensitivity of 99.9% to variants with a VAF of 10%. With the minimum input, the sensitivity of the assay to 10% VAF variants will be 99.68%. This assumes that no downstream processes limit sensitivity.
      Figure thumbnail gr2
      Figure 2Myeloid Panel 10% variants with 3% variant calling threshold. Solid line = minimum Myeloid Panel DNA input of 600 ng. Dashed vertical line = theoretical recommended input of 670 ng to achieve sensitivity of 99.9% to 10% variant allele fraction variants.

       Simulation of Binomial and Stochastic Models

      To first confirm that the binomial model was distinct from and more appropriate than the stochastic branching model for our assay, the two models were simulated for a range of 20 to 100 starting target fragments per droplet (Supplemental Figure S1). The stochastic branching model was not applicable when the mean number of alternate alleles was small (<10). In this simulation experiment, the stochastic distribution overestimated sensitivity when events were very rare (≤3 alternate copies) and underestimated sensitivity at >3 alternate copies. To explain this outcome, the combined effects of a broader VAF distribution and deviations from a normal distribution when events are extremely rare must be considered. If an event is sufficiently rare that it only ever appears below a configured variant calling threshold, it will never be called with a deterministic method. However, stochastic error would sometimes oversample the variant, causing the extremely rare event to sometimes exceed the detection threshold. According to the central limit theorem, binomial events approach a normal distribution as the mean event rate increases, but binomial events will deviate substantially from normality when events are rare. The stochastic model assumes a normal distribution and is not intended to be applied to distributions deviating from normality. The stochastic distribution asymptotically approached the binomial distribution as copies continued to increase. From this comparison, we concluded that the stochastic model was unreliable for evaluation of rare events in targeted sequencing data, and the stochastic model was excluded from any subsequent comparisons.

       Empirical Validation of Binomial DNA Input Calculations

      To empirically validate the binomial model, sensitivity estimates from the sensitivity projection tool were compared with empirical observations of variants detected by using the Myeloid Panel. If read depths sufficiently exceed value D in the model, assay sensitivity will be limited exclusively by DNA input quantity.
      Variant calling sensitivity goals were defined to be able to reliably detect somatic heterozygous variants present in any clinical acute myeloid leukemia case, so that both positive and negative results would be clinically meaningful. A blast cell percentage of at least 20% is required for a diagnosis of acute myeloid leukemia. There is a direct correlation between blast count and VAF, with mean VAF corresponding to approximately 50% of blast count.
      • Welch J.S.
      • Ley T.J.
      • Link D.C.
      • Miller C.A.
      • Larson D.E.
      • Koboldt D.C.
      • et al.
      The origin and evolution of mutations in acute myeloid leukemia.
      For the clinical Myeloid Panel, the goal was set to detect VAFs of 10% with 99.9% sensitivity to avoid false-negative findings; a minimum variant calling threshold of 3% VAF was used to avoid false-positive findings due to low-frequency sequencing errors and polymerase artifacts.
      During preliminary panel testing and optimization, standard DNA starting amounts of 500 ng DNA were received from fresh marrow samples, which yielded approximately 338 ng of product after shearing to 3 Kb (Table 1). Based on the theoretical binomial calculations, it was anticipated that a 1-μg DNA starting amount was necessary to yield at least 670 ng of targeted PCR DNA input to meet our sensitivity goals (Figure 2). To empirically validate the assumptions of the model, two experiments were performed comparing the initial targeted PCR input of 338 ng (Myeloid v1) and the optimized 670-ng input (Myeloid v2) recommended by the model. To simulate the lower bound of tumor content in clinical submissions, cell line mixtures with 20% tumor content with expected heterozygous VAFs of 10% were used. Formula B was solved to determine a read count at which at least 99% of the estimated mean unique genomic fragments per targeted base would be present, eliminating the effect of read count from the experiment. Variant positions with read counts in excess of 1000 were filtered for both experiments and the expected sequenced proportion of the available unique genomic targets were noted in the applications of Formula B:
      338 ng Input (Myeloid v1): 1 – (80180)ˆ1000 = 0.9999966 (>79 unique genomic fragments of 80)
      670 ng Input (Myeloid v2): 1 – (1591159)ˆ1000 = 0.9981805 (>158 unique genomic fragments of 159)
      The calculations presented here represent regions only included in a single PCR plex (P). The theoretical projections were compared with the empirical myeloid cell line dilution test results. Simulated binomial distributions were not significantly different from the empirical observations for either DNA input (Kolmogorov-Smirnov test for equality of distributions, P > 0.1). The curve in Figure 3A includes 18 variants from six experimental dilutions, and the curve in Figure 3B includes 42 variants from 14 experimental dilutions.
      Figure thumbnail gr3
      Figure 3Comparison of the variant allele fraction (VAF) densities for DNA inputs of 338 ng and 670 ng. A and B: No statistically significant difference is noted between empirical (observed) and theoretical (simulated) observations, confirming the validity of the model (Kolmogorov-Smirnov test, P > 0.05). C: Comparison of the 338 ng and 670 ng empirical VAF distributions, for which a significant expected improvement in the assay’s limit of detection is confirmed. A narrowing of the VAF distribution is observed after increasing the DNA input from 338 ng to 670 ng. The distributions are significantly different (Kolmogorov-Smirnov test, P = 0.014). The red region highlights the proportion of variant calls that are below the VAF threshold of 3% for the assay with 338 ng but not with 670 ng.
      The negative skew in the myeloid cell line VAF curves in Figure 3 is due to the proximity of VAFs to zero, which causes the distributions to be non-normal with a slight bias toward zero. Because the broader variance causes more observations to approach zero, this effect is more pronounced as DNA input decreases.
      With the optimized DNA input of 670 ng, variants present at 10% in the original sample would have final observed VAFs below 3% for 0.024% of assayed variants. With the original DNA input of 338 ng, VAFs for 10% variants would be below 3% for 1.9% of assayed variants according to the theoretical model. Empirically, 3.7% of assayed 10% variants had VAFs below 3% with a 338-ng input (Figure 3C). The VAF distribution was significantly narrower with the higher DNA input to amplification, based on a Kolmogorov-Smirnov test (P = 0.014). These results validated the need to increase DNA starting amounts from 500 ng to 1 μg to support the detection goals. Blood or marrow extraction quantities of 1 μg were required to allow input of 670 ng of DNA into PCR amplification. This submission requirement did not increase the invasiveness to the patient, the requirement for additional patient sample, or the cost of the assay, and it significantly improved the reliability of the clinical results in our theoretical model and empirical tests.

       Empirical Validation of Read Depth Requirements versus Sensitivity (Read Depth Utility)

      If PCR de-duplication protocols effectively eliminate PCR duplicates from the final read library, it is possible to evaluate read depth requirements in isolation to establish guidelines for an informatics pipeline. This would be the case for pipelines with unique molecular identifiers or capture-based panels with sequence-based de-duplication. Due to the lack of unique molecular identifiers in the Myeloid Panel assay, and the amplicon-based nature of the assay, de-duplication based on unique molecular identifiers or sequence was not an option. Because of these factors, a nested binomial is the most optimal theoretical approach for Myeloid Panel data. However, if one factor is substantially more limiting in a nested binomial model, the model will asymptotically approach the simplified binomial described only by the limiting factor.
      With a variant calling threshold of 3%, a minimum of two variant supporting reads, and a mean detectable VAF of 10%, a ≥99% variant calling sensitivity was predicted with >100 reads using a binomial model. Over a range of read pair depths from 10 to 140, the 95% CI of a linear model for the empirical downsampling experiment overlapped the linear model generated from theoretical downsampling (Figure 4A).
      Figure thumbnail gr4
      Figure 4Variant calling sensitivity and read depth. A: Variant calling sensitivity at read depths from 1 to 140. The dashed reference lines intersect at >99% sensitivity with a read depth >100 in the downsampling experiments. Solid line indicates empirical sensitivity; dotted line indicates binomial simulation sensitivity. B: Simulated and empirical variant allele fraction (VAF) distributions. Solid line indicates variant call VAFs at read depth of 100 (n = 89); dotted line indicates random binomial simulation of VAFs (n = 500). Kolmogorov-Smirnov test for difference in theoretical and empirical distributions, P = 0.19. The vertical reference line is at the 3% variant calling threshold.
      Given the calculation in Figure 4A, ≥100 unique read pairs would be sufficient for a variant to be detectable >99% of the time (ie, present in three or more read pairs). The VarScan variant caller tested in this experiment required an absolute minimum of two supporting reads. It is best to adjust this minimum read and minimum VAF requirement to achieve a reasonably low false-positive rate from sequence errors. Because these sequence errors may also reduce sensitivity, the median base errors per position were calculated and this value was subtracted from the read depth when predicting read depth requirements. It was predicted that if restricted to a sufficiently small number of read pairs per target (≤100), and a sufficiently large number of amplified DNA copies from the target region (approximately 800), the Myeloid Panel data would conform to a simple binomial model that does not require accounting for the initial DNA input; this would be due to both the high surplus of available DNA noted in the “DNA Input Requirements” section and the uniformity of amplification efficiency in dropletized PCR. The schematic calculation for dropletized PCR predicted 159 average genomic fragments per amplicon with 670-ng DNA input in the Myeloid Panel assay (Figure 1).
      Formula B was converted to calculate the percentage of unique reads with one amplicon and five amplicons, showing that five or more unique amplicons are necessary for ≥94% of 100 reads to originate from unique template copies.
      Unique reads with 1 amplicon and 100 read pairs:
      1(1591159)100(159100)=0.74
      (1)


      Unique reads with five amplicons and 100 read pairs:
      (1(7951795))100(795100)=0.94
      (2)


      To increase the sensitivity of the dropletized amplicon design at specific targets in this experiment, the number of overlapping amplicons were increased. In particular, we placed five amplicons were placed to overlap a region where tandem duplications occur in the FLT3 gene, which translates to a mean of 795 genomic fragments (159∗5) in this region. Based on Formula B, >94% of Myeloid Panel reads in the FLT3 region should represent unique genomic fragments if read depths are ≤100. If these assumptions are met, most or all reads will originate from unique genomic fragments at read depths between 10 and 100, and the observed distribution of VAFs will not be significantly different from a simple binomial model. Therefore, the distributions of VAFs will fit a simplified binomial model wherein each read may be treated as a unique genomic copy. Random downsampling experiments were performed to a mean depth of 100 reads, as previously described in the Materials and Methods.
      To thoroughly evaluate the effects of low read depth on variant calling sensitivity, a heterozygous 52 bp tandem duplication in the FLT3 gene was selected from an experimental 20% tumor content dilution into a normal cell line sample for an expected VAF of 10%. To confirm that this variant was not prone to allelic bias during amplification, n > 60 repeats of this dilution were assessed and it was confirmed that the mean VAF was 10%, consistent with the dilution factor. The variants were downsampled to a mean high-quality (HQ) paired read depth of 100, or 200 total reads, to generate n = 89 distinct observations, as previously described in the Materials and Methods.

       VAF Distributions

      If the model assumptions are correct, the sensitivity of the tested Myeloid Panel assay with exhaustive DNA input will be limited by read depth, and any limitation from DNA amount will be sufficiently negligible to be insignificant. If only read depths are limiting, it is appropriate to perform a simple binomial calculation that considers each read as a unique genomic copy to project VAF distributions at given read depths. To test this assumption, a random binomial simulation was compared and the data downsampled in Figure 4B.
      No significant difference was found between the distribution of VAFs in the empirical data and the random binomial simulation (P = 0.19) (Figure 4B). The randomly downsampled HQ depth range extended from 95 to 105 HQ paired read depth with a mean of approximately 100 HQ paired read depth (n = 89). For a statistical comparison with similar sample size, 500 random binomial simulations of 100 read pairs were generated for a heterozygous variant (dotted line). Due to the lack of significance, it was concluded that our model assumptions are correct and that reads at depths of ≤100 can be theoretically regarded as originating from unique genomic fragments for targets enriched by five or more amplicons on the Myeloid Panel assay.
      This read downsampling experiment shows that sufficiently low read depths (100) with sufficiently high DNA inputs (approximately 800 unique genomic copies) were quantitatively indistinguishable from a simple binomial model. The capture experiment with sequence-based de-duplication performed by Spencer et al
      • Spencer D.H.
      • Tyagi M.
      • Vallania F.
      • Bredemeyer A.J.
      • Pfeifer J.D.
      • Mitra R.D.
      • Duncavage E.J.
      Performance of common analysis methods for detecting low-frequency single nucleotide variants in targeted next-generation sequence data.
      also conformed to a simple binomial model. Although both use cases are quantitatively appropriate for a simplified binomial model, it is critical to empirically verify any assumptions about DNA inputs if no methods are used to de-duplicate reads.

       Empirical Validation of Nested Binomial Model

      As shown by the high read depth sequencing experiment, as the number of sequenced reads approaches infinity, the limit of detection asymptotically approaches the number of genomic equivalents. In contrast, as read depth and genomic equivalents approach similar values, library complexity becomes a limiting factor, and many sequenced reads will be duplicates originating from the same genomic copy.
      • Jennings L.J.
      • Arcila M.E.
      • Corless C.
      • Kamel-Reid S.
      • Lubin I.M.
      • Pfeifer J.
      • Temple-Smolkin R.L.
      • Voelkerding K.V.
      • Nikiforova M.N.
      Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists.
      In cases in which the read depth and the number of genomic equivalents are similar, we cannot use the simple binomial forms in which only one factor is limiting. Although many sequencing strategies allow de-duplication of reads, it is still necessary to estimate and account for the actual library complexity a priori to strategize changes to a panel design, or to determine how many total sequencing reads are required per sample. To date, some guidelines have stated that this problem can only be solved empirically through a series of sequencing experiments,
      • Jennings L.J.
      • Arcila M.E.
      • Corless C.
      • Kamel-Reid S.
      • Lubin I.M.
      • Pfeifer J.
      • Temple-Smolkin R.L.
      • Voelkerding K.V.
      • Nikiforova M.N.
      Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists.
      whereas others have attempted to solve the problem for capture panels with PCR bias
      • Heinrich V.
      • Stange J.
      • Dickhaus T.
      • Imkeller P.
      • Krüger U.
      • Bauer S.
      • Mundlos S.
      • Robinson P.N.
      • Hecht J.
      • Krawitz P.M.
      The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
      ; unbiased amplification, however, has remained an open problem. It was found that Formula B, a common mathematical solution for sampling from a population with duplicates, matched our theoretical assumptions about the outcome of low library complexity in an unbiased amplification reaction.
      To empirically validate application of this formula for library complexity, a 10% frequency single nucleotide variant from the Myeloid cell line dilution experiments was randomly downsampled and n > 95 times were downsampled at a series of paired read depth intervals from 10 to 70.
      To validate these assumptions, the empirical data versus two theoretical scenarios were compared, one in which the corrective library complexity term had been applied within the binomial calculation, and one in which it was assumed that all paired reads represented unique genomic equivalents. The variant position was expected to have a maximum of 159 unique genomic fragments based on the DNA input, because the variant position was only targeted in one amplicon plex (value G in Figure 1). Due to the lack of molecular barcoding in the test data, it was necessary to assess the fit of the model by comparing an observed distribution of heterozygous germline variants to the theoretical distribution. Figure 5A shows the outcome of the read downsampling experiment with n > 95 variants per depth (blue points), a nested binomial model of the unique copies expected from the paired reads (dashed line), and a simple binomial model if all reads were from unique copies (solid line). Vertical lines intersect the predicted depth at which sensitivity exceeds 99% for each model. Although a read depth of 100 would be sufficient according to the simple binomial model, the nested binomial model predicted that approximately 275 reads would be necessary to achieve 99% sensitivity. Kolmogorov-Smirnov tests were performed of the VAF distributions at a read depth of 70 to compare the cumulative distributions against the theoretical distributions (Figure 5B). The empirical distribution was not significantly different from the calculation with the library complexity correction, which we also refer to as the “nested binomial” (P = 0.22). As expected, the 70 paired reads were not unique, and the Kolmogorov-Smirnov test for identity indicated that the empirical data did not originate from a distribution consistent with 70 unique copies (P = 0.02).
      Figure thumbnail gr5
      Figure 5Variant calling sensitivity and nested binomial variant allele fraction (VAF) distributions for a variant present at 10%. A: Comparison of Myeloid Panel variant calling sensitivity with low complexity (nested binomial) and simple binomial distributions. Individual lines represent the nested binomial model of unique copies expected from the paired reads (dashed line), and a simple binomial model assuming all reads were from unique copies (solid line). The empirical variant calling sensitivity is represented as blue points at each downsampling interval, with n > 90 trials per point. B: Comparison of Myeloid Panel VAF distributions with low complexity (nested binomial) and simple binomial distributions. Individual lines represent the read downsampling experiment with n = 95 trials at a paired read depth of 70 (blue dotted line), the nested binomial model (dashed line), and the simple binomial model (solid line).
      To evaluate the applicability of this model to different variant calling algorithms, MuTect2, an assembly-based method, was also tested and the outcomes were compared versus those of VarScan2 (Supplemental Figure S2). The loss of information caused by assembly and consensus substantially reduced sensitivity to variants, such that sensitivity to events was approximately halved at read depths exceeding 50. The algorithms of MuTect2 are optimized for tiled reads rather than amplicons. Sequence artifacts in individual reads were misidentified as distinct alleles, which caused a broader VAF distribution, occasional failure to include the expected variant allele in the final assembly, and an ultimately lower variant calling sensitivity. The observed discrepancy in performance between VarScan2 and MuTect2 also resembles the discrepancy in the sensitivity curves of the binomial and stochastic simulations, supporting the conclusion that the MuTect2 assembly outcomes are stochastic depending on sequence errors present in each distinct read downsampling trial (Supplemental Figure S1). Given the finding that assembly-based callers are non-binomial, the application of the binomial and nested binomial models to variant callers is recommended after empirical confirmation that the variant caller behavior is in fact binomial. It is also recommended that users of variant callers carefully consider the impact of their variant calling approaches on variant calling sensitivity. Overall, VarScan2 exhibited superior performance to MuTect2 due to its deterministic behavior and binomial distribution.

       Effects of Fixative Degradation

      To examine the functional effect of sample fixative DNA degradation on our model, the same assumptions were evaluated on fresh patient samples versus patient samples fixed in formalin to verify that a corrective loss term could account for the template loss due to DNA fragmentation in the fixative. Due to the lack of known expected VAFs for the patient somatic variants, this experiment was performed on the VAF distributions of the presumed germline heterozygous single-nucleotide polymorphisms by using the Single Nucleotide Polymorphism Database v141 global population rate of ≥1%. Based on laboratory quantification of amplification yields, it was determined that only approximately 30% of the FFPE DNA extract was in the amplifiable size range (Supplemental Figure S3). An amplifiable fraction term L = 0.3 was therefore applied. The full end-to-end calculation including the amplifiable fraction term is presented in Figure 6.
      Figure thumbnail gr6
      Figure 6Nested binomial assay sensitivity estimation parameters. The first limiting factor included in the model is DNA input to the PCR amplification reaction. Subsequent to PCR amplification, downstream processes may introduce inefficiencies that further limit assay sensitivity. Read depth is the final limiting factor included in the model. Because DNA input is independent but interactive, the final nested binomial calculation was performed with the R (reads per base) and Y (final target-bearing fragments per base) terms as inputs.
      For the fresh samples, germline heterozygous variants were randomly downsampled from five samples from patients with a myeloid malignancy to generate 348 variant calls in a dataset with Y = 380 genomic equivalents and R = 150 paired reads. The fraction of genomic equivalents yielded by Formula B would be 0.326, and thus the final estimated genomic equivalents in the sequencing output would be 124. Assuming that reads were randomly distributed between fragments, it is valid to perform the binomial calculation using this sequencing output parameter as the sampling population. Kolmogorov-Smirnov tests were performed to compare the cumulative distributions against the theoretical distributions (Supplemental Figure S4A). The empirical distribution was not significantly different from the calculation with the library complexity correction, which is also referred to as the “nested binomial” (P = 0.211). As expected, the 150 paired reads were not unique, and the Kolmogorov-Smirnov test for identity indicates that the empirical data did not originate from a distribution consistent with 150 unique copies (P = 0.045). The significance was marginal and the visual difference in the distributions was slight due to the small effect size of the difference between 124 versus 150 copies.
      Because tumor blocks from FFPE material can often be limiting for sample input, for this experiment 250 ng of submitted DNA that sheared to 125-ng input DNA available for targeted PCR was used and, divided across 214 plexes. The mean number of estimated genomic copies per position was only 50 with such a low input. At this input, >95% of those 50 original copies (48 copies) would be represented in 150 paired reads (Formula C). Three distributions were calculated in Supplemental Figure S4B: the calculated DNA copies per position (dashed line), the observed variation per position (dotted line), and the read depth per position (solid line). This comparison was performed with the same assumptions as presented in Supplemental Figure S4A, and it was confirmed that the scaling was predictable as a linear scaling of 30% of the targeted PCR input (Kolmogorov-Smirnov test, P = 0.08). The Kolmogorov-Smirnov test for identity indicates that the empirical data did not originate from a distribution consistent with 150 unique copies (P < 0.005). Given the linear scaling of this corrective term, this meant that approximately three times as much input DNA was required to achieve a similar limit of detection with a formalin sample relative to a fresh sample.

       Model Selection Process Flow

      As we hypothesized, our emulsion PCR amplicon panels could fit either a “simple” or a “nested” binomial model, depending on whether the DNA input and sequencing capacity were both limiting sensitivity, or only one of the two factors was limiting sensitivity. Based on these findings, we propose the concept depicted in Figure 7 to anticipate which model is most appropriate for a panel design. It was found that appropriate application of the correct model resulted in reliable estimation of library complexity, or the number of DNA copies represented, for a set of experiments ranging from 50 to 234 estimated DNA copies (Figure 8). The observed VAF distributions at these differing DNA quantities illustrate the importance of sufficient library complexity for adequate sensitivity and reliable VAF estimation.
      Figure thumbnail gr7
      Figure 7Model selection flow for minimum panel read depth estimation. In this decision flow schematic, the terminal nodes are the appropriate theoretical models for an assay. “Simplified binomial” models are models that are limited solely by DNA input or read depth, rather than being limited by both factors. “Nested binomial” models are models that account for interacting limitations from DNA input and read depth. The “Stochastic branching” model is consistent with the model published by Heinrich et al
      • Heinrich V.
      • Stange J.
      • Dickhaus T.
      • Imkeller P.
      • Krüger U.
      • Bauer S.
      • Mundlos S.
      • Robinson P.N.
      • Hecht J.
      • Krawitz P.M.
      The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
      and accounts for DNA losses during competitive PCR (dashed line, not evaluated in this experiment). UMI = unique molecular identifier.
      Figure thumbnail gr8
      Figure 8Variant allele fraction (VAF) distributions relative to DNA copies. The VAF distributions for each experiment evaluated in this article, ranging from 50 to 234 DNA copies, were compared. The distributions are presented separately for heterozygous germline variants present at 50% and somatic variants present at 10%, with reference lines at both VAFs. FFPE = formalin-fixed, paraffin-embedded.
      The conceptual diagram in Figure 7 summarizes the proposed decision-making process required for correct determination of which of the three discussed models (binomial, nested binomial, or stochastic branching) is most appropriate for calculating the sensitivity of a given assay. Because the resulting calculations of the three theoretical models presented in Figure 7 can be disparate, it is crucial to make the appropriate model selection so that informed decisions regarding assay design and assay sensitivity can be reached.
      Mathematically, it would still be appropriate to use the nested model in every scenario in which the “simplified” binomial solution applies, and we recommend this as the most prudent option. However, we recognize that the “simple” binomial can be more straightforward and convenient when a term required by the “nested” model is not limiting sensitivity and also not known.

      Discussion

      We applied existing sensitivity models to targeted gene panels developed at our clinical sequencing center to accurately predict the sensitivity of our assay and to accurately predict how sensitivity would change with DNA input or sequencing effort without performing large costly experiments. It was found that the binomial model was only applicable in scenarios with a single limiting factor, stochastic branching models were conceptually inappropriate for our dropletized PCR strategy, and library complexity correction must be formulaic and cannot be addressed by correction with a static constant. Furthermore, it was found that after accounting for library complexity, variant calling algorithms could alter the final distribution of outcomes. The discrepancies observed in these various experimental conditions illustrate how essential careful model selection is for decision-making during assay design. By use of a thorough model of the limiting factors in next-generation sequencing assays, the outcomes of changes to our panel designs could accurately be predicted and DNA input and read depth requirements adjusted accordingly for optimal reporting reliability.
      The effects of stochastic amplification and PCR bias were not empirically evaluated in this article due to mitigation of both effects from our own library procedures but we propose that a stochastic branching model is likely best in these scenarios (Figure 7). Stochastic branching models are appropriate for assays with stochastic amplification, such as hybridization capture panels that are not dropletized and lack a de-duplication step. In assays with stochastic amplification, many genomic fragments may be lost stochastically over several PCR cycles. Heinrich et al
      • Heinrich V.
      • Stange J.
      • Dickhaus T.
      • Imkeller P.
      • Krüger U.
      • Bauer S.
      • Mundlos S.
      • Robinson P.N.
      • Hecht J.
      • Krawitz P.M.
      The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
      observed an asymptotic limit to sensitivity at a read depth of 30 in a hybridization capture experiment; although that assay had a large starting DNA input amount of approximately 1450 genomic copies (equivalent to approximately 5 ng of DNA), the library lost substantial DNA complexity due to stochastic losses of genomic fragments during PCR cycles. As shown by our variant caller comparison, stochastic losses may also be modeled informatically if the variant caller uses branching algorithms (Supplemental Figure S2). Application of a nested binomial as the input to a stochastic branching function to model the complexity of such a process is recommended after estimation of the appropriate model terms.
      Although these model outcomes were evaluated in a specific panel design type with dropletized targeted PCR, the conceptual framework outlined in Figure 7 has much broader applicability. Following the guidelines proposed in Figure 7, read depth requirements could be estimated with a simple binomial calculation for de-duplicated capture or whole-genome sequencing data.
      • Griffith M.
      • Miller C.A.
      • Griffith O.L.
      • Krysiak K.
      • Skidmore Z.L.
      • Ramu A.
      • Walker J.R.
      • Dang H.X.
      • Trani L.
      • Larson D.E.
      • Demeter R.T.
      • Wendl M.C.
      • McMichael J.F.
      • Austin R.E.
      • Magrini V.
      • McGrath S.D.
      • Ly A.
      • Kulkarni S.
      • Cordes M.G.
      • Fronick C.C.
      • Fulton R.S.
      • Maher C.A.
      • Ding L.
      • Klco J.M.
      • Mardis E.R.
      • Ley T.J.
      • Wilson R.K.
      Optimizing cancer genome sequencing and analysis.
      Other published experiments already report the conformity of de-duplicated capture read to a simple binomial distribution,
      • Spencer D.H.
      • Tyagi M.
      • Vallania F.
      • Bredemeyer A.J.
      • Pfeifer J.D.
      • Mitra R.D.
      • Duncavage E.J.
      Performance of common analysis methods for detecting low-frequency single nucleotide variants in targeted next-generation sequence data.
      and this practice is so widely accepted that it is now a recommendation in clinical practice guidelines.
      • Jennings L.J.
      • Arcila M.E.
      • Corless C.
      • Kamel-Reid S.
      • Lubin I.M.
      • Pfeifer J.
      • Temple-Smolkin R.L.
      • Voelkerding K.V.
      • Nikiforova M.N.
      Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists.
      PCR-free assays may require no correction for library complexity, because library complexity should be 100%, and the simple binomial approach would be applicable. However, many protocols still include a minimum of one round of amplification that can introduce duplicates, and some sequencing strategies will introduce high rates of optical duplicates.
      In conclusion, the approach outlined here provides clinical laboratories with a tool to evaluate the co-limiting elements of DNA and sequencing depth in silico during assay validation, and it provides some practical examples of the final effects of variant calling algorithms on assay sensitivity. These practices will allow laboratories to limit potentially harmful false-negative results.

      Author Contributions

      E.R.S. designed experiments, analyzed data, and wrote the manuscript; coordinated project and contributed to study design, L.S., T.R.D., and R.A.M. designed and coordinated experiments; I.B. designed the study; S.M. analyzed data; A.K. designed and coordinated experiments and wrote the manuscript.

      Supplemental Data

      Figure thumbnail figs1
      Supplemental Figure S1Stochastic and binomial sensitivity curves. We simulated the two models for a range of 20 to 100 starting target fragments per droplet, with N = 10,000 trials per starting fragment count.
      Figure thumbnail figs2
      Supplemental Figure S2MuTect2 and VarScan2 sensitivity comparison. This figure is a repeat of the experiment in . MuTect2 sensitivity differed markedly from that of VarScan2 for the same downsampling experiment. The vertical dashed lines indicate the number of reads required to achieve 99% (horizontal dashed line) sensitivity to a 10% variant for a simple binomial or nested binomial distribution.
      Figure thumbnail figs3
      Supplemental Figure S3Amplicon yield quantifications for n = 700 extractions from 500-ng DNA starting amounts. Vertical reference lines intersect the median yield for formalin-fixed, paraffin-embedded (FFPE) samples and samples fixed in optimal cutting temperature compound (OCT) or frozen fresh. The yield reduction due to FFPE was approximately 70%. The negative shift in FFPE yields was significant, based on a Wilcoxon rank sum test with continuity correction (P < 5 × 10−16).
      Figure thumbnail figs4
      Supplemental Figure S4Nested binomial variant allele fraction (VAF) distributions for fresh and fixed tissues. A: Comparison of Myeloid Panel VAF distributions with low complexity (nested binomial) and simple binomial distributions. Individual lines represent a read downsampling experiment with n = 348 variants that have approximately 150 paired reads of coverage (dotted line), a random binomial simulation of the 124 unique copies expected from the 150 paired reads (dashed line), and a random binomial simulation of 150 paired reads if all reads were from unique copies (solid line). The empirical VAF distribution is not significantly different from random simulations of 124 reads but significantly differs from 150 reads (Kolmogorov-Smirnov test, P = 0.045). B: Formalin-fixed, paraffin-embedded low DNA input comparison of VAF distributions with low complexity (nested binomial) and simple binomial distributions. Individual lines represent the read downsampling experiment with n = 75 that have approximately 150 paired reads of coverage (dotted line), a random binomial simulation of the 48 unique copies expected from 150 paired reads (dashed line), and a random binomial simulation of 150 reads if all reads were from unique copies (solid line). There is a substantial difference in the anticipated distribution even in the ideal case presented here for a pure tumor extraction with a heterozygous variant (Kolmogorov-Smirnov test, P < 0.005).

      References

        • Hiatt J.B.
        • Pritchard C.C.
        • Salipante S.J.
        • O’Roak B.J.
        • Shendure J.
        Single molecule molecular inversion probes for targeted, high accuracy detection of low-frequency variation.
        Genome Res. 2013; 23: 843-854
        • Misyura M.
        • Zhang T.
        • Sukhai M.A.
        • Thomas M.
        • Garg S.
        • Kamel-Reid S.
        • Stockley T.L.
        Comparison of next-generation sequencing panels and platforms for detection and verification of somatic tumor variants for clinical diagnostics.
        J Mol Diagn. 2016; 18: 842-850
        • Kuderer N.M.
        • Burton K.A.
        • Blau S.
        • Rose A.L.
        • Parker S.
        • Lyman G.H.
        • Blau C.A.
        Comparison of 2 commercially available next-generation sequencing platforms in oncology.
        JAMA Oncol. 2017; 3: 996-998
        • Rehm H.L.
        • Bale S.J.
        • Bayrak-Toydemir P.
        • Berg J.S.
        • Brown K.K.
        • Deignan J.L.
        • Friez M.J.
        • Funke B.H.
        • Hegde M.R.
        • Lyon E.
        • Working Group of the American College of Medical Genetics and Genomics Laboratory Quality Assurance Committee
        ACMG clinical laboratory standards for next-generation sequencing.
        Genet Med. 2013; 15: 733-747
        • O’Donnell M.R.
        Risk stratification and emerging treatment strategies in acute myeloid leukemia.
        J Natl Compr Cancer Netw. 2017; 11 Suppl 5: 667-669
        • Jennings L.J.
        • Arcila M.E.
        • Corless C.
        • Kamel-Reid S.
        • Lubin I.M.
        • Pfeifer J.
        • Temple-Smolkin R.L.
        • Voelkerding K.V.
        • Nikiforova M.N.
        Guidelines for validation of next-generation sequencing–based oncology panels: a joint consensus recommendation of the Association for Molecular Pathology and College of American Pathologists.
        J Mol Diagn. 2017; 19: 341-365
        • Tewhey R.
        • Warner J.B.
        • Nakano M.
        • Libby B.
        • Medkova M.
        • David P.H.
        • Kotsopoulos S.K.
        • Samuels M.L.
        • Hutchison J.B.
        • Larson J.W.
        • Topol E.J.
        • Weiner M.P.
        • Harismendy O.
        • Olson J.
        • Link D.R.
        • Frazer K.A.
        Microdroplet-based PCR enrichment for large-scale targeted sequencing.
        Nat Biotechnol. 2009; 27: 1025-1031
        • Hori M.
        • Fukano H.
        • Suzuki Y.
        Uniform amplification of multiple DNAs by emulsion PCR.
        Biochem Biophys Res Commun. 2007; 352: 323-328
        • Spencer D.H.
        • Tyagi M.
        • Vallania F.
        • Bredemeyer A.J.
        • Pfeifer J.D.
        • Mitra R.D.
        • Duncavage E.J.
        Performance of common analysis methods for detecting low-frequency single nucleotide variants in targeted next-generation sequence data.
        J Mol Diagn. 2014; 16: 75-88
        • Heinrich V.
        • Stange J.
        • Dickhaus T.
        • Imkeller P.
        • Krüger U.
        • Bauer S.
        • Mundlos S.
        • Robinson P.N.
        • Hecht J.
        • Krawitz P.M.
        The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process.
        Nucleic Acids Res. 2012; 40: 2426-2431
        • Xu C.
        A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data.
        Comput Struct Biotechnol J. 2018; 16: 15-24
        • Martin M.
        Cutadapt removes adapter sequences from high-throughput sequencing reads.
        EMBnet.J. 2011; 17: 10-12
        • Li H.
        Aligning sequence reads, clone sequences and assembly contigs with bwa-mem. ArXiv.
        2013: 1303
        • Koboldt D.C.
        • Zhang Q.
        • Larson D.E.
        • Shen D.
        • McLellan M.D.
        • Lin L.
        • Miller C.A.
        • Mardis E.R.
        • Ding L.
        • Wilson R.K.
        VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.
        Genome Res. 2012; 22: 568-576
        • Bosdet I.E.
        • Docking T.R.
        • Butterfield Y.S.
        • Mungall A.J.
        • Zeng T.
        • Coope R.J.
        • Yorida E.
        • Chow K.
        • Bala M.
        • Young S.S.
        • Hirst M.
        • Birol I.
        • Moore R.A.
        • Jones S.J.
        • Marra M.A.
        • Holt R.
        • Karsan A.
        A clinically validated diagnostic second-generation sequencing assay for detection of hereditary BRCA1 and BRCA2 mutations.
        J Mol Diagn. 2013; 15: 796-809
        • Li H.
        • Handsaker B.
        • Wysoker A.
        • Fennell T.
        • Ruan J.
        • Homer N.
        • Marth G.
        • Abecasis G.
        • Durbin R.
        • 1000 Genomes Project Data Processing Subgroup
        The Sequence alignment/Map format and SAMtools.
        Bioinformatics. 2009; 25: 2078-2079
        • Welch J.S.
        • Ley T.J.
        • Link D.C.
        • Miller C.A.
        • Larson D.E.
        • Koboldt D.C.
        • et al.
        The origin and evolution of mutations in acute myeloid leukemia.
        Cell. 2012; 150: 264-278
        • Griffith M.
        • Miller C.A.
        • Griffith O.L.
        • Krysiak K.
        • Skidmore Z.L.
        • Ramu A.
        • Walker J.R.
        • Dang H.X.
        • Trani L.
        • Larson D.E.
        • Demeter R.T.
        • Wendl M.C.
        • McMichael J.F.
        • Austin R.E.
        • Magrini V.
        • McGrath S.D.
        • Ly A.
        • Kulkarni S.
        • Cordes M.G.
        • Fronick C.C.
        • Fulton R.S.
        • Maher C.A.
        • Ding L.
        • Klco J.M.
        • Mardis E.R.
        • Ley T.J.
        • Wilson R.K.
        Optimizing cancer genome sequencing and analysis.
        Cell Syst. 2015; 1: 210-223