Advertisement

A Novel and Reliable Method to Detect Microsatellite Instability in Colorectal Cancer by Next-Generation Sequencing

  • Lizhen Zhu
    Affiliations
    Department of Medical Oncology, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China

    Cancer Institute Key Laboratory of Cancer Prevention and Intervention, Chinese National Ministry of Education, Key Laboratory of Molecular Biology in Medical Sciences, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
    Search for articles by this author
  • Yanqin Huang
    Affiliations
    Cancer Institute Key Laboratory of Cancer Prevention and Intervention, Chinese National Ministry of Education, Key Laboratory of Molecular Biology in Medical Sciences, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
    Search for articles by this author
  • Xuefeng Fang
    Affiliations
    Department of Medical Oncology, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China

    Cancer Institute Key Laboratory of Cancer Prevention and Intervention, Chinese National Ministry of Education, Key Laboratory of Molecular Biology in Medical Sciences, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
    Search for articles by this author
  • Chenglin Liu
    Affiliations
    Department of Bioinformatics, Burning Rock Biotech, Guangzhou, China
    Search for articles by this author
  • Wanglong Deng
    Affiliations
    Department of Research and Development, Burning Rock Biotech, Guangzhou, China
    Search for articles by this author
  • Chenhan Zhong
    Affiliations
    Department of Medical Oncology, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China

    Cancer Institute Key Laboratory of Cancer Prevention and Intervention, Chinese National Ministry of Education, Key Laboratory of Molecular Biology in Medical Sciences, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
    Search for articles by this author
  • Jinghong Xu
    Affiliations
    Department of Pathology, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
    Search for articles by this author
  • Dong Xu
    Affiliations
    Department of Surgical Oncology, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
    Search for articles by this author
  • Ying Yuan
    Correspondence
    Address correspondence to Ying Yuan, M.D., Department of Medical Oncology, The Second Affiliated Hospital, Zhejiang University School of Medicine, 88 Jie-fang Road, Hangzhou, Zhejiang 310009, China.
    Affiliations
    Department of Medical Oncology, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China

    Cancer Institute Key Laboratory of Cancer Prevention and Intervention, Chinese National Ministry of Education, Key Laboratory of Molecular Biology in Medical Sciences, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
    Search for articles by this author
Open ArchivePublished:December 19, 2017DOI:https://doi.org/10.1016/j.jmoldx.2017.11.007
      Two types of molecular tests have been established to assess the deficiency of the DNA mismatch repair (MMR) system: microsatellite instability (MSI) and immunohistochemical (IHC) analysis. We have developed a reliable method to analyze the MSI status by next-generation sequencing (NGS) based on read-count distribution. A total of 91 patients with primary colorectal cancer were recruited. These patients included 54 cases with loss of expression of any MMR protein in IHC, suggesting deficient MMR (dMMR), and 37 cases of colorectal cancer with staining of all four MMR proteins in IHC, suggesting proficient MMR in the sample after surgery. DNA was extracted from paired tumor–normal tissue for MSI detection by both the ColonCore NGS panel and PCR. The sequencing data from the NGS panel was processed using various MSI detection pipelines for a comparison with the ColonCore panel. Using the MSI-PCR test as the gold standard, MSI-ColonCore achieved 97.9% sensitivity (47 of 48) and 100% specificity (37 of 37) for the detection of MSI status. MSI-ColonCore also showed more efficient and robust performance compared with other NGS-based MSI detection algorithms. The concordance rate was 92.3% between MSI-ColonCore and IHC testing, and 93.4% between MSI-PCR and IHC testing. These results suggest that MSI-ColonCore is a reliable and robust method for MSI status detection by NGS based on read-count distribution.
      Microsatellites are tandem DNA repeats with one to six bases in coding and noncoding regions throughout the genome. The polymerase slippage during DNA synthesis leads to accumulation of mutations in microsatellites, and the two main types of errors are base–base mismatches and insertion–deletion. These errors are usually detected and corrected by the DNA mismatch repair (MMR) system. Deficient MMR (dMMR) activity caused by germline mutations or hypermethylation of MMR genes can lead to a hypermutable phenotype at the genomic level, named microsatellite instability (MSI).
      • Duval A.
      • Hamelin R.
      Mutations at coding repeat sequences in mismatch repair-deficient human cancers: toward a new concept of target genes for instability.
      Therefore, the MMR function can be detected by MSI analysis or immunohistochemical (IHC) loss of expression of any MMR proteins.
      The DNA MMR function detection is applied, not only to the initial molecular screening for Lynch syndrome, a major type of hereditary colorectal cancer (CRC) characterized by germline mutations in MMR genes, but also to the selection of suitable patients for immunotherapy, since anti–programmed death-1 (anti–PD-1) therapies have achieved significant success in various MSI-high (MSI-H)/dMMR cancers with the fact that pembrolizumab (anti–PD-1 therapy) is recently approved by the Food and Drug Administration for the treatment of patients with unresectable or metastatic solid tumors who are referred to be MSI-H or dMMR.
      • Moreira L.
      • Balaguer F.
      • Lindor N.
      • de la Chapelle A.
      • Hampel H.
      • Aaltonen L.A.
      • Hopper J.L.
      • Le Marchand L.
      • Gallinger S.
      • Newcomb P.A.
      • Haile R.
      • Thibodeau S.N.
      • Gunawardena S.
      • Jenkins M.A.
      • Buchanan D.D.
      • Potter J.D.
      • Baron J.A.
      • Ahnen D.J.
      • Moreno V.
      • Andreu M.
      • Ponz de Leon M.
      • Rustgi A.K.
      • Castells A.
      EPICOLON Consortium
      Identification of Lynch syndrome among patients with colorectal cancer.
      • Le D.T.
      • Uram J.N.
      • Wang H.
      • Bartlett B.R.
      • Kemberling H.
      • Eyring A.D.
      • Skora A.D.
      • Luber B.S.
      • Azad N.S.
      • Laheru D.
      • Biedrzycki B.
      • Donehower R.C.
      • Zaheer A.
      • Fisher G.A.
      • Crocenzi T.S.
      • Lee J.J.
      • Duffy S.M.
      • Goldberg R.M.
      • de la Chapelle A.
      • Koshiji M.
      • Bhaijee F.
      • Huebner T.
      • Hruban R.H.
      • Wood L.D.
      • Cuka N.
      • Pardoll D.M.
      • Papadopoulos N.
      • Kinzler K.W.
      • Zhou S.
      • Cornish T.C.
      • Taube J.M.
      • Anders R.A.
      • Eshleman J.R.
      • Vogelstein B.
      • Diaz Jr., L.A.
      PD-1 blockade in tumors with mismatch-repair deficiency.
      Before the era of massively parallel DNA sequencing, MSI is detected by PCR-based methods at specific microsatellite markers, and CRCs can be classified into MSI-high (MSI-H), MSI-low (MSI-L), and microsatellite stable (MSS) according to the proportion of unstable markers.
      • Boland C.R.
      • Thibodeau S.N.
      • Hamilton S.R.
      • Sidransky D.
      • Eshleman J.R.
      • Burt R.W.
      • Meltzer S.J.
      • Rodriguez-Bigas M.A.
      • Fodde R.
      • Ranzani G.N.
      • Srivastava S.
      A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: development of international criteria for the determination of microsatellite instability in colorectal cancer.
      • Suraweera N.
      • Duval A.
      • Reperant M.
      • Vaury C.
      • Furlan D.
      • Leroy K.
      • Seruca R.
      • Iacopetta B.
      • Hamelin R.
      Evaluation of tumor microsatellite instability using five quasimonomorphic mono-nucleotide repeats and pentaplex PCR.
      As next-generation sequencing (NGS) is increasingly applied to detect tumor gene mutations, combining MSI status and mutation detection into the same sequencing process would highly decrease the demand of tissue samples and increase the efficiency of tests.
      Currently, two types of methods have been proposed for the detection of MSI status by NGS. The first type tries to postulate MSI status from mutation burden, which is usually detected by whole-exome sequencing, and demonstrates a significant correlation between total mutation burden and MSI status.
      • Huang M.N.
      • McPherson J.R.
      • Cutcutache I.
      • Teh B.T.
      • Tan P.
      • Rozen S.G.
      MSIseq: software for assessing microsatellite instability from catalogs of somatic mutations.
      • Nowak J.A.
      • Yurgelun M.B.
      • Bruce J.L.
      • Rojas-Rudilla V.
      • Hall D.L.
      • Shivdasani P.
      • Garcia E.P.
      • Agoston A.T.
      • Srivastava A.
      • Ogino S.
      • Kuo F.C.
      • Lindeman N.I.
      • Dong F.
      Detection of mismatch repair deficiency and microsatellite instability in colorectal adenocarcinoma by targeted next-generation sequencing.
      • Stadler Z.K.
      • Battaglin F.
      • Middha S.
      • Hechtman J.F.
      • Tran C.
      • Cercek A.
      • Yaeger R.
      • Segal N.H.
      • Varghese A.M.
      • Reidy-Lagunes D.L.
      • Kemeny N.E.
      • Salo-Mullen E.E.
      • Ashraf A.
      • Weiser M.R.
      • Garcia-Aguilar J.
      • Robson M.E.
      • Offit K.
      • Arcila M.E.
      • Berger M.F.
      • Shia J.
      • Solit D.B.
      • Saltz L.B.
      Reliable detection of mismatch repair deficiency in colorectal cancers using mutational load in next-generation sequencing panels.
      The other type of method directly measures the level of MSI by the read-count distribution of a selected set of loci with different repeat lengths. Current approaches based on read-count distribution include MSIsensor and mSINGS.
      • Niu B.
      • Ye K.
      • Zhang Q.
      • Lu C.
      • Xie M.
      • McLellan M.D.
      • Wendl M.C.
      • Ding L.
      MSIsensor: microsatellite instability detection using paired tumor-normal sequence data.
      • Salipante S.J.
      • Scroggins S.M.
      • Hampel H.L.
      • Turner E.H.
      • Pritchard C.C.
      Microsatellite instability detection by next generation sequencing.
      MSIsensor requires paired tumor and normal samples, and compares the histogram of read counts covering different repeat lengths of the loci using a standard χ2 test. A locus is considered length-instable if the adjusted P-value is less than a pre-determined threshold. The percentage of length-instability loci is used to determine the MSI status. mSINGS does not require normal controls for MSI status detection. It determines the length-instable locus if the number of the types of repeat lengths is larger than[reference mean + 3 × SD].
      The differences in length-instable loci percentages are statistically significant between MSI-H and MSS samples for both methods.
      Here, we have developed a new reliable algorithm to analyze MSI by NGS read-count distribution, and compare its performance with MSIsensor and mSINGS. This algorithm, combined with the ColonCore panel, has also been validated against conventional PCR-MSI tests in a pool of samples with known IHC status of four major MMR proteins: MLH1, MSH2, MSH6, and PMS2.

      Materials and Methods

      With the approval of Ethics Committee of the Second Affiliated Hospital of Zhejiang University School of Medicine and informed consent of all patients or their relatives, a total of 91 patients with primary CRC were recruited from January 2015 to January 2017. Among these, 54 cases were randomly selected from CRCs with IHC loss of expression of any of four MMR proteins (MLH1, MSH2, MSH6, and PMS2), and 37 cases were randomly recruited from those with intact expression of all four MMR proteins. They were not randomly selected from all of the CRCs. For each tumor, formalin-fixed, paraffin-embedded tumor tissue was obtained after surgery, as well as normal tissue from the negative surgical margin, and the necrotic area ≤50%; and the percent tumor cellularity was checked: 80.2% samples (73 of 91) ≤ 50%, 19.8% samples (18 of 91) range from 30% to 50% and none lower than 30%. DNA was extracted from paired tumor-normal tissue for MSI detection by NGS and PCR. Each group of researchers interpreting the MSI-ColonCore (C.L. and W.D.) status, IHC results (J.X. and other pathologists routinely reading the IHC results in clinical practice), and MSI-PCR results (L.Z. and Y.H.) was blinded to the results of the other two tests.

       MSI Detection by ColonCore Panel

      The ColonCore panel (Burning Rock, Guangzhou, China) is designed for simultaneous detection of MSI status and mutations in 36 CRC-related genes, including KRAS, NRAS, BRAF, hereditary CRC genes, and other genes related to carcinogenesis and tumor development (Supplemental Table S1). The MSI phenotype detection method of MSI-ColonCore is a read-count–distribution-based approach. It uses the coverage ratio of a specific set of repeat lengths as the main characteristic of each microsatellite locus, and categorizes a locus as unstable if the coverage ratio is less than a given threshold. The MSI status of a sample is determined by the percentage of unstable loci in the given sample. The details of the method are described below. The raw reads are deposited in the NCBI Sequence Read Archive (SRA; https://trace.ncbi.nlm.nih.gov/Traces/sra; accession number SRP119517).

       Data Preprocessing

      To determine the MSI status, the microsatellite loci were first scanned from the reference genome, and the number of reads aligned to the loci of different repeat lengths was calculated in a training set of samples of known MSI status. The scanned loci were restricted to mononucleotide repeats because those were reported as the most sensitive and specific for MSI detection.
      • Umar A.
      • Boland C.R.
      • Terdiman J.P.
      • Syngal S.
      • de la Chapelle A.
      • Rüschoff J.
      • Fishel R.
      • Lindor N.M.
      • Burgart L.J.
      • Hamelin R.
      • Hamilton S.R.
      • Hiatt R.A.
      • Jass J.
      • Lindblom A.
      • Lynch H.T.
      • Peltomaki P.
      • Ramsey S.D.
      • Rodriguez-Bigas M.A.
      • Vasen H.F.
      • Hawk E.T.
      • Barrett J.C.
      • Freedman A.N.
      • Srivastava S.
      Revised Bethesda Guidelines for hereditary nonpolyposis colorectal cancer (Lynch syndrome) and microsatellite instability.
      Sequencing reads were aligned by Burrows-Wheeler Aligner software version 0.7.10 against the reference genome (hg19/GRCh37).
      • Li H.
      • Durbin R.
      Fast and accurate short read alignment with Burrows-Wheeler transform.
      Reads aligned to the loci at every possible repeat length were counted respectively using the same strategy as proposed by MSIsensor.
      • Niu B.
      • Ye K.
      • Zhang Q.
      • Lu C.
      • Xie M.
      • McLellan M.D.
      • Wendl M.C.
      • Ding L.
      MSIsensor: microsatellite instability detection using paired tumor-normal sequence data.
      To obtain the read counts with highest specificity and also allow for sequencing errors, especially for loci with low coverage, two versions of read-count statistics were generated by applying perfect match restriction and allowing 1-bp mismatch, respectively, in the alignment.

       Loci Characterization and Baseline Construction

      Among the loci scanned from the reference genome, those showed high consistency between the stability of their repeat lengths and the MSI status of their corresponding sample were selected as marker microsatellite loci. In microsatellite stable samples, the length of homopolymers in the marker microsatellite loci were relatively stable. In other words, reads were aligned to only a few types of repeat lengths. For each microsatellite locus, the specific repeat length covered by the largest amount of reads was called peak length, and the read count was called peak count. Repeat lengths covered by no less than 75% of the peak count were recorded for each normal sample in the training set. This length set was called the reference length set, which was then used for baseline construction. At the baseline construction stage, the ratio of the read count covering the reference length set divided by the total read count covering all possible lengths of the locus was calculated for each normal control. The average coverage ratio (mean) and the SD level of all normal samples in the training set were calculated afterward, and the threshold of [mean − 3 × SD] was set as the lower limit for a length-stable locus.
      A locus with a coverage ratio less than the threshold was considered a length-instable one.
      The final set of marker microsatellite loci was then selected using the training sample set with the following criteria: length-instable in more than 75% MSI samples, length-stable in more than 75% MSS samples, and has an average Spearman correlation higher than 0.8 between the ratio of reads covering each type of repeat length of the loci of each pair of normal samples.

       MSI Status Determination for Samples

      After the selection of marker microsatellite loci and the establishment of the ratio reference, the MSI status of a tumor sample could be determined based on the percentage of length-instable loci, without a paired normal control. For each marker locus, the read-count histogram was constructed, and the coverage ratio of the reference length set was calculated and compared to the reference threshold. A locus with a coverage ratio less than [mean − 3 × SD] of the reference ratio was considered a length-instable locus. A tumor sample was considered MSI-H if more than 40% of the marker loci were length-instable, MSS if the percentage of length-instable loci were <15%, or MSI-L for if the percentage was between 15% and 40%.

       Performance Evaluation Index for the NGS-Based Approaches

      MSI status reported by PCR was set as the ground truth, with MSI-H (PCR) samples as positive and MSS (PCR) samples as negative. Four widely used measurements were adopted for performance evaluation, including sensitivity (SN), specificity (SP), accuracy (ACC), and Matthews correlation coefficient, illustrated as follows:
      SN=TPTP+FN
      (1)


      SP=TNTN+FP
      (2)


      ACC=TP+TNTP+TN+FP+FN
      (3)


      MCC=TP×TNFP×FN(TP+FP)(TP+FN)(TN+FP)(TN+FN)
      (4)


      where TP, TN, FP, and FN denoted the true positive, true negative, false positive, and false negative, respectively.

       MSI Detection by PCR

      MSI-PCR testing was performed as the gold standard of MSI status. Genomic DNA extracted from all paired tumor-normal samples was tested by the Beijing Microread Genetics Co. Ltd. using the MSI detection kit (patent number ZL 201110152226.X; Microread Genetics Co. Ltd., Beijing, China) on the ABI 3730xl Genetic Analyzer (Applied Biosystems, Foster City, CA). The panel used for the MSI analysis consisted of nine markers, including six mononucleotide repeat sequences (NR-21, BAT-26, NR-27, BAT-25, NR-24, and MONO-27), two pentanucleotide repeat sites (Penta C and Penta D), and a sex loci (Amelogeni). Penta C, Penta D, and Amelogenin were used for sample contamination control only. Data were collected and analyzed with the GeneMapper software version 4.0 (Applied Biosystems). MSI of any marker was defined when there were peaks in the fluorescence profile of the amplified microsatellite DNA from tumor tissue that were absent in a corresponding profile of the paired normal tissue. Samples were categorized into MSI-H (≥2 mononucleotide markers instable), MSI-L (one mononucleotide marker instable), and MSS (none of the mononucleotide markers showed instability), and the cutoff value was 33.3%, within the range of 30% to 40%.

       MMR Analysis by IHC Staining

      Immunohistochemistry staining of CRCs was performed to examine the expression of four MMR proteins, MLH1, MSH2, MSH6, and PMS2, on formalin-fixed, paraffin-embedded tissue. Primary monoclonal antibodies against MLH1 (clone ES05, dilution 1:50; Dako Cytomation, Carpinteria, CA), MSH2 (clone FE11, dilution 1:50; Oncogene Research Products, Boston, MA), MSH6 (clone EP49, dilution 1:150; Dako Cytomation), and PMS2 (clone EP51, dilution 1:50; Dako Cytomation) were used with external controls. dMMR was interpreted when any of these MMR proteins were totally absent in the nuclear staining of tumor tissue while present in nuclear staining of adjacent benign tissue, and any convincing nuclear staining of all of these four proteins was considered proficient MMR. The IHC results were assessed by two specialized pathologists (J.X. and other pathologists routinely reading the IHC results in clinical practice), and only concordant samples were included in the present study.

       Statistical Analysis

      Categorical variables were analyzed using the χ2 test, and continuous variables were analyzed using unpaired t-test. Two-sided P <0.05 was considered statistically significant.

      Results

       Marker Microsatellites Loci of ColonCore Panel

      In all target regions of the ColonCore panel covering 36 CRC-related genes, 90 microsatellite loci with homopolymers no less than 8 bp long were scanned out. The read-count ratio covering the reference length set was calculated for each locus using the normal controls of the training sample set (20 tumor-normal sample pairs with previously determined MSI phenotype, and the lower limit for a stable locus was set as [mean − 3 × SD].
      Twenty-two marker microsatellite loci were selected as the final set according to criteria described in the Materials and Methods. The marker loci and the baseline statistics are listed in Table 1.
      Table 1Colorectal Cancer–Specific Marker Microsatellite Loci and Baseline Statistics
      Locus identityChrPositionHomopolymerLeft-mer
      The left side of the homopolymer.
      Right-mer
      The right side of the homopolymer.
      Ratio, mean (SD)Mismatch, n
      MS-BR1116133209114[T]ATTCCGCTTT0.661 (0.039)0
      MS-BR224763552313[T]TGTACAAGGA0.913 (0.019)0
      MS-BR3
      These loci were also used in the PCR method.
      24764155927[A]CAGGTGGGTT0.738 (0.032)1
      MS-BR424803274013[T]TGTGAAAGGT0.974 (0.011)0
      MS-BR524803389018[T]AAAACAATTT0.873 (0.055)0
      MS-BR6
      These loci were also used in the PCR method.
      29584936123[T]TCCTAGTGAG0.747 (0.058)0
      MS-BR7
      These loci were also used in the PCR method.
      45559821125[T]TTTGAGAGAA0.440 (0.037)0
      MS-BR87603705717[A]AACTGTTCAC0.895 (0.042)0
      MS-BR9711638112116[T]TGGTGGGTTT0.794 (0.052)0
      MS-BR10711640967515[T]CAACCCCTTT0.875 (0.0340)0
      MS-BR111110811466115[T]AATAAAAGAA0.750 (0.043)0
      MS-BR121110812141015[T]TATCCAGGCT0.784 (0.064)0
      MS-BR131110814195515[T]TGAACACCAC0.638 (0.031)0
      MS-BR141110818826613[T]CTTGAGCCTC0.853 (0.057)0
      MS-BR151110819597619[T]CATAGCATTT0.733 (0.072)0
      MS-BR16
      These loci were also used in the PCR method.
      1112549076521[T]GAAGAAATAT0.832 (0.046)0
      MS-BR171213323775314[A]ACCTGGGCAA0.724 (0.038)0
      MS-BR18133290521912[T]TTTGAGAGGT0.913 (0.021)0
      MS-BR19133290753511[T]CTGTCGTAAA0.913 (0.019)0
      MS-BR20
      These loci were also used in the PCR method.
      142365234621[A]TTGCTGGCCA0.792 (0.089)1
      MS-BR21159130332512[T]AAGACCCCTC0.816 (0.031)0
      MS-BR22184858485516[T]GGCTAGGTAG0.776 (0.059)1
      The homopolymer was described using the repeat length and the repeat unit. For example, MS-BR1 is the locus in chromosome 1, with a 14 Ts homopolymer. The flanking sequences on the left and right side of the homopolymer are ATTCC and GCTTT, respectively. Mismatch describes the maximum number of mismatches allowed when counting the reads aligned to the loci of different repeat lengths. Three marker loci allowed one mismatch during the alignment.
      The left side of the homopolymer.
      The right side of the homopolymer.
      These loci were also used in the PCR method.

       Validation of the MSI-ColonCore MSI Method versus Conventional PCR-MSI

      MSI-ColonCore achieved 97.9% sensitivity (47 of 48) and 100% specificity (37 of 37) for the detection of MSI when MSI-PCR testing was performed as the gold standard, with one PCR MSI-H sample labeled as MSI-L (Table 2).
      Table 2Correlation of MSI-ColonCore, MSI-PCR, and IHC
      MSIMSI-ColonCore, nMSI-PCR, n
      IHCMSI-HMSI-L/MSSMSI-HMSI-L/MSS
      dMMR477486
      pMMR037037
      dMMR, deficient MMR; pMMR, proficient mismatch repair.

       Correlation between MSI and IHC Status

      According to MSI-ColonCore, all 37 IHC proficient MMR cases were identified as MSS, whereas 47 of 54 IHC dMMR cases were MSI-H, with the remaining 7 interpreted as MSS/MSI-L. In MSI-PCR testing, the results were almost the same as MSI-ColonCore, except that the case interpreted as MSI-L in MSI-ColonCore turned out to be MSI-H in MSI-PCR, and one considered as MSI-L in MSI-ColonCore turned to be MSS in MSI-PCR. Therefore, the concordance rate was 92.3% between MSI-ColonCore and IHC testing, and 93.4% between MSI-PCR and IHC testing (Table 2).

       Comparison of MSI Status Detection Ability among Different NGS-Based Methods

      Here, we compared the performance of our approach to that of two previously published read-count–distribution-based methods: MSIsensor software version 0.2 and mSINGS software version 2.0. The MSI status of 79 samples was first determined by PCR method as the ground truth. One sample was reported as MSI-L by PCR method, and was excluded from the further performance evaluation. The 90 microsatellite loci with the length of homopolymers no less than 8 bp were canned out of the target region, and were used as marker loci for MSIsensor. The threshold for the percentage of length-instable loci was set at 30% for MSIsensor. It achieved the best performance based on this threshold. For mSINGS, loci that may cause artifacts were excluded from the baseline according to the recommendation of the software. Forty-four loci were retained as marker loci afterward. The threshold for the percentage of length-instable loci was set at the default value of 10% recommended by mSINGS. The performance indexes of these three methods are shown in Table 3. The ColonCore panel achieved the best performance as shown in Table 3. In addition, the independency of the paired normal sample of the algorithm makes it more practical in clinical applications.
      Table 3Performance of MSI Status Detection for 91 Samples
      MethodAccuracy, %Sensitivity, %Precision, %Matthews correlation coefficient
      ColonCore panel98.9097.921000.978
      MSIsensor version 0.296.7097.9295.350.934
      mSINGS version 2.097.8095.831000.957

       ColonCore: A More Robust MSI Status Detection Method versus mSINGS

      Because the percentage of length-instable loci is the key index to distinguish between MSI statuses for a sample, the distribution of the percentage in MSS and MSI-H samples were compared between the three methods. The percentages were most distinguished by the ColonCore panel, which demonstrated its robustness in MSI status detection (Figure 1).
      Figure thumbnail gr1
      Figure 1The percentages of length-instability loci in MSS and MSI-H samples for the ColonCore panel, MSIsensor, and mSINGS. Each dot represents one sample. The percentages are more readily distinguished in the ColonCore panel versus those from MSIsensor and mSINGS.

       Mutation Burden Comparison between MSI-H and MSS Samples

      The high correlation of MSI status and mutation burden has been demonstrated by recent reports benefiting from whole-exon sequencing. The mutation burden per Mb of MSI-H and MSS samples reported by MSI-ColonCore was presented as two violin plots (Figure 2). Although MSI-H samples tended to have higher mutation burden, the two types of samples were not as highly distinguished as reported.
      Figure thumbnail gr2
      Figure 2Mutation burden of MSS and MSI-H samples reported by the PCR method. Each dot represents one sample. The mutation burden is not highly distinguishable between these two types of samples.

      Discussion

      The deficiency of DNA mismatch repair system can be assessed through approaches at two different levels: the genomic level (MSI analysis, PCR or NGS based) and protein level (IHC tests of MMR proteins). It has been reported that MSI analysis and IHC testing are highly related, with a concordance rate ranging from 84.5% to 98.6%.
      • Moreira L.
      • Balaguer F.
      • Lindor N.
      • de la Chapelle A.
      • Hampel H.
      • Aaltonen L.A.
      • Hopper J.L.
      • Le Marchand L.
      • Gallinger S.
      • Newcomb P.A.
      • Haile R.
      • Thibodeau S.N.
      • Gunawardena S.
      • Jenkins M.A.
      • Buchanan D.D.
      • Potter J.D.
      • Baron J.A.
      • Ahnen D.J.
      • Moreno V.
      • Andreu M.
      • Ponz de Leon M.
      • Rustgi A.K.
      • Castells A.
      EPICOLON Consortium
      Identification of Lynch syndrome among patients with colorectal cancer.
      • Yan W.Y.
      • Hu J.
      • Xie L.
      • Cheng L.
      • Yang M.
      • Li L.
      • Shi J.
      • Liu B.R.
      • Qian X.P.
      Prediction of biological behavior and prognosis of colorectal cancer patients by tumor MSI/MMR in the Chinese population.
      • Yuan L.
      • Chi Y.
      • Chen W.
      • Chen X.
      • Wei P.
      • Sheng W.
      • Zhou X.
      • Shi D.
      Immunohistochemistry and microsatellite instability analysis in molecular subtyping of colorectal carcinoma based on mismatch repair competency.
      • Hampel H.
      • Frankel W.L.
      • Martin E.
      • Arnold M.
      • Khanduja K.
      • Kuebler P.
      • Clendenning M.
      • Sotamaa K.
      • Prior T.
      • Westman J.A.
      • Panescu J.
      • Fix D.
      • Lockman J.
      • LaJeunesse J.
      • Comeras I.
      • de la Chapelle A.
      Feasibility of screening for Lynch syndrome among patients with colorectal cancer.
      Here, the concordance rate was 92.3% between MSI-ColonCore and IHC testing, and 93.4% between MSI-PCR and IHC testing.
      At the genomic level, PCR-based approaches have been the gold standard for MSI analysis. With the development of NGS technology, NGS-based MSI analysis has been increasingly adopted for its two major advantages. First, NGS sequencing panels, when properly designed, can simultaneously capture the mutation spectrum and the MSI status in CRC patients, reducing the amount of tissue sample required and simplifying the testing process. Second, after NGS-based MSI analysis properly constructs a baseline reference set, as in the ColonCore panel, the need for normal control samples is eliminated, which will benefit patients without surgery, especially for most metastatic cancers.
      As described earlier, NGS-based MSI analyses fall into two categories: the mutation burden approach and the read-count distribution approach. The application of the mutation burden approach is limited in clinical practice by the need of large and costly sequencing panels of hundreds of genes or even whole-exome analysis, because mutation burden calculated from small panels tends to deviate from the real value. For example, in the present study, mutation burden calculated from our panel of 36 hotspot genes is obviously overestimated compared with those from large panels, and due to the small number of genes, even adjusted mutation burdens will be biased. Another technical challenge for mutation-burden–based MSI analysis is that cutoff values between MSI-H, MSI-L, and MSS samples have to be defined for each specific sequencing panel.
      By contrast, read-count–distribution-based MSI analysis, not only shows high consistency with the gold standard, but also suits clinical applications for its compatibility with smaller, cheaper, and more efficient sequencing panels such as the ColonCore panel in this study. It is also more versatile because the cutoff values are easy to define for any given panel using a similar logic to that in PCR-based approaches: the percentage of instable microsatellite loci. Although all read-count–distribution-based NGS methods achieved similar performance in this experiment, the MSI-ColonCore showed the most robustness compared to MSIsensor and mSINGS, because the percentages of length-instable loci were more readily distinguished between the MSI-H and MSS samples. After NGS-based MSI analysis properly constructing baseline reference set, it also eliminated the need for normal controls of MSIsensor.
      Although MSI-PCR and IHC, two methods to detect MMR system function, are well established and relatively inexpensive, such methods have limited capability to multiplex. By contrast, NGS allows for largescale parallel sequencing and has proved to be a cost-effective and accurate tool for the parallel profiling of different forms of genetic abnormalities including mutations, fusions, and amplifications across a large number of genes, which could not be provided by MSI-PCR or IHC, but are very important in clinical practice. Besides, previous studies had discussed the power of NGS in clinical practice.
      • Kurian A.W.
      • Hare E.E.
      • Mills M.A.
      • Kingham K.E.
      • McPherson L.
      • Whittemore A.S.
      • McGuire V.
      • Ladabaum U.
      • Kobayashi Y.
      • Lincoln S.E.
      • Cargill M.
      • Ford J.M.
      Clinical evaluation of a multiple-gene sequencing panel for hereditary cancer risk assessment.
      • Shen T.
      • Pajaro-Van de Stadt S.H.
      • Yeat N.C.
      • Lin J.C.
      Clinical applications of next generation sequencing in cancer: from panels, to exomes, to genomes.
      Furthermore, the performance of MSI-ColonCore is comparable with the gold standard PCR-MSI. Therefore, the NGS-based ColonCore panel is cost effective and promising in clinical practice.
      Our study is to some extent limited by the relatively small number of cases, due to only 15% of CRCs driven by MMR deficiency.
      • Grady W.M.
      • Carethers J.M.
      Genomic and epigenetic instability in colorectal cancer pathogenesis.
      More cases will be recruited to further validate our findings, and the capability of MSI-ColonCore will also be tested in other types of cancers with high MSI, such as endometrial cancer and gastric cancer.
      In summary, MSI-ColonCore can detect MSI accurately and more robustly compared with current NGS methods based on read-count distribution.

      Acknowledgments

      We thank Haiyan Xu and Han Han-Zhang for medical editorial assistance and Zhou Zhang for data analysis.

      Supplemental Data

      References

        • Duval A.
        • Hamelin R.
        Mutations at coding repeat sequences in mismatch repair-deficient human cancers: toward a new concept of target genes for instability.
        Cancer Res. 2002; 62: 2447-2454
        • Moreira L.
        • Balaguer F.
        • Lindor N.
        • de la Chapelle A.
        • Hampel H.
        • Aaltonen L.A.
        • Hopper J.L.
        • Le Marchand L.
        • Gallinger S.
        • Newcomb P.A.
        • Haile R.
        • Thibodeau S.N.
        • Gunawardena S.
        • Jenkins M.A.
        • Buchanan D.D.
        • Potter J.D.
        • Baron J.A.
        • Ahnen D.J.
        • Moreno V.
        • Andreu M.
        • Ponz de Leon M.
        • Rustgi A.K.
        • Castells A.
        • EPICOLON Consortium
        Identification of Lynch syndrome among patients with colorectal cancer.
        JAMA. 2012; 308: 1555-1565
        • Le D.T.
        • Uram J.N.
        • Wang H.
        • Bartlett B.R.
        • Kemberling H.
        • Eyring A.D.
        • Skora A.D.
        • Luber B.S.
        • Azad N.S.
        • Laheru D.
        • Biedrzycki B.
        • Donehower R.C.
        • Zaheer A.
        • Fisher G.A.
        • Crocenzi T.S.
        • Lee J.J.
        • Duffy S.M.
        • Goldberg R.M.
        • de la Chapelle A.
        • Koshiji M.
        • Bhaijee F.
        • Huebner T.
        • Hruban R.H.
        • Wood L.D.
        • Cuka N.
        • Pardoll D.M.
        • Papadopoulos N.
        • Kinzler K.W.
        • Zhou S.
        • Cornish T.C.
        • Taube J.M.
        • Anders R.A.
        • Eshleman J.R.
        • Vogelstein B.
        • Diaz Jr., L.A.
        PD-1 blockade in tumors with mismatch-repair deficiency.
        N Engl J Med. 2015; 372: 2509-2520
        • Boland C.R.
        • Thibodeau S.N.
        • Hamilton S.R.
        • Sidransky D.
        • Eshleman J.R.
        • Burt R.W.
        • Meltzer S.J.
        • Rodriguez-Bigas M.A.
        • Fodde R.
        • Ranzani G.N.
        • Srivastava S.
        A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: development of international criteria for the determination of microsatellite instability in colorectal cancer.
        Cancer Res. 1998; 58: 5248-5257
        • Suraweera N.
        • Duval A.
        • Reperant M.
        • Vaury C.
        • Furlan D.
        • Leroy K.
        • Seruca R.
        • Iacopetta B.
        • Hamelin R.
        Evaluation of tumor microsatellite instability using five quasimonomorphic mono-nucleotide repeats and pentaplex PCR.
        Gastroenterology. 2002; 123: 1804-1811
        • Huang M.N.
        • McPherson J.R.
        • Cutcutache I.
        • Teh B.T.
        • Tan P.
        • Rozen S.G.
        MSIseq: software for assessing microsatellite instability from catalogs of somatic mutations.
        Sci Rep. 2015; 5: 13321
        • Nowak J.A.
        • Yurgelun M.B.
        • Bruce J.L.
        • Rojas-Rudilla V.
        • Hall D.L.
        • Shivdasani P.
        • Garcia E.P.
        • Agoston A.T.
        • Srivastava A.
        • Ogino S.
        • Kuo F.C.
        • Lindeman N.I.
        • Dong F.
        Detection of mismatch repair deficiency and microsatellite instability in colorectal adenocarcinoma by targeted next-generation sequencing.
        J Mol Diagn. 2017; 19: 84-91
        • Stadler Z.K.
        • Battaglin F.
        • Middha S.
        • Hechtman J.F.
        • Tran C.
        • Cercek A.
        • Yaeger R.
        • Segal N.H.
        • Varghese A.M.
        • Reidy-Lagunes D.L.
        • Kemeny N.E.
        • Salo-Mullen E.E.
        • Ashraf A.
        • Weiser M.R.
        • Garcia-Aguilar J.
        • Robson M.E.
        • Offit K.
        • Arcila M.E.
        • Berger M.F.
        • Shia J.
        • Solit D.B.
        • Saltz L.B.
        Reliable detection of mismatch repair deficiency in colorectal cancers using mutational load in next-generation sequencing panels.
        J Clin Oncol. 2016; 34: 2141-2147
        • Niu B.
        • Ye K.
        • Zhang Q.
        • Lu C.
        • Xie M.
        • McLellan M.D.
        • Wendl M.C.
        • Ding L.
        MSIsensor: microsatellite instability detection using paired tumor-normal sequence data.
        Bioinformatics. 2014; 30: 1015-1016
        • Salipante S.J.
        • Scroggins S.M.
        • Hampel H.L.
        • Turner E.H.
        • Pritchard C.C.
        Microsatellite instability detection by next generation sequencing.
        Clin Chem. 2014; 60: 1192-1199
        • Umar A.
        • Boland C.R.
        • Terdiman J.P.
        • Syngal S.
        • de la Chapelle A.
        • Rüschoff J.
        • Fishel R.
        • Lindor N.M.
        • Burgart L.J.
        • Hamelin R.
        • Hamilton S.R.
        • Hiatt R.A.
        • Jass J.
        • Lindblom A.
        • Lynch H.T.
        • Peltomaki P.
        • Ramsey S.D.
        • Rodriguez-Bigas M.A.
        • Vasen H.F.
        • Hawk E.T.
        • Barrett J.C.
        • Freedman A.N.
        • Srivastava S.
        Revised Bethesda Guidelines for hereditary nonpolyposis colorectal cancer (Lynch syndrome) and microsatellite instability.
        J Natl Cancer Inst. 2004; 96: 261-268
        • Li H.
        • Durbin R.
        Fast and accurate short read alignment with Burrows-Wheeler transform.
        Bioinformatics. 2009; 25: 1754-1760
        • Yan W.Y.
        • Hu J.
        • Xie L.
        • Cheng L.
        • Yang M.
        • Li L.
        • Shi J.
        • Liu B.R.
        • Qian X.P.
        Prediction of biological behavior and prognosis of colorectal cancer patients by tumor MSI/MMR in the Chinese population.
        Onco Targets Ther. 2016; 9: 7415-7424
        • Yuan L.
        • Chi Y.
        • Chen W.
        • Chen X.
        • Wei P.
        • Sheng W.
        • Zhou X.
        • Shi D.
        Immunohistochemistry and microsatellite instability analysis in molecular subtyping of colorectal carcinoma based on mismatch repair competency.
        Int J Clin Exp Med. 2015; 8: 20988-21000
        • Hampel H.
        • Frankel W.L.
        • Martin E.
        • Arnold M.
        • Khanduja K.
        • Kuebler P.
        • Clendenning M.
        • Sotamaa K.
        • Prior T.
        • Westman J.A.
        • Panescu J.
        • Fix D.
        • Lockman J.
        • LaJeunesse J.
        • Comeras I.
        • de la Chapelle A.
        Feasibility of screening for Lynch syndrome among patients with colorectal cancer.
        J Clin Oncol. 2008; 26: 5783-5788
        • Kurian A.W.
        • Hare E.E.
        • Mills M.A.
        • Kingham K.E.
        • McPherson L.
        • Whittemore A.S.
        • McGuire V.
        • Ladabaum U.
        • Kobayashi Y.
        • Lincoln S.E.
        • Cargill M.
        • Ford J.M.
        Clinical evaluation of a multiple-gene sequencing panel for hereditary cancer risk assessment.
        J Clin Oncol. 2014; 32: 2001-2009
        • Shen T.
        • Pajaro-Van de Stadt S.H.
        • Yeat N.C.
        • Lin J.C.
        Clinical applications of next generation sequencing in cancer: from panels, to exomes, to genomes.
        Front Genet. 2015; 6: 215
        • Grady W.M.
        • Carethers J.M.
        Genomic and epigenetic instability in colorectal cancer pathogenesis.
        Gastroenterology. 2008; 135: 1079-1099