Identification of somatic and germline mutations using whole exome sequencing of congenital acute lymphoblastic leukemia
© Chang et al.; licensee BioMed Central Ltd. 2013
Received: 21 September 2012
Accepted: 30 January 2013
Published: 4 February 2013
Acute lymphoblastic leukemia (ALL) diagnosed within the first month of life is classified as congenital ALL and has a significantly worse outcome than ALL diagnosed in older children. This suggests that congenital ALL is a biologically different disease, and thus may be caused by a distinct set of mutations. To understand the somatic and germline mutations contributing to congenital ALL, the protein-coding regions in the genome were captured and whole-exome sequencing was employed for the identification of single-nucleotide variants and small insertion and deletions in the germlines as well as the primary tumors of four patients with congenital ALL.
Exome sequencing was performed on Illumina GAIIx or HiSeq 2000 (Illumina, San Diego, California). Reads were aligned to the human reference genome and the Genome Analysis Toolkit was used for variant calling. An in-house developed Ensembl-based variant annotator was used to richly annotate each variant.
There were 1–3 somatic, protein-damaging mutations per ALL, including a novel mutation in Sonic Hedgehog. Additionally, there were many germline mutations in genes known to be associated with cancer predisposition, as well as genes involved in DNA repair.
This study is the first to comprehensively characterize the germline and somatic mutational profile of all protein-coding genes patients with congenital ALL. These findings identify potentially important therapeutic targets, as well as insight into possible cancer predisposition genes.
KeywordsPediatric leukemia Congenital acute lymphoblastic leukemia Exome sequencing
Acute lymphoblastic leukemia (ALL) is the most common type of cancer diagnosed in children. Congenital ALL is a rare and aggressive subtype of ALL defined as diagnosis within the first month of life. A recent study of 30 patients with congenital ALL treated on the Interfant-99 protocol reported a 2-year event-free survival of 20% despite intensive chemotherapy . This is significantly worse than the 5-year event-free survival of older children with ALL, which approaches 80% . Although the 11q23 rearrangement is the most common cytogenetic abnormality in congenital and infant ALL , studies demonstrate that this rearrangement is not sufficient for leukemogenesis [4, 5] and does not entirely explain the aggressiveness of ALL in this population of patients [6–8].
These data demonstrate that congenital ALL is a biologically different disease, and therefore may be caused by a distinct set of mutations in ALL blast cells that differ from blasts from older patients. Whole-exome sequencing can be used to characterize the majority of amino acid encoding base positions of the genome. When applied to cancer, this method can identify somatic mutations that may contribute to leukemogenesis, as well as germline mutations that may reveal cancer predisposition genes [9–12]. In this paper, we report whole-exome sequencing on four paired tumor-normal samples from patients with congenital ALL and fully characterize the germline and somatic mutations. In addition, healthy parents of one patient were also sequenced to verify any inherited germline mutations. Our results demonstrate that there are very few somatic mutations in cALL and that there are potential druggable targets that may provide new therapeutic options to improve outcomes.
The UCLA Institutional Review Board approved this study, which was carried out in compliance with the Helsinki Declaration, and all participants, or parents of participants, provided written informed consent before samples were collected.
% peripheral blasts at diagnosis
DNA extraction and sequencing
Tumor genomic DNA was extracted from peripheral blood at diagnosis and normal genomic DNA was extracted from remission bone marrow using QIAmp DNA Minikit (Qiagen, Valencia, California). Genomic DNA was enriched for coding exons using Sure Select Human All Exon for sample 1, and Human All Exon 50Mb kits for samples 2–4 (Agilent, Santa Clara, California). Sample 1 was sequenced on one full lane of the Illumina Genome Analyzer IIx as 76x76 base paired-end reads as well as one full lane of the HiSeq2000 as 50x50 base paired-end reads and reads were merged for downstream analysis (Illumina, San Diego, California). Leukemia sample numbers 2 through 4 and parents of sample 1 were sequenced on one full lane of the HiSeq2000 as 100x100 base pair, paired-end reads, while the germlines of samples 2–4 were sequenced on one full lane of the HiSeq2000 as 50x50 base pair, paired-end reads.
Variant calling and filtration
Sequence reads were aligned to the human reference genome build 37, using Novoalign (novocraft.com). Post-processing of reads was performed using Samtools (samtools.sf.net) and Picard (picard.sf.net) for removal of PCR duplicates, merging, and indexing .
The Genome Analysis Toolkit (GATK) was used for recalibration of base quality, variant calling, filtration and evaluation [14, 15]. Quality scores generated by the sequencer were recalibrated by analyzing the covariation among reported.
Quality score, position within the read, dinucleotide, and probability of a reference mismatch. Local realignment around small insertions and deletions (indels) was performed, using GATK's indel realigner to minimize the number of mismatching bases across all reads. Statistically significant non-reference variants, single nucleotide substitutions (SNS) and small indels were identified using the GATK UnifiedGenotyper. The GATK VariantAnnotator annotated each variant with various statistics, including allele balance, depth of coverage, strand balance, and multiple quality metrics. These statistics were then used in an adaptive error model to identify likely false positive SNSs, using the GATK VariantQualityScoreRealibrator (VQSR). Single nucleotide substitutions with a low VQSR score were filtered out, leaving a set of likely true variants. Hard filtering was applied to indels and only passing indels were used for subsequent analyses.
An in-house program based on the Ensembl database (http://www.ensembl.org) was used to further annotate variants with gene, transcript, and protein identifiers, conservation, tissue-specific expression, reference and alternate allele frequencies based on 1000Genomes (http://www.1000genomes.org/data), dbSNP132 (http://www.ncbi.nlm.nih.gov/projects/SNP), NHLBI (http://evs.gs.washington.edu/EVS) and NIEHS (http://evs.gs.washington.edu/niehsExome), among additional annotations.
Variants were filtered out if they were in non-coding regions, resulted in synonymous amino acid changes, or were predicted to have a benign change in protein function by Polyphen (http://genetics.bwh.harvard.edu/pph) or Sift (http://sift.jcvi.org). Variants were classified as rare if alternate allele frequencies were less than 1%.
Nonsynonymous, protein-damaging, and rare germline variants were intersected with known germline mutations that predispose to cancer syndromes, found in Cosmic . Germline variants were also intersected with known DNA repair genes . Germline variants in sample 1 were cross-checked with the parents’ sequence data to identify inherited versus de novo mutations. All germline and somatic variants at the last step of filtering were manually visualized using Integrated Genomics Viewer .
Mutations were classified as somatic if they were rare and found in the tumor sample only with no evidence in the germline data. Fisher’s Exact test was performed on the reference and non-reference reads and p-value <1x10-6 was used as the cut-off for significance. Somatic mutations found in sample 1 were cross-checked with the parents’ sequence data to ensure they were indeed somatic and not alleles missed in the germline. Three somatic variants were excluded because they were present as non-reference reads in one or both parents.
Polymerase chain reaction and capillary sequencing
The SHH mutation in Sample 1, FLT3 mutation in Sample 3, and DMBT-1 mutation in Sample 4 were validated using PCR and capillary sequencing. All primers for mutations were designed using Primer3Plus (http://www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi) and ordered from Integrated DNA Technologies (Coralville, IA). Capillary sequencing was performed on Biosystems 3730 Capillary DNA Analyzer (Life Technologies, Carlsbad, CA). Raw and analyzed sequence results were visualized on Sequence Scanner v1.0 (Life Technologies, Carlsbad, CA). There was not sufficient DNA for Sample 2 to validate variants with PCR and capillary sequencing.
Alignment and coverage statistics
Alignment and coverage statistics by sample
% Covered at 20x
% PCR duplicates
Sample 1 tumor
Sample 1 germline
Sample 1 mother
Sample 1 father
Sample 2 tumor
Sample 2 germline
Sample 3 tumor
Sample 3 germline
Sample 4 tumor
Sample 4 germline
Comparison of mean overlap with Cosmic germline genes and DNA repair genes in patients with cALL and children without cancer
Mean overlap with Cosmic
Mean overlap with DNA repair genes
4 cALL patients
28 control exomes
Nonsynonymous, protein-damaging, rare somatic mutations with p-value <1x10 -6
Amino acid change
Prediction (Sift, Polyphen)
Although there has been significant progress in overall survival for children with ALL, newborns with congenital ALL continue to have poor prognoses despite intensive therapy. There is a need to identify new therapeutic targets in congenital ALL to rationally design treatment regimens that will produce sustained remissions with less toxicity. Additionally, understanding the molecular basis for congenital ALL may lead to novel insights into leukemogenesis and new cancer predisposition syndromes. This study is the first to comprehensively characterize the somatic and germline mutational profile of all protein-coding genes in four tumor-normal paired samples from patients born with congenital ALL.
Sample 1 had a somatic mutation in SHH, which has not previously been reported in ALL. The Hedgehog pathway is known to have a role in normal B-lymphocyte development and use of Hedgehog pathway inhibitors leads to decreased self-renewal potential . The G143S mutation found in Sample 1 lies in a critical signaling region of the SHH protein that interacts with the SHH receptor, Patched (PTCH). Association of SHH with PTCH releases the inhibitory effect of PTCH on Smoothened (SMO), which allows for the propagation of SHH signals to activate transcription factors including GLI-1, 2, and 3 . It is possible that this mutation has an activating effect on SHH that leads to dysregulation of downstream target genes.
Two of the four samples had somatic mutations in FLT3. Point mutations and internal tandem duplications in FLT3 are known to be driver mutations in acute myelogenous leukemia (AML) but are also enriched in infant ALL . Multiple oral FLT3 inhibitors have been tested in Phase 1 and 2 trials as single agents, as well as in combination with other chemotherapy agents for treatment of AML [22–25] with promising results. This study identified that single nucleotide substitutions in FLT3 are recurrent in ALL and infants with ALL might benefit from treatment with FLT3 inhibitors.
This is the first study to perform exome sequencing on paired tumor and normal samples from patients with congenital ALL. Three of the four tumor samples had somatic mutations in genes that are druggable targets. Germline analyses did not reveal any clear set of cancer predisposition genes but a larger number of samples will need to be sequenced in order to delineate the role of DNA repair genes and known germline cancer predisposition genes, as well as to identify novel cancer predisposition genes.
As the cost of next-generation sequencing continues to decrease, patients and physicians will routinely encounter opportunities to supplement traditional morphology, flow cytometry, and cytogenetics tests with a base-pair level resolution of all variants in the exome as well as whole genome. High-throughput functional assays to validate the effect of all candidate driver mutations will be needed to fully take advantage of this level of mutational profiling. Additionally, inherited or de novo mutations in patients’ germlines will continue to expand currently known cancer predisposition syndromes and may eventually lead to approaches for earlier cancer detection and even cancer prevention.
Authors Kathleen M Sakamoto and Stanley F Nelson are both co-senior authors.
We would like to express our deepest appreciation to our patients and their families. We also thank Rongqing Guo and Traci Toy for excellent technical assistance. This project was supported by Parents against Leukemia, Evelyn Grace Foundation, and the National Center for Research Resources, Grant UL1RR033176, which is now at the National Center for Advancing Translational Sciences, Grant UL1TR000124. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. V.C. was supported on the T32 Developmental Hematology grant (T32HL086345), K12 Child Health Research Career Development Award (2K12HD034610-16) and by the American Society of Hematology.
- van der Linden MH, Valsecchi MG, De Lorenzo P, Moricke A, Janka G, Leblanc TM, Felice M, Biondi A, Campbell M, Hann I, Rubnitz JE, Stary J, Szczepanski T, Vora A, Ferster A, Hovi L, Silverman LB, Pieters R: Outcome of congenital acute lymphoblastic leukemia treated on the Interfant-99 protocol. Blood. 2009, 114 (18): 3764-3768. 10.1182/blood-2009-02-204214.View ArticlePubMedGoogle Scholar
- Gaynon PS, Angiolillo AL, Carroll WL, Nachman JB, Trigg ME, Sather HN, Hunger SP, Devidas M: Long-term results of the children's cancer group studies for childhood acute lymphoblastic leukemia 1983–2002: a Children's Oncology Group Report. Leukemia. 2010, 24 (2): 285-297. 10.1038/leu.2009.262.View ArticlePubMedGoogle Scholar
- Rubnitz JE, Link MP, Shuster JJ, Carroll AJ, Hakami N, Frankel LS, Pullen DJ, Cleary ML: Frequency and prognostic significance of HRX rearrangements in infant acute lymphoblastic leukemia: a Pediatric Oncology Group study. Blood. 1994, 84 (2): 570-573.PubMedGoogle Scholar
- Montes R, Ayllon V, Gutierrez-Aranda I, Prat I, Hernandez-Lamas MC, Ponce L, Bresolin S, Te Kronnie G, Greaves M, Bueno C, Menendez P: Enforced expression of MLL-AF4 fusion in cord blood CD34+ cells enhances the hematopoietic repopulating cell function and clonogenic potential but is not sufficient to initiate leukemia. Blood. 2011, 117 (18): 4746-4758. 10.1182/blood-2010-12-322230.View ArticlePubMedGoogle Scholar
- Bursen A, Schwabe K, Ruster B, Henschler R, Ruthardt M, Dingermann T, Marschalek R: The AF4.MLL fusion protein is capable of inducing ALL in mice without requirement of MLL.AF4. Blood. 2010, 115 (17): 3570-3579. 10.1182/blood-2009-06-229542.View ArticlePubMedGoogle Scholar
- Pui CH, Frankel LS, Carroll AJ, Raimondi SC, Shuster JJ, Head DR, Crist WM, Land VJ, Pullen DJ, Steuber CP, et al: Clinical characteristics and treatment outcome of childhood acute lymphoblastic leukemia with the t(4;11)(q21;q23): a collaborative study of 40 cases. Blood. 1991, 77 (3): 440-447.PubMedGoogle Scholar
- Satake N, Maseki N, Nishiyama M, Kobayashi H, Sakurai M, Inaba H, Katano N, Horikoshi Y, Eguchi H, Miyake M, Seto M, Kaneko Y: Chromosome abnormalities and MLL rearrangements in acute myeloid leukemia of infants. Leukemia. 1999, 13 (7): 1013-1017. 10.1038/sj.leu.2401439.View ArticlePubMedGoogle Scholar
- Pui CH, Raimondi SC, Srivastava DK, Tong X, Behm FG, Razzouk B, Rubnitz JE, Sandlund JT, Evans WE, Ribeiro R: Prognostic factors in infants with acute myeloid leukemia. Leukemia. 2000, 14 (4): 684-687. 10.1038/sj.leu.2401725.View ArticlePubMedGoogle Scholar
- Yan XJ, Xu J, Gu ZH, Pan CM, Lu G, Shen Y, Shi JY, Zhu YM, Tang L, Zhang XW, Liang WX, Mi JQ, Song HD, Li KQ, Chen Z, Chen SJ: Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. Nat Genet. 2011, 43 (4): 309-315. 10.1038/ng.788.View ArticlePubMedGoogle Scholar
- Grossmann V, Tiacci E, Holmes AB, Kohlmann A, Martelli MP, Kern W, Spanhol-Rosseto A, Klein HU, Dugas M, Schindela S, Trifonov V, Schnittger S, Haferlach C, Bassan R, Wells VA, Spinelli O, Chan J, Rossi R, Baldoni S, De Carolis L, Goetze K, Serve H, Peceny R, Kreuzer KA, Oruzio D, Specchia G, Di Raimondo F, Fabbiano F, Sborgia M, Liso A, Farinelli L, Rambaldi A, Pasqualucci L, Rabadan R, Haferlach T, Falini B: Whole-exome sequencing identifies somatic mutations of BCOR in acute myeloid leukemia with normal karyotype. Blood. 2011, 118 (23): 6153-6163. 10.1182/blood-2011-07-365320.View ArticlePubMedGoogle Scholar
- Comino-Mendez I, Gracia-Aznarez FJ, Schiavi F, Landa I, Leandro-Garcia LJ, Leton R, Honrado E, Ramos-Medina R, Caronia D, Pita G, Gomez-Grana A, de Cubas AA, Inglada-Perez L, Maliszewska A, Taschin E, Bobisse S, Pica G, Loli P, Hernandez-Lavado R, Diaz JA, Gomez-Morales M, Gonzalez-Neira A, Roncador G, Rodriguez-Antona C, Benitez J, Mannelli M, Opocher G, Robledo M, Cascon A: Exome sequencing identifies MAX mutations as a cause of hereditary pheochromocytoma. Nat Genet. 2011, 43 (7): 663-667. 10.1038/ng.861.View ArticlePubMedGoogle Scholar
- Saarinen S, Aavikko M, Aittomaki K, Launonen V, Lehtonen R, Franssila K, Lehtonen HJ, Kaasinen E, Broderick P, Tarkkanen J, Bain BJ, Bauduer F, Unal A, Swerdlow AJ, Cooke R, Makinen MJ, Houlston R, Vahteristo P, Aaltonen LA: Exome sequencing reveals germline NPAT mutation as a candidate risk factor for Hodgkin lymphoma. Blood. 2011, 118 (3): 493-498. 10.1182/blood-2011-03-341560.View ArticlePubMedGoogle Scholar
- Homer N, Nelson SF: Improved variant discovery through local re-alignment of short-read next-generation sequencing data using SRMA. Genome Biol. 2010, 11 (10): R99-10.1186/gb-2010-11-10-r99.View ArticlePubMedPubMed CentralGoogle Scholar
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20 (9): 1297-1303. 10.1101/gr.107524.110.View ArticlePubMedPubMed CentralGoogle Scholar
- DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011, 43 (5): 491-498. 10.1038/ng.806.View ArticlePubMedPubMed CentralGoogle Scholar
- Bamford S, Dawson E, Forbes S, Clements J, Pettett R, Dogan A, Flanagan A, Teague J, Futreal PA, Stratton MR, Wooster R: The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. Br J Cancer. 2004, 91 (2): 355-358.PubMedPubMed CentralGoogle Scholar
- Wood RD, Mitchell M, Lindahl T: Human DNA repair genes, 2005. Mutat Res. 2005, 577 (1–2): 275-283. Sep 4View ArticlePubMedGoogle Scholar
- Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP: Integrative genomics viewer. Nat Biotechnol. 2011, 29 (1): 24-26. 10.1038/nbt.1754.View ArticlePubMedPubMed CentralGoogle Scholar
- Lin TL, Wang QH, Brown P, Peacock C, Merchant AA, Brennan S, Jones E, McGovern K, Watkins DN, Sakamoto KM, Matsui W: Self-renewal of acute lymphocytic leukemia cells is limited by the Hedgehog pathway inhibitors cyclopamine and IPI-926. PLoS One. 2010, 5 (12): e15262-10.1371/journal.pone.0015262.View ArticlePubMedPubMed CentralGoogle Scholar
- Taipale J, Cooper MK, Maiti T, Beachy PA: Patched acts catalytically to suppress the activity of Smoothened. Nature. 2002, 418 (6900): 892-897. 10.1038/nature00989.View ArticlePubMedGoogle Scholar
- Taketani T, Taki T, Sugita K, Furuichi Y, Ishii E, Hanada R, Tsuchida M, Ida K, Hayashi Y: FLT3 mutations in the activation loop of tyrosine kinase domain are frequently found in infant ALL with MLL rearrangements and pediatric ALL with hyperdiploidy. Blood. 2004, 103 (3): 1085-1088.View ArticlePubMedGoogle Scholar
- Smith BD, Levis M, Beran M, Giles F, Kantarjian H, Berg K, Murphy KM, Dauses T, Allebach J, Small D: Single-agent CEP-701, a novel FLT3 inhibitor, shows biologic and clinical activity in patients with relapsed or refractory acute myeloid leukemia. Blood. 2004, 103 (10): 3669-3676. 10.1182/blood-2003-11-3775.View ArticlePubMedGoogle Scholar
- DeAngelo DJ, Stone RM, Heaney ML, Nimer SD, Paquette RL, Klisovic RB, Caligiuri MA, Cooper MR, Lecerf JM, Karol MD, Sheng S, Holford N, Curtin PT, Druker BJ, Heinrich MC: Phase 1 clinical results with tandutinib (MLN518), a novel FLT3 antagonist, in patients with acute myelogenous leukemia or high-risk myelodysplastic syndrome: safety, pharmacokinetics, and pharmacodynamics. Blood. 2006, 108 (12): 3674-3681. 10.1182/blood-2006-02-005702.View ArticlePubMedPubMed CentralGoogle Scholar
- Stone RM, Fischer T, Paquette R, Schiller G, Schiffer CA, Ehninger G, Cortes J, Kantarjian HM, Deangelo DJ, Huntsman-Labed A, Dutreix C, Del Corral A, Giles F: Phase IB study of the FLT3 kinase inhibitor midostaurin with chemotherapy in younger newly diagnosed adult patients with acute myeloid leukemia. Leukemia. 2012, 26 (9): 2061-2068. 10.1038/leu.2012.115.View ArticlePubMedPubMed CentralGoogle Scholar
- Stone RM, DeAngelo DJ, Klimek V, Galinsky I, Estey E, Nimer SD, Grandin W, Lebwohl D, Wang Y, Cohen P, Fox EA, Neuberg D, Clark J, Gilliland DG, Griffin JD: Patients with acute myeloid leukemia and an activating mutation in FLT3 respond to a small-molecule FLT3 tyrosine kinase inhibitor, PKC412. Blood. 2005, 105 (1): 54-60. 10.1182/blood-2004-03-0891.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2407/13/55/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.