- Research article
- Open Access
- Open Peer Review
Shifting patterns of genomic variation in the somatic evolution of papillary thyroid carcinoma
BMC Cancervolume 16, Article number: 646 (2016)
Cancer is increasingly understood to arise in the context of dynamically evolving genomes with continuously generated variants subject to selective pressures. Diverse mutations have been identified in papillary thyroid carcinoma (PTC), but unifying theories underlying genomic change are lacking. Applying a framework of somatic evolution, we sought to broaden understanding of the PTC genome through identification of global trends that help explain risk of tumorigenesis.
Exome sequencing was performed on 53 PTC and matched adjacent non-tumor thyroid tissues (ANT). Single nucleotide substitution (SNS) signatures from each sample pair were divided into three subsets based on their presence in tumor, non-tumor thyroid, or both. Nine matched blood samples were sequenced and SNS signatures intersected with these three subsets. The intersected genomic signatures were used to define branch-points in the evolution of the tumor genome, distinguishing variants present in the tissues’ common ancestor cells from those unique to each tissue type and therefore acquired after genomic divergence of the tumor, non-tumor, and blood samples.
Single nucleotide substitutions shared by the tumor and the non-tumor thyroid were dominated by C-to-T transitions, whereas those unique to either tissue type were enriched for C-to-A transversions encoding non-synonymous, predicted-deleterious variants. On average, SNSs of matched blood samples were 81 % identical to those shared by tumor and non-tumor thyroid, but only 12.5 % identical to those unique to either tissue. Older age and BRAF mutation were associated with increased SNS burden.
The current study demonstrates novel patterns of genomic change in PTC, supporting a theory of somatic evolution in which the zygote’s germline genome undergoes continuous remodeling to produce progressively differentiated, tissue-specific signatures. Late somatic events in thyroid tissue demonstrate shifted mutational spectra compared to earlier polymorphisms. These late events are enriched for predicted-deleterious variants, suggesting a mechanism of genomic instability in PTC tumorigenesis.
Papillary Thyroid Carcinoma (PTC) is the most common type of thyroid cancer, accounting for more than 75 % of all thyroid malignancies. The incidence of PTC is increasing more rapidly than any other cancer . Previous studies have noted that PTC carries a low overall burden of genomic insult relative to anaplastic thyroid cancer and other malignancies [2, 3]. Several mutually exclusive, recurrent genomic events have been identified, including BRAF and RAS mutations and RET- and NTRK1- gene fusions which are found in up to 70 % of tumors [4, 5]. Mutations in the phosphoinositide 3-kinase (PI3K) pathway genes PTEN, PIK3CA, and AKT1 have been reported at lower frequencies as well as alterations in the EIF1AX, PPM1D, and CHEK2 genes [2, 6, 7]. This increasingly comprehensive catalog of recurrent genomic events allows for more specific tumor sub-classification and may provide clues to molecular mechanisms driving tumorigenesis. Despite such a diversity of genomic alterations, PTC does not demonstrate great clinical heterogeneity, suggesting that a unifying concept of the genomic contribution to malignant transformation continues to be elusive. In an evolutionary view of cancer, tumorigenesis can be seen in terms of genomic instability causing stochastic accumulation of variants. Those acquired variants providing a selective advantage at the cellular level will be propagated, resulting in stepwise accumulation of diverse genomic alterations, and ultimately tumors with highly individualized genotypes .
The availability of genome-wide data at single-nucleotide resolution has spurred interest in the somatic evolution model, with clonal expansions of cells arising in the context of continuous variant generation and natural selection [9–12]. A recent study correlating tissue-specific cancer risk with the number of stem cell divisions highlighted the importance of the accumulation of stochastic change introduced during successive cycles of DNA replication . Takeda et al. used a mouse model to investigate heterogeneity resulting from an evolving genome, searching for drivers appearing early in tumors’ phylogenetic trees, thereby making them more widely distributed within the tumors . Sequencing of distinct regions within individual lung tumors demonstrated spatial separation of drivers, showing evolutionary divergence of clones with concomitant shifts in mutational spectra . An accompanying study found that primary lung tumors with larger fractions of sub-clonal mutations have increased likelihood of relapse . Finally, studying myeloproliferative neoplasms, Ortmann et al. demonstrated that the order in which mutations occur during clonal evolution plays a critical role in tumor behavior .
The growing number of studies exploring an evolutionary model of cancer provides a framework to explain the sequential acquisition of tumor behaviors (eg. evasion of immune surveillance, metastatic ability, resistance to therapy) and highlights some of the challenges faced in attempting to design tailored therapies [11, 12, 14]. Unlike highly aggressive tumors which demonstrate rapid growth and require multiple chemotherapeutic (and frequently surgical) interventions, PTC’s indolent nature and relatively low mutational burden makes it an ideal candidate for the study of the sequential acquisitions of genomic variations.
This study explores the hypothesis that a rudimentary phylogenetic tree can be constructed for each tumor by exploring three subsets of single nucleotide substitutions (SNS): those present in tumor alone, those present in adjacent non-tumor tissue alone, or those present in both and therefore presumed to reflect the germline or variants accrued early in embryogenesis. Paired blood samples from a subset of patients were also examined to strengthen the hypothesis that SNS signatures of multiple tissues from a single patient, while sharing a great deal of homology derived from their common germline, also demonstrate tissue specific patterns of genomic variation. A more critical examination of the SNSs in each group yields insight into the topography of the PTC mutational signature.
Patient information and tissue samples
DNA from 53 fresh-frozen papillary thyroid tumors and paired non-tumor thyroid tissues resected between 2000 and 2011, for which sufficient quantities of tissue were available, were identified in the local biobank. Diagnosis was confirmed by a pathologist and TNM stages were determined using World Health Organization criteria. DNA was extracted using AllPrep DNA Mini Kit and following the protocol provided by the manufacturer (Qiagen, MD, USA). DNA from nine available matched blood samples was extracted using QIAamp DNA Mini Kit per manufacturer protocol (Qiagen, MD, USA). Patient demographic information is listed in Additional file 1: Table S1.
Genomic DNA from fresh-frozen matched pairs of tumor, corresponding adjacent non-tumor thyroid tissue, and blood samples was sequenced at the Yale Center for Genome Analysis following an internal protocol using NimbleGen v2.0 exome capture reagent (Roche) and sequenced through Illumina HiSeq 2000, 75 base-paired end reads. Reads were subsequently mapped to reference genome hg19 using ELANDv2, single nucleotide variants assigned a quality score (QS) using SAMtools, filtered with bcfTools varFilter using default parameters. Sequencing yielded and average of 95 million and 198 million reads per lane, for non-tumor and tumor tissue, respectively. 20x coverage was achieved for 92 % of non-tumor bases and 95 % of tumor sample bases. A minimum of 8x coverage was required for inclusion and standard quality score cut-offs werre employed. Samples were then annotated using ANNOVAR (www.openbioinformatics.org/annovar) and filtered to remove variants present in dbSNP, build 129 . Additional filters were included for alternate analyses, including dbSNP build 135, and the 1000 Genomes SNP filter. Additionally, 53 race-matched samples that had undergone exome sequencing on the Illumina platform were chosen at random from the 1000 Genomes project and processed identically . Computational prediction of the likelihood that each SNS would result in a deleterious change to the protein product was performed using 5 independent algorithms (PolyPhen-2, MutationTaster, SIFT, radialSvm, lr) as implemented in the ANNOVAR package.
SNS signatures from each PTC were intersected with those from their matched ANT to define subsets of SNSs shared between the tissues (Common subset), and those present in one tissue-type but not the other (Unique-to-PTC, Unique-to-ANT). The nine matched blood samples were intersected with each corresponding PTC/ANT subset individually. ConsensusPathDB [20, 21] pathway enrichment analysis was performed for each subset using all genes containing at least one non-synonymous, stopgain, or stoploss variant. MAPK pathway genes were downloaded from the KEGG database (www.genome.jp/kegg/kegg1.html). RET-PTC fusions were detected with RT-PCR. RNA was isolated using AllPrep DNA/RNA/Protein Mini Kit, cDNA created with iScript cDNA Synthesis Kit, and PCR amplification performed (primers: RET/PTC1 forward primer 5-ATT GTC ATC TCG CCG TTC-3 (H4 domain), RET/PTC1 reverse primer 5-TGC TTC AGG ACG TTG AAC-3, RET/PTC3 forward primer 5-TGGAGA AGA GAG GCT GTA TC-3 (RFG domain), RET/PTC3 reverse primer 5- CGT TGC CTT GAC TTT TC-3). RET/PTC translocations were identified as 306 and 268 base pair amplicons for RET/PTC1 and RET/PTC3, respectively. TERT promoters underwent Sanger sequencing at the Keck DNA Sequencing Facility. All in house computational analyses utilized the R Project for Statistical Computing with statistical testing via Welch two sample t-test and ANOVA.
Results and discussion
Study cohort and sequencing summary
The study cohort included 53 PTC patients with a mean age at the time of surgery of 48.9 years (range 16-97) and a female to male ratio of 2.5 (see Additional file 1: Table S1 for demographic and clinical characteristics). Exome sequencing of genomic DNA from 53 matched PTCs and matched adjacent non-tumor thyroid samples (ANTs) yielded a mean 198- and 95-fold coverage, respectively, with greater than 8x coverage in over 96 % of bases. Tumor purity ranged from 41.1 to 69.5 %.
Analysis of the 53 PTC specimens confirmed the presence of previously identified recurrent alterations, including BRAF (41.5 %), TERT (5.7 %), and RET-PTC fusions (3.8 %). PTCs with BRAF mutations had greater SNS counts than those without (mean 6996 vs. 5976, p < 0.001). Additionally, patients older than 45 years of age had more SNSs than those under forty-five (mean 6705 vs. 6014, p < 0.01). There was no significant difference in SNS count on the basis of gender, stage, or histological subtype (Additional file 1: Table S1).
Intersected SNS signatures define late somatic changes
Global SNS signatures for each paired PTC and ANT are shown in Fig. 1. Across the cohort, between 57 and 88 % of PTC SNSs also exist in the paired ANT. This shared portion (designated hereafter as the Common subset) represents SNSs present in the sample pair’s most recent common cellular ancestor, i.e. those present in the patient’s true germline (the zygote), together with those accrued during embryogenesis and thyroid differentiation, up to the point of divergence of the malignant from the non-malignant thyrocytes. The areas on either flank in the plot are SNSs accrued by the malignant and non-malignant sub-clones following divergence (designated hereafter as the Unique-to-PTC and Unique-to-ANT subsets).
The accumulation of SNSs throughout the lifetime of the thyroid gland should be driven in large part by stochastic change introduced during sequential cycles of DNA replication and cell division within a greater framework mutational susceptibility . Each thyroid gland (as well as every other tissue type) is therefore expected to follow a unique evolutionary pathway, with SNSs accrued at varying points in differentiation, resulting in large degrees of heterogeneity among tumors. Calculating the degree of SNS overlap of each PTC/ANT sample pair with every other sample confirms that on average only 28.0 % of each Common subset’s and 30.6 % of each blood sample’s SNSs are shared with another sample. Those SNSs Unique-to-PTC or Unique-to-ANT share only 6.0 % and 4.9 % of SNSs, respectively (Fig. 2a). For the nine patients with available blood samples, there was significantly greater overlap between the matched Common subset and blood SNSs than between the Unique-to-ANT or Unique-to-PTC subsets and the blood (mean 80.9 % vs. 10.8 % and 14.3 %, respectively; p < 0.0001), strengthening the argument that the Common subset SNSs consist mainly of germline variants plus those accrued earlier in embryogenesis, prior to divergence of hematopoietic from thyrocyte lineages (Fig. 2b).
Shifting mutational spectra increase risk for deleterious change
SNSs are divided into transitions (those exchanging a purine for purine or pyrimidine for pyrimidine) and transversions (those exchanging a purine for pyrimidine or pyrimidine for purine). Transitions are less likely than transversions to result in non-synonymous change in the final amino acid sequence and are more prevalent than transversions, partly due to frequent spontaneous deamination of 5-methylcytosine to thymine [22, 23]. Comparison of transition:transversion ratios (Tr:Tv) demonstrates that the Common subset consistently has the highest ratio, with a mean 3.4- and 3.2-fold increase over the Unique-to-ANT and Unique-to-PTC subsets, respectively (Additional file 2: Figure S1). Fig. 3 demonstrates the distribution of each transition and transversion for the Unique-to-ANT, Common, and Unique-to-PTC subsets as well as the matched blood samples and an independent cohort of blood samples from the 1000 Genomes Project. Those evolutionarily older SNSs (the Common subset) are dominated by C-to-T transitions, while those accrued after divergence of the thyroid-specific subsets (Unique-to-ANT and Unique-to-PTC) demonstrate a dramatic shift toward C-to-A transversions (p < 0.0001), suggesting that evolutionary forces at work on the genome of thyroid progenitor cells in the course of organogenesis differ from those seen later in the course of the gland’s development and during natural thyrocyte turnover. To ensure that the observed trend was not introduced through a filtering bias, the analysis was repeated with varying levels of SNS filtering. The shift in mutational spectra persisted regardless of the level of filtering (Additional file 3: Figure S2).
There are 64 possible codons in the triplet human genetic code, but only 20 amino acids and 3 stop codons to be encoded. This redundancy is the mechanism through which some SNSs result in synonymous changes, whereas others lead to a change in the amino acid sequence. The frequency with which potential synonymous and non-synonymous changes occur differs among the different possible SNSs. For example, 34.4 % of all potential C-to-T SNSs in the genetic code result in synonymous changes, compared to only 19.8 % of all potential C-to-A substitutions (Additional file 4: Figure S3). As therefore expected, the observed shift in PTC mutational spectra at different points in the evolution of its genome is associated with increased non-synonymous change in both the Unique-to-ANT and Unique-to-PTC subsets (p < 0.0001) (Additional file 5: Figure S4). Computational assessment of the likelihood that these changes will impact protein function demonstrated increased numbers of predicted-deleterious variants in the Unique-to-ANT and Unique-to-PTC subsets relative to the Common subset, the blood samples, and the 1000 Genomes cohort (p < 0.0001) (Fig. 4, Additional file 6: Figure S5) . Interestingly, both the malignant (Unique-to-PTC) and the non-malignant (Unique-to-ANT) subsets share this increased propensity to accumulate predicted-deleterious changes, suggesting that: 1) the global shift in mutational pattern is characteristic of the thyroid genome and does not reflect an evolutionary pressure introduced by the tumor; and 2) thyroid tissue appears more tolerant of mutation-prone SNSs at a more divergent point in its phylogenetic tree than earlier in its development.
MAPK pathway genes demonstrate accumulation of non-synonymous variants
To identify potential functional impacts of the observed shift in mutational spectra, pathway-based enrichment analysis was performed on the sets of genes containing non-synonymous SNSs in the Unique-to-ANT, Common, and Unique-to-PTC subsets. Although their FDR-corrected q-values did not reach statistical significance, four pathways with p-values < 0.01 were related to MAPK (Additional file 7: Table S2). Specific examination of MAPK pathway genes in the primary sequencing data revealed several recurrently altered genes in the cohort [25, 26]. Many of the MAPK SNSs appeared in the Common subset, signifying their presence prior to divergence of malignant from non-malignant clones (Fig. 5, Additional file 8: Table S3). Thirty-eight samples (70.4 %) contained a MAP3K4 variant, seventeen of which occurred in the Common subset. This is in contrast to the twenty-three BRAF variants, all of which were in the Unique-to-PTC subset. Interestingly, each BRAF-mutated sample harbored non-synonymous MAPK pathway SNSs in the Common subset, frequently in MAP3K4, CACNA1B, and PAK2. This observation raises the question as to whether BRAF mutation represents a damaging second hit to an already vulnerable pathway. On average, each sample carried 5.9, 5.7, and 3.4 non-synonymous SNSs among MAPK pathway genes in the Unique-to-ANT, Common, and Unique-to-PTC subsets, respectively. Taken together, these findings suggest that non-synonymous SNSs in the MAPK pathway are frequent events, and that a subset of genes within the pathway is preferentially altered in the thyroid of PTC patients.
The genome is a dynamic entity, constantly accruing changes that are subject to selective forces from diverse micro-environments. The true germline should be thought of as that present in the zygote, with subsequent cell divisions introducing change via imperfect DNA replication. Cells following variable differentiation pathways will therefore share a set of variants that derives from their most recent common ancestor cell. The accumulated evidence for this evolutionary framework is substantial and application of its principles will be critical in advancing understanding of the genome in health and disease. The current study, comparing thyroid tumor tissue, non-tumor thyroid tissue, and blood from single individuals demonstrates a great deal of heterogeneity within the genomic variants characterizing any given patient’s tissue types and provides further support for the somatic evolution model. These findings have potentially far-reaching implications regarding the common practice of using blood as the non-tumor control in attempting to identify tumor-specific mutations. The degree of diversity introduced through somatic evolution at every branch-point in differentiation results in vast tissue-specific variability, helping to explain a high incidence of false positives among proposed tumor driver candidates.
Previous analysis of mutational spectra in multiple tumor types demonstrated a predominance of C-to-T transitions in thyroid cancer . Taken in their entirety, the current PTC cohort confirms this mutational pattern but demonstrates an evolutionary time-line wherein later changes are dominated by an alternative, potentially damaging, signature. This variation in SNS type at different points in thyroid development suggests that although individual SNSs may occur stochastically, there exist unifying forces likely informed by the micro-environment of differentiated thyroid as a whole or in subsections of the gland. For instance, binding of triiodothyronine to thyroid hormone receptor B in cell culture and in mice has been shown to increase levels of reactive oxygen species (ROS). ROS generated via normal aerobic metabolism, inflammation, and exposure to ionizing radiation are also known to cause DNA damage, preferentially at guanine residues resulting in 8-oxoguanine (8-oxoG) and thymine glycol . Indeed, thyrocytes are uniquely sensitive to ionizing radiation, an established risk factor for PTC . Chromosomal regions with high 8-oxoG density co-localize with high rates of single nucleotide polymorphism [29–31]. Furthermore, it has long been known that 8-oxoG pairs with adenine, resulting in C-to-A transversions . Accumulation of 8-oxoG occurs in Mth1/Ogg1/Muthy triple knockout mice which develop various types of tumors that are heavily enriched for C-to-A transversions . This series of observations provides a plausible mechanism through which the micro-environment of the thyroid could cause the observed shift in mutational spectra. Interestingly, multifocality is present in 15-25 % of patients, with tumor foci that seem to occur independently . This clinical observation is consistent with the evolutionary model of PTC, in which independent multifocal tumors may arise in the background of genetic predisposition acquired during thyroidogenesis.
The current study represents a first step toward elucidating broader organizing principles that may underlie the individual genomic events previously associated with subsets of PTCs. The observed global shift in genomic variants causes increased likelihood of deleterious change to the encoded proteins in both malignant and non-malignant portions of the thyroid, leading to the postulation that stochastic factors (exact SNS location, order in which variants occur, micro-environment at the time of inception) may influence the potential for any given change to represent a true tumor driver. Further interrogation of the PTC genome should focus on identifying variants that cause evolutionary branch-points, as well as on elucidating the order of variant accrual. Such could be achieved through multiple sampling of individual PTCs, repeat sampling at various time-points, and via computational modeling, all integrated with longitudinal clinical data. Experience in multiple cancer types has shown that therapies targeting single tumor drivers will almost inevitably result in the selection and propagation of clones driven by alternative pathways. By coupling an expanded knowledge of the global trends underlying genomic evolution with a comprehensive catalog of driver variants placed in their evolutionary context, one could instead envision a therapeutic strategy aimed at modulating a dynamic system in order to minimize progression and focus on chronic management.
Adjacent Non-tumor Thyroid
Papillary Thyroid Cancer
Reactive Oxygen Species
Single Nucleotide Substitution
Carling T, Udelsman R. Thyroid cancer. Annu Rev Med. 2014;65:125–37.
Network CGAR. Integrated genomic characterization of papillary thyroid carcinoma. Cell. 2014;159(3):676–90.
Kunstman JW, Juhlin CC, Goh G, Brown TC, Stenman A, Healy JM, Rubinstein JC, Choi M, Kiss N, Nelson-Williams C, et al. Characterization of the mutational landscape of anaplastic thyroid cancer via whole-exome sequencing. Hum Mol Genet. 2015;24(8):2318–29.
Kimura ET, Nikiforova MN, Zhu Z, Knauf JA, Nikiforov YE, Fagin JA. High prevalence of BRAF mutations in thyroid cancer: genetic evidence for constitutive activation of the RET/PTC-RAS-BRAF signaling pathway in papillary thyroid carcinoma. Cancer Res. 2003;63(7):1454–7.
Kondo T, Ezzat S, Asa SL. Pathogenetic mechanisms in thyroid follicular-cell neoplasia. Nat Rev Cancer. 2006;6(4):292–306.
Liu Z, Hou P, Ji M, Guan H, Studeman K, Jensen K, Vasko V, El-Naggar AK, Xing M. Highly prevalent genetic alterations in receptor tyrosine kinases and phosphatidylinositol 3-kinase/akt and mitogen-activated protein kinase pathways in anaplastic and follicular thyroid cancers. J Clin Endocrinol Metab. 2008;93(8):3106–16.
Abubaker J, Jehan Z, Bavi P, Sultana M, Al-Harbi S, Ibrahim M, Al-Nuaim A, Ahmed M, Amin T, Al-Fehaily M, et al. Clinicopathological analysis of papillary thyroid cancer with PIK3CA alterations in a Middle Eastern population. J Clin Endocrinol Metab. 2008;93(2):611–8.
Nowell PC. The clonal evolution of tumor cell populations. Science. 1976;194(4260):23–8.
Anderson K, Lutz C, van Delft FW, Bateman CM, Guo Y, Colman SM, Kempski H, Moorman AV, Titley I, Swansbury J, et al. Genetic variegation of clonal architecture and propagating cells in leukaemia. Nature. 2011;469(7330):356–61.
Gerlinger M, Rowan AJ, Horswell S, Larkin J, Endesfelder D, Gronroos E, Martinez P, Matthews N, Stewart A, Tarpey P, et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med. 2012;366(10):883–92.
Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW, Raine K, Jones D, Marshall J, Ramakrishna M, et al. The life history of 21 breast cancers. Cell. 2012;149(5):994–1007.
Yates LR, Campbell PJ. Evolution of the cancer genome. Nat Rev Genet. 2012;13(11):795–806.
Tomasetti C, Vogelstein B. Cancer etiology. Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science. 2015;347(6217):78–81.
Takeda H, Wei Z, Koso H, Rust AG, Yew CC, Mann MB, Ward JM, Adams DJ, Copeland NG, Jenkins NA. Transposon mutagenesis identifies genes and evolutionary forces driving gastrointestinal tract tumor progression. Nat Genet. 2015;47(2):142–50.
Zhang J, Fujimoto J, Wedge DC, Song X, Seth S, Chow CW, Cao Y, Gumbs C, Gold KA, Kalhor N, et al. Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing. Science. 2014;346(6206):256–9.
de Bruin EC, McGranahan N, Mitter R, Salm M, Wedge DC, Yates L, Jamal-Hanjani M, Shafi S, Murugaesu N, Rowan AJ, et al. Spatial and temporal diversity in genomic instability processes defines lung cancer evolution. Science. 2014;346(6206):251–6.
Ortmann CA, Kent DG, Nangalia J, Silber Y, Wedge DC, Grinfeld J, Baxter EJ, Massie CE, Papaemmanuil E, Menon S, et al. Effect of mutation order on myeloproliferative neoplasms. N Engl J Med. 2015;372(7):601–12.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Subgroup GPDP. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.
Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA, Consortium GP. An integrated map of genetic variation from 1092 human genomes. Nature. 2012;491(7422):56–65.
Kamburov A, Wierling C, Lehrach H, Herwig R. ConsensusPathDB--a database for integrating human functional interaction networks. Nucleic Acids Res. 2009;37(Database issue):D623–628.
Kamburov A, Pentchev K, Galicka H, Wierling C, Lehrach H, Herwig R. ConsensusPathDB: toward a more complete picture of cell biology. Nucleic Acids Res. 2011;39(Database issue):D712–717.
Vogel F, Kopun M. Higher frequencies of transitions among point mutations. J Mol Evol. 1977;9(2):159–80.
Shen JC, Rideout WM, Jones PA. The rate of hydrolytic deamination of 5-methylcytosine in double-stranded DNA. Nucleic Acids Res. 1994;22(6):972–6.
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–9.
Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27–30.
Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014;42(Database issue):D199–205.
Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA, et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013;499(7457):214–8.
Sarasin A, Bounacer A, Lepage F, Schlumberger M, Suarez HG. Mechanisms of mutagenesis in mammalian cells. Application to human thyroid tumours. C R Acad Sci III. 1999;322(2-3):143–9.
Barnes DE, Lindahl T. Repair and genetic consequences of endogenous DNA base damage in mammalian cells. Annu Rev Genet. 2004;38:445–76.
Kasai H, Nishimura S. Hydroxylation of deoxyguanosine at the C-8 position by ascorbic acid and other reducing agents. Nucleic Acids Res. 1984;12(4):2137–45.
Ohno M, Miura T, Furuichi M, Tominaga Y, Tsuchimoto D, Sakumi K, Nakabeppu Y. A genome-wide distribution of 8-oxoguanine correlates with the preferred regions for recombination and single nucleotide polymorphism in the human genome. Genome Res. 2006;16(5):567–75.
Shibutani S, Takeshita M, Grollman AP. Insertion of specific bases during DNA synthesis past the oxidation-damaged base 8-oxodG. Nature. 1991;349(6308):431–4.
Ohno M, Sakumi K, Fukumura R, Furuichi M, Iwasaki Y, Hokama M, Ikemura T, Tsuzuki T, Gondo Y, Nakabeppu Y. 8-oxoguanine causes spontaneous de novo germline mutations in mice. Sci Rep. 2014;4:4689.
Shattuck TM, Westra WH, Ladenson PW, Arnold A. Independent clonal origins of distinct tumor foci in multifocal papillary thyroid carcinoma. N Engl J Med. 2005;352(23):2406–12.
The authors are indebted to the financial support provided by the Damon Runyon Cancer Research Foundation (T.C.); the Stockholm County Council (C.C.J.); and the Howard Hughes Medical Institute (R.P.L.). The study was also supported by the Yale University-Gilead, Inc. collaboration, the Ohse Research Award from the Yale Department of Surgery (J.C.R), and the American Cancer Society (RSGM-10-038-01-CCE).
The authors are indebted to the financial support provided by the Damon Runyon Cancer Research Foundation (T.C.); the Stockholm County Council (C.C.J.); and the Howard Hughes Medical Institute (R.P.L.). The study was also supported by the Yale University-Gilead, Inc. collaboration, the Ohse Research Award from the Yale Department of Surgery (J.C.R), and the American Cancer Society (RSGM-10-038-01-CCE). None of these funding sources participated in any way in the design of the study or the collection, analysis, and interpretation of data or in writing the manuscript.
Availability of data and materials
The data set supporting the results of this article has been submitted in full to the Sequence Read Archive at the NCBI (http://www.ncbi.nlm.nih.gov/sra; SRA Accession number: SRP077944; BioProject: PRJNA327846).
JCR conceived of the study, carried out computational analyses, drafted the manuscript. TCB carried out molecular genetic studies and participated in drafting the manuscript. ECL contributed to experiment design, performed statistical analysis and participated in drafting the manuscript. YZ performed molecular genetic studies. JWK performed molecular genetic studies and participated in sequence alignment. CCJ participated in experiment design, analysis, and drafting of the manuscript. CNW performed molecular genetic studies. GG performed sequence alignment. CEQ participated in sample acquisition and analysis and drafting the manuscript. GGC participated in sample acquisition and analysis and drafting the manuscript. RU participated in sample acquisition and experiment design and drafting of the manuscript. RPL participated in study design and analysis. RK conceived of the study, performed molecular genetics studies, and participated in drafting of the manuscript. TC conceived of the study, participated in sample acquisition, oversaw all molecular studies, and participated in drafting the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Approval for this study was obtained by the Yale University Institutional Review Board. Informed written consent for the use of tissue and clinical information was obtained from all patients who underwent thyroid resection at Yale-New Haven Hospital and are included in the study.
Shows Patient Demographic and Clinical Information. (XLSX 12 kb)
Shows the Transition-to-Transversion ratios. The Common Subset demonstrates a higher Tr:Tv than both the Unique-to-ANT and Unique-to-PTC subsets. (TIFF 1993 kb)
Shows the mutational spectra using alternate SNS filtering methods demonstrate that the shift in predominant SNS from C-to-T to C-to-A persists regardless of the level of SNS filtering: a) unfiltered; b) filtered to remove variants contained in dbSNP, build 135; and c) filtered to remove both dbSNP build 135 and 1000 Genomes variants. (TIF 115 kb)
Describes mutations at the level of the Genetic Code. Plotting the functional consequence of each base change upon translation of the sixty-four triplet codons in the genetic code shows that 34.4 % of potential C-to-T transitions result in synonymous change compared to only 19.8 % of C-to-A transversions. (TIFF 1550 kb)
Demonstrates the functional consequence of each SNS. Relative to the Common subset and the 1000 Genomes cohort, the increased frequency of C-to-A transversions in the Unique-to-PTC and Unique-to-ANT subsets show an accompanying increase in the frequency of non-synonymous change (p < 0.0001). (TIFF 2025 kb)
Demonstrates predicted impact of SNS on protein function. Based upon computational prediction with MutationTaster, RadialSVM, SIFT, LR. The observed shift toward increased numbers of C-to-A transversions is accompanied by an increased likelihood of accruing damaging variants in both the Unique-to-PTC and Unique-to-ANT Subsets across multiple algorithms (all p-values < 0.0001). (TIFF 3828 kb)
Shows ConsensusPathDB Pathway analysis. (XLSX 23 kb)
Shows Single Nucleotide Substitutions in the MAPK Pathway. (XLSX 43 kb)