Molecular profile of KRAS G12C-mutant colorectal and non-small-cell lung cancer

Background KRAS is the most frequently mutated oncogene in cancer, however efforts to develop targeted therapies have been largely unsuccessful. Recently, two small-molecule inhibitors, AMG 510 and MRTX849, have shown promising activity in KRAS G12C-mutant solid tumors. The current study aims to assess the molecular profile of KRAS G12C in colorectal (CRC) and non-small-cell lung cancer (NSCLC) tested in a clinical certified laboratory. Methods CRC and NSCLC samples submitted for KRAS testing between 2017 and 2019 were reviewed. CRC samples were tested for KRAS and NRAS by pyrosequencing, while NSCLC samples were submitted to next generation sequencing of KRAS, NRAS, EGFR, and BRAF. Results The dataset comprised 4897 CRC and 4686 NSCLC samples. Among CRC samples, KRAS was mutated in 2354 (48.1%). Most frequent codon 12 mutations were G12D in 731 samples (14.9%) and G12V in 522 (10.7%), followed by G12C in 167 (3.4%). KRAS mutations were more frequent in females than males (p = 0.003), however this difference was exclusive of non-G12C mutants (p < 0.001). KRAS mutation frequency was lower in the South and North regions (p = 0.003), but again KRAS G12C did not differ significantly (p = 0.80). In NSCLC, KRAS mutations were found in 1004 samples (21.4%). As opposed to CRC samples, G12C was the most common mutation in KRAS, in 346 cases (7.4%). The frequency of KRAS G12C was higher in the South and Southeast regions (p = 0.012), and lower in patients younger than 50 years (p < 0.001). KRAS G12C mutations were largely mutually exclusive with other driver mutations; only 11 NSCLC (3.2%) and 1 CRC (0.6%) cases had relevant co-mutations. Conclusions KRAS G12C presents in frequencies higher than several other driver mutations, and may represent a large volume of patients in absolute numbers. KRAS testing should be considered in all CRC and NSCLC patients, independently of clinical or demographic characteristics. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-021-07884-8.


Introduction
The field of precision oncology has revolutionized the way cancer is approached in the past two decades [1][2][3]. Throughout the years, numerous compounds have been developed targeting proteins responsible for the development and maintenance of cancer cells, with relevant implications in the clinic. In parallel, a better understanding of cancer biology and genetics enabled researchers to define predictive biomarkers that help oncologists select the most suitable targeted therapies for patients [4].
RAS genes were identified as the first oncogenes in 1982; specifically a single missense mutation in HRAS codon 12 was found in a bladder carcinoma cell line [5,6]. Subsequently, the somatic origin of RAS mutations was confirmed, three isoforms were described and hotspots were found at codons 12, 13, and 61. KRAS is the most frequently mutated oncogene in cancers, with mutation rates of up to 96% in pancreatic cancers [7,8], 54% in colorectal cancers (CRC) [9,10], and 39% in lung adenocarcinomas [11]. RAS proteins belong to a superfamily of small GTPases that regulate fundamental intracellular signaling pathways involved in cell growth and survival [12,13]. RAS missense mutations stabilize the protein in its active, GTP-bound state, which leads to sustained transduction of pathways including the MAPK [14][15][16].
In CRC, it has been demonstrated that RAS mutations are associated with lack of response to monoclonal antibodies against the epidermal growth factor receptor (EGFR), an upstream target related to RAS [17]. RAS testing is currently required in metastatic CRC in order to rule out the presence of missense mutations, and select patients for anti-EGFR antibodies [17]. In non-small cell lung cancer (NSCLC), RAS mutations have also been related to limited benefit from EGFR tyrosine kinase inhibitors [18]. Subsequently, EGFR mutations per se were shown to determine sensitivity to these inhibitors, and are mostly mutually exclusive with RAS [2]. Several other genomic alterations have been shown to determine sensitivity to specific targeted therapies in NSCLC [19]. Routine testing of gene mutations or fusions involving EGFR, ALK, RET, ROS1, MET, BRAF, ERBB2, and NTRK are recommended to select patients for targeted therapy [20].
KRAS G12C mutant has been exploited as a potential target for novel therapies. Indeed, the codon 12 cysteine residue may serve as a binding pocket for covalent inhibitors, stabilizing KRAS in an inactive, GDP-bound form [21]. Two novel compounds have been developed to act against KRAS G12C and are under clinical development, sotorasib (AMG 510) [21] and adagrasib (MRTX849) [22]. Understanding the frequency of KRAS G12C and co-occurring mutations is crucial to define strategies for cancer control. Herein we present a large review of CRC and NSCLC tumor samples submitted for KRAS testing in a clinical certified laboratory.

Study design
We retrospectively analyzed deidentified data from CRC and NSCLC tumor samples submitted to KRAS testing in a clinical certified laboratory (Progenética, Grupo Pardini -Brazil) between 2017 and 2019. Available clinical data were reviewed and annotated. No patient had duplicated tumor samples analyzed. Samples were previously acquired for clinical purpose, included either biopsy or resection, and were all preserved in formalin-fixed paraffin-embedded (FFPE) tissue blocks. Following the laboratory routine, tumor slides were examined to confirm sample diagnosis and adequacy for sequencing. At least two 4-6 micra tumor slides were required, with tumor content of 5% or more. DNA was extracted using an automated protocol (QIAsymphony DSP DNA Mini Kit, Qiagen; Hilden, Germany), and quantified using the Qubit 2.0 fluorometer (Thermo Fisher Scientific, Waltham, Massachusetts, US). A minimum of 10 ng of genomic DNA was required for targeted sequencing.
CRC samples were tested for KRAS and NRAS mutations using pyrosequencing (Therascreen KRAS Pyro Kit and Therascreen RAS Extension Pyro Kit, Qiagen; Hilden, Germany), while NSCLC samples were tested for KRAS, NRAS, BRAF, and EGFR by next generation sequencing (NGS; Thermo Fisher Technology; Waltham, Massachusetts, United States). Co-occurring mutations were confirmed by an orthogonal method. The pyrosequencing was performed according to handbooks of Kits' manufacturer (Therascreen KRAS Pyro® Handbook and Therascreen RAS Extension Pyro® V2 Kit Handbook). For NGS analysis, raw sequence data were mapped to the hg19 human reference genome using Torrent Mapping Alignment Program aligner. Torrent Coverage Analysis Plugin implemented in v5.0 of the Torrent Suite software (Thermo Fisher Scientific) was used to perform initial quality control and used to assess amplicon coverage for regions of interest. Regions were obtained after filtering of uniformity (> 80%), on-target reads (> 50%) and minimum mapped reads of 25.000. Variant calling was carried out using Ion Reporter v5.0 (Thermo Fisher Scientific) with the default setting of somatic parameters (minimum variant quality of 10, minimum coverage of 5×, maximum strand bias of 0.95, and minimum variant score of 6). The lowest limit of detection for low-frequency variants was 5%. Variants were annotated using the following databases: 5000Exomes V.

Statistical analysis
Clinical and molecular data (age, gender, geographic region, histology, and mutation status) are described in absolute numbers and frequencies. Age is also provided as mean and standard error (SE). Categorical variables were compared using Pearson's Chi-square test. KRAS status was classified as G12C mutant, non-G12C mutant ("KRAS Others"), and wild type. All comparisons included two-tailed tests, with level of significance set at 5%. All analyses were performed using IBM SPSS software (version 26.0; IBM Corporation, Armonk, NY, USA). All methods were carried out in accordance with relevant guidelines and regulations.

Ethical considerations
The protocol was reviewed and approved by local Ethics Committee (Research Ethics Committee of the Minas Gerais Social Security Institute -IPSEMG). A waiver for the informed consent was also approved by the Research Ethics Committee of the Minas Gerais Social Security Institute since all patients had previously signed an authorization for testing, and data were collected retrospectively through molecular reports review. No information capable of identifying patients was used.

Patient and sample characteristics
The dataset comprised 4897 CRC and 4686 NSCLC samples (Table 1). Mean age was 59.8 years (SE 0.186) and 65.6 years (SE 0.170) among CRC and NSCLC patients, respectively. Patient cases were well balanced according to gender in both cohorts (51.8 and 45.7% males, respectively). Brazilian Southeast and South regions were the most frequently represented, however all regions were present in the complete dataset. Adenocarcinoma was the most frequent histological subtype in both cases (87.8 and 81.4%, respectively). All KRAS G12C mutations were caused by a c.34G > T missense mutation.
Among 4686 NSCLC samples, KRAS mutations were found in 1004 (21.4%), mostly at codon 12 in 875 (18.7%) and codon 13 in 64 (1.4%). The distribution of KRAS G12 mutations was significantly distinct from CRC data (p < 0.001; Figs. 1 and 2). As opposed to CRC samples, G12C was the most common mutation in KRAS, in 346 cases (7.4%), followed by G12V in 206 (4.4%) and G12D in 188 (4.0%). Mean age of KRAS G12C patients was 67.0 years (SE 0.486), and half were males (50.0%). In contrast to CRC data, gender was not associated with KRAS mutation status (p = 0.229). On the other hand, significant associations were described between KRAS status, region (p = 0.003), and age (p < 0.001). The frequency of KRAS G12C was higher in the South and Southeast regions (8.2 and 8.1%, respectively), as opposed to 5.1 and 3.6% in the Northeast and North,  Fig. 4 and Supplemental Table 2).

Discussion
KRAS G12C has become a promising target for novel directed therapies in solid tumors [23], adding up a new role for routine KRAS testing. Given the lack of comprehensive and up to date information on the frequency of KRAS G12C and its characteristics of presentation, our group reviewed data from a large cohort of CRC and NSCLC samples (almost 10 thousand samples overall) submitted to KRAS testing in a clinical certified laboratory. Our data provide relevant input to guide future efforts in the field.
We described a frequency of KRAS G12C of 3.4% in CRC and 7.4% in NSCLC, which is higher than rates of several other driver mutations currently in clinical use to guide targeted therapies [24]. In absolute numbers, these figures represent a large volume of patients that may benefit from novel directed therapies. The percentages described herein are in line with COSMIC database, where a frequency of 2.6 and 6.9% were reported in colon and lung adenocarcinomas, respectively [16,25]. In Brazil, a comprehensive review of 8234 metastatic CRC patients found KRAS mutations in 31.9% [26]. G12C was present in 7.9%, but ranged from 7.1% in the Southeast to 12.2% in the North region [26]. The frequency of KRAS G12C was slightly higher in females  than in males (8.6 and 7.1%, respectively) [26]. No association was observed between age, gender, region, and KRAS G12C status in the current CRC cohort. KRAS mutational data were described in a review of 513 lung adenocarcinomas from Brazilian patients profiled by NGS [27]. KRAS G12V and G12C were the most common mutations, in 6.9 and 6.7%, respectively. Analysis of clinical and demographic characteristics was not provided. In a review of 5738 NSCLC samples from Latin America, KRAS mutations were found in 14% of samples, however no details were offered on KRAS G12C [28]. The current NSCLC data adds up by providing detailed KRAS analysis, and comparisons to clinical and demographic information. Our group found that KRAS G12C was significantly more frequent in older patients, as well as those from South and Southeast regions. A plausible explanation is the higher proportion of tobacco-related diseases at older age, hence linking this risk factor to KRAS G12C-related NSCLC. Also, South and Southeast are the regions with highest consumption of tobacco in the country [29], reinforcing the relationship between tobacco and KRAS G12C-mutant NSCLC. South and Southeast are also regions with the highest income in the country; molecular assessment may be more appropriate in these areas.
In agreement with COSMIC [25], KRAS G12C was the most frequent mutation in NSCLC, while KRAS G12D and G12V were the most common in CRC. The clinical observation that KRAS mutations differ in position and type of substitution according to cancer type is an intriguing, and yet unexplained phenomenon [16]. It has been suggested that distinct carcinogens may act differently to cause specific KRAS mutations and create selective pressures that will guide the process of tumor initiation in each tissue type [16]. For instance, KRAS G12C mutation is universally caused by a single nucleotide substitution at position 34, G > T. This transversion is commonly found in mutational signatures associated with tobacco-related carcinogens in lung cancer [30]. On the other hand, KRAS G12D is caused by a G > A transition characteristic of CRC mutational signatures [30].
KRAS mutations are classic driver oncogenes, and rarely co-occur with other targetable mutations. In our dataset, 21 KRAS G12C-mutant NSCLC cases (6.1%) presented reportable co-mutations, however only 11 (3.2%) were canonical nucleotide variations with an allelic frequency of 5% or more. In a large database including 1078 KRAS-mutant NSCLC samples [31], EGFR mutations were found in 1.2%, BRAF in 1.4%, and NRAS in 0.5%. KRAS G12C was associated with a higher frequency of ERBB2 amplification and ERBB4 mutations, while PTEN and BRAF mutations were less common than in the total cohort [31]. The finding that 2 or more gene mutations may co-occur within the same tumor likely reflects a multiclonal presentation that leads to tumor heterogeneity [31]. KRAS mutations may also emerge as a mechanism of secondary resistance to targeted therapies [32]. KRAS mutations commonly present alongside mutations in TP53, STK11, and KEAP1 and these co-mutations define distinct signature with potential clinical implications [33]. These genes were not assessed in our database.
The current study was based on laboratory reports; therefore limited clinical data were available. For instance, information on smoking status, disease stage, treatment, and outcomes were not provided. On the other hand, a large sample size was reached in a short and contemporary timeframe, which supports the reproducibility of our dataset. In addition, data analyzed was representative of the 5 geographic regions in the country. Brazil spans a large territorial extension and there may exist some degree of genetic background heterogeneity in the population. Differences in risk factors and disease presentation are expected, further supporting the strength of our results. It should be noted that North and Northeast regions were underrepresented in the current study, which may have caused an imbalance in the analysis. Future studies should focus on these regions to increase the power to confirm our findings. Lastly, modern sequencing platforms were applied in the current analysis, including pyrosequencing and NGS. The greater accuracy of these methods makes results more reliable than past publications in the field.