Exploring the prognostic significance of PKCε variants in cervical cancer

Background Protein Kinase C-epsilon (PKCε) is a member of the novel subfamily of PKCs (nPKCs) that plays a role in cancer development. Studies have revealed that its elevated expression levels are associated with cervical cancer. Previously, we identified pathogenic variations in its different domains through various bioinformatics tools and molecular dynamic simulation. In the present study, the aim was to find the association of its variants rs1553369874 and rs1345511001 with cervical cancer and to determine the influence of these variants on the protein-protein interactions of PKCε, which can lead towards cancer development and poor survival rates. Methods The association of the variants with cervical cancer and its clinicopathological features was determined through genotyping analysis. Odds ratio and relative risk along with Fisher exact test were calculated to evaluate variants significance and disease risk. Protein-protein docking was performed and docked complexes were subjected to molecular dynamics simulation to gauge the variants impact on PKCε’s molecular interactions. Results This study revealed that genetic variants rs1553369874 and rs1345511001 were associated with cervical cancer. Smad3 interacts with PKCε and this interaction promotes cervical cancer angiogenesis; therefore, Smad3 was selected for protein-protein docking. The analysis revealed PKCε variants promoted aberrant interactions with Smad3 that might lead to the activation of oncogenic pathways. The data obtained from this study suggested the prognostic significance of PRKCE gene variants rs1553369874 and rs1345511001. Conclusion Through further in vitro and in vivo validation, these variants can be used at the clinical level as novel prognostic markers and therapeutic targets against cervical cancer.


Background
Cervical cancer has the fourth highest incidence rate and remains the fourth highest cause of cancer-related deaths in females around the world, with 342,000 deaths and 604,000 new cases reported globally in 2020.According to WHO cancer statistics in 2020 cervical cancer is the one of the most prevalent cancer in Pakistani female population with third highest mortality rate and 3,197 estimated deaths [1,2].Chronic infections with high-risk human papillomavirus (HPV) are responsible for 99.7% of cases of cervical cancer [3].However, progression to cervical cancer is a gradual process and involves three stages of pre-cancer lesions (i.e., cervical intraepithelial neoplasia: CIN-I, CIN-II, CIN-III) prior to the development of invasive cervical cancer [4,5].Factors other than HPV infection, including reproductive and sexual factors, oral contraceptives, family history, tobacco smoking, and socioeconomic status.Moreover, various studies have proved that genetic factors of hosts such as SNPs, also contribute to the development of cervical lesions and cervical cancer [6,7].
PKCs belong to AGC-kinase family of serine/threonine kinases, and they are known to act as intersection points in various cellular processes and signal-transduction pathways [8,9].PKCε (alternative name KPCE) is an isoform in the calcium-independent novel PKC sub-family [10].It is a multi-functional protein involved in various cellular processes, encoded by PRKCE gene [11,12].Studies have shown that it is a transforming oncogene and its elevated levels regulate various signal transducers in oncogenic processes in various malignancies including cervical cancer [13][14][15][16][17]. Various studies have reported that expression of PKC isoforms is elevated in cervical cancer [18,19].
Single nucleotide polymorphisms (SNPs) are the most common form of genetic variants occurring in human [20].SNPs in the coding sequence of genes might have deleterious effects on the structure and function of the protein that may lead to or contribute to oncogenesis [21,22]; as such, SNPs can be regarded as the genetic basis and as potential genetic markers for cancer susceptibility of an individual [23].There are various SNPs identified in genes IL10, TP53, XRCC1, PAX8, CLPTM1L, HLA-A, and TGFBR2 which are associated with cervical carcinoma in different populations globally [4,24,25].Genetic variants in PRKCE alter the structure and functions of the protein, ultimately leading towards cancer [26].A few SNPs in PKC isoforms (rs454006 of PRKCG; rs546950 and rs4955720 of PRKCI; rs9907521 of PRKCA) have been reported to be associated with various cancers [27][28][29][30][31]. Recently, Khan et al. reported that several missense variants present in different domains of PKCε are responsible for altering the normal functioning of the protein and might play roles in disease contribution.The C2-like regulatory domain of PKCε is involved in regulating interactions with proteins, which are essential for controlling PKCε activation.Variants present in the regulatory domain tend to affect PKCε`s flexibility, leading to deviation in the ability of PKCε to make protein-protein interactions that might be associated with disease progression [32].In a recent study it was indicated that the PRKCE variant rs1553369874 was associated with HCVinduced HCC [33].However, so far, no published data has reported the association of these variants with the risk of cervical cancer in in vivo or in vitro studies.
In this study, two missense variants rs1553369874 and rs1345511001 of PRKCE, resulting in amino acid substitutions E14K and D39H, respectively, in the C2-like regulatory domain of the protein were selected.Their genotype frequencies were determined in cervical cancer patients and healthy control groups to reveal their possible association with the risk of cervical cancer in Pakistani population.Additionally, the findings of this study can be extended beyond the Pakistani population to the global context in the future, as these PKCε variants might be associated with the risk of cervical cancer in other populations as well, hence holding the potential to be used as a novel biomarker for the prognosis of cervical cancer.
Protein interactions studies have proven to be helpful in the identification of cancer biomarkers by providing the knowledge about the physiological functioning of these biomolecules and their interactions occurring in an organism [34,35].Therefore, to evaluate the functional impact of these variants on the protein-protein interactions of PKCε, molecular docking was performed using HADDOCK [36] with Smad3, an essential component in TGF-β signaling pathway known to be regulated by PKCε [16,37].Smad3 is also known to mediate carcinogenic processes, particularly in cervical cancer [38,39].Lastly, the predicted docked structures were evaluated through molecular-dynamic simulations for further analysis and to attain improved insight to the molecular interactions of PKCε.

Sample collection and processing
Prior to the initiation of current study, ethical approval from institutional review board (IRB No. 10-2021-01/01) was obtained from parent department Atta-ur-Rahman School of Applied Biosciences (ASAB) of National University of Sciences and Technology (see online source 1).Written and oral informed consent were attained from each participant before the collection of blood samples (see online source 2).All methods were carried out in accordance with relevant guidelines and regulations.
In this study, 95 female cervical cancer patients were ascertained from combined military hospital (CMH) after a thorough process of histological diagnosis.An equal number of control samples were also randomly collected.Patients suffering from HPV-induced cervical cancer were included for this study, while patients suffering with co-morbidity leading to cervical cancer and HIV co-infections were excluded from this study.Blood samples were obtained from the participants and stored in anticoagulating EDTA tubes for genotype analysis.

Genotype analysis
DNA from whole blood samples of study subjects was extracted through organic (phenol-chloroform) extraction protocol [40].The primers for variants analysis were designed using Primer 1 software [41].Four primers were designed for genotyping of each genetic variant.Two outer primers and two internal primers were designed in such a manner that they must amplify that region of the gene which carrying the targeted variants.Primers were also validated by UCSC In silico PCR [42].The genotyping analysis for variants rs1553369874 (E14K) and rs1345511001 ( D39H) was performed through Tetra-ARMS PCR in Veriti™ 96-Well Thermal Cycler of Applied Biosystems™.Reaction mixture of 25 µl was prepared for every sample using Solis BioDyne FIREPol® Master Mix with 7.5 mM MgCl 2 .The primer sequences and parameters for PCR-reaction are given in Table 1.The amplified product was analyzed by agarose electrophoresis (2% W/V) under UV-transilluminator.

Statistical analysis
Statistical analysis on obtained genotyping data was performed using GraphPad prism 9.0 software (California, USA).Fisher exact test was applied on both cervical cancer and control samples.The analysis of Odds ratios and relative risk was also performed and their respective confidence-intervals were also identified.The P-value ≤ 0.05 was regarded as statistically significant.

Predicting the effect of variants on secondary structures of mRNA
To predict and study the effect of genetic variants on the secondary structure of PKCε mRNA an online mRNA structure prediction bioinformatics tool RNA-fold was used (http://rna.tbi.univie.ac.at/) [43,44].The minimum free energy (MFE) for wildtype and variants mRNA was predicted.MFE and secondary structures of wild-type and variants mRNAs were compared and analysed for the structural stability.

Protein-protein interaction analysis
To determine the impact of amino acid variants in PKCε on its interactions with target proteins, molecular docking was performed.The 3D structure of Smad3 used in molecular docking simulation was predicted from I-TASSER https://zhanglab.ccmb.med.umich.edu/I-TASSER/that use fold recognition/threading approach for 3D modeling of protein structures [45].3D structure of Smad3 was selected based on the highest C-score value depicting the global accuracy of the predicted model as well as low RMSD and local error values.The protein structures were further validated using Ramachandran analysis using PROCHECK [46].Smad3 was chosen after the verification from literature that it is a target protein of PKCε and it is involved in the regulation of cancer cell metabolism and proliferation [16,37,47].The variant structures of PKCε were attained through In silico mutagenesis using PyMol v4.0.4 [48].For first variant glutamic acid at residue position 14 was substituted with lysine and aspartic acid at residue 39 was mutated to histidine in the second variant structure and they were subsequently used for molecular docking and moleculardynamic simulations.Three Molecular docking simulations (WT-Smad3, E14K-Smad3, D39H-Smad3) were performed through HADDOCK SERVER 2.4 [36].The cluster with lowest Z-value was selected from each docking simulation and docked structure from each cluster was chosen for further analysis.

Interaction dynamic analysis
The dynamics of molecular interactions between native PKCε and its variants with Smad3 were studied through MD simulations using GROMACS 2016 [49], with OPLS-AA force field [50] that was used for the simulation of wildtype and variants PKCε/Smad3 complexes.A cubic box was formed around each complex for solvation by adding SPC216 water molecules, and subsequent neutralization by incorporating Na + / Cl − ions.Initial energy minimization of MD-simulations was performed for 50,000 steps, followed by NVT and NPT equilibration for 100ps, through steepest descent.From the same random seed, trajectories of MD complexes were initiated.MD-simulations for wildtype and variants complexes were performed for the production run of 10ns.Dynamic trajectories were constructed using in-built programs of GROMACS 2016 (gmx_ trjconv).The visualization of docked structure was performed through LIGPLOT [51] and contact analysis of MD simulations was mostly performed using VMD [52].Several structural analyses of complexes were also performed.Calculation of root mean square deviation (RMSD for protein backbone) was performed using command gmx_ rms, radius of gyration (Rg for protein, backbone) was calculating using gmx_ gyrate command, analysis of solvent accessibility surface area (SASA) number of hydrogen bonds was performed using gmx_ sasa and gmx_ hbond commands respectively.All MD analyses were plotted as scatter line plots.

Pathway construction
In order to construct a molecular pathway illustrating the impact of amino acid variations in PKCε and its subsequent effect on cellular signalling, Kyoto Encyclopedia of Genes and Genomes (KEGG: hsa:351) [53] and Gen-eMANIA [54] were employed.Furthermore, STRING database [55] was used to predict and analyze genetic interactions while functional annotation of genes and the genetic pathway was attained through DAVID database [56].

Clinical features of cervical cancer in patients
The current study involved 95 cervical cancer patients.The general features of the patients are presented in Table 2.The ages of cervical cancer patients range from 35 years to 66 years, whereas their median age was 51 years.Furthermore, the median age of the control group was 41 years, with the range of ages from 30 years to 64 years.

Association of prkce genetic variants with cervical cancer and control
DNA extracted from the samples was genotyped for the presence of missense variants rs1553369874 (g.45,652,140 G > A) and rs1345511001 (g.45652215G > C) in the PRKCE gene; these variants result in the substitution of Glutamic acid (E) to Lysine (K) at residue 14 and Aspartic acid (D) to Histidine (H) at position 39.Tetra ARMS-PCR was used for the process of genotyping.This technique utilizes four primers that amplify the targeted gene sequence and result in an outer control band and inner genotype bands.
The frequency distributions for genotypes of both PKCε genetic variants rs1553369874, and rs1345511001 for control and cervical cancer samples are presented in Table 3.It was found that for variant rs1553369874 genotype GG was associated with elevated risk for cervical cancer (Odds Ratio (OR) = 2.571, Relative Risk (RR) = 1.609,P = 0.0022); whereas, genotype AA was found to have protective role in this regard (OR = 0.1566, RR = 0.3080, P < 0.0001).For variant rs1345511001, genotype GG was found to be significanty (associated with increased risk of cervical cancer (OR = 2.363, RR = 1.548,P < 0.0056); whereas, genotype CC was statistically associated with a protective role in cervical cancer (OR = 0.5128, RR = 0.7031, P = 0.0456).
Allele frequencies of both PKCε variants are described in Table 3. Analysis revealed that the frequency of the G-allele in variant rs1553369874 was significantly higher in the cervical cancer group compared to the control group, (OR = 0.3153, RR = 0.5914, P = 0.0004).The A-allele was also found to be significantly different between the two groups, and the OR and RR indicated that a pathogenic role (OR = 3.171, RR = 1.691,P-value = 0.0004).Similarly, the frequency of the G-allele in variant rs1345511001 was higher in the cervical cancer group compared to the control group, and the G allele associated with a protective role (OR = 0.4487, RR = 0.6809, P-value = 0.0116).The frequency of the C-allele in this variant was increased in the control samples and has role in elevated risk of cervical cancer (OR = 2.228, RR = 1.469,P-value = 0.01).

Association of prkce genetic variants with clinical characteristics of cervical cancer patients
Frequency distribution of genotypes for both variants in different age groups are given in Table 4. Genotypes GG and GA of variant rs1553369874 were not statistically significant in either age group; however, genotype AA of the variant showed statistical significance for subjects of both age groups (≤ 50 years: OR = 3.596, RR = 1.439,P = 0.0118; >50 years: OR = 26.11,RR = 5.185, P = 0.0016).The CC-genotype of variant rs1345511001 showed significance for the age group of ≤ 50 years (OR = 2.368, Genotype distribution frequencies among non-metastatic and metastatic patients were also determined in Table 5.In variant rs1553369874, genotypes AA and GA were shown to be statistically correlated with cervical cancer metastasis; genotype AA has a damaging role whereas genotype GA was shown to have a protective  role.None of the genotypes in variant rs1345511001 showed any statistical correlation with metastasis of cervical cancer.Lastly, the distribution of genotypes among the cervical cancer patients of stage I-II and stage III-IV were determined (Table 5).It has been reported here that only genotype AA of variant rs1553369874 showed significant correlation as a risk factor in patients with stage-III or stage-IV of cervical cancer.Genotype distribution of variants with cervical cancer patient`s clinical features is given (see online source 3).

Impact of genetic variants rs1553369874 and rs1345511001 on prkce mRNA structure
In silico analysis was performed to predict the secondary structures of mRNAs of wildtype PKCε and its selected variants.Minimal free energy (MFE) for the wildtype and variants was also determined and analyzed.The secondary structures of mRNA for rs1553369874 and rs1345511001 showed a dramatic change when compared to their wildtype counterparts, demonstrating the significant effect of variant alleles on the overall structure of mRNAs (Fig. 1).MFE calculations for alleles of the variant rs1553369874 predicted that MFE for the reference G allele was − 5.1 Kcal/mol, whereas MFE was elevated for variant allele A with the value of -4.2 Kcal/mol.The variant rs1345511001 showed a decrease in MFE value for the wildtype allele G (MFE= -3.3 Kcal/mol) compared to the variant allele C with MFE value − 0.7 Kcal/mol (see online source 4).The decreased values of MFE for the reference alleles in both variants depicts increased structural stability [57].Whereas the altered alleles for both variants had increased MFE values compared to their wildtype counterparts and thus were structurally less stable.The altered alleles in both variants are associated with cervical cancer risk (see Table 3) which might be the result of decreased mRNA structural stability.

Analysing intramolecular interactions between PKCε and Smad3 complexes
Molecular interactions of PKCε with Smad3 were also determined and molecular docking was performed between PKCε (wildtype, E14K, D39H) and Smad3 to predict and compare the nature of interactions of PKCε prior to and after the occurrence of missense variations.
3-dimensional structures of Smad3 and PKCε were predicted through I-TASSER and the models with highest C-score values were selected which indicated the quality of the predicted 3D models.Three docking simulations were carried out between wildtype PKCε and Smad3, E14K and Smad3, and D39H and Smad3 using HAD-DOCK SERVER 2.4 and clusters of docked models were generated.The cluster with lowest value of Z-score was selected from each docking simulation, which indicates that those clusters have lowest values of standard deviation, and the best protein interaction model from each cluster was selected that corresponds to the highest score structure [58,59].It was particularly noteworthy that in the computational structures obtained from docking, PKCε (wildtype, E14K, D39H) approached Smad3 mostly through its catalytic domain.This finding is exactly in accordance with the previous findings that Smad3 is a phosphorylation substrate of PKCε [16,37].While analyzing the intramolecular interactions between native PKCε and its variants with Smad3, it was determined that variants E14K and D39H interacted with Smad3 via catalytic domain residues by forming increased numbers of hydrogen bonds and hydrophobic interactions.Whereas native PKCε interacts with Smad3 by making fewer hydrogen bonds and hydrophobic interactions.It was also observed that the wildtype PKCε and its variants are involved in strong interactions with the following residues of Smad3: Ser213, Pro214, Ala215, His216, Asn217, Asn218, Asp220.The visualization of intermolecular interactions between wildtype and variants PKCε with Smad3 in the docked structures was performed through LigPlot + v.2.2.4 as shown in online source 5. To further validate the molecular interactions of the protein-protein interaction complexes, PDBsum server was used which presents the interaction-statistics as well as depicts the protein interacting sites and interacting residues [60].Furthermore, this tool also provides the information about the types of the molecular interactions present in protein-protein interaction complexes which includes hydrogen binds, salt-bridges, disulfide bonds, and non-bonded interactions.It was revealed that the linker domain residues of Smad3 made interactions via salt bridges, hydrogen-bonds, and non-bonding interactions with the kinase domain of PKCε that resulted in Smad3 activation (Fig. 2).

Interactions dynamic analysis of wildtype and variant PKCε -Smad3 complexes
In order to understand the consequences of amino acid variants on the structure of PKCε and dynamics of its interactions with Smad3, MD-simulations were performed that predict the molecular motion and behavior of protein complexes in four dimensions [61].GROMACS tool was used to carry out simulations between wildtype PKCε and its variants with Smad3 for a time period of 10 ns and data for structural analysis of all three complexes was obtained (see online source 6).RMSD (root mean square deviation) analysis was performed to analyze the average residual deviations in the protein structures over the course of simulation [58].RMSD values for PKCε and its variants are given in the figure (Fig. 3a).RMSD values increased to 0.3 nm for all three structures during the 0 ns -1 ns period of simulation.This value was further increased to 0.5 nm for all structures at 4 ns.After that, RMSD became more or less stable for the wildtype PKCε structure till the end of simulation.However, RMSD for both variants started to increase from the time interval of 5 ns and reached the value of 0.63 nm till the end of simulations, though during this period RMSD values for both variants remained constant.
To evaluate the variation in accessibility of surface area of the protein complex imparted by amino acid variants, SASA (solvent accessibility surface area) analysis was carried out.The data presented in Fig. 2 demonstrates that values of SASA were significantly increased from 540 nm 2 to 620 nm 2 during 0 ns -1 ns of simulations (Fig. 3b).SASA values were further increased to 640 nm 2 for the native PKCε till 2 ns; however, after that SASA values gradually decreased and at 10 ns this value was 620 nm 2 for wildtype PKCε.SASA values for both variants showed dramatic fluctuations in their values.Particularly, during the time frame of 6 ns -8 ns SASA for E14K was significantly reduced to 590 nm 2 , whereas SASA for D39H was recorded to be 633 nm 2 .However, during the interval of 8 ns -10 ns SASA for all three complex remained stable around 610 nm 2 .
To determine the effect of variants on the compactness of the protein complexes, the Rg (radius of gyration) was evaluated for MD-complexes.Data from the simulations is presented (Fig. 3c).The variant E14K has Rg value of 3.6 nm at the beginning of the simulations that reached value of 3.7 nm at 2 ns.This value remained somewhat constant till 4.8 ns.However, Rg value was slightly decreased to 3.65 nm during the time frame of 5 ns -9 ns, but by the end of simulation, its value was approximately 3.7 nm.Rg for wildtype has initial value of 3.25 nm whereas for D39H its value is 3.35 nm.During 0 ns -2 ns, Rg for D39H remained higher compared to the wildtype but this value was approximately equal to 3.4 nm during the time period of 2 ns -8 ns; however, its value decreased once more to approximately 3.36 nm during the last 2 ns.
Hydrogen bonds plays a critical role in the molecular interactions and maintenance of 3D structures of proteins.Number of hydrogen bonds that are formed in protein structures during the course of simulations are determined to analyze the effect of variants on protein interactions.It can be seen that during the period of 0 ns -2 ns, the number of hydrogen bonds decreased from 700 to 600, but after that interval the number of hydrogen bonds remained stable throughout the course of simulations (Fig. 3d).Interactions dynamic analysis that the number of hydrogen bonds formed in E14K and D39H complexes are slightly higher compared to that in the wildtype PKCε, showing that the molecular interactions which are formed between variants with Smad3 are more potent and stable compared to that of the wildtype.
The distances between interacting residues of PKCε (wildtype, E14K, D39H) and residues of Smad3 were also determined.The residues selected for this purpose were involved in intramolecular interactions by forming hydrogen bonds.Three pairs of such interacting residues were randomly selected and their distances were measured at three different time intervals (0 ns, 5 ns, 10 ns) of MD simulations.In the wildtype PKCε -Smad3 complex, the distance between the interacting residues were determined at 0 ns and 10 ns.It was shown that at 10ns the distance between Asp451 of PKCε and His216 of Smad3 was decreased to 13.51Å which was 15.14Å at 0ns.The distance between His525 of PKCε and Pro211 of Smad3 was also significantly decreased to 5.68Å at the end of simulations, which was 13.63Å at 0ns.While the distance between Gly561 and Asp193 was increased to 12.92Å from 8.06Å at the end of simulations (Fig. 4a).Similarly, distances between the interacting residues of E14K-Smad3 depicted that the bond length between the residues Asp699 (PKCε) and Asn197 (Smad3) as well as Tyr468 (PKCε) and Asp220 (Smad3) were decreased to 14.57Å and 7.08Å at 10ns compared to 17.85Å and 12.76Å at 0ns, respectively.However, the bond length between the interacting pair Lys437 (PKCε) and Pro209 (Smad3) was increased to 14.09Å from 6.68Å at the end of simulations (Fig. 4b).Lastly, bond distances between the interacting residues of D39H and Smad3 were also determined which also exhibited similar trends in their bond lengths.The distance between interacting resides Thr566 (PKCε) and Tyr226 (Smad3) was 4.94Å at 10ns,  at 0ns the bond distance was recorded to be 5.65Å.Similarly, the bond distance between Gly557 (PKCε) and His216 (Smad3) was reduced to 14.86Å at 10ns from 16.38Å recorded at the start of the MD simulations (Fig. 4c).Distances between the interacting residue pairs for all three complexes are described in Table 6.

Discussion
Several studies have demonstrated that genetic variants in PKCε and its isoforms are associated with several diseases, particularly cancers [62-65] [66].The present study has explored the role of missense variants in PRKCE in association with cervical cancer in Pakistani female population.Studies have demonstrated that PRKCE non-synonymous variants rs1553369874 and rs1345511001 might have a damaging role leading to cancer susceptibility [32].The results of this study have suggested that variants rs1553369874 and rs1345511001 in PRKCE are associated with risk of cervical cancer susceptibility in Pakistani population.
It was suggested from the tetra ARMS-PCR data that genotypes GG of rs1553369874 and GG of rs1345511001 were associated with increased risk of cervical cancer, whereas their counterpart genotypes AA and CC were most prevalent in healthy controls and imparted protective effect in Pakistani female population.This result was consistent with reported studies where variant genotypes were not associated with disease progression, rather they were imparting a protective effect in this regard [67][68][69][70].Though allele A of rs1553369874 and allele C of rs1345511001 had high distribution frequency in control samples, their ORs, and RRs were higher compared to that of reference alleles of both variants, respectively.This discrepancy in data might have occurred because of small cohort size.
The cervical cancer incidence rates are highest in woman of age 35-50 years, whereas they are low for females of age 51-65 years [71].It was found that genotype AA of rs1553369874 was acting as a risk factor in women of both age groups, while only genotype CC for the variant rs1345511001 was found to have detrimental role in females of age ≤ 50 years.Association of genetic variants with tumor metastasis state and tumor stage was also determined and genotype AA of rs1553369874 was shown to be correlated with advanced stage cervical cancer and secondary site metastasis.
The findings of this research have the potential to be implicated in cervical cancer studies beyond the studied population.These PRKCE variants may contribute towards the risk of cervical cancer in various population, as similar SNPs have been corelated with the risk of cancers in previous studies.For instance, previously studies have identified genetic variants rs546950, rs4955720 in PRKCI gene which are associated with the risk of prostate cancer in Han Chinese [30] and Iranian populations [67].Similarly variant rs1801270 in CDKN1A gene are associated with development of cervical cancer in Han Chinese [69], Brazilian [72], and Iranian populations [73].Nonetheless, further research is required in diverse population with large cohort size to validate the finding of this study as well as to assess the global significance of these variants with cervical cancer.
Secondary structures of mRNAs of both variants and their corresponding wildtype were predicted and the difference of mRNA structures and MFE values between wildtype and variants revealed that the structural stability of wildtype mRNA was higher compared to the variant mRNAs, which could lead to the onset of diseases [57,74].This analysis could be verified from the fact that altered alleles A and C of variants rs1553369874 rs1345511001 were associated with increased cervical cancer risk.Furthermore, secondary structures of mRNAs are responsible for the modulation of translation rates and any changes in mRNA structures can lead to protein misfolding and structural instability [75][76][77].that can lead to severe illnesses including cancer.
Such variants that cause protein structure instability can lead to cancer progression.Destabilizing variants in E-cadherin have been reported to be associated with the development of gastric cancer at early ages compared to the variants that do not alter the protein stability [78].Similarly, missense variants in c-src destabilized the structure of the protein and caused oncogenic transformation [76,79].Moreover, such structural destabilizations can also alter the protein-protein interactions [80].Cervical cancer associated variants in PKCε may also effect its interactions with other proteins, leading towards oncogenic progression.
The effect of PKCε`s non-synonymous variants (E14K, D39H) on functions of the protein was studied along with the effect of amino acid variants on interactions of PKCε with Smad3 that are responsible for tumor cell growth [66] [81].3D structure of PKCε and Smad3 was predicted through I-TASSER that uses threading approach for prediction of protein structures [45].In silico mutagenesis was performed on the predicted 3D structure of PKCε and the two variant proteins were attained.The variants E14K and D39H were shown to be pathogenic in previous studies [32] and this study evaluated their impact on PPI of PKCε with Smad3.
Several studies have reported the role of protein-protein interactions in identification of tumor biomarkers [34].The integration of PPIs network analysis with computational bioinformatics techniques can be considered valuable for getting a deep insight into the intricated biochemical interactions occurring among the proteins present in a physiological system and which can give valuable information for designing potential prognostic biomarkers for cervical cancer [35,82].
PKCε and its variants were than docked with Smad3 to determine the alteration in protein-protein interactions resulting from the single amino acid variation.The interaction parameters included score of desolvation, electrostatic and hydrophobic interactions, and hydrogen bonding [83] which were increased for the variants than in wild type.These results clearly indicate that interactions of E14K and D39H with Smad3 were stronger than that of wild type (online source 7).
The dynamics of molecular interactions of PKCε and its variants with Smad3 were analyzed through MD simulations.The number of hydrogen bonds that are formed within the complex over the course of MD simulations were also slightly higher for variants compared to wild type.This shows that intramolecular interactions between PKCε variants and Smad3 are stronger as hydrogen bonds plays a crucial role in stabilizing protein structure and its activity [84,85].The distance between interacting residues of all three complexes was also analyzed at three random sites.It was determined that those interacting residues of native PKCε/variants and Smad3 were linked through hydrogen bonds.During the course of simulations, the distance between those interacting residues showed alterations.The molecular distances for some residues were increased during the middle of simulations, however, their distances showed shrinkage at the end of simulations.Interactions between variant PKCε-Smad3 complexes were stronger due to extensive hydrogen bonding and shorter length of hydrogen bonds which are critical for the maintenance of stable protein interactions [86].
In this study analysis regarding the impact of nonsynonymous PKCε variants on the crosstalk with Smad3 and the downstream molecular actors was performed.For this purpose, several genetic and protein-interactions databases were used.The information regarding gene-functional annotation and genetic linkage was attained through an extensive literature review and this information was also obtained from several databases including KEGG, STRING, geneMANIA, and DAVID.It was revealed from the studies that PKCε phosphorylates Smad3 at its linker region serine residue resulting in activation and nuclear translocation in a Tgf-β independent manner [81,87].Inside the nucleus activated Smad3 binds the promoter regions of several target genes and activates their transcription.It has been revealed through docking simulations and interaction dynamics analysis that variant PKCε proteins interact with Smad3 in an aberrant manner compared to their native counterpart.The binding affinity between the variant PKCε proteins and Smad3 is significantly higher compared to that in the wildtype-Smad3 complex.Since the binding interactions between both variants and Smad3 are much stronger, it will increase the binding energy of the kinase and Smad3 complex which in turn will lead towards the lowering of activation energy.This lowering in activation energy will assist in the elevated speed of the phosphorylation reaction.When the rates of Smad3 phosphorylation are increased, its activation levels will be aberrantly increased.These abnormally high levels of phosphorylated Smad3 will then mobilize to nucleus where they will frequently bind the promoter regions of target genes including HKII, MCT4, and Hif-1α, resulting in their increased rates of transcription.Elevated levels of these genes are known to regulate key oncogenic processes including Warburg's effect.Moreover, Smad3-dependent increased activation of Hif-1α will further lead to the activation of several other oncogenes, especially VEGF which is a key activator of angiogenesis [88].Activation of both these carcinogenic processes result in rapid proliferation of cancer cells, enhanced tumour survival and poor prognosis in patients (Fig. 5a).Contrarily, when wildtype PKCε forms a complex with Smad3, it does not bind with a high affinity and the binding energy between the native PKCε -Smad3 complex is substantially low.This will result in higher activation energy to accomplish the phosphorylation reaction.Hence, the PKCε induced activation of Smad3 will not increase aberrantly and typical levels of activated Smad3 will localize to the nucleus to activate gene transcription in a normal fashion, which is essential for the regulation of biological processes in the normal cervix-uterine environment including vascularization of cervix-uteri after menstruation and normal biochemical processes which are essential for the generation of energy by cervix epithelial cells (Fig. 5b).

Conclusion
Genotype analysis of PRKCE non-synonymous variants was performed which indicated that the altered genotype-AA of variant E14K with P-value 0.0001, and altered genotype-CC of variant D39H with P-value 0.045, were associated significantly with the risk of the development of cervical cancer.The wet lab analysis also revealed that the variant genotype-AA of E14K is associated with the risk of advanced stage cervical cancer with a P-value of 0.05 and Odds ratio of 7.538.The In silico molecular docking and interaction dynamics analysis of wildtype and variants PKCε suggested the potential role of variants in altered structural stability as well as protein-protein interactions that can have oncogenic implications and can serve as potential biomarkers for cancer.Therefore, these variants hold the potential to be further explored via in vitro and in vivo experimentations.Lastly, functional studies of these variants on large cohort-size should be performed to attain in-depth information regarding the molecular functions of these, so that these variants could be used as new therapeutic targets for the treatment of cervical cancer as well as novel biomarkers for better prognosis and early diagnosis of cervical cancer.

Fig. 1
Fig. 1 Predicted structures of mRNAs using RNAfold (a) rs1553369874 effect on mRNA structure due to the substitution of reference allele G (left) with variant allele A (right) (b) rs1345511001 effect on mRNA structure due to the substitution of reference allele G (left) with variant allele A (right) Darker colors represents high probability for base-pairing whereas lighter colors indicates that these sequences not form base-pair at their alignment position

Fig. 2 Fig. 4
Fig. 2 PDBsum illustration depicting list of interacting amino acid residues of chain A representing PKCε and chain B representing Smad3 (a) List of interacting residues in wildtype PKCε-Smad3 complex (b) List of interacting residues in E14K-Smad3 complex (c) List of interacting residues in D39H-Smad3 complex

Table 1
Sequences and parameters of primers used for genotyping of

Table 2
Clinical features of cervical cancer patients and control RR = 1.334,P = 0.0368).All other genotypes of this variant were non-significant in both age groups.

Table 3
Genotype and allele distribution and association of PRKCE variants rs1553369874 and rs1345511001 with cervical cancer

Table 4
Genotype distribution and association of PRKCE variants rs1553369874 and rs1345511001 with cervical cancer in age groups of ≤ 50 and > 50

Table 5
Genotype distribution and association of PRKCE variants rs1553369874 and rs1345511001 with secondary site metastatic and stages of cervical cancer

Table 6
Bond distance between PKCε and Smad3 interacting residues