Transcriptomics & copy number variation of radiotoxicity points to altered mitochondrial and DNA repair mechanisms

is an inflammation of the rectum and may be induced by radiation treatment for cancer. We investigated proctitis as a radiotoxic endpoint in prostate cancer patients who received radiotherapy (n=222). We analyzed the copy number variation and SNP-derived transcriptomic profiles of whole-blood and prostate tissue associated with proctitis. The SNP and copy number data were genotyped on Affymetrix® Genome- 5 wide Human SNP Array 6.0. Following QC measures, the genotypes were used to obtain gene expression by leveraging GTEx, a reference dataset for gene expression association based on genotype and RNA-seq information for prostate (n= 132) and whole-blood tissue (n=369). In prostate tissue, 62 genes were significantly associated with proctitis, and 98 genes in whole-blood tissue. Six genes - CABLES2, ATP6AP1L, 10 IFIT5, ATRIP, TELO2, and PARD6G were common to both tissues. The copy number analysis identified seven regions associated with proctitis, one of which (ALG1L2) was also associated with proctitis based on transcriptomic profiles in the whole-blood tissue. The genes identified via transcriptomics and copy number variation association were further investigated for enriched pathways and gene ontology. Some of the 15 enriched processes were DNA repair, mitochondrial apoptosis regulation, cell-to-cell signaling interaction processes for renal and urological system, and organismal injury. We obtained access to anonymized individual level genotype data from Genetic Predictors of Adverse Radiotherapy Effects (Gene-PARE) (phs000772.v1.p1) via dbGaP’s authorized application – link, under the approval of North Texas Regional IRB protocol 2016-090. The study described here was 90 performed under the North Texas Regional IRB (formerly the University of North Texas Health Science Center IRB), and was given “EXEMPT” status based on the criteria that our study involved data available from public repository, i.e. dbGaP and does not require approval of receiving informed consent. We analyzed prostate cancer individuals from the discovery set (N=367), which were genotyped for 934,940 SNPs on Affymetrix® 95 Genome-wide Human SNP Array 6.0. The dataset contains phenotypic information on prostate cancer patients who have received radiation treatment either via EBRT or brachytherapy. Out of three radiotoxicity phenotypes – erectile dysfunction, proctitis and urinary morbidity (IPSS/AUASS) – we focused our investigation on proctitis because (1) it is also prevalent in other pelvic region cancers[16] and (2) the dataset for 100 proctitis was complete for all individuals.


~ 1 ~
Proctitis is an inflammation of the rectum and may be induced by radiation treatment for cancer. We investigated proctitis as a radiotoxic endpoint in prostate cancer patients who received radiotherapy (n=222). We analyzed the copy number variation and SNP-derived transcriptomic profiles of whole-blood and prostate tissue associated with proctitis. The SNP and copy number data were genotyped on Affymetrix® Genome- 5 wide Human SNP Array 6.0. Following QC measures, the genotypes were used to obtain gene expression by leveraging GTEx, a reference dataset for gene expression association based on genotype and RNA-seq information for prostate (n= 132) and whole-blood tissue (n=369). In prostate tissue, 62 genes were significantly associated with proctitis, and 98 genes in whole-blood tissue. Six genes -CABLES2, ATP6AP1L, 10 IFIT5, ATRIP, TELO2, and PARD6G were common to both tissues. The copy number analysis identified seven regions associated with proctitis, one of which (ALG1L2) was also associated with proctitis based on transcriptomic profiles in the whole-blood tissue. The genes identified via transcriptomics and copy number variation association were further investigated for enriched pathways and gene ontology. Some of the 15 enriched processes were DNA repair, mitochondrial apoptosis regulation, cell-to-cell signaling interaction processes for renal and urological system, and organismal injury.

Background
Prostate cancer is one of the most prevalent diseases in older men, with 66 years being the average age at the time of cancer diagnosis [1]. According to the SEER cancer statistics of 2019, approximately 3 million men have been previously diagnosed with prostate cancer and are still alive today. This feat can be credited to the advancement in 25 cancer treatment which has contributed to the 5-year relative survival rate of 90% in prostate cancer survivors [2]. Radiation therapy is one of the primary forms of treatment for prostate cancer, delivered as external beam radiotherapy (EBRT) or brachytherapy.
While the dose and precision of radiation delivery to the tumor tissue has improved over the years, surrounding normal tissues get irradiated leading to clinical side effects [3]. 30 Proctitis is the inflammation of the rectum, which can result from receiving radiation therapy around the pelvic region such as during prostate cancer treatment [4]. The inflammation of the rectum can either be acute or chronic. Acute proctitis appears within 3 months of receiving radiation therapy, and progression of rectal inflammation after 3 months of completing radiation therapy is identified as chronic proctitis [5]. The 35 development of radiation-induced chronic proctitis affects 5-20% of cancer survivors and is relatively more common[6] than acute proctitis which affects approximately 13% of the cancer population [5]. As of 2016, the population of cancer survivors in the US was estimated to be approximately 15 million, and by the year 2026 is expected to reach 20 million individuals [7]. Given the prevalence of chronic proctitis affecting cancer 40 individuals (5-20%), we can deduce that approximately 1-4 million cancer survivors experience proctitis from receiving radiotherapy. The goal of radiation therapy in ~ 3 ~ treating cancer is to damage the DNA of cancer tissue by creating double-strand breaks (DSBs). While the cellular system is capable of repairing breaks in the DNA, strands with DSBs are difficult to restore leading to activation of apoptotic signals which ultimately 45 kills cancer cells. Unfortunately, the normal tissue around the targeted region is also affected by DNA damage from radiation, and must rely on DNA repair mechanisms for rehabilitation of cellular functions [8]. A recent twin-study has reported that certain SNPs and their transcriptomic influence are associated with individual radiation sensitivity and a heritability estimate of 66% [9]. Therefore, it is vital to understand 50 genetics underlying molecular mechanisms involved in adverse effects of radiotherapy and individual genetic variations that may induce radiation sensitivity. Genome-wide studies have been conducted to identify gene variants that may contribute towards developing radiotoxic side-effects. These studies have identified genetic variations involved in DNA repair pathways to be associated with overall radiotoxicity [3,10]. 55 However, the role of altered gene expression [9] from aggregated single nucleotide polymorphisms (SNPs) remains elusive in radiotoxicity phenotypes (e.g., proctitis).
Regulatory variants are SNPs within coding regions which contribute towards tissuespecific gene expression alterations leading to variations in the phenotypic spectrum.
Estimating the contribution of SNP aggregates to gene expression can be carried out 60 using correlation weights derived from reference datasets which contain both SNP and RNA-seq information as modelled in PrediXcan [11]. One such dataset is the GTEx project, an NIH funded initiative that stores genotype and RNA-seq data of 53 tissues from 620 donors (v7). The majority of the donors in the GTEx dataset are Caucasian, ~ 4 ~ and more than 50% of the donors are over the age of 50 years [12]. These characteristics 65 make GTEx an excellent reference dataset to derive gene expression values from individual level SNP profiles of prostate cancer patients who have received radiation treatment.
Beyond SNPs, genetic discordance from gene dosage and structural effects can be attributed to copy number variation (CNV). CNVs are segments of duplicated DNA that 70 are greater than 1kb with differences in size between the two copies[13]. In a clinical setting, testing of CNVs is relatively more common than other genetic tests [14] due to the majority of phenotypic changes being associated with these variations in segment size. CNVs associated with radiotoxicity phenotypes (e.g. proctitis) have not been investigated extensively and may prove to be significant contributors to phenotype risk. 75 We hypothesize that genotype-derived gene expression profiles and variations in copy number will identify genetic alterations associated with a spectrum of DNA repair functions. Here, we investigate both CNV and tissue specific (prostate and whole-blood) transcriptomic profiles derived from individual-level SNPs that are associated with radiation induced proctitis in prostate cancer patients.

Material & Methods
The overall methodology is visually summarized in Fig S1. Data access to study subjects. The Gene-PARE was approved by the Institutional Review Board of the Icahn School of Medicine at Mount Sinai and Florida Radiation Oncology Group (Kerns et al. [3,15]). All patients provided informed consent under the parent study -Gene-PARE, at Icahn School of Medicine, Mount Sinai and Florida Radiation Oncology Group (Kerns et al. [3,15]). We obtained access to anonymized individual level genotype data from Genetic Predictors of Adverse Radiotherapy Effects (Gene-PARE) (phs000772.v1.p1) via dbGaP's authorized application -link, under the approval of North Texas Regional IRB protocol 2016-090. The study described here was 90 performed under the North Texas Regional IRB (formerly the University of North Texas Health Science Center IRB), and was given "EXEMPT" status based on the criteria that our study involved data available from public repository, i.e. dbGaP and does not require approval of receiving informed consent. We analyzed prostate cancer individuals from the discovery set (N=367), which were genotyped for 934,940 SNPs on Affymetrix® 95 Genome-wide Human SNP Array 6.0. The dataset contains phenotypic information on prostate cancer patients who have received radiation treatment either via EBRT or brachytherapy. Out of three radiotoxicity phenotypes -erectile dysfunction, proctitis and urinary morbidity (IPSS/AUASS) -we focused our investigation on proctitis because (1) it is also prevalent in other pelvic region cancers [16] and (2) the dataset for 100 proctitis was complete for all individuals.
SNP-QC. We extracted SNP data from the *.CEL files using Affymetrix® Genotyping Console using the BIRDSEED v2 algorithm and genotype call rate of 95% and default settings from the array, leaving 905,280 markers and total of 355 individuals. The files were then exported to plink [17] format to perform QC measures as suggested by 105 Anderson et. al [18]. At the individual-level filtering, we removed 5 individuals for either failing heterozygosity or having greater than 9% missing genotypes. At the IBD filter of   Gene Expression imputation and GSEA. Tissue specific gene expression prediction using each individual's genotype profile was performed using PrediXcan [11]. The PrediXcan model, accessible at http://predictdb.org/, derived weights of SNPs and 120 tissue specific gene expression were trained using lasso regression from GTEx (v7) [12] reference datasets. The predicted gene expression is related to the component of gene 3113 genes were predicted and 5954 genes for whole-blood tissue. Following association tests, significant genes (identified as p-value <0.05) were investigated further by constructing tissue-specific protein-protein interaction (PPI) networks between query (significant genes) and interacting genes using the DifferentialNet [20] database and NetworkAnalyst3.0 [21]. The network was filtered on betweenness 135 centrality of 4.0 in order to reduce isolated neighboring nodes (each gene is a node). All the genes in the network were subsequently analyzed for gene set enrichment using clusterProfiler [22] for gene ontology and visualized in GOPlot [23].
We identified the study by Oorschot et. al [24] as an external cohort for replication of our transcriptomic findings. Their study recruited 200 patients who received EBRT for 140 prostate cancer. Prior to receiving treatment, whole blood was drawn, and lymphocytes were cultured, irradiating half of the cells with 0Gy and 2Gy gamma rays followed by genome-wide microarray analysis. The data weredeposited in Gene Expression Omnibus -GSE85570 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE85570). Further details of 145 the study have been described elsewhere [24]. We analyzed differentially expressed genes between groups receiving 0Gy and 2 Gy of radiation using BART [25] and NCBI's GEO2R. We tested the significance of genes overlapping our results from prostate and whole-blood tissue to those identified in the replication dataset using GeneOverlap (https://github.com/shenlab-sinai/geneoverlap) which applies Fisher's exact test.

150
CNV association and GSEA. The *.CEL files of 222 individuals from the above QC protocol were extracted for copy number analysis using Affymetrix® Genotyping Console. Copy number segments were filtered to regions (minimum genomic size of 2kbps) with 10 marker per segment [26]. The copy number data was exported as tabdelimited file for copy number association in CNVRuler [27].

Results
Genes identified in prostate tissue. In the association analysis between prostate cancer individuals who developed proctitis (cases) and who didn't develop proctitis (controls), we found a total of 62 differentially expressed genes to be significantly associated.

165
Based on z-score direction, 28 genes were downregulated, and 34 genes were upregulated in the prostate tissue (Table S1). We mapped the genes to tissue-specific protein-protein interaction (PPI) network to understand combined functional effects of  (Table S2).
Genes identified in whole blood tissue. We found a total of 98 genes to be associated with proctitis in whole blood tissue. 49 genes were upregulated, and 49 genes were downregulated (Table S3). Integrating PPI network information with the significant genes, highlighted DNA repair processes (Fig 2) such as DNA replication (TERF2, EGFR,  (Table S4).
Replication dataset for transcriptomic findings. In order to replicate our SNP-derived transcriptomic genes associated with radiation toxicity, we identified the study of van Oorschot et. al, whose data were deposited in gene expression omnibus (GEO) [24].

190
Their study isolated and cultured lymphocytes from individuals who received radiation treatment for their prostate cancer and were assessed for radiotoxicity for a period of 2 years. In their study design they irradiated lymphocytes (collected prior to radiation treatment) with 0Gy and 2Gy of gamma ray, followed by microarray analysis. We  Table S5).
CNV association and GSEA of mapped genes. We found 7 CNV regions associated with 205 proctitis on chromosomes 1, 3, 4, 11, 12 and 15 (Fig 4; Table S6). We identified genes within CNV regions using the UCSC browser (hg19) (Table S7). Interestingly, out of the two regions on chromosome 11 that were significant, we observed a high number of TRIM family genes (chr11:89487937-89909274 bps). The mapped genes from copy number regions were investigated for gene interactions using biobase knowledge of 210 Ingenuity Pathway Analysis®. The pathway with the highest number of query genes ( Fig   5) was further analyzed for enriched disease and functional categories (Table S8). Cellto-cell signaling interaction processes for renal and urological system, connective tissue development and function, and organismal injury were significantly associated processes, and their functional categorization included synthesis, proliferation, 215 apoptosis and transmembrane transport. It is interesting to note, that most of these processes were dominated by TRH and TRIM-family genes. Furthermore, we also observed that the ALG1L2 gene, which was one of the significantly downregulated genes in whole-blood tissue, was also mapped to significant CNV region on chr3:129690192-129896364 bps which observes both gain and loss of copy, referred to 220 as mixed regions.

Discussion
Genetic susceptibility towards developing radiotoxic phenotypes is an upcoming research interest of significant clinical impact to improve the quality of life of cancer ~ 12 ~ survivors [8]. Previously, GWAS studies have been conducted to identify genetic loci 225 associated with overall toxicity, decreased urinary stream, and erectile dysfunction [10,29] in prostate cancer individuals who received radiotherapy [30]. While these findings have shed much-needed light on SNP loci associated with susceptibility towards radiotoxicity, cumulative effects of exonic SNPs on gene expression and other genetic alterations such as copy number differences have not been previously studied. Here, we 230 integrated genotyping data to identify genetic risk associated with proctitis by (1) employing genetic variant-derived gene expression of both prostate and whole-blood tissue, and (2) identifying associated genomic CNVs. The transcriptomic analyses points to several novel genes that play role in DNA-repair processes. In addition, we identified variable copy number regions that had multiple members of TRIM-family 235 genes to be associated with proctitis. Along with novel genes identified through the analysis, the incorporation of PPI map reveals convergence of the implicated gene sets on known DNA-repair, mitochondrial, and telomeric regulation processes, highlighting their involvement with radiotoxic phenotypes (e.g. proctitis).
Six genes from both prostate and whole-blood tissue were associated with proctitis.

240
CABLES2, ATP6AP1L and IFIT5 were under expressed, and ATRIP and TELO2 were upregulated in both tissues, however PARD6G was over expressed in prostate tissue and under expressed in whole blood tissue. CABLES2 (Cdk5 And Abl Enzyme Substrate 2), which is involved in regulation of the cell cycle, was also reported to be under expressed in lymphocytes of occupational workers who were exposed to ionizing 245 radiation [31]. ATP6AP1L (ATPase H+ Transporting Accessory Protein 1 Like) is critical ~ 13 ~ for proton transportation and ATP synthase activity in the mitochondria; it has a paralog, ATP6AP1, which is involved in secretory granules and regulating neuroendocrine responses [32]. IFIT5 (IFN-induced protein with tetratricopeptide repeats) has been reported to act as an enhancer in immune responses, with partial 250 containment in mitochondria [33]. A recent study has reported that elevated IFIT5 gene expression was correlated with interferon-γ levels in prostate cancer individuals after radiation, and demonstrated that IFN-γ stimulated epithelial-to-mesenchymal transition through the activation of JAK-STAT pathway [34]. ATRIP [TREX1] is an ATR interacting protein that is a DNA exonuclease [35] that is known to initiate DNA repair pathways [36]. 255 ATRIP has also been reported to provide telomere protection by recruiting ATM -a key player in regulating cellular damage responses-to telomeric and DNA damage sites unaided by ATR kinase activity [37]. Additionally, ATR responds to UV damage via the downregulation of Pin1 demonstrating anti-apoptotic activity in mitochondria [38]. ATRIP also interacts with ATM to stimulate cell cycle arrest in response to radiation induced double strand breaks [41] via AKT activation [42]. High TELO2 expression activity has been identified to be correlated with cell protection when exposed to high radiation 265 doses [42]; conversely, TELO2 overexpression can trigger inflammation by influencing PIKKs (via mTORC1 binding [43]) while responding to DNA damage [44]. Based on gene ontology annotations, ALG1L2 functions as a mannosyl transferase in protein glycosylation [46]. Another gene within chromosome 3 CNV region is TRH (Thyrotropin releasing hormone); its function includes carbohydrate and amino acid metabolism, and it has been involved in endocrine system disorders, metabolic disease, 285 and organismal injury (Table S7). TRH has been shown to mobilize calcium from endoplasmic reticulum and mitochondria [47], and plays role in mitochondrial endoxidation via mitochondrial complex I and IV enzyme activity in skin samples [48], which is aligned with TRH's known involvement in hair and skin development (as indicated in our IPA results,  [49]. In the pathway analysis (Fig 5), we observed that FAM86HP has an indirect interaction with TGM2 (Transglutaminase 2), a stress-response gene [50] involved in mRNA metabolism [51] which has been reported to be upregulated during inflammation [50]. It is interesting to note that TGM2 was over expressed in individuals 295 receiving chemo & radiation therapy, suggesting its involvement in sensitivity to radiation [52,53]. Both, TGM2 and NUPR1 (Fig 5) have been reported to be implicated in inflammation driven primarily via the JAK/Stat and IL-17A signaling pathway [54]. In the pathway we observe, NUPR1 has direct interaction with FAM72C/FAM27D and LOC100133315, both of which were identified from the copy number variation 300 association. NUPR1 is known to repair double strand DNA breaks [55] and regulate cell cycle progression[56] from damage induced by gamma irradiation [55]. The under expression of NUPR1 has been reported to result in increased ROS production thus creating a deficit in mitochondrial membrane potential. This alteration in OXPHOS activity has been associated with ER stress and triggers programmed necrosis in 305 cancer cells [57]. During angiogenesis in cancer cells, NUPR1 was reported to be upregulated in association with triiodothyronine thyroid hormone receptor [58], which is regulated with TRH -Thyrotropin releasing hormone [59]. It is interesting to note that NUPR1 has been reported to play different roles before cancer development, during cancer progression and in response to cancer treatment. tissue injury (normal and tumor) appears to trigger ROS-induced mitochondrial oxidation [62], which exacerbates local inflammation in nearby tissues and propagates this DNA repair-inflammation stress cycle. Our findings report several novel genes that have been observed to be associated with known BRCA1-ATM-RAD50 damage response complex [63] which are activated in response to radiation, thus extending our 325 understanding of these new players and their multifactorial roles associated with proctitis.
Our study has several limitations. The sample size is small, and our findings should be replicated and functionally validated in future studies using an animal model for There are also many strengths to our study. Leveraging SNPs for reference transcriptomic data and copy number association, we identified several novel genes associated with proctitis -an inflammation of the rectum resulting from radiation therapy received for prostate cancer. The integration of tissue -specific PPI network 340 aided in understanding the biological interactions between known and reported genes.
We were able to replicate 27 of our genes to be significantly representative in an external cohort of prostate cancer diagnosis receiving radiation treatment. Analysis of copy number variation identified several genes in the reported regions and their pathways analysis highlighted two primary genes that have distinctive roles before 345 cancer development and in response to cancer treatment.
In conclusion, this investigation highlights genes primarily involved in DNA repair processes and mitochondrial malfunction threaded via inflammation. The field of radiogenomics -work investigating the role of genetics in developing radiation toxicitycalls for investigation of genetic risk that can help inform dose management of 350 radiation treatment and toxicity monitoring during treatment [8]. We anticipate that understanding genetic data from both CNV and SNPs would contribute towards

Funding
We would also like to acknowledge the NIH -Neurobiology of Aging T32 grant 365 AG020494 for supporting this research. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author contributions
GAP conceptualized the study design, carried out analysis and drafted the manuscript. NRP supervised the study to its completion, contributed to manuscript revisions and gave final approval for publication.

Competing interests
The authors have no competing interests to declare. ~ 19 ~

Consent for publication
All authors on the manuscript have provided consent for publication.
Ethics approval and consent to participate 380 The study described here was performed under the North Texas Regional IRB (formerly the University of North Texas Health Science Center IRB), under the North Texas Regional IRB protocol 2016-090 for data access from public repository -NCBI's Database of Genotype and Phenotype (dbGaPhttps://www.ncbi.nlm.nih.gov/gap/).

385
All data generated or analyzed during this study are included in this article (and its Supplementary Information files). For more information, please contact the first author (G.A.P.) or the corresponding author (N.R.P.) on the manuscript.  Table S2 in Supplementary file for more details.  Table S4 in Supplementary file for more details. In the Venn diagram, we highlight the number of significant genes overlapping between our transcriptomic associations of proctitis (radiotoxicity phenotype) for prostate and whole-blood tissue and an external dataset with similar phenotype assessing differential expression between groups receiving 0Gy vs 2Gy of radiation. The overlap of 27 genes between whole-blood and external dataset is significant. 14/27 genes had same direction of expression in both datasets and were further analyzed for GSEA. The chord plot shows the representative Database: Name of function category on the right side of the circle and genes present in these categories are on the left connected with strings. The up/downregulation of genes is reported via their z-scores. See Supplementary file for details.