A Systems Biology Approach to Identify Novel Biomarkers in Progression from Crohn’s Disease to Colorectal Cancer

Objective: This study aimed to find the key genes and miRNAs as potential biomarkers related to the progression of colorectal cancer (CRC) from Crohn’s disease (CD). Background: CD is widely accepted as one of the main risk factors leading to CRC. So, Identifying the novel molecular pathways involved in the development of CRC from CD can provide potential solutions for therapeutic interventions. Methods: By implementing a systematic approach, we have analyzed mRNA and miRNA datasets containing CRC and CD samples to determine differentially expressed genes (DEGs) and miRNAs (DEmiRNA). Then by selecting common genes involved in the progression from CD to CRC, different downstream analyses including mRNA-miRNA network, functional enrichment analysis, gene set enrichment analysis, and survival analysis were performed. Finally, quantitative real-time PCR (RT-PCR) analysis of tissue samples obtained from Normal/CRC samples was used to confirm the differential expression of selected genes and miRNA. Results: There were 10 DE miRNA and 181 genes DEGs common between progression from CD to CRC. The genes obtained for each of the 10 miRNAs were considered as the final target for downstream analyzes. In addition, analysis of RT-PCR indicated that miR-195-5p, PHLPP2, and LITAF were downregulated in the cancer group compared to the control group. Conclusion: This study showed that PHLPP2, LITAF, and miR-195-5p may have key roles in the tumorigenesis of CRC and they can be used as therapeutic targets and diagnostic biomarkers after further in-vitro and in-vivo evaluation.


A Systems Biology Approach to Identify Novel Biomarkers in Progression from Crohn's Disease to Colorectal Cancer
factors (Larabi et al., 2020;Liu et al., 2015). Results of genome-wide association studies indicated that several genes have a role in inheritable vulnerability to CD. The protein encoded with these genes is associated with several homeostatic mechanisms including epithelial barrier integrity maintenance, Th17-lymphocyte differentiation, innate pattern recognition receptors, and the secondary immune response balance (Van Limbergen et al., 2009).
MiRNAs are small non-coding RNAs that control several biological functions such as proliferation, differentiation, apoptosis, metastasis, and invasion through suppressing the target mRNAs. (Afshar et al., 2020;Mahmoudi et al., 2022;Schee et al., 2010). MiRNAs play an important role in the pathogenesis of several human diseases, such as cancers and IBD. Based on the type and activity of the protein to which the miRNA binds to its target mRNA and changes its expression, they are called tumor-suppressor miRNAs or oncogenic miRNAs (oncomiRs) (DalalKwon, 2010;Zhang et al., 2015). In summary, miRNAs play essential roles in the pathogenesis of sporadic CRC, CRC which initiates in IBD, and IBDs, in particular CD (Grillo et al., 2021;Hnatyszyn et al., 2019b). In this regard, integrated analysis of omics data on miRNA-mRNA interactions has helped to identify key therapeutic targets in CRC and IBD diseases. Many studies are currently being conducted to extend our understanding of miRNA functions and regulatory mechanisms in the tumorigenic process of CRC (Vera et al., 2013).
In this study, by evaluating the core miRNA-mRNA interactions related to CD and CRC, we search for the most important molecular factors related to the progression from CD to CRC. Moreover, with the identification of novel molecular factors, it will be possible to perform additional evaluations to prove the biomarker capability of these biomarkers and their importance as potential therapeutic targets. Finally, to confirm the results of systems biology approaches, the expression analysis of selected biomarkers was evaluated by the Real-Time PCR method on CRC tissue samples and adjacent normal tissue. An overview of the steps in this study is represented in Figure 1.

Data collection and unsupervised analysis
Raw expression data of miRNA and mRNA of tissue samples from CRC patients were obtained from Gene Expression Omnibus (GEO) databases (Table 1). For pre-processing, we used R software and related Bioconductor packages.

Differentially expressed miRNAs and mRNAs analysis
In order to identify miRNAs and mRNAs by differentiating between control and CRC samples, the LIMMA package under R software was used. In this study, significant DEGs were selected for subsequent analysis with criteria of adjusted P-value <0.05 and |log2 (FC)| > 1, while for DEmiRNAs, the cutoff criteria of adjusted P-value <0.05 and | log2 (FC) | > 0.5 was considered. We also utilized the violin plot to demonstrate the expression patterns of DEGs and DE miRNAs in CD and CRC. The analysis was performed using the R package "ggplot2". The violin plot was chosen as a useful visualization tool to show the distribution of expression values for each gene and miRNA, where the width of the plot indicates the frequency of expression values at different levels.

Prediction of miRNA-mRNA interactions
To identify miRNAs and their targets that show differential expression in control and disease samples, we used two miRNA target forecast software including miRTarBase (https://mirtarbase.cuhk.edu.cn) and miRWalk (http://mirwalk.umm.uni-heidelberg.de ). The mRNA-miRNA interactions present in both databases were considered as the final interactions for further analysis.

Analysis of KEGG pathway and Gene Ontology
To analyze the biological pathways in which common DEGs are involved, functional enrichment analysis for The Kyoto Encyclopedia Gene and Genomes (KEGG) pathway and tree gene ontology (GO) terms including molecular function (MF), biological process (BP), cellular mean. The t-test was used to evaluate the average values differences between groups using SPSS 16.0 and P-value less than 0.05 was considered statistically significant.

Identification of DEGs and DE miRNAs
Evaluation of mRNA expression data indicated that in the low-grade adenomas, high-grade adenomas, and Adenocarcinoma samples 2,190, 2,772, and 2,677 DEGs were detected compared with healthy control respectively. For miRNA expression data 61, 48, and 55 DE miRNAs were detected in the low-grade adenomas, high-grade adenomas, and Adenocarcinoma samples respectively. The number of common DEGs and DE miRNAs between low-grade adenomas, high-grade adenomas, and Adenocarcinoma are 1,176 and 39 respectively. Regarding CD, the number of common DEGs and DE miRNAs between low and high-stage CD were 454 and 26. Finally, 181 DEGs and 10 DE miRNAs were found common between CRC and CD ( Figure 2).
The violin plot analysis also revealed distinct expression patterns of DEGs and DE miRNAs in CD and CRC. Notably, DEGs and DE miRNAs showed significant component (CC) was performed using the R packages ClusterProfiler. KEGG pathways and GO terms with FDR <0.05 were considered statistically significant.

Analysis of mRNA-miRNA network
Finally, network analysis was performed to further investigate the hub genes involved in the pathogenesis of CD and CRC. The STRING online tool was used to construct The Protein-protein interaction (PPI) network (https://string-db.org/cgi/input.pl/) considering the required score = 0.4. The constructed network and bipartite miRNA-mRNA network was visualized and evaluated with Cytoscape software (V:3.7.2).

Survival analysis
The TCGA expression data was used to investigate the relationship between hub genes and the survival rate of CRC patients. The survival analysis and Kaplan-Meier plot depiction were performed with TCGA biolinks, and survminer packages in r software.

Investigation of gene expression in CRC samples
To validate the expression of selected miRNA and its target genes,12 CRC and 12 adjacent normal tissues were used (Table 3). All patients participating in this study accepted Informed consent. This study was approved by the Ethics Committee of the Hamadan University of Medical Sciences (IR.UMSHA.REC.1399.561.) In this regard, total RNA was isolated using the RNX-plus kit (Sinaclon, Iran) conferring to the manufacturer's protocol. Afterward cDNA synthesis for miRNA and mRNA was performed with 1st strand cDNA Synthesis Kit (Takara, Japan) and miRNA stem-loop cDNA synthesis (Anacell, Iran), respectively. The expression level of miRNA-195-5p and U6 as an internal control was evaluated using specific primer pair (Anacell, Iran). The expression level of LITAF, PHLPP2, and GAPDH as an internal control was evaluated with specific primer pairs designed using AlleleId 6 ( Table 2). Roche LightCycler® 96 system (Roche, Germany) was used for performing the real-time PCR analysis. Moreover, based on the microarray data of CD and CRC, stage plots were plotted using ggstatsplot package in R.

Statistical analysis
Data are presented as the mean ± standard error   up-regulation or down-regulation of each stage of CD and CRC in comparison to normal, indicating their potential as biomarkers and therapeutic targets for these diseases (Figure 3).

Bipartite miRNA-mRNA interaction analysis
The top 10 miRNA targets were searched in the miRTarBase and miRWalk databases. For CRC 111 DEGs and for CD 36 DEGs were found as a target of selected miRNAs respectively. Evaluation of the miRNA-mRNA network indicated that miR-195 as a hub miRNA due to the best scores in both CD and CRC networks. More ever, evaluation of the network indicated that LITAF, PHLPP2, and CNKSR3 were identified as common miR-195-5p target genes between colorectal and Crohn's cancer (Figure 4). Due to the lack of correlation between the CNKSR3 gene and other nodes in the network, only the PHLPP2, LITAF, and miR-195 were selected for further experimental validation.

KEGG pathway and Gene Ontology enrichment analysis
In this study, we analyzed the GO terms and KEGG pathways for the common differentially expressed genes (DEGs) using the clusterProfiler package in R. Our analysis revealed the top 10 enriched terms, which are illustrated in Figure 5. Notably, the largest terms identified were related to biological processes such as immune system process, and cellular components such as extracellular region and extracellular space. Additionally, immune receptor activity and calcium ion binding were also among the most prominent terms.
Furthermore, the top 20 significantly enriched KEGG pathways were revealed for the common DEGs. These included important signaling pathways such as Staphylococcus aureus infection, Rheumatoid arthritis, Phagosome, and Cell adhesion molecules. Importantly, the inflammatory bowel disease pathway was also significantly enriched and highlighted the role of our final set of DEGs in inflammation-related cancer progression. These results are depicted in Figure 5, underscoring the significance of our findings.

Find the prognostic index of hub genes by survival analysis
As shown in Figure 6A, B the expression level of PHLPP2 and LITAF was associated with the prognosis of patients. The results of the survival analysis indicated  that the expression levels of PHLPP2 and LITAF were significantly associated with the overall survival of CRC patients (p-values of 0.0042 and 0.0038, respectively).

Experimental validation of PHLPP2 and LITAF and miR-195
The results of the study of gene expression between the tumor sample groups and adjacent normal tissue groups in terms of ∆CT showed a significant difference for PHLPP2 LITAF and miR-195. The ∆CT has an inverse correlation with the expression level of a gene, so PHLPP2 LITAF and miR-195 are downregulated in CRC samples compared with adjacent normal tissues ( Figure 6C, D). Moreover, using microarray expression datasets (Table 1), stage plots of final hub genes were also depicted for more validation of our experimental work ( Figure 6E to 6H).

Discussion
In recent years, the discovery of molecular identifiers which are involved in cancer progression has created an interesting challenging research field. Nevertheless, there are some issues such as the absence of in-depth studies to compare healthy tissues, different stages of cancer, and precancerous stages. For instance, in a comprehensive integrated analysis, Samadi et al. have demonstrated a signature comprising DEGs, DE-miRNAs, and DE-lncRNAs (long non-coding RNAs) biomarkers related to the diagnosis, prognosis, and therapy of CRC (Samadi et al., 2022). Therefore, finding biomarkers based on diseases that promote cancer risk, such as inflammatory diseases, can be more effective in the treatment of cancer in the early stages and improve the patients' prognosis (Liu et al., 2021;Sheng et al., 2020).
In the present study, in order to detect mRNA and miRNA biomarkers involved in the progression of Crohn's disease to various stages of CRC, separate miRNA and mRNA expression data were evaluated. Subsequently, the most significant mRNAs and miRNAs that are involved in the progression from inflammation to adenocarcinoma were used for further evaluation. Furthermore, the results of the current study indicated that miR-195-5p was significantly downregulated in CRC tissues. Recent studies have examined the central role of this miRNA in controlling CRC growth and proliferation. Various other studies have also shown that miR-195 through targeting Bcl-2 (Liu et al., 2010), FGF2, Wnt / β-catenin (Zhang et al., 2016), and GDPD5 pathways (Feng et al., 2018) can induce apoptosis and suppress the growth and proliferation of CRC cells and. It has also been shown that miR-195-5p can play an important role in response to chemotherapy and radiotherapy (Jin et al., 2018;Zheng et al., 2017). Results of recent studies indicated that miR-195 -5p has an essential role in the pathogenesis of IBD by regulating inflammation and intestinal tight junctions (Scalavino et al., 2022;Tian et al., 2020). Chen et al. in their study showed that miR-195 has a role in regulation of UC development and response to glucocorticoid in UC by targeting the SMAD7 and regulating the TGFβ signaling pathway (Chen et al., 2015).
Results of the current study indicated that PHLPP and LITAF upregulated in colorectal tumor tissues. PHLPP2 which was a tumor-suppressor gene in several cancers regulates tumor cell apoptosis and survival. More ever, this gene has a regulatory role in metastasis, invasion, and chemoresistance of CRC through the stemness regulation of colorectal tumor cells (Yongfu et al., 2022). Downregulation of PHLPP2 in CRC patients was common and its expression level was associated with CRC patients' prognosis .  in their study indicated that PHLPP2 was downregulated in UC and associated with immune cell infiltration. PHLPP2 has an essential role in intestinal epithelial cell pyroptosis via the NF-κB signaling pathway . LITAF acts as a tumor suppressor and reduces properties in various types of cancer and its expression is controlled by p53. LITAF which is dysregulated in IBD may play a role in inflammatory disease (Zou et al., 2015). Zhou et al., (2017) in their study showed that protein and mRNA expression levels of LITAF were downregulated in pancreatic cancer and associated with patients' prognosis. Stucchi et al., (2006) in their study showed that the expression level of LITAF was significantly upregulated in CD compared with healthy control and inflammatory colon area compared with the normal colorectal area. Similarly, in UC patients, the expression level of LITAF was significantly increased compared with healthy control.
Results of the current study indicated that the expression level of mir-195 had not a reverse correlation with LITAF and PHLPP2 which were bioinformatically determined as target genes of miR-195. Usually, miRNAs suppress the expression level of target genes and miRNA expression level has a reverse correlation with its target genes (Afshar et al., 2020). Ritchie et al., (2009) in their study showed that the expression level of miR-92 and miR-32 has a positive correlation with PCNA as a target gene. The positive correlation between PCNA and miR-92 and miR-32 is due to the intermediated regulation. More ever, miR-92 and miR-32 repress the RFX1 and this intermediate gene represses the PCNA. So we can hypothesize that miR-195 may have a positive regulator role for LITAF and PHLPP2 through the intermediate gene.
Taken together, miR-195, LITAF, and PHLPP2 may have critical roles in the tumorigenesis of CRC and the progression of Crohn's disease to adenocarcinoma. More ever, miR-195, LITAF, and PHLPP2 could be used as therapeutic targets and diagnostic biomarkers after further in-vitro and in-vivo evaluation. The positive correlation between miR-195 and bioinformatically determined target genes also should be evaluated through more precise studies.

Acknowledgements
In memory of Prof. Massoud Saidijam, who is no longer with us, but continues to inspire his students over the course of his career.

Funding
This work is supported by a grant from Hamadan University of Medical Sciences, Hamadan, Iran (No. 9907295200).

Ethical approval
Ethical issues (Including plagiarism, informed consent, misconduct, data fabrication and/or falsification, double publication and/or submission, redundancy, etc.) have been completely observed by the authors. The ethical protocol of this study was approved by the Ethics Committee of Hamadan University of Medical Sciences. (Ethical code: IR.UMSHA.REC.1399.561.) and written informed consent was obtained from all patients to participate in the study.

Data Availability Statement
The datasets generated and/or analyzed during the current study are available in the [https://www.ncbi. nlm.nih.gov/geo/database]. The original contributions presented in the study are included in the article/ Supplementary Material, further inquiries are available from the corresponding author on reasonable request.