Identification of Key Candidate Genes and Pathways for Relationship between Ovarian Cancer and Diabetes Mellitus Using Bioinformatical Analysis

Ovarian cancer is one of the three major gynecologic cancers in the world. The aim of this study is to find the relationship between ovarian cancer and diabetes mellitus by using the genetic screening technique. By GEO database query and related online tools of analysis, we analyzed 185 cases of ovarian cancer and 10 control samples from GSE26712, and a total of 379 different genes were identified, including 104 up-regulated genes and 275 down-regulated genes. The up-regulated genes were mainly enriched in biological processes, including cell adhesion, transcription of nucleic acid and biosynthesis, and negative regulation of cell metabolism. The down-regulated genes were enriched in cell proliferation, migration, angiogenesis and macromolecular metabolism. Protein-protein interaction was analyzed by network diagram and module synthesis analysis. The top ten hub genes (CDC20, H2AFX, ENO1, ACTB, ISG15, KAT2B, HNRNPD, YWHAE, GJA1 and CAV1) were identified, which play important roles in critical signaling pathways that regulate the process of oxidation-reduction reaction and carboxylic acid metabolism. CTD analysis showed that the hub genes were involved in 1,128 distinct diseases (bonferroni-corrected P<0.05). Further analysis by drawing the Kaplan-Meier survival curve indicated that CDC20 and ISG15 were statistically significant (P<0.05). In conclusion, glycometabolism was related to ovarian cancer and genes and proteins in glycometabolism could serve as potential targets in ovarian cancer treatment.


Introduction
Ovarian cancer (OV) is one of the most common malignant tumors of the female genital organ. The morbidity of OV ranks second after the cervical cancer and uterine body carcinoma, and moreover, the third in the list of malignance. Globally about 200,000 people are diagnosed with ovarian cancer and 125,000 die from the disease each year (Torreet al., 2012). As the symptoms of ovarian cancer is vague and the lack of effective screening techniques and early diagnosis methods, 70%~80% of patients are diagnosed with advanced disease, which contributes to poor treatment effect that 5-year survival rate is less than 20% (Chou et al., 2010;Ledermann et al., 2016). From a medical point of view, the embryo development of ovary, the histological anatomy and the endocrine function are rather complex, and the early symptoms are not typical, as a result it is difficult to identify the origin and malignancy degree before surgery.
Epidemiological studies have demonstrated that the risk of OV is associated with obesity, diabetes, and metabolic syndrome. More evidences show that persons with type I diabetes mellitus (DM) have an approximately 20-25% higher cancer incidence compared with persons without diabetes. For instance, women with diabetes have been shown to have an elevated incidence of liver, pancreatic, kidney, and endometrial cancer.
Recently a meta-analysis of epidemiological studies on the correlation between diabetes and OV indicated that diabetes was associated with an increased risk of OV. The incidence of OV risks related approximately to 1.17 (95% confidences interval (CI), 1.02-1.33)] (Donal and Lamkin, 2009). Besides, a variety of proteins in glycometabolism have been attracting increasing attention as potential targets for OV treatment (Birrer et al., 2011). The clinical results are still controversial though, it suggests that targeting glycometabolism could be a novel approach to treat OV.
High throughput platform for genetic analysis is becoming more and more important, and the proper application of microarray is one of the most promising technology in medical oncology. Gene microarray technology in cancer gene research, cancer gene expression profile analysis, molecular classification of the tumor and tumor gene mutations is widely used due to the great advantage in tumor biology. For the past few years, gene microarray technology has been used in the study of malignant OV to analyze the expression difference of genes, detect the molecular markers of early diagnosis, analyze the characteristics of drug resistance genes, judge the sensitivity to chemothearpy, analyze characteristics of gene expression and predict the early tendency of recurrence. All of these are the critical methods to study the gene expression profile of OV.
In our research, the initial data were downloaded from Gene Expression Omnibus (GEO, https://www.ncbi.nlm. nih.gov/geo). GSE26712 genetic data from the patients with OV were compared with normal controls, then GO and pathway enrichment analysis was performed. We want to integrate the publicly available microarray datasets in the database, analyze the biological functions of genes and pathways, explore whether diabetes is the risk factor of OV, which provides meaningful clues for the relationship of diabetes and OV.

Microarray data information and DEGs identification
GSE26712 gene expression profiles were downloaded from the GEO database. GSE26712 was submitted by Michael Birrer based on the agilent GPL96 platform ([hg-u133a] Affymetrix Human Genome U133A Array). The database consisted of 195 samples in total, including185 tissues from ovarian cancer patients and 10 from normal human. We downloaded the raw data in text form (agilent platform). After analyzing the top 250 genes with GEO online tools 2R, we got all genes with differences expression. And the genetic variations were extracted in Excel spreadsheets, with the conditions of |log FC| ≥ 2, P-value< 0.05. Finally 379 eligible difference genes were found. The results showed that the data was statistically significant.

Genetic ontology and pathway enrichment analysis of differential genes
A variety of online databases were utilized to analyze candidate DEGs function and access enrichment. GO constitutes one of a variety of biological ontology languages, providing a three-tier system definition method for describing the function of gene products, which is suitable for all kinds of species, gene and protein function for qualified and description]Gene Ontology C 2006). KEGG (http://www.kegg.jp/) is a database to know the advanced features and biological systems (such as cells, biological and ecological systems) from the molecular information, especially large data sets generated genome sequencing and other high-throughput experimental technologies (Ashburner et al., 2000). Now David (https:// david.ncifcrf.gov/) provides researchers with a set of complete functional annotation tools for investigators to understand the biological significance behind the numerous genes (Kanehisa and Goto, 2000). GO analysis and KEGG analysis of the selected different genes were performed using David online tools. The biological process, molecular mechanism and the cell composition of the GO term were obtained by the annotation of the altered genes, and the correlation pathway of KEGG was analyzed. The differential genes of David online tool analysis (P-value < 0.05) was considered to be statistically significant.

The network diagram and module synthesis analysis of protein-protein interaction
Search Tool for the Retrieval of Interaction Genes-STRING-is an online tool used to identify protein-protein interaction information.
The STRING (version10.5) contains 9,643,763 proteins from 2031 species; 1,380,838,440 interactions. To evaluate the interaction of different genes, we mapped all the differences of all the genes into a STRING and used score > 0.4 as the cut-off criteria. Then, the PPI network diagram was constructed by using the cytoscape, and the three modules of the expression were selected by the plug-in MCODE of cytoscape, and the module analysis was carried out. The corresponding proteins in the central nodes might be the core proteins and key candidate genes with significant physiological regulatory functions.
2.4 Disease predictions of candidate genes The Comparative Toxicogenomics Database CTD, http://ctdbase.org./ was used for describing the relationship between chemicals, genes and diseases. We put ten central genes into CTD website, and the CTD tools was used to express the abnormal expression of OV in these ten hub genes and to analyze the relationship between metabolic and cancer related diseases.

Analysis of ROC in candidate gene ovarian cancer
Receiver operating characteristic (ROC) curves for the hub genes were analyzed using web-based tools (http://www.proteinatlas.org). Oncolnc is a website that incorporates various RNA data and patient clinical data from TCGA and provides survival analysis. We mapped ten central genes into this site, the use of the functions and characteristics of the Oncolnc hub genes in OV survival situation is analyzed, and the survival curves of the ten central genes, is used to evaluate 10 hub genes diagnosis effectiveness of abnormal expression in OV.

Identification of differential genes in ovarian cancer
A total of 185 cases of ovarian cancer and 10 control samples were analyzed. The series of microarray data were analyzed with GEO 2R tool, and the different genes were obtained with P-value < 0.05 and |log FC| ≥ 2.0 as the cut-off criterion. In the analysis of GSE26712, a total of 379 DEGs were identified, including 104 up-regulated genes and 275 down-regulated genes (  IGLC1 KLK6 S100A11 DEFB1  INS-IGF2 IGF2 HMGA1 ISG15 FOLR1 GSTP1 TAGLN S100A2 H2BFS IGLC1 CALR COL1A1 CKS2 CP IGHA2 IGHA1  IGH ATP5H H2AFX   Red, up-regulation; Blue, down-regulation map of differential gene expression (the genes of the first 50 and down-regulated genes) was presented in Figure 1.

The ontology analysis of differential genes in ovarian cancer
The results of GO analysis indicated that the different expressions of the top 20 DEGs were mainly associated with cell migration, proliferation and adhesion. The identified DEGs also participated in the metabolism process of cells and polymer, the response to lipid, and organ and tissue development. The up-regulated genes were mainly concentrated in biological processes, including cell adhesion, transcription of nucleic acid and biosynthesis, and negative regulation of cell metabolism. The down-regulated genes were predominantly concentrated in cell proliferation, migration, angiogenesis, and macromolecular metabolism. The molecular function enrichment analysis showed that the up-regulated genes were mainly focused on enzyme binding, RNA binding and protein dimerization activity, and the down-regulated genes were predominantly concentrated in receptor binding, molecular function regulatory mechanism, enzyme binding and enzyme regulation activity. In addition, the analysis of GO cell components indicated that the increase of genes is mainly concentrated in remote body, extracellular region and membrane binding domain. And the down-regulated genes were predominantly concentrated in extracellular regions, membrane-bounded vesicls, extracellular exosomes and cell joints ( Figure  2 and Table 2). The signaling pathways and functions identified by KEGG PATHWAY enrichment analysis were     We used the plug-in MCODE analysis to further analyze the 218 nodes and 434 edges, and the top three most significant modules were selected. We also analyzed the function annotation of the genes in the modules, respectively. The consequences of enrichment analysis showed that the genes of these three modules were mainly related to the positive regulation of biological process, regulation of metabolic process, oxidation-reduction process and carboxylic acid metabolic process (Table  5-7).Ten hub genes related to cancers and metabolic diseases were analyzed by CTD. The results of the analysis indicated 1,128 distinct diseases with statistical significance (bonferroni-corrected P<0.05). It involved      185 kinds of cancer, including ovarian cancer. In addition, we found that the ten hub genes could cause many ovarian diseases and disorders, for example, ovarian cysts, ovarian neoplasms, ovarian epithelial cancer, and primary ovarian insufficiency. Interestingly, we also found that these genes were closely related to diabetes, weight change, hypertriglyceridemia and other metabolic disorders, particularly to type I and II diabetes. For the relationship between OV and diabetes, the results were displayed in Table 8.
The clinical significance of ten abnormally expressed hub genes Further analysis was conducted on the above ten hub genes in abnormal expressions. The Kaplan-Meier survival curve showed that the survival time of CDC20, H2AFX, HNRNPD and ISG15 is significantly correlated with the survival time of OC (P < 0.05), as showed in Fig.4. Preliminary results showed that high expression of CDC20, H2AFX, HNRNPD and ISG15 in ovarian cancer prognosis has an obvious effect, compared with lower expression.

Discussion
It is extremely important to explore the risk factors and prognostic factors for the early diagnosis of OV. We received 185 OV samples and 10 normal samples from GSE26712 in GEO database. At last we identified 379 different genes, 104 up-and 275 down-regulated genes. The results of GO analysis showed that DEGs were mainly related to cell migration, cell proliferation and cell adhesion, the process of cell and macromolecular metabolism, and lipid response. Results of KEGG pathway analysis showed that DEGs were mainly related to signaling pathways such as Rap1, Ras, Wnt and PI3K-Akt as well as the metabolism of retinol and amino acids. We constructed PPI network diagrams genetic variations and obtained the most significant modules. Moreover the top ten hub genes were found from DEGs, which are: CDC20, H2AFX, ENO1, ACTB, ISG15, KAT2B, HNRNPD, YWHAE, GJA1 and CAV1.
Module analysis of PPI network demonstrated that Module A was focused on positive regulation of biological process, regulation of metabolic process, cellular metabolic process, that Module B was mainly enriched in the oxidation-reduction process, Carboxylic acid metabolic process, and that Module C was mostly concentrated in the oxidation-reduction process and generation of precursor metabolites and energy, nucleoside of the metabolic process. All of the three modules had statistical significance with MCODE score > 10, FDR < 0.05.
The functions of each hub gene were different. CDC20 exists in normal ovarian tissue and ovarian cancer tissue. High expression of CDC20 was associated with High tumor grade in the ovarian (P = 0.044). Studies using weighted gene co-expression network analysis (WGC) to analyze serous ovarian cancer (SOC) identified one stage-associated module and one grade-associated module.CDC20 was found to be one of the top hub genes related to grade. In addition, studies showed that glucose and the activated Ras2 (Val19) protein sympathetically inhibited APC/C function via the cAMP/PKA pathway in yeast, at the same time CDC20 was involved in the APC/C regulation by the cAMP/PKA pathway (Bolte et al., 2003;Gayyed et al., 2016;Sun et al., 2017). H2AFX: In the treatment of ovarian cancer, the P38 lightning/ H2AX shaft was one of the molecular mechanisms of drug resistance in ovarian cancer. In addition, H2AX and SEI1 have co-localization in the nucleus, and the high expression of SHI1 in ovarian cancer plays a major role in the deterioration of ovarian cancer. At the same time, it is found that histone H2AX phosphorylation (gamma H2AX foci) expression was significantly increased in patients with type 1 diabetes, especially in women (Giovannini et al., 2014;Mo et al., 2016;You et al., 2017). ENO1 interferes with follicles in ovarian granulose cell by inducing the mRAN expression of hormone receptor (FSHR) and reducing the mRNA expression of luteinizing hormone receptor (LHR). Besides, ENO1 is a glycolysis enzyme that can reduce glycolysis in the cells of gastric cancer. The expression level of ENO1 increases with the shortening of the survival time in patients with gastric cancer. ENO1 is closely linked to glucose metabolism enzyme PGK1, and is up-regulated in the study of glucose uptake (Kim et al., 2013;Zhonghua et al., 2015;Zhao Y et al., 2016). ACTB is closely related to various cancers, including liver, melanoma, lung cancer, breast cancer, prostate cancer, uncontrolled ovarian cancer, and its expression rises in most tumor cells and tissues. Furthermore, abnormal expression of ACTB is associated with invasive and metastatic potential of cancer (Guo et al., 2013). In the chip analysis of tissues from 128 cases of ovarian serous carcinoma patients with surgery and chemotherapy treatment, it was found that ISG15 protein expression was significantly elevated in relapsed carcinomas as compared to primary tumors (P = 0.027) and ISG15-positive carcinomas had a significantly longer overall survival in university analysis (P = 0.002). The study also found the molecular basis for ubiquity and ISG15 cross-reactivity in viral ovarian tumor domains. IFNs: Using the Type I IFN receptors (NOD) IFNAR1 (-/ -)) immunodeficient NOD mice model, it was found that ISG15 expression was significantly increased in mice at one week of age, and reached peak after 3-4 weeks. The results suggested that ISG15 is closely related to the function of pancreas in young mice (Akutsu et al., 2011;Darb-Esfahani et al., 2014). KAT2B: Whole genome analysis in the Han nationalities and ethnic minorities with high uric acid, Type 2 diabetes and obesity in China were carried on and it was found that KAT2B was closely related to HbA1c. The study also found that KAT2B and WDR5 stimulated gluconeogenesis through self-reinforcing cycle, and the small molecule inhibitors of KAT2B decreased the blood sugar levels, indicating KAT2B one of the effective targets in diabetes treatment. But no evidence showed that KAT2B concerned with ovarian carcinoma yet (Ravnskjaer et al.,2013;. HNRNPD: It was found that AUF1/ HNRNPD over-expression could lead to tumor occurrence in the research of transgenic mice. Furthermore, in Akita mice with type I diabetes, insulin caused the effects on Nrf2 and angiotensinogen (Agt) gene expression of the kidneys and unregulated heterogeneous nuclear ribonucleoprotein F and K (hnRNP F and hnRNP K). Insulin curbed Nrf 2 promoter activity via a specific DNA-responsive element that binds hnRNP F/K and hnRNP F/K that over expression curtailed Nrf2 promoter activity and hnRNP K. These findings identified hnRNP F/K and Nrf2 as potential therapeutic targets in diabetes (Gouble et al., 2002;Abdo et al., 2013;Lo et al., 2015;Singh 2016., Ghosh et al., 2017Lo et al., 2017). YWHAE: We evaluated the expression of approximately 21,000 genes using DNA microarray screening of paired tumor samples taken prior to and after CT treatment from 6 patients with predominantly advanced stage, high-grade epithelial ovarian cancer. Up-regulated genes in post-CT tumors included research on diseases associated with diabetes or metabolic disorders. Substantial number of genes with previously known implication of mechanisms of tumorigenesis (L'Espérance et al., 2006). YWHAE: There is no research on the relationship between diabetes or dysmetabolic syndrome and YWHAE. GJA1: It was demonstrated that TGF-β up-regulateed GJA1/Cx43 in two human ovarian cancer cell lines, SKOV3 and OVCAR4. In the study of diabetes, it was discovered that intercellular coupling via gap junctions was decreased after insulin administration in diabetic and non-diabetic mice. This decrease in coupling was associated with a concomitant increase in the phosphorylation of GJA1/Cx43 at serine 368. Insulin regulated both gap junction-mediated intercellular communication and injury propagation in type I diabetic mouse heart. GJA1/Cx43 expression and cell-to-cell communication increased in response to elevated glucose and may protect the collecting duct from renal damage associated with established diabetic nephropathy (Hills et al., 2006;Qiu et al., 2015;Liu et al., 2015;Palatinuset al., 2015;Qiu et al., 2016). CAV1: In SKOV3 and A278 cells, it was reported that Cav1 promoted the chemoresistance of ovarian cancer by targeting apoptosis through Notch-1/ Akt/ and NF-kappa B pathway. The other microarray analysis on ovarian tissues and SKOV-3 and ES-2 cell lines proved that CAV1 gene was likely to act as a tumor suppressor gene in human ovarian epithelium. In addition, there was a significant change of Cav1 and mir-375 in the beta-cells (insulin secretory cells) in the islets of mice and people with different degrees of fusion. Fasudil was found in the diabetic rats of fasudil by blocking VEGFR2/Src/ caveolin-1 signaling pathway to protect the diabetic rats (Wiechen et al., 2001;Zou et al., 2015;Ofori et al., 2017).
We used comparative toxicogenomics database to predict the onset of these ten hub genes with ovarian and metabolic diseases, and the results showed that the ten genes were associated with ovarian cancer or diabetes and metabolic disorders. Meanwhile, the analysis results of the ten survival curves illustrated that high expression of CDC20, H2AFX, HNRNPD and ISG15 had good prognosis in ovarian cancer. The studies of the above literatures indicated that five genes, GJA1, CAV1, ENO1, H2AFX and ISG15 were related to both ovarian cancer and diabetes, CDC20, however, was only involved in diabetes, and KAT2B, HNRNPD and YWHAE only in ovarian cancer.
With the improvement of people's living conditions, the incidence of diabetes has increased year by year (Clery et al., 2017). Epidemiological studies have demonstrated that diabetes reduced the survival time and median survival of ovarian cancer patients.It was reported 23.3% newly diagnosed cervical cancer patients, almost twice as many as the previous Swiss census. Insulin resistance is common in type 2 diabetics, and insulin resistance often leads to hyperinsulinemia. The vitro studies have demonstrated that hyperinsulinemia affected the binding protein level of sex hormones, leading to elevated estradiol and testosterone levels, thus, affected the prognosis of patients with ovarian malignancy (Ruge et al., 2012). Besides, insulin inhibited cell apoptosis by affecting the PI3K/AKT pathway and mitotic kinase pathway, thereby inducing tumor cell proliferation (Gryko et al., 2014). Meanwhile, oxidative stress is closely related to the development of type 1 diabetes mellitus, which leads to genome damage, especially DNA double chain fracture. Previous studies have shown apparent metabolic changes that occur in cancer tissues, and low oxygen would result in higher HIF-1 alpha level, which increases the sugar intake in tumor cells. . Moreover, sugar increased within the tumor cells, not only leads to enhanced glycolysis pathway, but increases the biosynthesis at the same time. And studies in vitro have shown that P53 pathway influences the reprogramming of glucose metabolism in ovarian cancer cells. (Semczuk et al., 2017). It's found that metformin produced anti-cancer effects via AMPK dependent or non-dependent pathways, and the study provided evidence of diabetes and ovarian cancer. (Rattan et al., 2011). By using molecular bioinformatics tools we found out the possible common target genes in ovarian cancer and diabetes, provided evidence from the molecular level supporting that diabetes is a risk factor for ovarian cancer, which provides researchers a new way of thinking when facing the big challenges in cancer detection, early diagnosis and medicinal treatment of ovarian malignant tumors.
All in all, we studied the differences in ovarian cancer genes with bioinformatics analysis, aiming at the population based case-control study to collect information on relationship between diabetes and ovarian cancer. However, the study does have limitations and more experiments are needed to be done in order to further validate our observation, for example, to conduct experiments to validate the expression levels of these DEGs and to increase the sample size to confirm out findings.