Transcriptomic Profiling of OSCC Patients in an Indian Subset

Background: Tumor-specific biomarkers are needed for accomplishing antidote in early detection, as well as prognosis and designing therapeutic strategies. Comprehensive transcriptome profiling offers critical insights into the disease and reveal new avenue for drug discovery. Methods: Total 5 cancerous and histopathological normal tissue pairs of 5 OSCC patients included in the petite study. Transcriptome sequencing was performed using Roche’s 454 sequencing platform followed by CLC Genomics Workbench was used to examine gene expression in OC development. Results: A total 2082 genes were differentially expressed across all the five tumor-control pairs collected from the OC patients during the surgery. From these 1092 upregulated and 273 downregulated genes, whereas 717 genes were found to be non-significant. The genes with pvalue <0.05 and log2foldchange > 1 or log2foldchange < -1 were considered for further enrichment analysis. Topfunn was used for gene enrichment analysis to identify gene enrichment pathway analysis found some cancer related pathways such as TNF signaling, p53 signaling pathway, cGMP-PKG signaling pathway, Apelin signaling pathway and IL-17 signaling pathway were strikingly involved in proliferation and apoptosis of tumor cells. The PPI network construction was performed and identified 8 best protein interactions. Conclusion: The current study reports molecular biomarkers including INHBA, FJX1, OLR1, CDK2, IGHM, CXCL11, SH2D5 and FABP5 associated with cancer that can led to identify potential therapeutic targets for the better prognosis of the cancer patients. The signature candidate can be translated to clinical practice to increase early diagnostic accuracy.


Introduction
Oral cancer (OC) is by far the eight most frequent neoplasmic disorder all over the globe.Oral squamous cell carcinoma (OSCC) comprises 90% of all malignant lesion [1,2] and it representing the highest prevalence in head and neck carcinomas [3].The global estimation of oral cancer indicates annual incidence 354,864, leading to 177,384 mortalities and 913,514 prevalence [4].The incidence and prevalence of OC vary greatly between different ethnicity worldwide and it may be attributed to ethnic-specific etiological factors, environmental factors and effect of carcinogenic agents [5].Tobacco, Alcohol and smoking are principle risk factors established for the blooming of OSCC.The western part of India (Gujarat) is tobacco growing belt [6,7].The low-quality product of tobacco including pan masala and gutkha have flooded Indian market as convenient and cheap betel quid [8].The components of betel quid induce generation of reactive oxygen species (ROS) which leads to cell proliferation resulted in apoptotic cell death [9].Thus, apart from these factors, various genetic factors such as HPV involved in activation and deactivation of oncogenes and tumor
With the advent of high-throughput technologies like microarrays, cancer transcriptome analysis has been widely promoted for biomarker innovation, personalized therapy, and the fundamental study of complex biological systems [14].The low sensitivity and the likelihood of cross-hybridization between homologous DNA fragments are two major drawbacks of microarray [15].High throughput NGS technologies, such as Illumina's Solexa, Applied Biosystems' SOLiD, Roche's 454 GS-FLX, and Helicos HeliScope platforms for the qualitative and quantitative analyses of whole genomes as well as transcriptomes, have been developed in response to the low-cost and quick sequencing demand [16].
The present study characterized the transcriptome sequencing (RNA-Seq) of five matched pairs of adjacent non-tumorous mucosa and OSCC tumorous mucosa with differential gene expression analysis to understand the clinical behavior, development and progression of oral neoplasia at molecular level which could facilitate identification of novel prognostic and diagnostic biomarker for OSCC.

Tissue collection and histopathology
Representative research was approved by The Institutional Ethics Committee (Human) of P.D.U.Medical college, Rajkot, Gujarat, India.All procedures and methods were carried out in accordance with accepted protocols and standard guidelines.Oral cancer malignant tissue (n = 5) and adjacent normal tissue (n = 5) were collected in RNAlater® (Thermo Fisher) from the patient who diagnosed with OSCC at Shashwat Haemato Onco Associates, Rajkot, Gujarat, India and stored at -20° C temperature.Every individual who contributed to the study gave their informed consent, prior to their participation in the research.
mRNA isolation and cDNA synthesis 5 µM tissues were washed thrice with 70% ethanol.The total RNA was isolated from each sample using RNasy Micro Kit (Qiagen, USA) as per manufacture's protocol.RNA quality was evaluated based on the RNA integrity number (RIN) using a 2100 Bioanalyzer (Agilent, USA) and quantity was confirmed by using NanoDrop1000 spectrophotometer (Thermofisher).mRNA was extracted from total RNA by mRNA isolation kit (Roche Diagnostics) following users' manual.Purity of isolated mRNA was analyzed using RNA 6000 Pico LabChip kit (Agilent, USA) with Bioanalyzer 2100 (Agilent, USA).RNA fragmentation solution (100 mM ZnCl2 in 100 mM TrisHCl) was subsequently used to fragment the mRNA followed by first and second strand cDNA was synthesized from the fragmented mRNA using cDNA synthesis system (Roche Diagnostics).Furthermore, the High Sensitivity DNA Chip kit and the Fluorometer QuantiFluorTM-ST (Promega) were used to verify the quality and amount of the DNA on the Bioanalyzer 2100.

Roche's 454 sequencing
The high-quality cDNA library was ligated with the quick library adapters A and B before being exposed to emulsion PCR.Following manufacturer's instructions, 454 sequencing was performed on clonally amplified beads.(Roche Diagnostics, USA).

In silico gene expression analysis
RNA sequencing approach of tissue-specific cDNA libraries was carried out using 454 Life Sciences FLX sequencer.We have used commercially available CLC Genomics Workbench v.4.7.1 (http://www.clcbio.com/genomics/) and the human RNA database (http://www.ncbi.nlm.nih.gov/) as our point of reference in order to compute the in silico gene expression level.Normal and malignant tissue sequence reads were evaluated independently using default RNA-seq parameters.According to their RPKM (Reads per kilobase of transcript per Million reads mapped) values, we were able to categorize genes as being expressed mostly in normal or malignant tissues, or both tissues [17].The relative abundance RPKM values of the associated transcripts were used to calculate the differential expression between normal and malignant tissue.

Functional categorization
RNA-seq analysis was done using all the normal and malignant tumors were grouped together and paired t-test was performed to find differentially expressed genes.The genes with pvalue < 0.05 and log2foldchange > 1 or log2foldchange < -1 were considered for further enrichment analysis.Toppfun (https://toppgene.cchmc.org/) was used for gene enrichment analysis to identify over-represented biological processes, molecular functions and cellular components [18].All the enrichment plots were generated using clusterProfiler package in R [19].

Interactome analysis
The search tool for the Retrieval of Interacting Genes/ Proteins (STRING v11.0) (https://string-db.org/)database was searched for all candidate DEGs.A minimum interaction score > 0.4 was used as the criterion to establish the PPI network.The data were exported from the STRING website and incorporated into Cytoscape (version 3.9.1)software in order to visualize them [20].The differentially expressed genes were cross-validated in the TCGA HNSC dataset using GePia (http://gepia.cancer-pku.cn/)[21].

RNA integrity and quality control
The highest possible RIN value is 10, indicating that highly intact total RNA.Only a total RNA with a RIN value greater than or equal to 7.0 and A260/A280 ratio ranged from 2.07 to 2.20 were used to generate the sequencing library.

Differentially expressed genes in OSCC patients
RNA-seq analysis was performed from OSCC as well as normal tissue samples using CLC Genomic Workbench and found different read counts for each (Table 1).Initially, the RPKM values were log2 transformed in all the five sets of control vs test and Log2FC was calculated for each set.The number of genes showing log2FC > 1 or log2FC < -1 in each set were 18916, 14463,16987,14225 and 16826 respectively.Intersection of the differential genes showed that there are 2082 genes differentially expressed across all the sets (Figure 1) counting 1092 upregulated and 273 down regulated genes whereas 717 genes were found to be non-significant (Figure 2).Top 10 upregulated and down regulated genes in cancerous and normal tissue summarized in (Table 2).

Enrichment of cancer pathways and biological processes by mutated genes in OSCC
The GO analysis was carried out using an P<0.05 as the cutoff criterion.A total of 2082 differentially expressed genes (DEGs) were substantially enriched in 27 distinct GO terms, which were classified into three functional categories: Biological process, cellular component, and molecular function.Subsequently pathway enrichment analysis was carried out in order to identify aberrant gene-associated pathways.Total 40 different pathways signaling pathway, Apelin signaling pathway and IL-17 signaling pathway were strikingly associated with OSCC.
were considerably enriched (Figure 3) among which p53 signaling pathway, TNF signaling pathway, cGMP-PKG   PPI network construction DEG interaction networks are generated using STRING database.Total 85 proteins (nodes) were connected by 38 edges.The average local clustering coefficient was 0.29 and the protein-protein interaction enrichment "p" value was 0.0044 at minimum interaction score >0.40, indicating strong protein interactions.Eight interactions were demonstrated various biological linkages including cell cycle, energy metabolism, PPAR signaling pathway and PI3K-Akt signaling pathway which are strongly involved in cancer progression (Figure 4).

Cross-validation using TCGA database
Multiple gene comparison was performed to validate the selected gene expression from our cohort with TCGA database using GePia.Selected genes were found to be (A) (B)

Discussion
Numerous studies on cancer have concentrated on genes associated to cell cycle control and transcriptional regulation for decades.In the petite study, a comprehensive transcriptomic analysis was performed from OSCC tumor and surrounding normal tissue pairs to explore DEGs.A growing number of tumor suppressor genes and oncogenes that contribute to OSCC have been identified by using Roach's 454 sequencing platform.De novo sequencing discovered transcripts, eliminating the need for tissuespecific transcriptome information.Furthermore, it offered the relative abundance of each transcript, allowing for a more accurate comparison of OSCC with normal tissues.
The cancer progression characterized by six essential alterations in cell physiology including self-sufficiency in growth signals, antigrowth (insensitivity to growth inhibitory) signals, prolonged angiogenesis, apoptosis evasion, limitless replicative potential, tissue invasion, and metastasis [22].In present analysis, we identified eight upregulated genes including INHBA (Inhibit Subunit Beta A), FJX1 (Four-Jointed Box Kinase 1), OLR1 (Oxidized Low-Density Lipoprotein Receptor 1), CDK2 (Cyclin Dependent Kinase 2), IGHM (Immunoglobulin Heavy Constant Mu), CXCL11 (C-X-C Motif Chemokine Ligand 11), SH2D5 (SH2 Domain Containing 5) and FABP5 (Fatty Acid Binding Protein 5) closely associated with OSCC development.Over expressed INHBA upregulates Versican, which allows colon cancer cells proliferate, migrate, and invade [23], FJX1 is upregulated in many malignancies and functions as a regulator of angiogenesis [24].OLR1 overexpression was associated with a poor prognosis in breast cancer, maybe as a result of its ability to induce macrophage polarization and set off immunological evasion [25].CDK2 is essential for controlling the cell cycle.It phosphorylates and interacts with proteins involved in a wide variety of cellular processes, including repair of DNA damage, intracellular transport, protein degradation, signal transduction, and metabolism of both DNA and RNA [26].Carbonetti G et al. [27] suggested that lipid signaling disruption by FABP5 suppression could represent a novel approach to treating metastatic prostate cancer as FABP5 as a key driver of lipid-mediated metastasis [27].Over expression of IGHM, CXCL11 and SH2D5 also involved in various biological event which proliferate tumor growth [28][29][30].Cancer-causing HPV can alter the gene expression in a number of ways.These include NF-kappaB, NF-erythroid2-related factor 2, p16, p53, RB1, and certain microRNA genes [31].
The genetic abnormalities has been linked to the development of a various diseases, such as cancer [32].According to KEGG (Kyoto Encyclopedia of Genes and Genome) pathway analysis, the majority of DEGs appears to function in metabolic processes.The enrichment of large number of genes to metabolic pathway was expected as tumor are known to be metabolically very active.Hence, additional research on the specific functional features of these genes that we covered in our study would improve our knowledge of oral carcinogenesis in our population.
In conclusion, selective and effective cancer treatment involves knowledge of differentially expressed biomarkers and their role in biological process dysregulation.Remarkably, KEGG pathway analysis discovered substantial enrichment of DEGs in the p53 signaling and other cancer pathways.The reported biomarkers associated with cancer that can led to identify potential therapeutic targets for the better prognosis of the cancer patients.The signature candidate can be translated to clinical practice to increase early diagnostic accuracy.

Figure 1 .
Figure 1.Venn Diagram Showed Deferentially Expressed Genes from Multiple Samples

Table 1 .
The Outcomes of an RNA-seq Investigation Carried Out with CLC Genomic Workbench

Table 2 .
Top 10 Up Regulated and Down Regulated Genes from the 5 OSCC Tumor-normal Pairs