Association of EGFR 1 Gene Alteration and their Association with Lung Adenocarcinoma Patients

Background: The epidermal growth factor receptor 1 (EGFR1) plays a significant role in cell proliferation and development. Its regulation in humans is very critical and incompletely understood in Non small cell lung cancer (NSCLC). Methods: 100 newly diagnosed NSCLC (lung adenocarcinoma) patients and 100 healthy controls were included and allele specific (AS) polymerase chain reaction (PCR) was used to genotype and expression was analyzed by quantitative real time PCR. Overall survival of patients was analyzed by Kaplan-Meier method and for prognostic significance ROC curve was plotted. Results: A statistically significant difference (p<0.0001) in CC, AA and CA genotypes distribution among patients and healthy controls was observed. Compared to the CC genotype as reference, OR was 30.40 (95%CI 1.75- 524.9, p=0.0002) and 3.97 (95%CI 1.49-10.52, p=0.003) for the homozygous AA and heterozygous CA genotypes respectively. Kaplan-Meier survival analysis was also performed to analyze the relationship of EGFR1 (-191C/A) genotypes with progression free median survival of NSCLC patients and the difference was found to be significantly (p=0.0002) associated with different genotypes. In the ROC curve with respect to TNM stage at optimal cut-off value of 9.88 fold increase in EGFR1 mRNA expression, sensitivity and specificity were 92.9%, 83.3% respectively (AUC=0.95, p<0.0001). ROC curve w.r.t. distant metastases at optimal cut-off value of 13.5 fold change EGFR1 mRNA expression, sensitivity and specificity were 68.2%, 71.4% respectively (AUC=0.81, p<0.0001). In ROC curve w.r.t to presence/ absence of pleural effusion at optimal cut-off value of 14.8 fold change EGFR1 mRNA expression sensitivity and specificity were 66.7%, 68.2% respectively (AUC=0.71, p=0.009). Conclusions: Study concluded EGFR1 promoter polymorphism could be a risk factor associated with disease and may be used as prognostic marker for patients’ survival and predictor for disease worseness.


Introduction
Single nucleotide polymorphisms (SNPs) are the most common sources of human genetic variation, and they may contribute to an individuals' susceptibility to cancer. Several studies have demonstrated that some variants affect either the expression or activities of various enzymes, and that they are therefore associated with the risk of cancer development (Park et al., 2006). The SNP occurs at an important binding site for the transcription factor SP1 that is necessary for activation of EGFR promoter activity and correlates with increased promoter activity and expression of EGFR mRNA (Kobayashi et al., 2005). EGFR is also a possible genetic risk factor for cancer susceptibility. In transgenic mice, overexpression of the EGFR initiates formation of oligodendroglioma and breast cancer (Weiss et al., 2003). Till date, several features of the 5'-regulatory region of EGFR have been described, including a TATA-less, CAAT-less, high GC content promoter with multiple transcriptional start sites (Ishii et al., 1985). EGFR are highly expressed in many tumor types of epithelial origin, including breast, head and neck, bladder cancers, and non small cell lung cancer (Arteaga, 2002). Expression of high levels of EGFR has been associated with a poor prognosis, especially in NSCLC patients (Brabender et al., 2001). Several studies have demonstrated that some variants affect either the expression or activities of various enzymes, and that they are therefore associated with the risk of cancer development (Park et al., 2006). Several polymorphisms in the EGFR gene have been reported (Hsieh et al., 2005) and deposited into public databases (ncbi.nlm.nih.gov/SNP). The variant −191C/A (rs712830) have been associated with increased EGFR promoter activity and gene expression (Liu et al., 2005). This study aimed at characterizing the frequency of EGFR gene polymorphisms in NSCLC patients and determining the correlation with clinicopathological feature and survival of patients.

Study population and sample collection
This study included 100 histopathologically confirm newly diagnosed lung adenocarcinoma patients and 100 healthy subjects. Study was approved by institutional ethics committee of Maulana Azad Medical College and Associated Hospitals and All India Institute of Medical Sciences New Delhi. After informed consent, patients' 3 mL of peripheral blood sample was collected in plain vials from each subject and serum was separated and stored at -80 o C until circulating DNA, RNA extraction. Patients included in study were followed for 2 years for overall and progression free survival analysis.

Circulating DNA, RNA isolation and cDNA synthesis
Circulating DNA was extracted by using commercially available kit (Epigentech, USA) following manufacturer's protocol and circulating total RNA was extracted by Trizol reagent according to the manufacturer's protocol (AMRESCO, USA) of lung adenocarcinoma cases as well as from healthy control's circulating samples stored at -80 o C. The quality of circulating DNA and total RNA samples were quantified using NanoDrop spectrophotometer. The ratios of the absorbance at 260 and 280nm (A260/280) were used to assess the purity of nucleic acids and for pure DNA, A260/280=1.8 and for pure RNA A260/280 =2.0 were considered.and from total RNA, cDNA was synthesized by using 100 ng total RNA following manufacturers' protocol (Verso, Thermo Scientific, USA). Briefly, 100 ng of total RNA, 5X cDNA synthesis buffer, dNTPs (5 mM each), RT enhancer, Verso RT enzyme mix and random hexamers/Oligo DT (400 ng/μL) in the total volume of 20 μL incubated for 60 min at 42 o C.

Genotyping and quantitative real time polymerase chain reaction (PCR)
The (-191C/A, rs712830) promoter polymorphism of EGFR1 gene was genotyped using allele specific (AS) PCR method using circulating DNA. PCR reaction was performed in 25 μL reaction volume containing 3 μL of 100 ng template circulating DNA, 0.25 μL 25 pmol each primer, 10 μL of mastermix containing 10 mM dNTPs, 20 mM MgCl2, 5 U/μL Taq polymerase with 10× Taq Buffer (Fermantas) and 25 μL reaction volume was maintained by adding nuclease-free ddH2O followed by programme 10 min of initial denaturation at 95 o C and 40 cycles at 95 o C for 40 s, 64 o C for 40 s and 72 o C for 40 s with a final 10 min extension step at 72 o C and PCR product of 175 bp was visualized on 2% agarose gel containing ethidium bromide ( Figure 1). EGFR-1circulating mRNA expression was studied by QRT-PCR method (SYBR Green I technology) with β-actin gene as internal control. The expression of EGFR-1 and β-actin was performed by PCR programme for 40 cycles, denaturation at 94 o C for 40 s, annealing at 60 o C for 40 s, extension at 72 o C for 40 s and reaction volume was 20 μL. A final extension step at 72 o C for 5 min to complete the reaction and melting curve analysis was performed between the range 40 to 90 o C to ensure the specific amplification. A control without cDNA was included in each experiment as non template control and all reaction were performed in duplicate. The relative quantification method (2 − ( ΔΔCT )) was done to analyse the circulating EGFR-1 mRNA expression level by using β-actin as internal control and final results were expressed as mean fold change in circulating EGFR-1 mRNA expression in lung adenocarcinoma patients as compared to control.

Statistical analysis
Differences in selected demographic charecteristic and EGFR-1 genotype frequencies between the cases and controls were evaluated by using the Chi-square test. The associations between EGFR-1 variant genotypes and risk of lung adenocarcinoma were estimated by calculating the odds ratios (ORs) with 95% confidence intervals (CIs). Allele frequencies between the cases and controls were evaluated using Hardy-Weinberg equilibrium test. Kruskal Wallis test were used to analyze the association of gene expression with different EGFR 1 genotype. The Kaplan-Meier method was used to calculate the over all and progression free survival of lung adenocarcinoma patients. A P value <0.05 was considered indicative of a statistically significant difference. All statistical analyses were performed using the SPSS 16 and Graph Pad version 6.0.

Demographics
All demographic features of the subjects are depicted (Table 1). In brief, total of 100 lung adenocarcinoma patients were analyzed and healthy controls were age, sex and history of smoking status and type was also matched.

Case-control genotype distribution
The genotype and allele distribution of EGFR1 (-191C/A) in cases and controls are summarised in Table 2. The genotype and allele frequency distributions of EGFR1 polymorphism (-191C/A) in Lung adenocarcinoma cases and controls was analyzed using circulating DNA. A statistically significant difference (p<0.0001) in CC, AA and CA genotypes distribution among patients and healthy controls was observed. The EGFR1 CC, AA and CA genotypes distribution among NSCLC patients was found to be 71%, 11% and 18%, whereas among controls it was 94%, 0% and 6% respectively.

EGFR1 (-191C/A) gene polymorphism and risk of Lung adenocarcinoma
Odds ratio with 95% confidence intervals was calculated for each group to estimate the degree of association between the EGFR1 (-191C/A) genotypes and Association of EGFR 1 Gene Alteration in Lung Adenocarcinoma Patients risk of NSCLC in Indian patients. Compared to the CC genotype as reference, OR was 30.40 (95% CI 1.75-524.9, p=0.0002) and 3.97 (95% CI 1.49-10.52, p=0.003) for the homozygous AA and heterozygous CA genotypes respectively. This suggested a possible dominant effect of the EGFR1 (-191C/A) mutant A allele on NSCLC risk in Indian population as higher risk was found to be associated with AA homozygous genotype (Table 3).

EGFR1 (-191C/A) genotypes and circulating EGFR1 m RNA expression
Real time relative quantification analysis showed increased circulating EGFR1 mRNA expression among NSCLC patients than healthy controls but no significant difference was observed in circulating mRNA expression with respect to EGFR1 genotype in NSCLC patients. The fold change was calculated with respect to the EGFR1 CC wild type genotype NSCLC patient taken as the reference value. It was observed that there was more than 15 fold (mean) increase circulating EGFR1 mRNA expression among patients with homozygous AA and heterozygous CA genotype while homozygous CC wild type genotype showed more than 12 fold of circulating EGFR1 mRNA expression (Table 4).

EGFR1 (-191C/A) genotypes and survival outcome Overall survival
Kaplan-Meier survival analysis was performed to analyze the relationship of EGFR1 (-191C/A) polymorphism with overall survival of NSCLC patients. No significant difference was observed in overall survival of NSCLC patients with respect to different genotypes, though patients with EGFR1 AA genotype (8.9 months) and CA genotype (8.0 months) showed reduced overall median survival time compared to wild type CC homozygous (11.2 months) genotype (Figure 2a).

Progression free survival
Kaplan-Meier survival analysis was also performed to analyze the relationship of EGFR1 (-191C/A) genotypes with progression free median survival of NSCLC patients. Patients with EGFR1 AA genotype (4.5months) and CA genotype (4.8 months) showed reduced progression free  survival compared to CC genotype (11.3months) and the difference was found to be significant (p=0.0002) (Figure 2b).

Circulating EGFR1 mRNA expression as predictive/ prognostic marker in Lung adenocarcinoma patients
To evaluate the role of circulating EGFR1 mRNA as predictive/prognostic marker for NSCLC patients, different clinical parameters were dichotomised into two groups and ROC curves were plotted between early and advanced stage NSCLC patients. In the ROC curve with respect to TNM stage at optimal cut-off value of 9.88 fold increase in EGFR1 mRNA expression, sensitivity and specificity were 92.9%, 83.3% respectively    (AUC=0.95, p<0.0001). ROC curves were also plotted with respect to presence/absence of distant metastases and presence/absence of pleural effusion positive. In ROC curve w.r.t. + distant metastases at optimal cut-off value of 13.5 fold change EGFR1 mRNA expression, sensitivity and specificity were 68.2%, 71.4% respectively (AUC=0.81, p<0.0001). In ROC curve w.r.t to presence/ absence of pleural effusion at optimal cut-off value of 14.8 fold change EGFR1 mRNA expression sensitivity and specificity were 66.7%, 68.2% respectively (AUC=0.71, p=0.009) (Figure 3).

Discussion
EGFR1 essential promoter region was associated with altered promoter activity and gene expression both in vitro and in vivo. The SNP occurs at binding site for the transcription factor SP1 that is necessary for activation of EGFR1 promoter activity and correlates with increased promoter activity and expression of EGFR1 mRNA (Kobayashi et al., 2005). Several polymorphisms in the EGFR1 gene have been reported (Hsieh et al., 2005) and deposited into public databases (Liu et al., 2005). The variant -191C/A have been associated with increased EGFR1 promoter activity and gene expression (Moriai et al., 1994) and Costa BM et al in 2011 found that the heterozygous -191C/A genotype EGFR1 could be used as predictive marker in glioblastoma (Costa et al., 2018). EGFR1 (-191 C/A) promoter polymorphism was found to be associated with increased risk and poor prognosis of NSCLC in Indian population and significant difference was observed in genotype distribution of EGFR1 -191C/A among NSCLC cases and healthy controls. It has been observed that the risk of developing NSCLC was more than 30.0 fold higher in association with homozygous EGFR1 -191AA genotype than homozygous EGFR1 -191CC genotype. Genome-wide studies (GWAS) showed that EGFR1 polymorphisms found to be associated with increased glioma risk (Sanson et al., 2011). Higher expression of circulating EGFR1 mRNA expression was observed with AA mutant homozygous and CA heterozygous compare to CC homozygous wild type genotype. It has been observed that the overall survival was reduced with EGFR1 AA mutant homozygous genotype compare to EGFR1 CC wild type genotype, however progression free survival was also found to be reduced with EGFR1 AA mutant homozygous genotype and CA heterozygous genotype. Ligands binding to EGFR1 triggers different downstream signalling pathways and results in several biological responses varying from proliferation, differentiation, migration and apoptosis (Wells, 1999;Pu et al., 2009). EGFR1 was found upregulated in human carcinomas and often related to poor prognosis or advanced pathological stages (Fischer-Colbrie et al., 1997). The switch from greater cytoplasmic EGFR1 to greater membranous EGFR1 expression occurs in progression in cancers (Piyathilake et al., 2002). Increased expression of EGFR1 mRNA in plasma of NSCLC patients provides a potential tool for molecular approach in diagnosis (Radostina et al., 2010). Higher AUC in ROC curves of advanced vs early stage patients for serum EGFR1 mRNAs expression and very high sensitivity and specificity suggested to be useful as good predictive marker for disease progression in NSCLC. Scagliotti et al., (2004) also have analysed that EGFR1 expression in NSCLC is associated with reduced survival. EGFR1 is overexpressed in the NSCLC patients, and is associated with the poor survival and EGFR1 expression is clearly involved in the lung cancer pathogenesis (Dowell and Minna, 2005). Expression study previously published and it was found that circulating EGFR1 mRNA over expression may be a predictive marker in patients' poor prognosis, overall survival and metastatic behaviour of NSCLC patients (Mirza et al., 2015). Current study suggest that circulating EGFR1 gene polymorphism could be the indicative marker to be associated with risk for Lung adenocarcinoma, poor progression free survival and expression could be a prognostic indicator for disease advancement and metastatic behaviour of disease.

Conflict of interest
None.