Determining Overall Survival and Risk Factors in Esophageal Cancer Using Censored Quantile Regression

Background: Esophageal cancer is one of the leading causes of death worldwide. The global increasing rate of this type of cancer requires more attention. The purpose of this study was to determine the overall survival probability of esophageal cancer after diagnosis and to assess the potential risk factors in a population of Iranian patients. Materials and Methods: This retrospective cohort study was conducted on 127 cases with esophageal cancer in the Azarbaijan province, East of Iran. Participants in the study were diagnosed during 2009-2010 and were followed up for 5 years. The event was considered death due to esophageal cancer and those who survived until the end of the study were assumed as right censored. Censored quntile regression was fitted to find the overall survival of the patients using adjusted effects of variables and was compared with Cox regression model. Results: Patients’ mean and median survival time were 16.99 and 10.06 months respectively and 89% off cases died by the end of the study. The 1, 3, 6, 12 and 36-month survival probabilities were 0.95, 0.76, 0.60, 0.43, and 0.18. The median survival time for females and males without surgery were 21.79 and 14.76 month respectively. The accuracy of predictions were 0.99 and 0.74 for the censored quantile regression and Cox, respectively. Conclusion: We concluded that being male, not having surgery, longer wait time between having symptoms and being diagnosed, low socioeconomic status and old age to be significant risk factors in reducing the probability of survival from esophageal cancer.


Introduction
Cancer is known as a group of diseases with abnormalities in cell growth. It is one of the leading causes of mortality and disability globally (Oliver et al., 1992;Ferlay et al., 2010). Statistically, about two-third of cancers occur in developing countries in where only 5% of the cancer control tools are available (Sung et al., 2005). Despite the progress in medical science and the development of treatments and increase in the number of survivors, being diagnosed with cancer is associated with a great fear of dying and a sense of helplessness among diagnosed patients (Brunelli et al., 2000;Blazeby et al., 2005;Gradauskas et al., 2006). According to The World Health Organization, by 2020, the incidence of cancers will increase by 50% and cancer will be the leading cause of death in the world by 2030 (Nia et al., 2011). Esophageal cancer is the sixth leading cause of death and the eighth most common among all types of cancer in the world. It is in the fifth place in developing countries and its 5-year survival probability is about 15-25% (Parkin et al., 2005;Hebert et al., 2006;Pennathur

Determining Overall Survival and Risk Factors in Esophageal Cancer Using Censored Quantile Regression
Elaheh Zarean 1 , Mahmoud Mahmoudi 2 , Tara Azimi 3 , Payam Amini 4 * et al., 2013). Esophageal cancer is classified into main histological types including squamous cell carcinoma and adenocarcinoma. Multiple epidemiological studies indicated that the incidence of esophageal squamous cell cancer is decreasing whereas the incidence of esophageal carcinoma is rapidly increasing. This type of esophageal cancer affects more than 45,000 people across the world (Zhang et al., 2012). The lack of premature symptoms and the presence of strong two-way laryngeal esophagus are reasons this disease is often diagnosed in a late and advanced stage (Allen et al., 1997;Patti and Owen, 1997;Sundelöf et al., 2002). Geographic dispersion can play an important role in the incidence of esophageal cancer where the highest incidence rate is in "the Asian Esophageal Cancer Belt" which includes countries such as Turkey, Iran, Kazakhstan, northern and central parts of China mean men and women are equally affected in European countries, but in North America men are affected more (Pennathur et al., 2013). The highest incidence of esophageal cancer occurs in the age group of 50-70 years. Moreover, it has been indicated that men are more prone to get this type of cancer (Ferlay et al., 2010;Zendehdel, 2014). Surgery is known as a definitive treatment for esophageal cancer. However, surgical techniques can also impact the increase of survival in patients with esophageal cancer, Radiotherapy and chemotherapy are other major treatment options in patients with esophageal cancer (Napier et al., 2014). In Iran, approximately 51,000 new cases of cancer are reported annually. The most involving organ in both sexes is the gastrointestinal tract (38%) in which 6,500 cases are with esophagus (Sadjadi et al., 2005).In addition, gastrointestinal cancers are more common in the northern region of Iran (the region by the Caspian Sea) and the highest incidence of esophageal cancer in the world is reported by the Caspian Sea Cancer Center in 1973 (Mahboubi et al., 1973;Somi et al., 2008). Most Iranian studies have identified Esophageal, as well as gastric and breast cancer (except skin caner), as the most common cancers occurring in both sexes (Rajaiefard et al., 2011). Esophageal cancer is the third most common cancer in men and the second most common cancer in women in the East Azarbaijan province in Iran (Somi et al., 2008). Some of the risk factors for esophageal cancer are sex, race, drugs abuse, alcoholic drink consumption, obesity, and socioeconomic status (SES) (Vaughan et al., 1995;Tran et al., 2005). Among 35,000 death due to cancer reported in Iran, 5800 cases have had esophageal cancer (Sadjadi et al., 2005). Whereas, limited studies have been done regarding the survival of patients with esophageal cancer in Iran the survival analysis of patients with esophageal cancer is very important and appropriate statistical models can better introduce the important prognostic factors to improve patients' survival.
Time to event are the most common types of data in medical problems. Lots of statistical tools are available to analyze this data including parametric models such as Weibull distribution, semi-parametric approaches such as Cox and quantile regression, and non-parametric tools such as Kaplan-Meyer estimations (Portnoy, 2003;Cox, 2018). Semi-parametric survival models (Cox Proportional hazard model) require multiple important assumption including proportional hazard (PH) assumption which illustrate the effect of covariate on hazard function at a constant rate over time. Where the PH assumption does not hold, there has been misinterpretation in hazard ratio (HR) estimation. Despite the several advantages of Cox models, the difficulty in HR interpretation is identified as a major problem (Xue et al., 2018). However other approaches (parametric survival models) such as the accelerated failure time (AFT) model give us direct interpretation of covariate effects on event time, it requires homogenous treatment effect assumption. Therefore, the necessity of method that does not require the assumption of classic survival method (parametric and semi-parametric model) is needed. Since censored quantile regression (CQR) model provide more dynamic relationship between covariate and survival time and having straightforward interpretation than classic survival models, this approach can be considered as a useful tool in modeling time to event data (Xue et al., 2018).
Although the prevalence of esophageal cancer in the East Azarbaijan province is high, few studies have been conducted to assess the survival rate of patients. The present study intends to determine the important risk factors on the survival of patients with esophageal cancer using censored quantile regression model.

Materials and Methods
This retrospective cohort study obtained information on patients with gastrointestinal cancer during the years 2009-2010 using the database of centralized registration of all cancer cases in Iran (Country Registration Program). All pathological centers in Iran are asked to report all of their cancer cases to the center through a computer program annually. The data from this study was collected from 127 cases of patients with esophageal cancer who lived in the cities of East Azarbaijan province in north east of Iran. The patients' information were extracted from their hospital and health center records.The patients referred to health centers and hospitals in this province were followed up for 5 years till 2015. The beginning of the study was considered as the date of pathologic diagnosis of esophageal cancer and the event was considered death due to esophageal cancer. The patients that survived from esophageal cancer until the end of the study were considered as right censored.
The current study used several available factors to assess the time to event among the patients. The utilized factors were sex (male/female), age at the time of diagnosis ((≤55/ ≥56 years), smoking habits (yes/no), education (illiterate/literate), marital status (married/ single), residence (province center/urban/rural), surgery (yes/no), chemotherapy (yes/no), radiotherapy (yes/no), hormone therapy (yes/no), alcoholic drinks consumption (yes/no), biopsy type (Endoscopy/ others) and time interval between symptom-diagnosis (≤2/ ≥3 month). In addition, we used principal components analysis in order to achieve an indicator of socio-economic status (high/ Low level) using a check list that consisted of questions on household fuel consumption, residential facilities, personal family facilities, household appliances used by the family, source of household income and total monthly household income. Patients' age and time interval between symptom-diagnosis were categorized based on their most critical cut-off point in the patients' survival.
However, there are lots of other factors related to the lifetime of esophageal cancer such as metastatic status, tumor size, the stage of disease and the certain type of esophageal cancer (adenocarcinoma and squamous). These prognostic factors were not assessed because of the lack of access to the medical records of patients in the East-Azarbayjan cancer registry center.

Statistical analysis
Descriptive characteristics of the patients were shown as mean (± standard error) and frequency (percentage) for continuous and categorical variables, respectively. The mean survival time for each variable and its subgroups were calculated and compared using the Kaplan-Meyer estimator. Moreover, a log-rank test was performed to assess the distribution differences among variables subgroups.
However popular classical statistical tools like the study. Based on Figure 1 a decreasing trend of survival probability is observed and the one, three, and six-month, and one and three-year survival probabilities are 0.  (Xue et al., 2018).
In the current study, we used censored quntile regression to find the overall survival of the patients using adjusted effects of variables (Fitzenberger, 1997;Portnoy, 2003;Wang and Wang, 2009). This model provides clinicians and physicians with numerous quantiles of survival time based on several risk factors (Fitzenberger, 1997;Portnoy, 2003;Wang and Wang, 2009). This model estimates the pth quantile of survival time (Qp) as follows in which X's are the covariates and factors, βp is the coefficient for the pth quantile.

Q(p|X)=X^' βp
For example, Q50=a(days) means that a randomly selected person from the sample has a probability of 0.5 for experiencing the event within a days. This model uses bootstrap resampling method to estimate the coefficients' standard error. The CQR model used the significant factors in the log-rank test results to assess the amount of adjusted effects of the independent factors on the survival of the patients.
The performance of CQR model was compared with the frequently used proportional hazards Cox regression model using the Chambless and Diao's estimator of cumulative/dynamic AUC for right-censored time-to-event data (Chambless and Diao, 2006). To do so, the data was divided into two sets of train (70%) and test (30%) randomly. To find more reliable results, the cross-validation was repeated 500 times and the mean accuracy measure was presented for the models. The train set was utilized for model fitting and the validation of the results was checked by the test set. In order to assess the proportionality of the hazards between the independent variables groups, the Schoenfeld residuals was used. The data analysis was carried out using the "survival", "quantreg" and "survAUC" in the statistical programing R language version 3.3.1. All statistical tests were 2-sided and a p-value<0.05 were considered statistically significant.

Results
The mean (± standard deviation) age of the 127 patients was 66.92 (± 11.95) years. The survival time ranged from 0.10 to 69.03 month and the mean and median survival time was 16.99 and 10.06 months respectively. A total of 113 individuals (89%) lost to death by the end of  Table 2 shows the results from the performed CQR model. The significant variables from the unadjusted log-rank test were entered in the model. The impact of selected variables in the CQR model on the survival of the patients were evaluated in 5, 10,15,20,25,30,40,50,60,70,75,80,85  According to the above formulas, the 10th percentile of survival time for a case with a low SES status and older than 55 years of age is 1.79 month. That is, the probability of survival for a person older than 55 with low SES on the 1.79th month is 10 percent. The probability of 10% experiencing the event for cases older than 55 is 2.96 month less. Similarly, the median survival time for females and males without surgery is 21.79 and 14.76 month. In other words, the probability of survival for a female without surgery on the 21.79th month is 50 percent whereas males are expected to experience the event 7.03   In addition, 50% of survival time for a case without surgery is 3.02 months less than that of with surgery. The same interpretation can be used for any other quantiles. As it is shown in Table 2, most of the survival time quantiles are affected by sex and presence/absence of surgery. The results of Cox PH regression model is shown in Table 3. The Schoenfeld residuals exposed proportional hazards between the groups of independent variables. The results demonstrated that the hazard of death from Esophageal cancer among females is 0.533 (95% CI: 0.356-0.793) times than males. The patients without surgery were 1.627 times more likely to experience death from Esophageal cancer. Moreover, those with a time interval between symptom-diagnosis more than 3 were 1.506 times more prone to die from Esophageal cancer. Using the " Chambless and Diao (2006)"'s estimator of cumulative/dynamic AUC for right-censored time-to-event data, the mean accuracy of the predictions for CQR and Cox PH models were 0.99 and 0.74, respectively.

Discussion
The current study showed that survival time after positive diagnosis of esophageal cancer is significantly affected by sex, age, socioeconomic status, surgery, and time the interval between symptom and diagnosis. Almost 90% the patients in our study died before the end of the study. Moreover, the mean survival time was below one year. It was shown that the survival probability is almost consistently at 30 month after the diagnosis of the cancer. The probability of 40% to 85% experiencing death was related to the presence/absence of surgery and the sex of the patient.
Our study showed that low a socioeconomic status score is responsible for lower probability of survival from esophageal cancer. Facilities such as chemotherapy, hormone therapy, surgery and other potential treatments might not be easily accessible to patients with low SES contributing to an overall lower survival rate. Louwman et al. studied the prevalence of life-shortening factors among cancer patients including esophageal cancer patients with low socioeconomic status. In a large population-based research study they demonstrated that cancer patients with low SES were 50% more likely to suffer from another serious disease. The overall adverse consequences of esophageal cancer combined with the likelihood of having another serious disease may explain the lower survival rates among low SES patients with cancer (Louwman et al., 2010). Tran et al., (2005) assessed the impact of sex, race, socioeconomic status, and treatment on the survival of esophageal cancer patients. They concluded that lower SES is associated with less likelihood of receiving surgery resulting in lower survival (Summart et al., 2017). In another study in Kashmir, the association between the risk of esophageal squamous cell carcinoma and SES also indicated a strong inverse relationship (Dar et al., 2013).
The data from our study also showed that males are more prone to die earlier. However, there was no significant difference between the males and females for the probability of survival less than 40%. This might be due to the fact that men are more likely to smoke and use tobacco, drink alcohol, and to engage in more risky behaviors than women. Melhado et al. debated on the changing face of esophageal cancer and demonstrated that the disease is more common among men than women (Melhado et al., 2010). The impact of sex on the survival of patients with esophageal cancer was investigated by Bohanes et al. and they exposed that women have longer survival in both metastatic and locoregional esophageal cancer. They rationalized that hormonal differences and menopause justify the controversy of the survival rates among two genders (Bohanes et al., 2012). The difference in esophageal cancer survival between males and females was also discussed by Mathieu et al. These researchers concluded that higher estrogen levels in women may play as a preventative agent against the cancer. They also illustrated that this protective performance disappears as females approach their Menopause (Mathieu et al., 2012). The present study demonstrated that undergoing surgery extends the survival time of patient's with esophageal cancer. This treatment has been introduced as the best option for the management of esophageal cancer in its early-stage whereas chemotherapy and radiation therapy are suggested for later-stages. D'Amico assessed the outcomes after surgery among patients with esophageal cancer and found slight longer survival in patients with chemotherapy after surgery than surgery alone (D'Amico, 2007). The health related quality of life after surgery as a treatment for esophageal cancer was evaluated by Lagergen et al. and more than half of the patents survived more than 3 years with improved emotional function (Lagergren et al., 2007). The current research also revealed that patients older than 55 years as well as rural resident patients are more likely to have shorter survival rates. The same cut-off point was found by Bohanes et al., (2012). They found that the incidence and mortality of esophageal cancer increases with age in both genders and in both rural and urban residents. They also found a higher risk of mortality and cancer incidence among rural residing patients in comparison to those living in urban areas (Zeng et al., 2016). The results of our data showed that the probability of survival reduces as time between symptom and diagnosis of esophageal cancer increases. We found that those with a period of longer than 3 months of Between the presence of symptoms and diagnoses are at a higher risk of death. Grotenhuis et al., (2010) assessed the delay in diagnostic workup and treatment of esophageal cancer and identified that short-term outcomes such as morbidity and mortality rates are significantly associated with hospital delay while the long-term consequences are less related. Moreover, we used censored quantile regression to find the probability of survival in any desired quantile. Regarding the skewness of time-to-event data, semi-parametric approaches such as quantile regression models can fit the data better and result in more valid estimations and interpretations (Portnoy, 2003).
This study was not without limitations. There was a relatively small sample size utilized. Moreover, missing data was frequently observed in the patients' records in the hospital and health centers.
We conclude that being male, having no surgery, experiencing a longer time between symptom onset and diagnosis, having low socioeconomic status and being older in age are all significant risk factors in reducing the probability of survival from esophageal cancer.