Association between Telomere-Related Polymorphisms and the Risk of IPF and COPD as a Precursor Lesion of Lung Cancer: Findings from the Fukuoka Tobacco-Related Lung Disease (FOLD) Registry

Background: Lung cancer coexisting with idiopathic pulmonary fibrosis (IPF) or chronic obstructive pulmonary disease (COPD) can lead to poor prognosis. Telomere-related polymorphisms may be implicated in the pathogenesis of these three lung diseases. As to elucidate the mechanism of lung cancer via IPF or COPD may enable early detection and early treatment of the disease, we firstly examined the association between telomere-related polymorphisms and the risk of IPF and COPD in a case-control study. Materials and Methods: A total of 572 patients with IPF (n = 155) or COPD (n = 417), who were derived from our on-going cohort study, and controls (n = 379), who were derived from our previous case-control study, were included in this study. Telomerase reverse transcriptase (TERT) rs2736100, telomere RNA component (TERC) rs1881984, and oligonucleotide/oligosaccharide-binding fold containing1 (OBFC1) rs11191865 were genotyped with real-time PCR using TaqMan fluorescent probes. Unconditional logistic regression was used to assess the adjusted odds ratios and 95% confidence intervals. Results: TERT rs2736100 was significantly associated with the risk of IPF; increases in the number of this risk allele increased the risk of IPF (Ptrend = 0.008). Similarly, TERT rs2736100 was associated with the risk of COPD. In regard to the combined action of the three loci, increasing numbers of “at-risk” genotypes increased the risk of IPF in a dose-dependent manner (P trend=0.003). Conclusions: TERT rs2736100 was associated with the risks of both IPF and COPD in a Japanese population. A combination of the “at-risk” genotypes might be important to identify the population at risk for IPF more clearly.


Introduction
Patients with chronic obstructive pulmonary disease (COPD) or idiopathic pulmonary fibrosis (IPF) are at an increased risk of lung cancer, especially in male smokers (Ozawa et al., 2009;El-Zein et al., 2012). As lung cancer is the most common cancer and the leading cause of cancer-related mortality worldwide (Ferlay et al., 2015;Nakagawa-Senda et al., 2017), a considerable number significant morbidity and mortality due to hypoxemic respiratory insufficiency (Yang and Schwartz, 2015). The malignant change from these lung diseases to lung cancer is partly caused by the accumulation of DNA damage. Cigarette smoking is the major cause of DNA damage, but natural biological processes such as aging also contribute to the accumulation of DNA damage.
Telomeres, which are repetitive DNA sequences, are located at the ends of linear chromosomes and protect the chromosome ends from degradation (Blackburn, 2001). Because critically shortened telomeres signal a DNA damage response that can lead to apoptosis, telomere length is thought to be a biomarker of aging (Blackburn et al., 2006;Mather et al., 2011). As smoking may induce oxidative stress and then irretrievable damage to the telomeric DNA (von Zglinicki, 2002), smoking may be associated with shortened telomere length. Despite biological plausibility between smoking and shortened telomere length, there have been inconsistencies in the studies (Harris et al., 2012;Revesz et al., 2015). As telomere length is influenced by non-genetic factors such as age and smoking, the relationship between telomere length and lung cancer is controversial (Jang et al., 2008;Machiela et al., 2015).
Recently, genes involved in telomere length have been implicated in the pathogenesis of a variety of chronic lung diseases (CLDs), including IPF, COPD, and lung cancer (Alder et al., 2011;Codd et al., 2013;Fingerlin et al., 2013;Gansner and Rosas, 2013;Snetselaar et al., 2015;Stanley et al., 2015;Zhou and Wang, 2016). Mutations in the telomerase complex, TERT (telomerase reverse transcriptase) and TERC (telomerase RNA component), which help maintain telomere lengths and chromosome stability in cells, are of great importance to human health (O'Reilly et al., 1999;Ly, 2009) and has been reported as a risk of COPD (Ding et al., 2019), IPF (Noth et al., 2013;Kropski et al., 2015) and lung cancer (Fernandez-Garcia et al., 2008;Li et al., 2017). OBFC1 (oligosaccharidebinding fold-containing protein 1) is part of the CST complex (consists of CTC1, STN1, and TEN1 proteins), which binds to single-stranded DNA and is important for telomere maintenance (Levy et al., 2010). Polymorphisms of OBFC1 have been reported as a risk factor for glioma (Walsh et al., 2015) and IPF (Fingerlin et al., 2013). Although overlapping genetic risk factors among COPD, IPF, and lung cancer have been reported (Haycock et al., 2017;Hobbs et al., 2017), not all genetic factors might contribute to the development of the three diseases. Namely, the alleles associated with the risk of having a disease depend on the type of disease (van Moorsel, 2018).
TERT rs2736100 and TERC rs1881984 were relatively well-examined in both IPF and COPD, while OBFC1 rs11191865 has been reported to be associated with various carcinomas (Walsh et al., 2015). As to elucidate the mechanisms of lung cancer via IPF or COPD may enable early detection and early treatment of the disease, we first examined the association between telomere-related polymorphisms (TERT rs2736100, TERC rs1881984 and OBFC1 rs11191865), which were selected from genome-wide association studies on IPF (Mushiroda et al., 2008;Fingerlin et al., 2013;Noth et al., 2013;Stuart et al., 2015;Allen et al., 2017), and the risk of IPF and COPD in this case-control study.

Fukuoka tobacco-related lung disease survey and population
In this case-control study, patients with COPD or IPF were selected from a multicenter (29 associated hospitals), prospective cohort study named the Fukuoka Tobacco-Related Lung Disease (FOLD) registry study conducted in Fukuoka prefecture, Japan between September 1 st , 2013 and April 1 st , 2016 (Ogata-Suetsugu et al., 2020). The patients who agreed to donate blood samples for genetic testing were included in this study (Supplement and Supporting Data / SSD1). IPF was diagnosed based on the criteria (Raghu et al., 2011) while COPD was diagnosed according to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria. (Vestbo et al., 2013). The details on the cohort setting were described elsewhere (Ogata-Suetsugu et al., 2020).
Controls (n = 379) were derived from a previous case-control study conducted in Fukuoka prefecture between November 1996 to March 2008 (Kiyohara et al., 2014). They were hospitalized patients without a clinical history of any type of malignancy, ischemic heart disease, or chronic respiratory disease. All controls agreed to donate blood samples after written informed consent.
All subjects were unrelated ethnic Japanese. The study protocol was approved by our institutional review board and research ethics committee (#25-135, #555-00), and all participants provided written informed consent.

Genetic analysis
Genome DNA was extracted from blood samples, and genotyping was conducted with blinding to case/control status. TaqMan ® SNP Genotyping Assays purchased from Applied Biosystems (Foster City, CA, USA) were used for the following [gene, single nucleotide polymorphism (SNP), assay ID]: TERT, rs2736100, C___1844009_10; TERC, rs1881984, C___176429_10 and OBFC1, rs11191865, C___2818536_10. The real-time PCR reaction conditions were as follows: 95°C for 10 min, followed by 40 cycles of 95 °C for 15 s and 60°C for 1 min. For quality control, we repeated assays on a random 5% of all samples, and the replicates were 100% concordant.

Statistical analysis
Comparisons of means and proportions were based on the unpaired t-test (or Mann-Whitney test in case of not following a normal distribution) and χ 2 test (or Fisher's exact test in case of n < 5), respectively. The former tests were used for continuous variables, while the latter ones were used for dichotomous variables. Unconditional logistic regression was used to compute the odds ratios (ORs) and their 95% confidence intervals (CIs), with adjustments for several covariates (age, sex, and smoking status). Deviation from Hardy-Weinberg Equilibrium (HWE) for each SNP was tested in controls by the chi-square (Pearson) test. Because there is no generally accepted answer to the question of which alleles are

Characteristics of study subjects
SSD 2 shows the distributions of selected characteristics among study subjects. This study included 572 patients with IPF (n = 155) or COPD (n=417) and 379 controls. Since controls were not selected to match patients in regard to age and sex, there were significant differences in age (P < 0.001) and sex ratio (P < 0.001) between patients and controls. Smoking history was higher for both patient groups as compared to controls (P < 0.001 for both groups). Hence, we included age, sex and smoking history as a covariate in all the analyses. The prevalence of a family history of idiopathic interstitial pneumonias was significantly higher in patients with IPF (4.5%) than in those with COPD (0.5%) (Fisher's exact P = 0.002). Diagnostic biopsies were performed in 13 patients with IPF. IPF patients had a lower prevalence of smoking history, a lower percentage of males, and a lower vital capacity (VC) (% predicted) than COPD patients. Conversely, patients with IPF had a higher BMI, a higher forced expiratory volume in 1 (FEV1.0 (L)), (%FEV), and a higher Tiffeneau Index than those with COPD. There was no difference in the prevalence of cancer history and gastroesophageal reflux disease between IPF patients and COPD patients.

Allelic frequencies in patients with IPF or COPD, and in controls
SSD 3 shows the allelic frequencies of telomererelated genetic polymorphisms in study subjects. The distribution of genotypes in rs2736100, rs1881984, and rs11191865 were in Hardy-Weinberg equilibrium in controls (P > 0.05). The frequencies of the minor alleles of rs2736100, rs1881984, and rs11191865 among controls were 41.3%, 36.0%, and 33.2%, respectively. As for rs2736100, the genotypic distribution among patients with "at-risk" alleles, we selected the category with the largest number of subjects (generally major homozygotes) as the reference category. We then designated the genotype that is presumed to increase the risk of lung disease as the "at-risk" genotype. Thus, analyses were done under a dominant model (the heterozygotes grouped with the homozygotes for the "at-risk" allele), a recessive model (the heterozygotes grouped with the homozygotes for the "non-risk" allele) and a codominant model (genotypic model, three genotypes considered independently) in this study. The trend test was described in two ways: by reference to homozygotes of major alleles and by reference to homozygotes of minor alleles. For example, the trend was assessed by a score test for each genotype as follows: 0 = homozygous for the major allele, 1 = heterozygous for the minor allele, and 2 = homozygous for the minor allele. The cumulative "at-risk" genotype (at least one "at-risk" allele) effects were evaluated and assigned an ordinal score of 0 ("no-risk" genotypes), 1 (one "at-risk" genotypes), 2 (two "at-risk" genotypes) and 3 (three "at-risk" genotypes).
Patients with respiratory disease may have changed their smoking habits following the appearance of respiratory symptoms before the diagnosis of the disease. As it is difficult to distinguish clearly between current smokers and former smokers, subjects were considered ever smokers if they smoked or stopped smoking before the date of registration of the FOLD registry. Never-smokers were defined as those who had never smoked in their lifetime.
All statistical analyses were performed using the computer program STATA Version 15.1 (STATA Corporation, College Station, TX). All P values were two-sided, with those less than 0.05 considered statistically significant. Association between telomere-related genetic polymorphisms and the risk of IPF or COPD After adjustment for age, sex, and smoking status, the OR of the minor homozygote of rs2736100 for IPF was 0.35 (95% CI = 0.15-0.76) under the codominant model (Table 1). Increasing the number of minor alleles of rs2736100 significantly decreased the risk of IPF in a dosedependent manner (Ptrend = 0.008). Namely, the major allele was an "at-risk" allele. Rs2736100 was significantly associated with increased risk of IPF in the genetic models [codominant model (TT vs. GG), OR = 2.88, 95% CI = 1.31-6.34; dominant model (TT + TG vs. GG), OR = 2.41, 95%CI = 1.14-5.09; recessive model (TT vs. TG + GG), OR = 1.76, 95%CI = 1.06-2.91]. Increasing the number of minor alleles of rs1881984 and rs11191865 tended to decrease the risk of IPF in a dose-dependent manner, although the trend was not statistically significant (Ptrend = 0.435, 0.063, respectively).
As shown in

Association between the combination of telomere-related genetic polymorphisms and the risk of IPF or COPD
To achieve adequate statistical power, the minor homozygote and the heterozygote (reference category) were bundled in one group for subsequent analysis independent of the genetic model. Table 3 shows the association between the combination of telomere-related genetic polymorphisms and the risk of IPF. According to the recessive model, the homozygotes for the "at-risk" allele were scored as 1 (one "at-risk" genotype) in each SNP. Increasing numbers of "at-risk" genotypes increased IPF risk in a dose-dependent manner (OR for   Table 3. Association between the Combination of Telomere-Related Genetic Polymorphisms and the Risk of IPF one "at-risk" genotype = 1.78, 95%CI = 0.77-4.08; OR for two "at-risk" genotypes = 2.84; 95% CI = 1.20-6.73; OR for three "at-risk" genotypes = 3.51, 95%CI = 1.16-10.59; P trend=0.003). In the two loci analysis, the ORs for the combination including OBFC1 rs11191865 increased significantly.
In contrast, there was no dose-dependent relationship between COPD risk and the number of "at-risk" genotypes, though the OR for three "at-risk" genotypes was the highest (OR = 1.34, 95% CI = 0.51-3.54) (SSD 4).
In the present study, the T allele of rs2736100 was associated with an increased risk of IPF [adjusted OR for the T allele =1.73, 95%CI = 1.18-2.52, P = 0.005 (data not shown)]. A GWAS in Japanese reported that rs2736100 might contribute to the risk of IPF; the OR of IPF for the T allele was 1.81 (95% CI= 1.46-2.24) (Mushiroda et al., 2008). In non-Japanese populations, the ORs of IPF for the T allele were 1.75 (P = 0.17) among Koreans, 2.00 (P = 0.05) among Mexicans, and 1.37 (P<0.001) among non-Hispanic whites (Peljto et al., 2015). It has been reported that the T allele is associated with shorter telomere length while the G allele is associated with longer telomere length (Codd et al., 2013). Although the mechanisms of telomere length regulation by rs2736100 are not currently understood, the SNP lies within a putative regulatory region (Landi et al., 2009) and is considered to influence TERT expression based on the evolutionary and sequence pattern extraction through reduced representations (ESPERR) score (Taylor et al., 2006). ESPERR, which is a computational method, has been developed to create a reduced representation for removing noise while keeping useful signals for characterizing a class of functional components (Available at https:// omictools.com/esperr-tool.) The germline mutations in telomerase components such as TERT and TERC are detected in 8%-15% of cases of familial IPF but rarely in 1%-3% of cases of sporadic IPF (Armanios, 2012). Variations in TERT and TERC cause telomerase haploinsufficiency, which results in short telomere defect, the most clinically recognized telomere dysfunction in autosomal dominant pulmonary fibrosis (Armanios, 2012). However, there are few reports about the association between rs2736100 and COPD. Stanley et al., (2015) reported that the prevalence of deleterious variations in TERT was 1% in their cohort of smokers with severe emphysema/COPD and that germline mutations in telomerase were a genetic risk factor for severe emphysema among smokers. Short telomeres reduce the threshold for cigarette smoke damage in intraepithelial cells, and then the damage causes epithelial senescence. Senescence may be an important factor in triggering alveolar destruction in telomere-mediated emphysema (Stanley et al., 2015).
Taking these results together, we considered it biologically plausible that the T allele of rs2736100 associated with shorter telomere length was related to an increased risk of non-cancerous pulmonary disease. Recently, it has been suggested that the common etiology of IPF/COPD is a disease category characterized by alveolar senescence and lung "early aging" (for IPF patients, defects in alveolar epithelial precursor cells; and for COPD patients, defects in mesenchymal precursor cells) (Ito and Barnes, 2009;Chilosi et al., 2012;Tuder et al., 2012).
TERC rs1881984 and OBFC1 rs11191865 were not related to the risk of either IPF or COPD in this study. It has been shown a significant association between rs11191865 and IPF in non-Hispanics and Mexicans but not in Koreans (Peljto et al., 2015). Ethnic differences may exist in the association of rs11191865 and IPF and reflect different polymorphism-polymorphism interactions, or different linkages to the polymorphisms determining the IPF risk.
We also evaluated the relationship between the cumulative "at-risk" genotypes of three telomerase-related polymorphisms and the risks of IPF and COPD. The association between IPF and COPD risk and the pertinent combination of multiple "at-risk" genotypes has not been explored. Most epidemiological studies on CLDs have examined the main effect of each SNP or SNP-environment interaction and rarely the SNP-SNP combination. Although there were no significant associations in a single locus analysis, increasing numbers of "at-risk" genotypes increased the IPF risk in a dose-dependent manner. In addition, there was no significant association in the two risk genotype combination of rs2736100 and rs1881984, but the combination including the "at-risk" genotype of rs11191865 increased the risk of IPF. Like rs2736100, rs11191865 is present in an intron and has been reported to have no effect on OBFC1 expression and structure (Fingerlin et al., 2013). It remains unclear how rs11191865 is associated with the risk of IPF. Our study, for the first time, combined rs2736100, rs1881984, and rs11191865 polymorphisms with COPD/IPF risk.
The advantages of the present study included the fairly large size of the study population and the higher prevalence of the "at-risk" allele of rs2736100 in Japanese compared to other ethnic populations because a sample size with sufficient statistical power is critical to the success of genetic association studies. Therefore, Japanese individuals may be an appropriate population for studying the association between rs2736100-either alone or with other polymorphisms-and IPF Several study limitations should also be discussed. The first was the potential for misclassification of the diagnosis. Clinical investigators were cautious to exclude other fibrotic lung diseases, but we cannot exclude the possibility that a small number of non-IPF fibrotic lung diseases may have been classified as IPF. Second, case-control studies are specifically open to selection bias in the control group. In our study, the controls were grouped separately and were not matched to the patients. However, matching solves confounding problems during the design phase of the study, rather than the analysis phase. In this study, these factors were adjusted in a statistical model. Matching is an option that can improve the efficiency of the estimation of exposure effects in situations where confounding factors are substantially different between the cases and controls.
TERT rs2736100 was associated with both risks of COPD and IPF in a Japanese population. Three SNPs involved in telomere length had a significant cumulative impact on IPF risk. In the future study, we are going to compare the role of these three telomere-related polymorphisms among two groups of COPD or IPF patients who developed lung cancer and those who did not develop lung cancer in our prospective cohort study.