Gene-Environment Interaction between Arg72Pro SNP and Selected Environmental Exposures among Brazilian Women Diagnosed with Benign Breast Disease

Background: Benign breast disease (BBD) is a factor strongly associated with breast cancer worldwide. Arg72Pro SNP association with breast cancer is controversial due to the suggestion that environmental factors are required to modulate such risk. There are no studies evaluating these environmental interactions of the aforementioned SNP within BBD. Aim: To determine the frequency of SNP Arg72Pro in a cohort of women diagnosed with BBD; and to investigate gene-environmental interactions with environmental factors. Results: The genotype frequency was 44.6% for Arg/Pro, 39.3% for Arg/Arg genotype, and 16.3% for Pro/Pro homozygote. Gene-environment interaction analysis shows that when Arg/Arg is considered as reference, there is an ORinteraction with Arg/Pro and fabric exposure (OR=1.90;95%CI:1.04,3.48), solvents (OR=2.21;95%CI:1.01,4.83) and chlorine, bleaches, disinfectants, and liquid wax exposure (OR=2.52;95%CI:1.07,5.91). Analysis with Pro/Pro genotype as the reference showed an interaction between alcohol consumption and recessive model (OR=1.58;95%CI:1.00,2.51). Gene-environmental interactions were observed too between exposure to hair dyes, straighteners or relaxers and Arg/Arg (OR=3.26;95%CI:1.21,8.82). Conclusion: The Arg/Pro genotype was the most frequent in the BBD cohort. When compared with the Arg/Arg genotype, the presence of Arg/Pro genotype and solvents, fabric and cleaning products exposure increased the risk of BBD. When compared with Pro/Pro genotype, there were interactions between recessive model with alcohol consumption and exposure to hair products on the risk of BBD.


Introduction
Benign breast disease (BBD) is a public health problem, since it constitutes an important breast cancer risk factor, which is the most incident neoplasm among women worldwide (Bray et al., 2018). BBD encompass heterogeneous group of lesions, which are histologically classified according to breast cancer risk (Page and Dupont, proteins may modulate the risk of cancer development (Dumont et al., 2003;Zhuo et al., 2009).
As far as we know, no study estimated the frequencies of TP53 SNPs in women diagnosed with BBD. However, according to the literature the frequency of TP53 polymorphisms in breast cancer range from 30-50%, with Arg72Pro SNP being the most common in this disease (Done et al., 2001;Tsuda, 2009). Moreover, it has been shown that the allele frequencies of Arg72Pro SNP vary among healthy and sick women, depending on the country and continent. Differences can be explained by the fact that these studies were developed among different ethnic populations, which may affect allele frequencies and environmental exposures (Själander et al., 1996). Authors of case-control studies developed on ethnic mixture populations, such as Brazil, observed that, when compared with women with Arg/Arg genotype, those with at least one Pro allele presented higher risk of breast cancer, ranging from 0.99 (95% CI; 0.59-1.65) in the Southeast region, (Mayorano, 2008) to 2.90 (95%CI, 1.43-3.60) in the Southern region (Damin et al., 2006). Besides the considerable ethnic variation in Brazil, environmental factors also vary throughout the country, reflecting cultural differences among Brazilian regions that influence in diet, smoking and alcohol habits, lifestyle, and environmental exposures such as those to benzene, heavy metal etc Kresovich et al., 2019;Ma et al., 2019). Furthermore, biological mechanisms by which this gene-environment relationship modulates the risk of breast cancer are still a target of investigation.
Despite that, no study so far evaluated the frequency of the Arg72Pro SNP in women with BBD as well as gene-environmental interactions between Arg72Pro SNP and environmental exposures among these women.Thus, the aim of our study is to determine the frequency of Arg72Pro SNP in a cohort of Brazilian women diagnosed with BBD; and to investigate gene-environmental interactions with selected environmental factors regarding the risk of benign breast disease.

Materials and Methods
A cross-sectional study was carried out in cohorts of women diagnosed with BBD referred to the outpatient clinics of Fernandes Figueira National Institute (FFI/Fiocruz) and the Federal Hospital of Lagoa (FHL), which are BBD reference units in the city of Rio de Janeiro, Brazil. Baseline cohorts comprises women with altered breast tests (mammography, ultrasonography, and Fine Needle Aspiration -FNA) referred to FFI and FHL from July 2013 to July 2018. Histopathological confirmations were proceeded by core-biopsy, excisional biopsy, and/or surgery. The Ethics Committee Board of both FFI/Fiocruz and the National School of Public Health Sérgio Arouca/FIOCRUZ approved the protocol of this study. All included women formally agreed to participate by signing the informed consent form.
Eligibility criteria included women aged over 17 years at interview, with confirmed histopathological results of BBD. Women with history of breast cancer and/or previous BBD, and those diagnosed with cognitive conditions limiting the understanding of the informed consent form, were excluded. From 373 women included, 14 (3.7%) refused to participate in the study, whereas 359 women (96.2%) signed the informed consent form. Among the included women, 327 (91.1%) had complete questionnaire and blood sample available; however, 2 (8.9%) of them showed nonspecific bands in the amplification step and were not included in the analysis. Thus, analyses included 325 women (90.5%) who presented a complete questionnaire and DNA genotyping (Figure 1).
An interview-based questionnaire was proceeded to collect data concerning sociodemographic characteristics, clinical aspects, and selected environmental exposures (smoking habit, alcohol consumption, and domestic and occupational chemical exposures). Such instrument was adapted from validated scales (IARC Working Group on the Evaluation of Carcinogenic Risks to Humans 2004Humans , 2010, and it was applied by three trained nurses. Clinical evaluation included weight and height measurements, using a Filizola ergometer scale, regularly calibrated according to Inmetro (1994) criteria. An inelastic tape measure assessed waist and hip measurements. Information on clinical conditions and histopathological outcomes were obtained from physical and electronic medical reports. Two 4-mL tubes of peripheral blood samples were collected, stored at 4°C, and processed at the Laboratory of Molecular Epidemiology of Cancer (ENSP/FIOCRUZ).
Genomic DNA was extracted from leukocytes using the salting out technique, and then diluted in deionized water (Miller et al., 1988). DNA quality was evaluated by spectrophotometric technique (Nanodrop®), and 0.1-10 μl of DNA was subsequently used for TP53 amplification through polymerase chain reaction (PCR) analysis. Forward and reverse primers used for amplification of the 296 bp fragment of polymorphic region were, respectively, 5'-ATCTACAGTCCCCCTTGCCG -3' and 5'-GCAACT GACCGTGCAAGTCA-3'. Arg72Pro SNP genotyping was performed by the PCR-RFLP method (Kumar and Dunn, 1989).
PCR was performed using approximately 1.0-4.0 μl of genomic DNA, 0.15 U of Taq-DNA Polymerase Platinum enzyme (Invitrogen, São Paulo, Brazil), 10 pmol of each primer pair, and 5 mM dNTPs in 25 μl final volume. PCR conditions were 94°C for 5 minutes for initial denaturation, followed by 35 cycles of denaturation at 94°C for 30 seconds, annealing at 68°C for 30 seconds, and extension at 72°C for 40 seconds. The final elongation step was performed at 72°C for 7 minutes. After PCR, a 4-μl aliquot was removed and digested with the BstU1 restriction enzyme (New England Biolabs, Beverly, MA) at 60°C for at least 6 hours. Digested DNA was subjected to a 3% agarose gel electrophoresis. Gels were photographed in a translucent UV light. The presence of wild Arg allele was indicated by two bands of 169 bp and 127 bp, whereas absence of digestion of mutant Pro allele was observed by a single band of 296 bp.
Sociodemographic characteristics were compared between the cohorts of women treated in IFF/Fiocruz and HFL, using the Pearson Chi-square test, and no statistically significant differences were observed. Among women and the susceptibility genotype (Yang and Khoury, 1997). Assuming independence between genotype and exposure in the population, the expected value of OR CO becomes unity, and the odds ratio obtained from a case-only study measures the departure from the multiplicative joint effect of genotype and exposure (Yang and Khoury, 1997). Null hypothesis considers OR ca = 1; OR ca > 1 if the joint effect is more than multiplicative, and ORca < 1 if the joint effect is less than multiplicative (e.g., additive) (Khoury and Flanders, 1996). Confidence intervals of case-only odds ratios can be obtained by using standard crude analyses or logistic models that control for the effects of other covariates (Yang and Khoury, 1997).
The outcome of such approach is the effect of gene-environment interaction on the risk of BBD. Thus, genetic dominance models were created to estimate the allele effect on interactions with environmental factors. When Arg/Arg genotype was considered as reference, dominant model was composed of Arg/Pro + Pro/Pro genotypes and recessive model was composed of Pro/Pro genotype. When Pro/Pro genotype was used as reference, dominant model was Arg/Arg + Arg/Pro genotypes and recessive model was composed of Arg/Arg genotype. Gene-environment interaction analyses were performed using non-conditional Logistic Regressions to estimate crude interaction odds ratio, with 95% confidence interval (95%CI). Statistical analyses were performed using Statistical Package for Social Sciences (SPSS) for Windows, version 20.0.

Results
From 325 women included in the study, 62.8% were under 50 years old, and 37.7% self-declared reported being white. Non-proliferative lesion was the most frequent type of lesion (73.5%), followed by proliferative lesion without atypia (19.4%), and atypical hyperplasia (7.1%). The frequency of polymorphic allele (Pro) was 39.0%, whereas the frequency of Pro allele homozygous genotype was 16.7%, heterozygous 45.5%, and Arg allele homozygous genotype was 37.8% (Table 1). In both reference hospitals, genotypes distributions were in Hardy-Weinberg equilibrium (p-value>0.05).
included in the study, distributions of age, skin color, and BBD were compared between those whose blood samples were lost (N=34), and those who were included in the analysis (N=338), using the chi-square and Fisher's tests; moreover, no statistically significant differences were observed.
Hardy-Weinberg Equilibrium (HWE) was estimated for Arg72Pro SNP (rs1042522), according to the reference unit, using Chi-square test (5% significance level). Statistical program R, version 3.4.3, was used in this analysis. Distributions of genotypes of the Arg72Pro SNP were at HWE in both units (Table-1). Distributions of Arg72Pro SNP genotypes were evaluated according to BBD histological type. Differences between frequencies were verified using the Chi-square test, with 5% significance level.
Interaction odds ratios, and respective 95% confidence intervals, between Arg72Pro SNP and selected environmental factors were evaluated considering both Arg/Arg genotype and Pro/Pro genotype as reference, using case-only approach (Piegorsch et al., 1994;Yang and Khoury, 1997). Case-only design has been promoted as an efficient and valid approach to gene-environment interaction screening under the assumption of independence between exposure and genotype in the population (Piegorsch et al., 1994). If one's primary interest is assessing possible interaction between genetic and environmental factors in the etiology of a disease, one may do so without employing control subjects (Yang and Khoury, 1997). According to Yang and Khoury (1997) the odds ratio calculated from a case-only design is related to the odds ratios for exposure alone, for genotype alone, and for their joint effects in the case-control design by the following formula: where OR ca is the case-only odds ratio, and OR CO is the odds ratio among control subjects relating exposure  In Table 2 we present the distribution of sociodemographic characteristics, clinical aspects, and hormonal exposures, according to the Arg72Pro SNP. In these analyses we observed that domestic and occupational exposure to hair straighteners or dyes were statistically more frequent among women with Arg/Arg genotype (96.1%) when compared with women with another genotypes (p-value=0.039).
When compared with Pro/Pro genotype, a statistically significant positive interaction was observed between current/past alcohol consumption and the recessive model (OR=1.58;95%CI:1.00,2.51) ( Table 4). In addition, a strong interaction were observed between early onset of alcohol consumption (≤20 years old) and Arg/Arg genotype (OR=2.14;95%CI:0.86,5.30), but without statistical significance. Current or past smoking habit interacted with the recessive model (OR=1.60;95%CI:0.98,2.51 without statistical significance. Among those who have never smoked, a significant interaction was observed between second-hand smoking before 22

Discussion
To the best of our knowledge, this was the first study to investigate frequencies of the genotypes of Arg72Pro SNP in women diagnosed with BBD as well as gene-environmental interaction between this SNP and the selected environmental exposures. Thus, Arg/Pro  *, Statistically significant difference (p-value <0.05); a , Only for menopausal women; b , Only women who reported alcohol consumption; c , Only women who reported smoking habit; d , Only women who did not report smoking habit; e , Only women who were exposed to second-hand smoking; and never have smoked; f , Hair dyes, straighteners, or relaxers; gExposure to chlorine, bleaches, disinfectants, and liquid wax.    (Damin et al., 2006;Mayorano, 2008;Aoki et al., 2009;Portela De Melo et al., 2009;Ramalho, 2012;Almeida et al., 2016). Brazilian breast cancer studies showed Arg/Arg genotype frequencies ranging from 44.7% to 55.5% in the South of Brazil (Damin et al., 2006;Aoki et al., 2009;Portela De Melo et al., 2009); whereas in the Southeast and Northeast regions, Arg/Pro genotype was the most frequent, varying from 41.4% to 60% (Mayorano, 2008;Ramalho, 2012;Almeida et al., 2016). Among groups of healthy women, we observed that the Arg/Pro genotype frequency ranged from 39.2% to 58.9%. The highest frequencies were observed in Southern Brazil (46.3% to 58.9%) (Damin et al., 2006;Mayorano, 2008;Aoki et al., 2009;Portela De Melo et al., 2009;Almeida et al., 2016); whereas in the Southeastern region, frequencies of heterozygous genotype ranged from 39.2% to 42.4% (Mayorano, 2008;Almeida et al., 2016). Moreover, the Arg/Arg genotype is the second most frequent among groups of Brazilian healthy women, ranging from 33.3% to 45.0%; whereas the Pro/Pro genotype ranged from 10.3% to 16.1% (Damin et al., 2006;Mayorano, 2008;Aoki et al., 2009;Portela De Melo et al., 2009;Almeida et al., 2016).
Differences between studies can be explained by the mixture of races in Brazil, which began with Amerindians being colonized by Portuguese peoples who brought enslaved African peoples, as well as the long history of migration of the Arab, Jewish and European peoples, and more recently, of Japanese and Chinese peoples (Layton and Smith, 2017;Braganholi et al., 2017). Such differences may be reflected in variations of genotypes frequencies observed in different regions of the country. However, different genotyping methods used among studies may also affect genotyping determination. The PCR-RFLP method is a qualitative and error-prone method in determining Arg/Pro genotype. Determination of the SNP of heterozygous genotype depends on restriction enzyme quality, which can produce a partial digestion with time. Furthermore, such method depends on the observer's accuracy when evaluating the gel image. A replication analysis for 10% of samples was proceeded through PCR-RFLP, in order to validate the correct classification. In addition, the studied genotypes frequencies of the SNP were in HWE in the total sample as well as in each reference unit.
As breast cancer, BBD is a multifactorial disease, suggesting that a host single polymorphism might be insufficient to produce the disease phenotype, being necessary environmental factors interacting with gene polymorphism/mutations to affect the risk of disease (Ambrosone, 2007). Thus, cultural differences would play a role as in the life habits characteristic of each population, as in environmental exposures frequencies. However, biological mechanisms by which such gene-environmental interaction modulates the risk of BBD development are still unclear (Gray et al., 2017;Rodgers et al., 2018). Nevertheless, it is already known that environmental factors associated with breast cancer include endogenous and exogenous exposures to estrogen and progesterone, to tobacco and alcohol, as well as to specific chemical agents such as petroleum products, solvents, and endocrine disruptors (EDCss) found in cleaning products (Gray et al., 2017;Rodgers et al., 2018). In the present study, we observed a strong association between Arg allele and alcohol consumption, age at onset of alcohol consumption until 20 years, smoking habit, smoking onset until 18 years, passive smoking onset before 22 years, and domestic/occupational hair dye and smoothing exposures. On the other hand, the presence of at least one Pro allele was strongly associated with exposures to fabric, solvents and cleaning products such as chlorine, bleaches, disinfectants, and liquid wax.
These findings could be partially explained by the fact that p53 protein encoded by the Arg allele is more efficient for inducing apoptosis than DNA repair (Dumont et al., 2003;Pim and Banks, 2004). Therefore, exposures to alcohol, tobacco, and second-hand smoke could interact with this allele and increase BBD risks. In addition, evidences suggest that alcohol and tobacco consumption induce DNA damage, and such consumption has been associated with increased risk of BBD and breast cancer (Pflaum et al., 2016;Ma et al., 2019). Moreover, previous studies suggested that alcohol could act in the carcinogenesis process by two pathways. Firstly by ER and PR hyperstimulation, affecting the estrogen tissue sensitivity and leading to development of ER+ breast cancer tumors, and increasing estradiol circulating levels ; secondly, via DNA damage, causing increased oxidative stress (Zhao et al., 2017). Moreover, early age at onset of alcohol consumption may reflect both a greater opportunity for prolonged exposure, and exposure during critical period of biological development in women's breasts, in which there is greater susceptibility of BBD development (Byrne et al., 2002;Liu et al., 2012;. There are several evidences according to which smoking acts from initiation to neoplastic progression, mainly in cells of epithelial origin (IARC Working Group on the Evaluation of Carcinogenic Risks to Humans, 2004). However, regarding breast cancer, literature is still expanding (IARC Working Group on the Evaluation of Carcinogenic Risks to Humans, 2004;IARC Working Group, 2007). Although the specific mechanisms involved in the association between smoking and TP53 gene mutations are still unclear, the hypothesis of addition of genotoxicity associated with smoking habits seems to be plausible, since cigarettes contain about 20 carcinogens recognized by IARC, among them aromatic hydrocarbons, nitrosamines, aliphatic compounds, arylamines, and nitroarenes(IARC Working Group on the Evaluation of Carcinogenic Risks to Humans, 2004). These carcinogens act on DNA through a bond that forms adducts (Ma et al., 2019). Our results corroborate other studies whose authors reported that passive smoking was also associated with increased risk for BBD (Liu et al., 2000). Researchers have strongly suggested that the breast tissue is a target for carcinogenic effects of cigarette smoke (Conway et al., 2002), because such is more inhaled and absorbed by passive smokers, which would lead to DNA damage just as it occurs to active smokers (Johnson et al., 2011;Li et al., 2015).
Nevertheless, the p53 protein encoded by Pro allele has been mostly efficiently related to DNA repair (Siddique et al., 2005;Zhuo et al., 2009). In our study the Pro allele statistically interacted with EDCs present in a wide range of products, being found in household insecticides, pesticides, detergents, cleaning products, solvents, hair products, and plastics. EDCs are also present in occupational exposure to fabrics among textile mill workers or seamstresses, since they are used in textile manufacturing industry compounds such as textile dyes, printings, fungicides, flame retardants, solvents, plastics, and moth repellents (IARC Working Group, 1990). These compounds are capable of deregulating ER expression, PR or HER2 gene, in addition to the expression of the p53 protein. Thus, one of the hypotheses to explain such finding is that such changes may be related to BBD development, as already observed in breast cancer (Gray et al., 2017).
Thus, according to our results, we suggest that deficiency in triggering apoptosis process in women with Arg/Pro genotype or at least one Pro allele, coupled with hormone receptors hyperstimulation promoted by solvents and cleaning products exposures could be modulating risk to BBD. Moreover, domestic or occupational exposure to hair straighteners, relaxers, or dyes interacted with the presence of at least one Arg allele (dominant model), when compared with the Pro/Pro genotype. Such finding could be explained by the fact that hair products are also considered as EDCs (Gray et al., 2017;McDonald et al., 2018). EDCs has been associated with the development of DNA adducts, which in the presence of the Arg allele, can produce p53 protein with difficulty in achieving DNA repair (Thomas et al., 1999;Dumont et al., 2003;Pim and Banks, 2004).
Our study is the first one that described the Arg72Pro SNP distribution among women with benign breast disease, besides being pioneer in gene-environment interaction analysis between this SNP and the selected environmental exposures in BBD. For such analysis, we used the caseonly approach, whose findings are an efficient and valid approach to gene-environment interaction screening under the assumption of independence between exposure and genotype in the population (Dai et al., 2018). Another advantage of this type of approach is the reduction of selection and recall bias, which are more likely to occur in case-control studies; in addition to being more efficient and less costly than a case-control study. Moreover, this approach is ideal for initial investigations of geneenvironment interactions (Dai et al., 2018). Finally, another advantage of this study is the use of samples of two reference hospitals in Rio de Janeiro, with the largest sample size of Brazilian BBD studies.
However, this study has limitations that must be addressed. First, the low prevalence of hormonal replacement therapy, probably because the study population is in average very young and therefore with a low frequency of menopausal women. Another possible limitation would be the lack of statistical significance for strong gene-environment interactions between Arg72Pro SNP and important environmental factors, such as smoking habit, early age at smoking and alcohol use onset, early age at second-hand smoking, and exposure to gasoline. However, this may have occurred due to the small sample size. In addition, case only approach has as limitation that many biologically plausible modes of gene-environment interaction involve a departure from multiplicative effects and in case of additive joint effect, OR interaction derived from a case-only design can be questionable (Gauderman et al., 2019). Thus, future studies, with different study designs and larger sample size, are required to test hypotheses raised from this investigation.