A Geospatial Model to Identify Areas Associated with Late-Stage Breast Cancer: A Spatial Epidemiology Approach

Objective: The aim of this study was to show how a geospatial model can be used to identify areas with a higher probability for late-stage breast cancer (BC) diagnoses. Methods: Our study considered an ecological design. Clinical records at a tertiary care hospital were reviewed in order to obtain the place of residence and stage of the disease, which was classified as early (0-IIA) and late (IIB-IV) and whose diagnoses were made during the 2013-2017 period. Then, they were geolocated to identify the distribution and spatial trend. Subsequently, the pattern of location, i.e. scattered, random and concentrated, was statistically assessed and a geospatial model was elaborated to determine the probability of late diagnoses in the state of Jalisco, Mexico. Result: There were 1 954 (N) geolocated BC diagnoses: 58.3% were late. During the five-year period, a southwest-northeast trend was identified, nearly 9.5% of the surface of Jalisco, where 6 out of 10 (n= 751) late- stage diagnoses were concentrated. A concentrated and statistically significant pattern was identified in the southern, central and northern Pacific area of Jalisco, where the geospatial model delimited the places with the highest probability of late clinical stages (p <0.05). Conclusion: The geographical differences associated with the late diagnoses of BC suggest it is necessary to adapt and focus the strategies for early detection as an alternative to create a major impact on the population. Reproducible analysis tools were used in other contexts where geolocation data are available to complement public policies and strategies aimed to control BC.


Introduction
Although the breast cancer (BC) incidence has increased exponentially in several countries worldwide (Bray et al., 2018;Feylay et al., 2019), in 2018 it was the most frequent diagnosis among different kinds of cancers in women (24.2%) (Azamjah et al., 2019;Feylay et al., the availability of public policies that guarantee not only organized screening programs (Ghoncheh et al., 2016;Wild et al, 2020) and access to diagnosis in early stages (Ghoncheh et al., 2016;Steponavičienė et al., 2020), but also the availability of human resources trained to provide clinical care and identify BC symptoms, as well as improvements in medical treatments provided once the diagnosis has been made (Unger et al., 2015;Ghoncheh et al., 2016;Steponavičienė et al., 2020).
In the Americas, Mexico has been one of the countries where BC mortality has increased in the last 30 years (Navarrete et al., 2018;Valle et al., 2019;Cárdenas, 2021) and whose trend has shown important geographical differences among the 32 states. Such is the case of Jalisco, one of the states with the largest population in the country, where it was recently documented that the mortality rate of BC has increased by 17% (Ramos et al., 2020), which is considerable, since this indicator is associated with diagnoses in late clinical stages (Anaya et al., 2014;Tatalovich et al., 2015;Navarrete et al., 2018;Cárdenas, 2021). Some studies carried out in Mexico have shown that since 2003 (Montemayor, 2014) 5 out of 10 women are generally diagnosed between stages IIB and IV (Montemayor, 2014;Wild et al, 2020;Cárdenas, 2021). Therefore, the increased mortality observed in Mexico has been attributed to clinically late diagnoses (Anaya et al., 2014;Navarrete et al., 2018;Cárdenas, 2021). In this sense, there are two reasons that explain why it is important to conduct research when implementing secondary prevention strategies, i.e. population screening and early diagnosis-First, it represents the main strategy to control BC according to the world cancer report (Wild et al., 2020), because it aims to identify the disease as early as possible, slow or stop the disease from progressing (Wild et al., 2020). Second, most of the risk factors (biological, environmental and reproductive history) are not easily modifiable (Bray et al., 2018). Therefore, primary prevention actions have a limited impact on reducing both the incidence and mortality of BC.
On the other hand, spatial epidemiology refers to the study of geographically referenced health events using analytical tools incorporated into Geographic Information Systems (GIS) (Ghoncheh et al, 2016;Kirby et al., 2017). It has the potential to contribute and reinforce, at the population level, the actions, programs, and public policies related to the control of BC in different aspects. This has been demonstrated and supported by publications related to chronic-degenerative diseases, such as diabetes mellitus (Nurjannah and Baker, 2020;Cuadros et al., 2021), cancer (Auchincloss et al., 2012;Roquette et al., 2017), and communicable diseases, such as dengue, SARS-CoV-2- (Canal et al., 2017;Zapata et al., 2020).
In this sense, we consider that the spatial epidemiology approach emphasizes the place where health events occur, hence contributing to identify those places (census units, neighborhoods, municipalities, regions and areas) where women who belong to the group with the highest risk of BC diagnosis reside (≥40 years). Furthermore, it propose that these places must go through organized screening programs to increase coverage. In Mexico, this indicator has reported a value of less than 30%. Moreover, another contribution is the identification of places associated with a higher probability of suspicious detections or highly suggestive of malignancy, as established by the Breast Imaging Report and Database System through mammography studies (BI-RADS 4 and 5, respectively), or, alternatively, places with a greater occurrence of latestage diagnoses. Therefore, it is necessary to guarantee that secondary prevention has a greater population impact by directing it to those places where it is most needed. Thus, an epidemiological study was carried out in the federal state of Jalisco, Mexico, in order to show the use of a geospatial model to identify areas with a higher probability for diagnoses of late-stage breast cancer (BC) to occur.

Materials and Methods
The federal state of Jalisco is located in western Mexico. In 2020, it was classified with a low marginalization level, ranking number 28 out of a total of 32 states (National Population Council, 2021). Jalisco is divided into 13 health regions and 125 municipalities, with a population of 8,348,151 inhabitants (2020), of these 50.9% (n= 4,249,696) were women, and of these, 1,509,737 (35.5%) were 40 years or older. In contrast, the Guadalajara Metropolitan Area (GMA) only has 6 municipalities ( Figure 1), characterized by higher Figure 1. Study area in the context of the federal state of Jalisco (a), Mexico (b). The location and territorial extension of the place where the research was conducted is shown, that is, the Metropolitan Area of Guadalajara (zoom in), in the federal state of Jalisco, which is located in the western part of Mexico. Prepared by authors based on fieldwork, January 2022. population growth, level of urbanization, and economic development (Venegas et al., 2021).
This ecological study was conducted at a hospital specialized in cancer care and located in the city of Guadalajara (GMA), Jalisco. The study was assessed and approved by the Ethics and Research Committee. Data were obtained by reviewing clinical records to collect the home address of those women diagnosed with BC during the five-year period 2013-2017, as well as the clinical stage reported by the histopathological study. The latter was also used to classify diagnoses as early stage (categories 0, I, IIA), and late stage (categories IIB, IIIA, IIIB, IIIC, and IV). When the tumor is less than 2 cm (stage IIA), there is a better prognosis in therapeutic and survival terms (Fuentes, 2014). The records that reported neither histopathological study nor home address in the state of Jalisco were excluded. It should be noted that when collecting data, geolocation and analysis process, the number of each record was used to guarantee confidentiality and anonymity of the patients to avoid nominal identification.
The data obtained from the clinical records were organized in an Excel file, then it was incorporated into the Google My Maps platform to carry out the geolocation process of the BC diagnoses using the residence address. As a result, latitude and longitude coordinates were obtained to generate a point map (Roquette et al., 2017), and by means of spatial statistics, measurements regarding the geographic distribution, central, and directional tendency of BC diagnoses according to the classification of the clinical stage were obtained. Therefore, the standard deviational ellipse tool (Wang et al., 2015) was used in the QGIS™ software, version 3.20 (Creative Commons Corporation, Mountain View, California, United States). The directional ellipse tool assesses the distribution of a given data set to identify orientation (spatial trend) and areas of concentration in a study area (Wang et al., 2015), so, in our study it was useful to visualize where and how late-stage diagnoses were distributed in the context of the state of Jalisco.
Additionally, hotspot analysis (Fritz et al., 2013) was used to statistically assess whether or not the concentration and scattered locations were the result of chance. This analysis considered the geolocation of late-stage diagnoses. As a result, z values, which refer to the spatial pattern (concentrated, scattered or random), and p values, which indicate the probability associated with the spatial pattern, were obtained. A concentrated pattern was considered statistically significant when p <0.05. Finally, the Inverse Distance Weighting (IDW) interpolation method (Auchincloss et al., 2012) was used to group the hot spots, whose pattern was concentrated and statistically significant in order to elaborate a geospatial model to show, through areas, the probability of late-stage BC diagnoses, as well as the variations in the 13 health regions and 125 municipalities of Jalisco. Using this method was helpful to improve the estimation and visualization of the results of the hotspot analysis (Roquette et al., 2017).

Results
A total of 1 954 (N) clinical records related to BC diagnoses were reviewed and geolocated in the context of the municipalities and health regions of the state of Jalisco, Mexico. Thus, 6 out of 10 cases (n= 1 220) were identified in municipalities geographically close to the GMA, that is, Zapopan, Guadalajara, Ixtlahuacán del Río, Zapotlanejo, Tonalá, El Salto, San Pedro Tlaquepaque and Tlajomulco de Zúñiga, which are part of the 4 health  The hot spot analysis identified a concentrated spatial pattern (p <0.05) that involved 68 late diagnoses (figure 3) distributed in 18 (14.4%) municipalities, mainly in the central and western areas of Jalisco, which are characterized by a Human Development Index (28) of the highest quintile (0.77 to 0.82), while the level of marginalization reported in 2020 was medium, low and very low (National Population Council, 2021). A Geospatial Model for Breast Cancer (Canal et al., 2017;Momenimovahed and Salehiniya, 2019) and non-communicable diseases (Auchincloss et al., 2012;Roquette et al., 2017;Nurjannah and Baker, 2020;Cuadros et al., 2021;Saleem et al., 2021), identifying and monitoring health inequalities at different levels of territorial aggregation (Ghoncheh et al., 2016;Zapata et al., 2020) and proposing geospatial models (Zapata et al., 2020) to anticipate risk scenarios for the population has been published. Regarding BC in women, research has helped to identify places or regions with higher incidence of late-stage BC, which could benefit from targeted (Roche et al., 2022;Tatalovich et al., 2015) and/or geographically adapted strategies (Wang et al., 2015). In some countries, such as Germany or the USA, mammography screening programs have been shown to reduce the incidence of cases in advanced stages (Roche et al., 2002;Tatalovich et al., 2015;Simbrich et al., 2016), because they particularly were directed at the population and places of residence associated with a higher probability of occurrence. Therefore, we consider that the results, in terms of geographic distribution, spatial patterns and model of probability of occurrence of late diagnoses, delimited in the context of Jalisco, Mexico, the places where the actions aimed at detecting early , and timely diagnosis of BC in women are more likely to be effective.
This evidence is especially important because in Mexico the incidence may continue to increase due to population aging, as observed in those countries undergoing a demographic transition. However, mortality should not follow the same trend, particularly mortality attributed to late-stage diagnoses. Given that some public health interventions and strategies, such as population screening or early diagnosis, could reduce this indicator and be complemented with the use of analytical tools incorporated into GIS to improve the expected results. One of the main factors that explain mortality due to BC is late-stage diagnosis (Anaya et al., 2014; Navarrete and Navarrete, 2018; Cárdenas, 2021), which in Mexico has been on the rise during the last 30 years (Navarrete and Navarrete, 2018; Valle et al., 2019; Cárdenas, 2021). It has been reported that 66.4% of cases are diagnosed in IIB stage or higher (Montemayor, 2014;Wild et al., 2020). However, this situation differs from countries such as England or Australia (Ghoncheh et al., 2016), where most cases are diagnosed before said stage, as public policies and -secondary-prevention actions are aimed at increasing the number of diagnoses in the early stages, improving the chances of survival for women along with the fact that the main risk factors for BC can hardly be changed (Bray et al., 2018). Therefore, the methodological strategy and results of our study, considering the spatial epidemiology approach, are useful to propose that secondary prevention should be adapted geographically according to the places associated with a higher probability of late diagnoses, and not only territorially. Since treating BC can be adapted to patients -personalized medicine-, be specific -precision medicine-or adapted to the needs of each patient, it is logical to propose that the strategies and actions effective to control BC may have a greater impact if they are adapted and directed to specific contexts. For instance, those places where women ≥40 years of age and eligible It should be noted that the result of the concentration pattern was in line with the spatial trend observed in the study period. According to the results of the hot spot analysis, the geospatial model allowed us to visualize the location of the places associated with a greater probability of late-stage diagnoses (p <0.05), whose zoning was not only extended in the GMA, where the hospital that made the diagnoses during 2013-2017 is located, but also in more distant geographical areas that involved 8 health regions of Jalisco (Figure 4).

Discussion
The geospatial model we used showed the places associated with a higher probability of late-stage BC diagnoses, which was not a result of chance. Also, it was in line with the directional trend observed in the study period. We believe that this evidence, underpinned by geographical differences regarding late diagnosis, suggests the need to prioritize and adapt secondary prevention strategies in those places where they are very likely to be effective. Hence, screening programs and early BC diagnosis must be targeted as an alternative to increase its impact on the population not only in the context of Jalisco, but also in those places where, the late clinical stage continues to be a public health concern, as well as in the availability of geolocation data to analyze and delimitate areas of greatest risk.
One of the most important contributions of mammography screening programs is that they reduce BC mortality (Tatalovich et al., 2015;Steponavičienė et al., 2020), since they allow detection, and, consequently, diagnosis in the early stages of the disease. This means that their impact depends to a large extent both on the identification of population groups susceptible to screening -women between 40 and 69 years old-and the places where such groups live in order to be certain about where to aim the screening programs. In this sense, using analytical tools and spatial statistical methods, accessible via GIS, gains relevance because they consider the geographic dimension in which health events occur (Ghoncheh et al., 2016;Loyola et al., 2002), so it has the potential to complement preventive intervention planning. Furthermore, geospatial analysis tools can be used with different wide relevance epidemiological indicators, such as prevalence, incidence, or mortality (added data), regardless of the territorial scale and source; or, considering the place of residence associated with the cases (individual data) to identify areas of greater magnitude or probability for different health events, similar to the methodology and results of our study. Based on this, evidence characterized by its territorial precision is obtained, which allows not only to improve decisionmaking at local levels, but also to geographically allocate and distribute the resources available to respond with effective and precise prevention measures, damage control and health promotion where necessary (Momenimovahed and Salehiniya, 2019).
On the other hand, for several years research on generating evidence that supports the prioritization and targeting of preventive interventions for communicable for screening live, characterized by low mammography coverage, or in places with an increasing trend of latestage diagnoses "by focusing the most effective health interventions on these areas, most of the problems will be solved" (Momenimovahed and Salehiniya, 2019, p. 425).
On the other hand, the limitations of this study are those that characterize epidemiological designs whose data source is secondary. However, in this study a methodological strategy was used to reduce possible biases or errors, particularly when collecting data -filesand performing the geolocation process based on the place of residence of the patients, for which it was necessary to assess reliability (data not shown) before geospatial analysis. Additionally, we recognize that this study is not population-based, that is, data from the private sector or other institutions that are part of the National Health System were not included. Also, this study records data related to BC diagnoses (age, BI-RADS, clinical stage), which is simply due to the lack of a population-based registry in Mexico, unlike other countries such as the USA (Roquette et al., 2017).
However, among the strengths we highlight that this study used data from a secondary source -clinical records-, whose content is not only important in medical practice, but also reliable, since it is related to the process of detection, diagnosis, and medical treatment of patients. Among these, the clinical stage of BC reported by the histopathological study was used, a variable that in Mexico has not yet been analyzed through population-based studies, even less in geographical terms and on a small territorial scale that favors implementing interventions in areas with a higher than expected risk. Therefore, the main strength was to recognize the distribution, patterns and geographical differences according to the home address of patients in order to reevaluate probability of late BC diagnosis. In this sense, the geospatial analysis presented in this study used the most recommended spatial epidemiology techniques to identify structures, trends, and patterns underlying within a given data set (Auchincloss et al., 2012, Fritz et al., 2013Wang et al., 2015;Auchincloss et al., 2012). Furthermore, the methodology of this study is reproducible and as more georeferenced data is available, it could be used not only in a different or more recent study period, but also in another territorial scale (census unit, neighborhood, municipality, region, state) or context, as well as other important clinical variables, that is, BI-RADS categories, histological grade and variety, hormone receptors, to name a few. It is even possible to use this methodological strategy with other data sources, such as a hospital or population registry, or other type of cancers whose incidence and mortality need further research to improve decision-making.
In conclusion, the geospatial model showed significant differences in terms of the probability of late-stage BC diagnoses and allowed us to visualize the places where this occurred. Based on this, we consider that detection and timely diagnosis strategies, besides focusing only on specific risk groups, should also be geographically adapted to the places where such groups are most required and live, as an alternative to increase their impact. This represents a proposal from a population perspective and based on the geospatial model in order to complement the public policies and strategies available in Mexico that are implemented on a daily basis by the different institutions of the National Health System, as well as in those contexts where georeferenced data are available and mortality attributable to late-stage diagnoses continue to be a priority public health concern among women. committees of the hospital under the registry number PRO-12/16. It was also registered and approved by the research and ethics committees of the University Center for Health Sciences of the Universidad de Guadalajara in 2017 under the registry number CI-03920.

Availability of data
The data that support the findings of this study are available from the corresponding author, upon reasonable request.