MLH1 rs1800734 Pathogenic Variant among Patients with Colorectal Cancer in the Lower Northeastern Region of Thailand

Background: The -93G > A (rs1800734) polymorphism within the core promoter region of MLH1 gene is associated with MLH1 CpG island hypermethylation. This polymorphism has recently been proposed as a low penetrance variant for colorectal cancer. Many published studies have evaluated the association between the MLH1 -93G > A polymorphism and colorectal cancer risk. However, the results remain conflicting rather than conclusive. The aim of this study was to assess the association between the MLH1 -93G > A polymorphism and the risk of colorectal cancer in patients with colorectal cancer in the lower northeastern region of Thailand. Methods: One hundred fifty one samples from colorectal cancer patients and 100 samples from healthy control group were analyzed. Genomic DNA was extracted from white blood cell of all samples. The real-time polymerase chain reaction (qPCR) was used to demonstrate genetic polymorphism of MLH1 rs1800734. Results: This study demonstrated that the frequency of MLH1 rs1800734 in patients with colorectal cancer was higher than healthy control group. The MLH1 rs1800734 polymorphism variant AA was associated with an increased risk of colorectal cancer (p < 0.05). The MLH1 polymorphism variant AA carriers presented 1.36-folds high risk of colorectal cancer and the alcohol consumption was linked to their likelihood of developing colorectal cancer and their tumor’s grade. Conclusion: This study showed that MLH1 rs1800734 genotype AA was associated with colorectal cancer risk in the lower northeastern region of Thailand.


Introduction
Colorectal cancer (CRC) is one of the most prevalent and incident cancers worldwide, as well as a significant cause of mortality. It is a heterogenous disease associated with the number of genetic or somatic mutations. Diagnostic markers are used for risk stratification and early detection, which might prolong overall survival. Both genetic and environmental factors play an important part in the etiology of colorectal cancer. Several genome-wide association studies have identified polymorphisms associated with colorectal cancer risk (Ilyas et al., 1999). The majority of colorectal cancers are sporadic; approximately three-quarters of patients have a negative family history. Impaired mismatch repair during replication gives rise to accumulation of DNA mutations (Jasperson et al., 2010). Lifestyle factors influence the risk of developing colorectal cancer. The risk is increased by smoking, alcohol intake and increased body weight. The loss of genomic and epigenomic stability accelerates the RESEARCH ARTICLE

MLH1 rs1800734 Pathogenic Variant among Patients with Colorectal Cancer in the Lower Northeastern Region of Thailand
accumulation of mutations and epigenetic alterations in tumor suppressor genes and oncogenes, which drive the malignant transformation of colon cells through rounds of clonal expansion that select for those cells with the most aggressive and malignant behavior (Grady and Carethers, 2008).
Single nucleotide variations in genomic sequences are called SNPs. SNP is the third generation of molecular marker and one of the most common genetic variations in human. The SNP is a single base change in a DNA sequence, with a usual alternative of two possible nucleotides at a given position. SNPs may fall within coding regions, non-coding regions or in intergenic regions. Their diverse roles in disease pathogenesis are reported through both experimental and computational methods. SNPs in gene coding regions can lead to change in the biological properties of the encoded protein, SNPs in non-coding gene regulatory regions may affect gene expression levels in an allele-specific manner and these functional SNPs represent an important but relatively unexplored class of genetic variations. SNP marker is a potential tool for improving cancer diagnosis, treatment planning and drug development (Goode et al., 2002).
There are several DNA repair pathways that are essential for genome integrity, including base excision repair (BER), nucleotide excision repair (NER), DNA double-strand breaks (DSBs) and the mismatch DNA repair (MMR) system. A highly conserved set of MMR proteins has long been known to be primarily responsible for repairing bases mismatched during DNA replication. Mismatch DNA repair (MMR) system is a DNA repair system with a high fidelity. It mainly repairs base mismatch and insertion/deletion ring produced in the synthesis of DNA to maintain the stability of the whole genome (Peltomaki, 2003). Studies found that SNPs of specific DNA repair genes could affect the expression level and activity of enzymes and individual DNA damage repair efficiency (Tomlinson et al., 2007). Repair gene defects may lead to genetic instability and cancer occurrence, suggesting that individual differences in cancer risk was related to polymorphisms of specific repair genes (Goode et al., 2002).
MLH1 gene is the key component of the MMR system which participates in the recognition of nucleotide mismatches occurring during DNA replication and in the recruitment of additional mismatch repair proteins to the site to correct the replication error. This gene has a length of 57375 kb with 19 exons that encodes a protein with 756 amino acids and is located on the short arm of chromosome 3 at position 22.2. MLH1 region rs1800734 located 93 base pairs upstream of the MLH1 start site (MLH1 -93G > A promotor polymorphism). This polymorphism is located in the core promoter region of MLH1. The allelic variant of rs1800734 decreases MLH1 promoter CpG islandmediated transcriptional activity, thereby providing insight into its potential role as a functional SNP (Herman et al., 1998). The MLH1 promoter hypermethylation caused by MLH1 rs1800734 is an important event, silencing the MLH1 gene expression and preventing the formation of MLH1 protein and normal activation of the DNA repair gene (Niv, 2007). This frequent polymorphism was susceptible for developing various human malignancies, including tobacco-related oral carcinoma, colorectal cancer, papillary thyroid carcinoma (PTC) and secondary tumor arising from Hodgkin lymphoma (Allan et al., 2008). This MLH1 SNP has been associated with ovarian cancer in the Chinese population (Niu et al., 2015) and Lynch syndrome in Malaysian population (Zahary et al., 2012). However, it must be considered that different effects of SNPs on the function of the gene may emerge in different human populations. The effects of MLH1 rs1800734 still need further verification due to race, nation, region and individual differences. There was no report about the risk of MLH1 rs1800734 variant in Thai CRC patients. The aim of this study was to focus on data from tumor-normal genomic profile and identified potentially MLH1 rs1800734 pathogenic variant in Thai colorectal cancer (CRC) patients in the lower northeastern region of Thailand.

Subjects
This study included 151 patients with colorectal cancer (CRC) diagnosis and stay in the lower northeastern region of Thailand. The patients recruited from Ubon Ratchathani Cancer Hospital. The clinical and demographic characteristic of study subjects were collected from the medical records. The one hundred healthy control cases were also studied. This study was ethically approved by The Human Research Ethics Committee of Thammasat University (Science), (HREC-TUSc) (COA No. 026/ 2020) and Ethical Committee of Ubon Ratchathani Cancer Hospital, Thailand (EC001/2021).

SNPS selection and genotyping
All patients and healthy control cases were asked to provide 5 mL of whole blood (1.5 mg/mL EDTA was used as anticoagulant) for genotyping and signed for a written informed consent. The EDTA blood was kept at -20°C. The DNA was extracted using QIAamp DNA Blood mini kit. DNA concentrations were conducted by spectrophotometer measurement of absorbance at 260 and 280 nm. by Nano Drop Technology. MLH1 rs1800734 (genotype GG, GA and AA) polymorphism was detected by using Type-it Fast SNP Probe PCR Kit and Rotor Gene Q 5plex HRM instrument for qPCR technique.
Total RNA was extracted and checked for the quality. One microgram of total RNA was reverse transcribed for first strand cDNA synthesis. All reactions were done in triplicate to analyze expression using gene specific primers and Taqman probes (Table 1) by qPCR technique where β-actin was taken as an endogenous reference control. The thermos cycling conditions were 95°C for 5 min and 95°C for 15 s in a total of 45 cycles followed by 60°C for 45 s. The product size was 134 bps. This method was able to detect all possible genotypes of this gene. Samples were coded for case-control status, and at least 10% of the samples were randomly selected and subjected to repeat analysis as quality control for verification of genotyping procedure.

Statistical analysis
The differences in distributions of demographic, epidemiologic and clinical variables as well as genotypes between CRC cases and healthy control cases were calculated for percentage, odds ratio (OR) and 95% confidence intervals (95% CI). These data were analyzed by STATA software (Version 11.0) to assess the effects of each SNPs on colorectal cancer risk and p < 0.05 was regarded as statistically significant.

Clinicopathological and demographic characteristics
This study included 251 subjects with 151 CRC patients (mean age 60.2±8.6 years) and 100 healthy controls (mean age 57.9±5.8 years). The demographic characteristics of the study were shown in Table 2. The increased CRC risk was observed in those who had drinking alcohol as compared with those without drinking and 100 healthy controls. MLH1 genotype frequency data between CRC cases and healthy controls were shown in Table 3. It showed that the frequencies of MLH1 rs1800734 GA genotype in CRC cases and healthy controls were 48% and 48%, respectively. The most common genotype among CRC cases was GA (48%) followed by GG (29%) and AA (23%), respectively. The most common genotype among healthy controls was GA (48%) followed by GG (44%) and AA (8%), respectively. GA genotype was the most frequency genotype found in both CRC cases and healthy controls. Homozygous mutant allele (AA) was detected in CRC cases (23%) higher than in healthy controls (8%) (p < 0.05). During our study, rs1800734 in MLH1 was found to be associated with the incidence of CRC and homozygous mutant allele (AA) was detected more frequently in CRC cases than in controls. AA genotype affected CRC risk in Thai population.

Association between MLH1 rs1800734 variants and clinicopathological and demographic characteristics in colorectal cancer patients
This study observed about the correlation between MLH1 rs1800734 variant genotypes and demographic characteristics (age, alcohol drinking and smoking). It showed that heterozygous variant (GA) had OR = 0.96, 95% CI = 0.46 -2.01, p = 0.92. Homozygous variant (AA) alcohol in CRC group (p < 0.05) and 65% of CRC patients had drinking alcohol less than ten years. There was no significant difference for CRC risk between smoker and non-smoker (p > 0.05). Smoking might not be the risk factor for CRC in this study.
The clinical data about tumor grade of 151 patients with CRC were shown in Table 2. It was found that 59 patients (39.1%) were well differentiated tumors, 77 patients (50.9%) were moderately differentiated tumors and 15 patients (10%) were poorly differentiated tumors. 59 CRC patients and 11 CRC patients with moderately differentiated tumors and poorly differentiated tumors, respectively, both drank alcohol. In this study, Thai CRC patients' alcohol consumption was linked to their likelihood of developing colorectal cancer and their tumor's grade.

Discussion
A highly conserved set of MMR proteins has long been known to be primarily responsible for repairing bases mismatched during DNA replication. The recent studies have commonly reported that MLH1 plays a central role in the MMR system in mismatch strand excision and subsequent repair. The genetic variant of the MLH1 gene may affect the MMR capacity of the encoded protein and therefore might contribute to the risk of cancer (Yu et al., 2006). The common polymorphism MLH1 -93G > A is located in the core promoter region, 93 nucleotides upstream of the transcription start site, contains putative consensus sequences for transcription factor binding sites, and may play an important role in regulating MLH1 promoter activity, thereby modulating susceptibility to cancer. This variant is localized in the promoter region in the consensus sequence binding transcription factor AP-4, which is important for the initiation of transcription (Park et al., 2004 ;Ku et al., 2009). It was reported that the A-allele of rs1800734 within the promoter region of MLH1 as perturbing the binding of allele-specific transcription factor AP-4 and consequently increasing DCLK3 expression through a long-range interaction, which promotes malignant transformation through enhancing expression of the genes related to epithelialto-mesenchymal transition (Liu et al., 2017). Lack of AP-4 binding at the promoter variant SNP may lead to decreased transcriptional activity, possibly recruitment of other factors, and over time, increased DNA methylation (Jackstadt et al., 2013;Savio et al., 2017).
These results explore the functional epigenetic regulation and molecular mechanisms occurring at the important MLH1 region in CRC, shedding new light on the epigenetic concept of CpG shores and how DNA variants play a role in epigenetics and cancer susceptibility. Bi-allelic methylation of the MLH1 promoter region induced transcriptional silencing. This MLH1 rs1800734 is an important risk factor for development of MLH1 methylation (Andrea et al., 2017).
Epigenetic alterations of DNA mismatch repair (MMR) genes are associated with risk of gastric cancer (GC) by causing microsatellite instability (MSI) (Hichins et al., 2007). Microsatellite instability (MSI) is a form of genomic instability that can be detected as changes in the length of repetitive microsatellite sequences. It resulted from a failure of the DNA mismatch repair (MMR) system to edit errors made during DNA replication (Campbell et al., 2009). Absence of MMR protein by immunohistochemistry is a surrogate marker for microsatellite instability (Lachit et al., 2022). MSI occurs in the majority of tumors from patients with Lynch syndrome carrying germline mutations in the DNA mismatch repair (MMR) genes (Vilar et al., 2014). MSI also occurs in approximately 15 -20% of sporadic colorectal cancers (CRCs) (Weisenberger et al., 2006). MSI is one of the possible predictive markers in metastatic tumors, including colorectal cancer (Sud et al., 2022). Methylation of the MLH1 promoter region and subsequent transcriptional silencing have been shown to play a critical role in the development of MSI-positive CRC cases (Boland et al., 2010).
MLH1 defects in the MMR system may cause microsatellite instability, which presents as a somatic gain or loss in simple repeat (microsatellite) sequences of DNA. The accumulation of such errors increases the spontaneous mutation rate secondary to genome -wide instability and inactivates tumor suppressor genes, facilitating carcinogenesis. G > A SNP −93 bp upstream of the transcription start site was identified in the MLH1 gene promoter region [MLH1-93G > A SNP (rs1800734)] and it has recently been proposed to have an association rs1800734 resulting in loss of MLH1 protein and microsatellite instability (MSI) may cause colorectal cancer (Mrkonjic et al., 2010 ;Whiffin et al., 2011).
There were many epidemiologic reports about MLH1 rs1800734 SNPs in many countries included this study in Thailand in Table 5. They have been reported that MLH1 rs1800734 genotype GA increased CRC risk in Malaysian and Mexican populations more than MLH1 rs1800734 genotype AA (Liu et al., 2019). According to our investigation, CRC cases had a higher rate of MLH1 rs1800734 genotype AA detection than control cases. When compared to the wildtype (GG), the MLH1 rs1800734 genotype AA was more closely associated with an increase in colorectal risk than the MLH1 rs1800734 genotype GA in the Thai (this study) and Spanish populations (Liu et al., 2019). The homozygous mutant allele (AA) was detected in CRC cases higher than healthy controls in this study. The odd ratio in Table  4 indicated that the MLH1 rs1800734 genotype AA increased 1.36 time for colorectal risk in Thai population and it was also detected in CRC cases (23%) higher than in healthy controls (8%) (p < 0.05). The MLH1 rs1800734 genotype AA was linked to a risk factor for colorectal cancer in Thai population study. MLH1 rs1800734 was significantly associated with methylation of the MLH1 promoter region in Thai population. MLH1 rs1800734 caused promoter hypermethylation. This variant induced genomic instability and cell proliferation to the point of colorectal carcinogenesis in Thai population.
This is the first evidence that the MLH1 rs1800734 is associated with a significant risk of developing CRC in Thai population in the lower northeastern region. AA genotype affected CRC risk in Thai population and the alcohol consumption was linked to their likelihood of developing colorectal cancer and their tumor's grade. Further independent population-based case-control studies are warranted to validate our results in larger sample sizes, as well as in different populations.