A Multifactor Dimensionality Reduction-logistic Regression Model of Gene Polymorphisms and an Environmental Interaction Analysis in Cancer Research

Abstract

Background: Analysis of interactions between genes and the environment with complex multifactorial human disease faces important challenges. Limitations of parametric-statistical methods for detection of gene effects that are dependent solely or partially on interactions with other genes or environmental exposures are key problems. The aim of the study was to investigate the use of multifactor dimensionality reduction (MDR) and logistic regression models to analyze the effects of interactions between complex disease genes with other genes and with environmental factors and to compare the results of these two methods in interaction analysis.
Methods: In this case-control study, the two methods were applied to analog data of samples from 486 cancer patients and 514 control individuals by computer simulation, including 4 environment factors (E1~E4) and 8 gene polymorphism factors (G1~G8). Non-conditional logistic regression was used to analyze risk factors for cancer, and MDR and logistic regression were employed to analyze interactions under various conditions.
Results: MDR could find high-level interactions between genes and the environment (E3*G1*G7), but it could not find a main effect; conversely, logistic regression better analyzed the main effects (E3, G1, and G4) but was limited in its analysis of high-level interactions (E3*G1*G7). The results of these two methods with analog data show that the gene G1 site, the G4 site, E3, and the E3*G1*G7 interaction may be risk factors for occurrence of cancer.
Conclusions: MDR and logistic regression, which are the two complementary methods, can be combined to analyze gene-gene (gene-environment) interactions with good results. This approach should help to determine the causes of diseases, such as chronic non-transmittable diseases like cancer.

Keywords