Identification of oxazolo[4,5-g]quinazolin-2(1H)-one Derivatives as EGFR Inhibitors for Cancer Prevention

Objective: Abnormal expression of EGFR (epidermal growth factor receptor) results in different types of human tumors. Quinazoline-containing derivative signify an attractive platform for EGFR inhibitors. The present study aims to discover the potential binders of a group of compounds belonging to oxazolo[4,5-g]quinazoline-2(1H)-one derivative as EGFR inhibitors. Methods: We apply multiple computational procedures, including pharmacophore-based virtual screening methods, to perform a preliminary screening against EGFR over compounds belonging to oxazolo[4,5-g]quinazoline-2(1H)-one derivative. Then, we carried a fine screening by molecular dynamics simulations, followed by free energy calculations. Results: The best pharmacophore model created has five characteristics. Three hydrogen bonds acceptors (A) and two aromatic rings (R) make up AAARR (a sequential representation of chemical features of ligands). We have performed pharmacophore-based screening with different databases like Asinex, Chembridge, Lifechemicals, Maybridge, Specs, and Zinc. Top-scoring 30 molecules were considered for performing induced fit docking. Molecular Dynamics Simulations executed for the top five ligands confirmed that throughout the simulation, the protein-ligand complexes remained stable. Conclusion: Thus, the results obtained from this research provide insights for the development of oxazolo[4,5-g]quinazoline-2(1H)-one derivative as potent EGFR inhibitors.


Introduction
The ability to differentiate between a malignant and a noncancerous cell is essential to intervene in treating cancer cell progression so that the normal cells remain undisturbed. Hence, current research efforts mainly focus on developing targeted therapies on key molecules elaborate in tumor proliferation and survival (Citri and Yarden, 2006;Du et al., 2021;Lee et al., 2021). Epidermal growth factor receptors (EGFR) is one such target that has been successful in curing solid tumors (Viswanathan et al., 2019;Abeer et al., 2021;Zhang et al., 2021). The role of EGFR is to initiate the signalling events and tumors of epithelial cell origin. It is a transmembrane glycoprotein that's one of four tyrosine kinase receptors in the erbB family. The external receptor domain, transmembrane region, and intracellular domain with tyrosine kinase activity make up the EGFR glycoprotein (170kDa) (Saravanan et al., 2018;Nguyen et al., 2021). The association of EGFR to its cognate molecules causes autophosphorylation of receptor tyrosine kinase and subsequent activation of signal transduction pathways
It is significantly expressed in several tumour cells lines and has been linked to a bad prognosis and shorter survival time. The efforts made in the past two decades leading the development of anticancer agents that can affect EGFR activity. The most widely investigated pharmacologic strategy for affecting EGFR function aims to inhibit cellular phosphorylation of the EGFR tyrosine kinase with low-molecular weight compounds (Izadian et al., 2017). Inhibitors operate on the intracellular part of the receptor to block tyrosine kinase phosphorylation and consequent stimulation of signal transduction pathways. The four classifications of EGFR inhibitors are 4-anilinoquinazolines, 4-(ar (alk) ylamino) pyridopyrimidines, 4-phenylaminopyrrolo-pyrimidines (Elrayess et al., 2021), and oxime inhibitors (Xu et al., 2008a(Xu et al., , 2008b. The quinazoline derivatives are a popular family of synthetic compounds that might be used as a scaffold for EGFR inhibitors. Gefitinib (Ranson et al., 2002), Erlotinib (Luo et al., 2011), and Lapatinib (Collins et al., 2009) are examples of anti-tumor medicines developed and marketed using the 4-anilinoquinazoline scaffold. Gefitinib was used as a key molecule in the development of oxazolo [4,5-g] quinazolin-2(1H)-one derivatives (Yin et al., 2015). We have attempted to identify novel potent EGFR inhibitors through pharmacophore-based virtual screening in the present work.

Data Set
Twenty four compounds from oxazolo[4, 5-g] quinazolin-2(1H)-one derivative's series data were retrieved to study EGFR inhibitors with available IC 50 value. All the compounds were refined by Gefitinib as a lead compound (Sim et al., 2018). ChemAxon's Marvin Sketch chemical software constructed the 2D structure of these derivatives. All the in silico approaches of these compounds was done by Schrodinger software.

Ligand Preparation
The preparation process of these derivatives was performed by using the ligprep module in the Schrodinger package. This module confirms an ionization state tautomers and ring conformation to filter the molecules based on the various criteria. The structure with the best chirality is imported to the workspace. This module confirms the 3D conformations and possible combinations. The hydrogen bonds were added, and the bond length was evaluated using OPLS_2005 (Banks et al., 2005).

Pharmacophore Generation
The pharmacophore hypothesis generation of these molecules was done by the PHASE module (Dixon et al., 2006). Rapid torsion angle search approach used to generate conformers and OPLS_2005 force field with an implicit generalised Born and surface area solvation (GB/ SA) model used to minimize each generated structure. Following minimization, each conformer was filtered via a 10kcal/mol relative energy window with a minimal atom deviation of 1.0Å. The six pharmacophoric features which are defining the chemical features of all ligands are, hydrogen bond donor (D), hydrogen bond acceptor (A), hydrophobic group (H), negatively charged group (N), positively charged group (P), and aromatic ring (R). A method for identifying common pharmacophore hypotheses (CPHs) using active analogues. Using a treebased partitioning technique, the common pharmacophores are selected from the conformations of the set of active ligands. The tree-based partitioning approach clusters comparable pharmacophores collectively based on their intersite distances, the distances between the pairs of sites in the pharmacophore.
The pharmacophores that resulted were then assessed and ranked. The scoring goal was to find the best prospective hypothesis that would offer an overall ranking of all the hypotheses. The scoring method additionally considers site point and vector alignment, volume overlap, selectivity, the number of ligands matched, relative conformational energy, and activity. Based on the score analysis and the matching of active compounds to the developed hypothesis, the best highest representative pharmacophore hypothesis was chosen (Figure 1a). The chosen 3D pharmacophore hypothesis (Figure 1b) has the following characteristics: with a survival value of 3.823, there are three hydrogen bond acceptors (A) and two aromatic rings (R).

Pharmacophore Validation
The validation of the pharmacophore was done by using a decoy set. This process validates the discriminative ability of the pharmacophore hypothesis to differentiate the activated compounds. To develop the database, the 24 compounds are combined with the 1000 compound decoy set database. The screening of the databases is done by using "find matches to hypotheses in the PHASE module" (Dixon et al., 2006). The enrichment factor (EF) and goodness of hit score (GH) were computed to estimate the hypothesis capacity for all the active compounds.

Protein Preparation
The atomic coordinates of EGFR kinase protein with the resolution of 1.71Å was obtained from the protein databank (PDB) (Berman et al., 2000). The preparation of the crystal structure was done by the protein preparation wizard. The protein molecule is imported into the workspace and missing side chains and loops are built with primes. The hydrogen bond in the protein structure was optimized. Water molecules under 3Å were deleted. Optimized crystal structure subjected for minimization to confirm the energy of the structure. All the processes were done by the OPLS_2005 force field. Following the preparation of the protein receptor, the grid is generated by the receptor grid generation module.

Structure -Based Virtual Screening (SBVS) Using Glide
The compounds retrieved over pharmacophore-based screening were integrated into Maestro. All the compounds were allowed to the Ligprep module for the addition of hydrogen, salt removal, ionization, and ring conformations (Schrödinger, 2018). The chirality of the original compounds was saved. 3D conformation structures of all the compounds were generated. Pharmacodynamic and pharmacokinetic parameters analyzed drug likeness of all the compounds. For the docking and scoring functions, Virtual Screening Workflow was used. Glide the aqueous environment were done with the Poisson-Boltzmann solver MESP, HOMO, LUMO, and aqueous solvation energy were determined during this operation (Lee et al., 1988). The red and blue colours correspond to the high negative and positive electrostatic potentials, respectively. Other colours (orange, yellow, green) represent intermediary reactivity ranges.

Molecular Dynamics Simulations
The five compounds with the highest glide score and high binding energy were subjected to the molecular dynamic simulations. The molecular dynamic simulations were performed by the Desmond module in the Schrodinger package (Chow et al., 2008). The best-docked structure complexes were allowed to 20ns simulations. The dynamic simulations help to understand the structural stability and structural fluctuation of the protein complexes. The protein-ligand complexes were fixed in the orthorhombic water box that contains TIP3P as water molecules, and neutralized by adding 0.15M Na+ Cl-ions (Mark and Nilsson, 2001). The system equilibrium was maintained at 300K and 1 bar pressure for the whole process. The built-in function of Desmond trajectory analysis was used to determine RMSD, RMSF, backbone comparison in Initial and Final structures, and hydrogen bond interactions (Chow et al., 2008).

Pharmacophore based Virtual Screening
The set of 55 compounds is the precise compound to inhibit EGFR activity. From the set of 55 compounds, 24 compounds were chosen for their better inhibiting activity. Pharmacophore was formed to these compounds respective to the features named as Hydrogen bond was originally performed in a high-throughput virtual screen module (Friesner et al., 2004). 50% of the leading score compounds were docked in standard precision mode and 25% of leading score ligands were docked in extra precision mode (Friesner et al., 2006). All the docking processes were done by default parameters. Scoring functions were used to select the best conformations of each ligand.

Induced Fit Docking Procedure
For initial molecular docking, standard precision mode was used. 20 poses for each ligand were retrieved for structural refinement. The ligand docking process was done by the induced-fit docking procedure in the Schrodinger suite. Induced fit docking process was performed for the high scoring 30 compounds retrieved by using glide docking. First, the ligands were docked with a rigid receptor model. The energy minimization of protein structure was done by using the OPLS-2005 force field with the default parameters (Banks et al., 2005). Three phases are involved in this procedure: a) Glide-based docking of the ligands and receptors in standard precision (SP) mode, with a van der Waals scaling factor of 0.5 provided to the non-polar atoms of the receptor and compounds (Friesner et al., 2004). For the structural refinement, 20 poses of all the ligands were retrieved. b) To generate the protein and ligand complexes, the docked conformers are allowed to side-chain and backbone refinements. c) In the extra precision (XP) mode of Glide docking, the ligands are optimized at each of the binding pockets. Induced-fit docking scoring functions calculates both protein interaction and total energies of the system for ranking of Induced fit docking poses. Thirty compounds that contain the best glide score were selected for further analysis.

Rescoring Using Prime/MM-GBSA
The induced fit docking pose with the best Glide score value was calculated using the Prime/MM-GBSA approach (Genheden and Ryde, 2015). This process has been used to calculate the free binding energy of all the ligands to the receptor. The energy of all complexes was calculated using the OPLS-AA force field and the Generalized-Born Surface Area (GBSA) continuum solvent model. The binding free energy (Gbind) is calculated by Gbind = ER: L -(ER + EL ) + Gsolv + GSA, where ER:L -energy of the protein-ligand complex, ER + EL -sum of energies of the ligand and unliganded receptor, Gsolv -difference between GBSA solvation energy (surface area energy) of the complex and GSA is the sum of corresponding energies of the ligand and un-liganded protein.

Molecular Electrostatic Potential (MESP) Calculation
The five molecules tabbed from the induced fit docking study were subjected to the density functional theory (DFT) calculations (Becke, 1993). The DFT calculations of all the compounds were done by the Jaguar module (Bochevarov et al., 2013). The geometry optimization of all the compounds was done by the hybrid density functional theory. The energy calculations in acceptor and donor, hydrophobic group, positive and negative ions, and aromatic ring. Pharmacophore features of these compounds were developed by default parameter in PHASE. Common features using a 1Å terminal box size and tree-based algorithms that developed together according to their intersite distances, applying scoring functions for five featured pharmacophore hypothesis was ranked considering site point alignment and vector alignment, volume overlap, selectivity, number of leads.
The general pharmacophore alignment of all these compounds was retrieved based on five characteristics, the AAARR hypothesis with the survival score of 3.283 shown in Figure 1A and distance in 1B. This hypothesis includes five pharmacophore features with two acceptors, two donors, and two aromatic rings. These values show that all compounds have the same structural framework and binding orientation. Using these screening pharmacophore models helps extract the best pharmacophore features and active compounds for EGFR protein. Small molecules are screened using seven different databases and the fitness criteria given as >2.0 to test new compounds for a model. We obtained 1,000 compounds from each database and docked the target with EGFR by virtual glide workflow screening.  ZINC03861279 and (e). ZINC01031908. oxazolo [4,5-g]quinazolin-2(1H)-one Derivatives as EGFR Inhibitors

Pharmacophore Validation (Decoy Set Method)
The AAARR hypothesis has been validated and used as an external decoy and active database to control its predictive ability. The data set contains 1,024 compounds that include all 24 compounds from the database and 1,000 decoy compounds from Schrodinger. The 24 active compounds have been obtained as hits through the screening of the database using AAARR, which indicate it is a successful model.

Molecular Docking
All the oxazolo[4, 5-g]quinazolin2(1H)-one derivative were developed using Gefitinib, the documented inhibitor of EGFR [9]. Gefitinib mimics ATP, and the ATP binding site is sandwiched between the lobes where ATP forms essential interactions with hydrogen bonding and the hinge zone. The target protein 4I22 is bound to Gefitinib thus; the interactions formed were taken as a reference point. A total of 1,000 compounds from each database were obtained through pharmacophore-based virtual screening of seven databases. Via virtual screening workflow using Glide, these compounds were docked with EGFR (PDB: 4I22). The glide score value was used to choose the bestdocked compounds out of the 1,000 compounds from each set. We got the 30 best-docked conformations with glide values greater than 9.5kcal/mol (Table 1). The identified inhibitor Gefitinib forms close contact of hydrogen bond with the protein residue Met793. Hydrogen bond interaction with the backbone atoms of Met793 was found to form the selected 34 compounds. Even in several of the docked compounds, the residue Gln791 is found to form interactions. It has also been found that the residues Asp855, Asp800, Lys745, Cys797, Thr854, Arg841, Pro794 form interactions with the target protein. It was also found that these residues form interactions with the reference compound Gefitinib, which indicates that the selected 30 compounds also form many interactions of the receptor. These compounds were subjected to additional analysis.  DOI:10.31557/APJCP.2022.23.5.1687quinazolin-2(1H)-one Derivatives as EGFR Inhibitors

Induced Fit Docking (IFD)
In the Glide docking, the protein was rigidly placed when binding the ligand was made flexible. In the Induced fit docking, both the protein and the ligand were treated flexibly. For each compound, we built induced fit models and the significance of ligand fitting in the binding pocket was taken into account. IFD score had used to rate IFD poses. Thirty (30) compounds were selected out of thirty-four (34) glide docked compounds for IFD analysis. The findings show that Met793 residues formed more numbers of backbone hydrogen bond contacts compared to glide docking. The compounds tested were also found to interact strongly with Gln791 residues. The residues Thr854, Asp855, Asp800, and Lys745 form good contacts through their side-chain interactions. All of these interactions were found to be similar to those of the interactions between Gefitinib and the EGFR target, which shows the reliability of the selected compounds (     Table 4. ADME Properties of the Top-Scoring Five Compounds

Binding Energy Calculations
The binding free energy analysis was performed using the Prime/MM-GBSA method for the docked compounds induced to match. The calculated Gbind values ranged from -80.03kcal/mol to -125.25kcal /mol. The more negative the beliefs, the more successful they are in binding. The primary residue that forms the most important contact with EGFR is Met793, followed by Gln791, Thr854, and Asp855. These are the residues that help effectively bind the compounds to the protein (Table 2).

Molecular Electrostatic Potential (MESP) Profiles
For electronic structure analysis, the best five compounds from induced fit docking and five highly potent compounds (possessing high IC50 values) were taken from the dataset sequence. The results show that all substances share electronic properties identical to each other. The MESP maps displayed the MESP profiles for the five induced fit docked compounds in Figures 2A,  2B, 2C, 2D, and 2E respectively, in which blue colour reflects the potentially electropositive region and the potentially electronegative region reflected by red colour. The negative potential values ranged from -65.75kcal/ mol to -72.69kcal/mol and the positive potential ranged from 56.47 kcal/mol to 98.56 kcal/mol for the induced fit docked compounds, whereas the negative potential ranged from -43.98 kcal/mol to -47.77 kcal/mol and the positive potential ranged from 54.09 kcal/mol to 82.95 kcal/mol for the compounds in the dataset. The positive ranges of electrostatic potentials for the dataset and the screened compounds were almost identical. The green and the light yellow colour represent the molecules' neutral zone, and the reactivity of the selected molecules can be understood as a whole.

Lowest Unoccupied and Highest Occupied Molecular Orbital
The frontier orbitals' energies contribute to the electron donor and acceptor's molecular properties, thus affecting the reactions of chemical compounds. The LUMO energy is related to the affinity of electrons and the measure of possible nucleophilic attack site of molecules. The data set and screened compounds of HOMO and LUMO sites are shown in Figure 3. From this, we found that HOMO and LUMO energies are weak, respect to the range between -0.208 and 0.242eV and between -0.060 and -0.075eV, ranging between -0.202 and -0.223eV (HOMO) for screened compounds and for dataset compounds between -0.074 and -0.084eV that indicates the fragile existence of the bound electrons. The energy difference between oxazolo [4,5-g]quinazolin-2(1H)-one Derivatives as EGFR Inhibitors the HOMO and LUMO energies (HLG) of screened compounds, ranges between 0.144 and 0.182eV, but for dataset compounds 0.123 and 0149eV. It indicates that less stability making these compounds very reactive. It allows both rapid electron transfer and exchange equally possible. It was also discovered that the compounds screened were more reactive than the compounds from the dataset. The compounds ZINC01031908, F2565-0145, and ZINC14436570 have high energy of HOMO and different energies of LUMO. The highly active ZINC01031908 has a lower LUMO energy of -0.060459 and has a larger band difference. Due to its low band gap, ZINC03861279 is a more reactive compound among the three compounds. All the screened compounds have nearly the same HLG values that suggest that they are similarly reactive ( Figure  3A, 3B, 3C, 3D, 3E, 3F, 3G, 3H, 3I & 3J and Table 3). Consequently, the effect of HOMO and LUMO energies may directly affect the inhibitory action against EGFR.

Aqueous Stabilization
Aqueous solubility acts as a major property leading to the drug's oral bioavailability. For the 30 induced fit docked compounds, the solvation-free energies ranged from -4.86kcal/mol to -60.2kcal/mol. The compound LEG 07404742 has -60.2kcal/mol of high-solvation capacity. Next the compounds ZINC03861279, ZINC06483405, and F2565-0145 possess high solvent capacity -33.38 kcal/mol, -29.61 kcal/mol, and -29.02 kcal/mol (Table 4) respectively. Solvation energy indicates the compound's solubility. The more negative the values are, the more is the aqueous solubility of the compound. The solvation energies calculated could provide knowledge about the pharmacokinetic optimization of these compounds.

ADME
The compounds analysed were subjected for property analysis by ADME, as the compounds have to be pharmacologically suitable to serve as potential leads against EGFR objectives. QPlogKsha is -6.5 and -0.5, QPlogKsha is -1.5 and 1.5, QPPCaco is < 25 and > 500, QPlogBB is -3.0 and -1.2, QPPMDCK is < 25 and > 500 and total human oral absorption is between 25 and 80 percent, according to the Qikprop software. For the top five compounds, the obtained values of all parameters are coming under the acceptable range and are further subject to molecular dynamics simulations.

Molecular Dynamics Simulation
For the top-scoring five substances, molecular dynamics (MD) simulations were performed using the Schrödinger Desmond module. The stability of the protein-ligand complexes examined after 20ns of simulations indicates that all ligands have time-period deviations of up to 5ns, the initial fluctuations were considered as a time taken for equilibrium. Throughout the simulation time depicted in the RMSD graph ( Figure 4A), all protein-ligand complexes were later found to be stable. The RMSF graph in Figure 4B shows fluctuations among the target protein residues and we observe that smaller fluctuations in the residues with interaction. The Met793, Asp800 active site residues form stable interactions in the simulations and additional interactions are also formed with Lys745, Glu791, Cys797, Thr854, and Asp855 with water-mediated interactions as shown in Figures 5A, 5B, 5C, 5D, 5E, 5H, 5I, 5J. Water-mediated protein-ligand interactions play a crucial role in the binding affinity, potency, and selectivity of the small molecules.

Discussion
Discovering drug leads can be greatly aided by computer-aided drug screening. Finding effective inhibitors with a limited number of experiments would be difficult. Molecular modelling and molecular dynamics simulation, according to several publications in the literature, considerably assist in narrowing down the candidate pool for EGFR inhibitors at a reliable computational cost . Further, molecular dynamics simulations give more precise atomic details of the protein-ligand interaction, making the prediction much more understandable and aiding in discovering ligand binding mechanisms (Zhang et al., 2021). Considering the above, twenty-four highly active compounds with the five features of the pharmacophore hypothesis were predicted against EGFR. The predicted pharmacophore hypothesis was further carried out by screening seven separate databases for pharmacophore-based virtual screening to carry new and the most active compounds against epidermal growth factor receptors (EGFR). The obtained 7,000 screened compounds were further implemented in induced fit docking protocol with EGFR structure. To assess the inhibitory properties of these compounds, the five best compounds, ZINC06483405, ZINC14436570, F2565-0145, ZINC03861279, and ZINC01031908 were docked with EGFR using the standard protocols. Induced fit docking of the selected five compounds had strong interactions with EGFR. Binding-free energy analysis confirms the findings of IFD as the binding energy of protein-ligand complexes appears to have low values which suggest more stable and feasible binding. Furthermore, molecular dynamics research shows that the interaction and structure of protein-ligand complexes remained constant during simulation. The electronic structure of the top five compounds is analysed to determine their essence, which can aid in the understanding of their chemical and pharmacokinetic features. The ADME prediction of these compounds implies that they have drug-like features, and more investigation into their potential as EGFR inhibitors should have been done. Our findings can greatly aid in selecting prospective compounds from a much larger chemical space, which will greatly aid in discovering true innovative drugs against EGFR.
In summary, we have screened seven small molecular chemical databases to discover molecules with desired pharmacokinetic properties that inhibit EGFR. The hits obtained were subjected to multiple computational methods, including molecular dynamics simulations. Our results suggest oxazolo[4,5-g] quinazoline-2(1H)one derivative that have the potential to facilitate drug development for EGFR. Thus, this computational research gives the first glimpses of a possible binding between oxazolo[4,5-g] quinazoline-2(1H)-one derivative and