Molecular Docking and QSAR Study of Chalcone and Pyrimidine Derivatives as Potent Anti-Malarial Agents against Plasmodium falciparum

. A data set of chalcone and pyrimidine derivatives with anti-malarial activity against Plasmodium falciparum was employed in investigating the quantitative structure-activity relationship (QSAR). Molecular docking study was performed for plasmodium falciparum dihydrofolate reductase ( PfDHFR-TS ). Genetic function approximation (GFA) technique was used to identify the descriptors that have influence on anti-malarial activity. The most influencing molecular descriptors identified include thermodynamics, structural and physical descriptors. Generated model was found to be good based on correlation coefficient, LOF, rm 2 and rcv 2 values. Nrotb, solubility, polarizibility may have negative influence on antimalarial activity or play an important role in


Introduction
Malaria is the most common parasitic disease in the tropic and sub-tropic regions of the world today. The World Health Organization's 2013 'World Malaria Report' estimates that The disease still took an estimated 6,27,000 lives in 2012. From total the reported cases of malaria in the year of 2012 in south-east Asia region, 52% cases were only from India [1]. Nearly all antimalarial drugs were developed because of their action against asexual erythrocytic forms of malaria parasites that cause clinical illness. Malaria is caused by Plasmodium parasites like P. falciparum, P. vivax, P. malariae, and P. oval [2], of which the most lethal is Plasmodium falciparum (Pf). The biology of this parasite is well understood with the help of genomic [3] and proteomic projects [4]. Some important antimalarial compounds are as follows: Dihydrofolatereductase (DHFR) is responsible for the reduction of 5,6 double bond of dihydrofolate to tetra hydrofolate through a catalytic cycle in malaria parasite. Models of pfDHFR have also been reported based on the homology of the DHFR sequences [5,6,7]. In Plasmodia, DHFR and TS coexist as a single-chain bifunctional enzyme, in contrast to prokaryotes and higher eukaryotes where the two proteins are distinct monofunctional enzymes. In bifunctional DHFR-TS, individual DHFR and TS domains have polypeptide folds closely related structurally to those of their respective monofunctional counterparts [8]. Before the release of PfDHFR-TS crystal structures, researchers developed homology models to study the resistance mechanism of commonly used antifolate drugs [7]. Drug resistance acquired by the malarial parasites is a major problem in the treatment and control of malaria. Dihydrofolatereductase (DHFR) in Pf (PfDHFR-TS; TS refers to thymidylate synthase bound to DHFR in Pf), which catalyzes the reduction of dihydrofolate to tetrahydrofolate, is one of the most widely studied enzymes in antimalarial drug design due to its potential role in DNA synthesis [9,10].

Data set
The ligand set of twenty one compounds were derived as previously reported method by Christian et al. in 2014 [11], were used for MolDock study. Inhibitory concentration of all molecules was expressed as IC50. The general chemical structures and biological activity values of all of the compounds are shown in Table 1.

Binding site prediction
It is well known that plasmodium falciparum dihydrofolatereductase (PfDHFR-TS) possess a catalytic site for enzymatic activity [12].Molegro virtual docker (MVD) 'Cavity' module was used to recognize ligand binding potential sites using an interaction energy scheme to find energetically favorable binding sites between the protein and a van der Waals probe [13]. The cavities clustered by spatial proximal scheme were utilized to rank the top scoring predicted cavities. The experimentally known active site of PfDHFR-TS (Phe96) [12] was deployed as docking cavity.

Molecular docking of substituted prop-2-en-1-ones and pyrimidinesand their energetic profile
Prior to docking, the ligand dataset was energy minimized using the Ghemical all-atom force field engineered in the PyRx program [14]. The crystal structure of PfDHFR-TS (entry: 4DPD) was downloaded from Protein Databank (PDB). The protein structures were preprocessed by removing crystallographic waters and ions. The docking simulations were performed using Molegro Virtual Docker (MVD) 8.0 program. MVD develops ligand poses according to the search space in the protein binding cavity by utilizing genetic evolutionary methods (GA). Docking was carried out with the following customized settings: Grid resolution = 0.30, population size = 50, number of generations = 10 and the number of solutions = 10. The best poses for each ligand was selected by initially sorting out the MolDock score and then by H bond energy term.

Model validation and stereochemistry corrections (530 AA)
Ramachandran plot [15] exhibited 95.7% amino acids in core areas of φ-ψ distributions, with 4.0% dispensed in additional allowed regions and 0.4% outliers ( Figure 1A). Besides, the PROSA-WEB Z-score (a composite scoring function consisting of the following pseudo energy terms: Cβ interaction, all-atom pairwise, solvation, torsional angle and agreement measures: secondary structure and solvent accessibility) [16] of the modeled PfDHFR-TS protein was observed to be -9.77, which is quite close to that of the experimental X-ray structures ( Figure 1B). Thus, the model was proved to be structurally realistic and was thus used for further analysis.

Interaction of pyrimidine derivatives in PfDHFR-TS.
A cavity with active site residues, along with adjacent neighbors within a proximity of 6 Å, was constructed to implement as a docking grid. A grid space with a site volume of 209 Å 3 predicted by MVD [17] was utilized in docking simulations. The ligand dataset consisted of twenty-one pyrimidine derivatives. We tried to investigate the binding confirmation of the substituted pyrimidine moiety initially to distinguish the floor and to allow the terminal groups to interact with the active site residues (Figure 2). It is evident that the major pharmacological activity of pyrimidine is due to its lipophilicity (predicted LogP=-0.312) [18].
The MolDock score of the ligand dataset constituted major contribution of van der Waals contacts with few H bonds. The substituted pyrimidine moiety is enriched with H-bond acceptors, H-bond donors and hydrophobic pharmacophore groups, and the active site of PfDHFR-TS possesses chargeable and non-polar amino acids, there is a possibility to develop H bonding and other modes of interactions ( Table 2). The substituted pyrimidine moiety interacts with the active site residues.

Data set
Potent antimalarial activity has been reported in a series of chalcones and pyrimidine by Christian [11] against Pf (Tables 1). This data was used for QSAR study. The IC50 values were converted to the negative logarithmic scale (pIC50) to achieve normal distribution shown in Table 3. The strategy for the selection of compounds to be included in the test set was a random selection of compounds that exhibited a varied range of inhibitory activities and structural diversities. Twenty compounds were randomly divided into training set and test set of fifteen and five compounds, respectively.

Computation and selection of molecular descriptors
Descriptors are considered as vital element of any ligand based study, due to any biological property can vary depends on molecular property of chemical structures. In this study, molecular descriptors representing structural, thermodynamic and other properties were calculated using Microsoft excel 2010. To develop QSAR models descriptors were selected by analysis of correlation matrix. From correlation matrix, highly correlated descriptors with value above 0.75 (implying highest multicollinearity) and descriptors with value zero were removed from the study. Descriptors used in both studies are listed in the Table 4. Calculated descriptors value (for Eq. 1) used in the QSAR study of P. falciparum are given in Table 5. Correlation matrices of descriptors (for Eq. 1) used in the QSAR study of P. falciparumare given in the Table 6.

Validation test
Statistical significance of relationship between chemical structural descriptors and antimalarial activity was analysed by variance inflation factor (VIF) analysis, cross validation, or internal validation with training set and external validation with external data set.
To check the inter-correlation of descriptors variance inflation factor (VIF) analysis was performed. VIF value is calculated from 1/1 -r 2 , where r 2 is the multiple correlation coefficient of one descriptor's effect regressed on the remaining molecular descriptors. VIF value is larger than10 indicates that information of descriptors can be hidden by correlation of descriptors and multicollinearity [19,20].

Evaluation of predictive ability of QSAR models
Evaluation and predictive power of the models were estimated using external dataset by examination of the statistical parameters such as rpred 2 , rm 2 .  Predictive correlation coefficient values (rpred 2 ) is the predictive ability of each analysis was determined from a set of compounds that were not included in the training set. The predictive r 2 (rpred 2 ) value is based on molecules of the test set only and is defined as [21]:

pred = (SD -PRESS)/SD
Where SD is the sum of the squared deviations between the biological activities of the test set and mean activity of the training set molecules and PRESS is sum of the squared deviation between predicted and actual activity values for every molecule in the test set.
To better understand the external predictability of models, modified r 2 (rm 2 ) was determined by the following equation [22]: Where r 2 is the squared correlation coefficient between observed and predicted values and r0 2 is the squared correlation coefficient between observed and predicted values without intercept. International Letters of Chemistry, Physics and Astronomy Vol. 85     To gather information about which physicochemical properties of compounds influencing the antimalarial activity, 100 QSAR equations were generated with 23 preselected descriptors from correlation matrix using GFA technique. Results of the best models with four parameters (Table 7) as per the rule of ''per descriptors five compounds,'' which showed acceptable statistical characteristics, for P. falciparum (Table 8) (Figure 3, Figure 4). For any QSAR study, it is necessary that molecular features affecting the biological activity should not be inter-correlated with each other. VIF analysis of both studies indicates low intercorrelation of the descriptors used in the selected models. For example, correlation matrix value of all the descriptors are calculated in Table 5. Model 1 is considered to be the best based on internal and external validation. The regression correlation coefficient of 0.90 shows good correlation with biological activity, cross-validation coefficient of 0.67 indicates good internal predictability, predicted correlation coefficient (r 2 pred) of 0.67 and r 2 m value of 0.90 indicates reliability and significance of the model as external validity parameters. Each equation is assessed by Friedman's lack of fit. Molecular descriptors used in the model 1 are Nrotb, solubility, polarizability (Table 9). In which, polarizability to be important between the electrons in a ligand and a biological receptor. Likewise solubility and Nrotb of the compounds are also played an important role in the changes in biological activity (Table 10).

Conclusion
Molecular docking study shows thatthe substituted pyrimidine moiety have good binding in PfDHFR-TS pocket and displayed excellent antimalarial activity as compared to chalcone derivatives.The QSAR study shows that structural, thermodynamic and physical descriptors are responsible for describing activity of chalcones and 2-substituted pyrimidine derivatives. The QSAR model is statistically and chemically sound with excellent predictive power as evidenced from the predicted activity of test set compounds. It shows that as the Nrotb, solubility, polarizibility may have negative influence on antimalarial activity. The results showed that the 2substituted pyrimidine derivatives having lower values of affected descriptors, which demonstrated their excellent antimalarial activity as compared to chalcone derivatives.