In-silico Rational Protein Engineering and Design Approach to Improve Thermostability of a Haloalkane Dehalogenase Enzyme

Corresponding Author: Raghunath Satpathy School of Life Science, Sambalpur University, Jyoti vihar, Burla, Odisha, 768019, India Email: rnsatpathy@gmail.com Abstract: Thermostability of enzymes is a major prerequisite for use in industrial enzymology. There are as such no simple general principles for achieving thermostability in case of enzymes as many factors are required to fulfil for different enzymes. The present study describes computational methods to design thermostable haloalkane dehalogenase enzyme using the crystal structure available Protein Data Bank (PDB ID: 1EDE). In in silico design strategy rule-based approaches such as disulfide bond geometry, new hydrophobic pocket design, new salt bridge construction and multiple mutations (combination of the above approaches) were introduced to the original enzyme. After each design strategy the functional effect was confirmed in terms of enzyme substrate binding by molecular docking using Autodock vina tool. Best design strategy was evaluated by comparative molecular dynamics simulation applying simulated annealing method at 8 ns using GROMACS tool. The surface hydrophobicity which is the key factor for thermostability in haloalkane dehalogenase was obtained from the simulation result. Upon optimizing the parameters, thermostability of mutant enzyme under consideration was also confirmed by the 5 ns molecular dynamics simulation at 400, 500 and 600 K.


Introduction
Haloalkane dehalogenase enzymes can be used as a potential catalyst for biodegradation of a number of halogenated compounds that are environmentally toxic expelled as industrial by-products (Fetzner and Lingens, 1994;Koudelakova et al., 2013). Along with their potential applications in bioremediation, these enzymes have gained a limited interest in the industry for the production of optically pure alcohols. Therefore, the identification of novel dehalogenating enzymes is very much important in terms of their industrial utility Hasan et al., 2011). The catalytic reaction mechanism of the haloalkane dehalogenase enzymes (EC 3.8.1.5) involves the conversion of 1-haloalkane to a primary alcohol and halide ions following a hydrolysis reaction (Wackett, 1994;Prokop et al., 2003). Structurally, these enzymes belong to alpha/beta hydrolase superfamily that specifically acts on halide bonds present in carbonhalide compounds. Their active site is found to be buried in a hydrophobic cavity at the interface formed by hydrolase core domain and another cap domain (Dijkstra, 1993;Oakley et al., 2002) The general mechanism of the catalysis by haloalkane dehalogenase has been studied in a detailed manner where amino acids ASP-124 and HIS-289 align perfectly with the substrate and catalyze the hydrolysis reaction respectively (Marek et al., 2000;Hesseler et al., 2011). By understanding the importance of these enzymes, designing of a thermostable haloalkane dehalogenase would be very much important. The current in silico approaches in the protein engineering and design to enhance the thermostability in enzymes have been employed as a potential method in many events as explained by many researchers (Lehmann and Wyss, 2001;Damborsky and Brezovsky, 2014;Basu and Sen, 2013). A typical sequence/structure based design to enhance thermostability assumes that the conserved amino acids observed in related sequences are singles which mainly contributes favoring stability. Thus, the engineering strategy in the former case consists of output from multiple alignments of similar sequences followed by implementing suitable design strategies in the target protein that resemble the consensus sequence. Otherwise from a structure-based method it can be summarized that a more rigid protein would be more stable at high temperatures (Watanabe et al., 2006;Razvi and Scholtz, 2006). The different design strategies introduced to the protein structure are increasing the disulfide bonds, salt bridges and stabilizing the loop regions either by introducing residues such as proline or removing glycine. Similarly, another common strategy is finding the most flexible region of the target proteins from the experimental B-factors and then to focus mutagenesis on this aspect (Farias and Bonato, 2003;Kumar et al., 2000;Wang and Lercher, 2010;Anbarasan et al., 2010). One of the direct method in case of proteins of interest having enhanced thermal stability may be obtained either by protein engineering or by searching for homologs in thermophiles Radestock and Gohlke, 2011). Protein engineering may involve directed evolution, design from first principles or some combination of these two strategies (Chakraborty et al., 2013;Barrozo et al., 2012;Lindberg et al., 2013). Protein design strategy can be viewed just like an optimization problem: That considers certain criteria, by which an engineered protein will fold to the desired structure and that holds the novel function (Alvizo et al., 2007). The most common design mechanism to achieve the thermo stability is engineering a desired protein by inserting disulfide bonds, substituting core amino acids, de novo design of the hydrophobic pocket, salt-bridge insertion etc. (Chirakkal et al., 2001;Yokota et al., 2006;Zhou et al.,1993;Charbonneau and Beauregard, 2013). Following computational design it is necessary to evaluate the substrate binding affinity by the molecular docking method. For docking purpose Autodock Vina is one of the most preferred tools as described by many researchers world-wide (Azam and Abbasi, 2013;Chen and Ren, 2014;Frączek et al., 2013). Furthermore the molecular dynamics simulation is used commonly to check the stability, energy and also to evaluate thermostability property of the enzyme at different temperatures (Paul et al., 2014;Manjunath and Sekar, 2013;Noorbatcha et al., 2013).
The objective of the present work is to develop a computational protein design method using the crystal structure of haloalkane dehalogenase enzyme as a template to make it thermostable. Subsequently the design and validation of the designed protein was studied by various in silico approaches.

System Architecture used in this Work
The configuration of the PC used for performing computation studies were as follows; OS Windows XP, Dual Core, RAM 2 GB, 250 GB HD, 1.76 GZ and for docking and MD simulation HPC WIPRO Super Nova, 1 TF, Intel Xeon Quad Core 2.66 GHZ, with 16 Nodes were used.

Selection of Suitable Haloalkane Dehalogenase Enzyme Structure from PDB and In-Silico Protein Design
Currently the PDB contains about 59 haloalkane dehalogenase 3D structures. The most suitable dehalogenase was searched from PDB server using keywords and Boolean criteria in the advanced search (http://www.rcsb.org/pdb/search/advSearch.do) such as "haloalkane dehalogenase", chain length from 200-350 amino acid, single chain, CATH criteria having alpha-beta with no ligands and with a low crystallographic R value.

Disulfide Bond Design
Disulfide bond design is an important biotechnological tool that has great interest in wide areas of research and the same has been applied extensively to improve protein stability, modify functional characteristics. DbD2 web server available at http://cptweb.cpt.wayne.edu/DbD2/ is used for disulfide bond design. DbD2 is a web-based, platform-independent application that significantly extends functionality and visualization capacity. This feature facilitates identification of potential disulfides that are not only likely to form, but are also expected to provide improved thermal stability to the protein that are very reliable. Following the desired PDB file upload, suitable parameters were selected for our desired PDB (details of parameters discussed in the results and discussion section).

Hydrophobic Pocket Design
Binding sites and active sites of proteins are most often associated with structural pockets and cavities. CastPserver (http://stsfw.bioengr.uic.edu/castp/calculation.php) was used for identification and measurements of surface accessible pockets as well as interior inaccessible cavities, for proteins. This program analytically measures the area and volume of each pocket and cavity that can be visualized by the JMOL visualization tool. Castp predicted several pockets for the protein of interest, among them the suitable pocket was chosen based on criteria such as, area >100, cavity not containing any active site residues and has maximum number of hydrophilic residues. Then the selected hydrophilic residues were mutated to corresponding hydrophobic residues based on comparable molecular weight using PyMol tool.

Salt Bridge Mutant Construction
The criteria for salt bridge formation is that at least one Asp or Glu side-chain carboxyl oxygen atom (containing atom type OD in Asp or OE in Glu) and one side-chain nitrogen atom of Arg, Lys or His (contain atom type NH in Arg, NZ in Lys or NE and ND in His) should be within a distance of 4.0 Angstroms. The salt bridge in the selected protein was analyzed by ESBRI web tool (http://bioinformatica.isa.cnr.it/ESBRI/). Other charged residue that does not take part in salt bridge formation in the core and surface were identified using the in-house built software. Other than the existing salt bridges, additional salt bridges were designed by suitable mutation at a specific range and the respective pairs were analyzed whether they are still within the range (about 4.0 angstrom) after the energy minimization.

Multiple Mutant Enzyme Design
All the above mutation strategies mentioned in the earlier sections were introduced in combination to the protein structure template using PyMol tool.

Energy Minimization and Molecular Docking to Check the Substrate Affinity
Following the design of the mutant structures it is necessary to analyze the different energy values obtained before and after minimization to gain an idea about their stability. For the various mutants, preliminary energy minimization was performed using SPD viewer tool followed by GROMACS software. Further molecular docking was performed on the designed protein, including the original (wild type). For docking purpose the chemical structures of substrates for haloalakane dehalogenase enzymes were obtained from the PubChem database (http://pubchem.ncbi.nlm.nih.gov/). Two substrate structures 1, 2, 3-trichloropropane and 1, 2dichloroethane were retrieved based on the toxicity level that are also degraded by the haloalkane dehalogenase enzyme. These compounds were energy minimized to respective 3D structure by PRODRG server (http://davapc1.bioch.dundee.ac.uk/prodrg/) followed by molecular docking using Autodock Vina tool.

Comparative Molecular Dynamics Simulation
Molecular dynamics simulation was run to examine the stability of different designed proteins. This is one of the popular computational methods that calculates the time dependent behavior of protein and provides detailed information in terms of fluctuations and conformational changes. For MD simulation, GROMACS 4.5.0 (Groningen Machine for Advanced Chemical Simulations) software was used. It is a freely available and open source molecular dynamics simulation software mainly designed for simulations of biochemical molecules like proteins, lipids and nucleic acids. The force field used in this study was GROMOS96 53a6. The protocol for simulation includes placing the initial protein structure in a cubic box filled with water, followed by energy minimization of the system without restraint using the steepest descent method for maximum up to 300 steps. The molecular dynamics simulation was set at temperature 298 K, with decreasing harmonic position restraints on the protein atoms over a 50 pico second time interval, followed by 8 ns (up to 333 K) without restraints in the simulated annealing mode. The P-LINCS algorithm was used with a dielectric constant of 1 and a time step of 2 Femto second (fs) was used for the MD simulation purpose with pressure kept constant during the MD simulations. A simulated annealing protocol was performed by dynamically changing the reference temperature (from 298 to 333 K) in the thermostat algorithm. After selection of suitable mutants that achieved thermo stability, another round of molecular dynamics simulation was performed for 5 nano second at temperatures 400, 500 and 600 K respectively.

Results and Discussion
Crystallographic structures of haloalkane dehalogenase from the source organism Xanthobcter autotrophicus GJ10 (PDB entry 1EDE) was used as a template for the initial coordinates for MD simulations (Verschueren et al., 1993). This protein structure has single chain of 310 residues without any ligand whose structure was solved at a relatively low crystallographic R value (Angstrom) of 1.9. For all mutation purpose, the structural conformation of this protein was used as template.

Disulfide Design in the Structure by DbD Tool
For calculating the possible amino acid pairs that can be mutated to form disulfide bonds by DbD server, intra chain interaction was selected and χ3 angle within range -87° or +97°±10 with Cα-Cβ-Sγ angle range 114.6°±8 was considered. From the output a total of 12 pairs of predicted disulfide bond formers were obtained, out of which only 3 residue pairs (ASP 48 -ALA 75, VAL 69 -ALA 299 and TYR 38 -PRO 80) were chosen for mutation to form disulfide bonds due to its low energy (Kcal/Mol) value. Altogether the above mentioned residues were then mutated to cysteine using the Pymol mutagenesis module.

Hydrophobic Pocket Design
From the output of the CastP server, only pocket ID #49 fulfils the criteria stated in materials and methods section and therefore selected. From the predicted pocket, residue ASP 9 and both PHE 12 and TYR 24 were not considered for mutagenesis as the former is present in an alpha helix and the later residue is hydrophobic in nature. All other residues of pocket #49 were converted into corresponding hydrophobic residues based on comparable molecular weight using PyMol mutagenesis module. The mutant protein in this type contains replacement of residue 6ARG/PHE, 7THR/VAL, 35 ARG/PHE, 87 LYS/ILE, 88 SER/VAL, 90 LYS/ILE as shown in Fig. 1.

Result of Salt bridge Construction
The various charged amino acid residues within 4Aº radius were selectively replaced by other appropriately charged residues (Table 1).

Result of Multiple Mutant Construction
Multiple mutant protein contains all the changes that have been performed for disulfide bond design, hydrophobic core design as well as the mutation that is done for salt bridge construction as explained above.

Energy Minimization and Molecular Docking Study
The results indicate that all these mutants along with the original (wild type) protein posses a total energy that is negative, hence stable (Table 2). Followed by a docking protocol, the results show that the binding site of the ligand was found to be explicitly positioned near the crystallographic binding orientation within the protein cavity. From binding energy values in all cases it was concluded that inspite of mutation the mutant enzymes show almost equalaffinity to the substrates (Table 2).

Molecular Dynamics Simulation Results
The molecular dynamics simulation was by placing the protein in a defined grid solvated in a water environment. The simulation was performed by the simulated annealing method for 8000 Pico second at different temperature levels ranging 298 to 333 K as discussed above. Various properties such as RMSD, Solvent-Accessible Surface Area (SASA) both hydrophobic and hydrophilic, Radius of Gyration (RG), Root Mean Square Fluctuation (RMSF), number of hydrogen bonds both in protein and protein-solvent (water) system were computed as a function of time to assess the stability of the proteins from the respective trajectory file of Gromacs output.

Low Root Mean Square Deviation (RMSD) of Hydrophobic Mutant Enzyme
RMSD is a parameter that shows how much a structure differs from its original protein structure and therefore it is considered as an important indicator for evaluating the thermostability of a protein in terms of structural conformation. In general, the smaller the RMSD value of a protein, smaller is its Cα-atomic displacement range and therefore more thermo stable is its conformation (Zhang et al., 2014). The output was obtained as the Cα-atomic displacement parameter to access the stability. The RMSD during the simulations shows the average over the 8 ns simulations (Fig. 2). The overall C-alpha to C-alpha fluctuation obtained for the hydrophobic mutant is comparatively less. At the beginning of the trajectory (t = 0), RMSD of all proteins displayed a value of 0.09 nm indicating further about the movements occurred during the thermalization and equilibration periods. Between 7000-8000 pico seconds (298-333 K), the deviation value of hydrophobic mutant decreases sharply than other mutant protein.
However, it is possible that at simulation times of 8ns, the hydrophobic mutant structure might be stable and reach at the folded state at a variable temperature. This is also due the substituted hydrophobic amino acids are responsible for enhancing the thermal stability of protein structure in this mutant of the haloalkane dehalogenase enzyme.

Low Solvent Accessible Surface Area (SASA) of Hydrophobic Mutant
Solvent Accessible Surface Area (SASA) is basically used to calculate the transfer of free energy that is required to move a protein from a polar solvent to a non-polar solvent and is also used to predict the secondary structure (Momen-Roknabadi et al., 2008). Figure 3 and 4 show a significant decrease in total solvent accessible surface value (for both hydrophobic and hydrophilic) in case of hydrophobic mutants until the end of simulation time. The solvent accessible surface areas of atoms are correlated with their hydrophobicity value and folding process in proteins.
A lower SASA value correlated proportionally to the stability in the protein structure has been explained earlier (Richmond, 1984). The replacement of hydrophilic residues with hydrophobic residues has increased the non-polar residues located inside the protein structure favouring protein packing and thus reducing the internal cavities within protein that providing overall protein thermostability (Salleh et al., 2012).

Distribution Salt Bridge in Mutant Enzyme
Structure during MD Simulation Figure 5 indicates the trends of salt bridge formation and breakage in case of all considered mutants as well as the original protein. In general, surface salt bridges have been shown to contribute to protein stability at high temperatures due to decreased desolvation penalties of charged groups (Fei et al., 2013).
A significant increase in the number of salt-bridges has been reported for most structures of thermostable enzymes, but this trend has not has been explained properly by site-directed mutagenesis experiments (Yip et al., 1998;Trejo et al., 2001). However in this case, at room temperature (0 ns/ 298 K) the number of salt bridges is more but as the temperature increases there is anomalous effect observed in all enzymes.

Lower Root Mean Square Fluctuation (RMSF) Value of Hydrophobic Mutant
Root Mean Square Fluctuation (RMSF) of each residue in the proteins was used to calculate the trajectory to determine the overall flexibility and behavior of the system at different temperatures (298-333 K). The conformational structure of haloalkane dehalogenase mutant structures continuously change during the MD simulations until at the end of the simulation (8 ns).
In this comparative study lower conformational changes were obtained for hydrophobic mutant (Fig. 6). However, due to lack of significant fluctuations (nm), the RMSF does not correlate much to the thermal stability for a specific mutant but indicate better compactness and high rigidity which are the major factors in protein thermal stability in all considered proteins (Kuzmanic and Zagrovic, 2010).

Lower Radius of Gyration of Hydrophobic Mutant
The radius of gyration describes the general size of the protein and basically indicates the global changes that occur in the shape of the protein during MD simulation. The radii of gyration during the simulations are shown in Fig. 3 and the average values over the 8 ns production simulations are shown in Fig.  7. It can be seen that most of the simulations give average radii of gyration of 1.78-1.86 nm. However among all selected proteins the hydrophobic mutant show slightly lesser radii of 1.79 to 1.82 nm between temperature range 298K/28°C and 333K/60°C. This indicates that, in general the mutations do not have any major impact on the general shape of the protein, except some rather small changes when the hydrophobic residues are inserted to the original protein structure, which might have indirectly improved the hydrophobic interaction in the protein structure leading to the rigidity in the structure.
The radius of gyration as a feature for selecting thermo stability has been studied by many researchers (Irani et al., 2013;Feng et al., 2007;Liu et al., 2008). It has been observed that the thermal stability of the proteins is inversely proportional to the radius of gyration. From our analysis hydrophobic mutant show comparatively lower value for radius of gyration indicating thermo stability.

Optimum Hydrogen bond Distribution in Hydrophobic Mutant
Hydrogen bond calculation from the trajectory file of Gromacs was computed at a distance cut-off of 3.5 Angstrom. As the temperature increases with the increase in time, the numbers of intra-molecular hydrogen bonds were observed to be slightly decreased in case of hydrophobic mutant in comparison to the un-mutated (Original) protein. This is reasonable because the structure becomes more distorted as the simulation temperature is raised. It was also observed from the plot (Fig. 8) that, although the number of hydrogen bonds varies in different temperature still it is steadily maintained with the increase in simulation time and temperature (Choudhury et al., 2010;Vogt et al., 1997). Due to a large distortion of regular secondary structural elements and unpacking of the hydrophobic cores, some of water molecules are inserted in hydrophobic cores.
It can be predicted that, the enhanced intermolecular (water-protein) hydrogen bonding might compensate the breakage of intra-molecular Hydrogen bonding thereby providing the stability to the hydrophobic mutant.

Dynamic Behavior of the Hydrophobic Amino Acids in Hydrophobic Mutant
In addition to all the features, the dynamic behavior of designed hydrophobic pockets was also analyzed and shown in Fig. 9. The amino acid residue PHE 6 with VAL 88, PHE 35 and ILE 90 was observed to interact periodically as the time and temperature increased. The hydrophobic interactions have more stabilizing effect in comparison to the salt bridges (Burg et al., 1994;Karjiban et al., 2009;Shirdel et al., 2013).
For each nano second of simulation the non-polar SASA value was computed from Naccess program (http://www.bioinf.manchester.ac.uk/naccess/) to check the behaviour of the considered residues. Figure 10 indicates Val 88 and Ile 90 remaining almost buried in nature and Ile is little exposed (at 5 ns) during the simulation time with temperature increment. However in this condition, due to the exposure of the residues Phe 6 and Phe 35, it strongly promotes the flexibility, enhancing their interaction and burial of adjacent residues strongly inhibiting the flexibility of the central residues (interaction of Val 7 and Ile 87).
The present analysis has shown a new concept of enhancing haloalkane dehalogenase protein thermo stability by hydrophobic amino acid interaction at the surface. So along with the surface exposed hydrophilic residues and core hydrophobic residues, the surface hydrophobic residues in the loop region also play a significant role in attaining thermo stability (Zhang et al., 2009;Moret and Zebende, 2007). This factor can be considered as a new protein design strategy for this particular enzyme to achieve thermo stability.

MD Simulation of Hydrophobic Mutant at high Temperature Conditions
In order to validate its thermal stability, further simulations at (constant) high temperature is necessary (Chang and Mahoney, 1995). The plots of RMSD of the protein versus time at different temperatures are shown in Fig. 11. The plot shows that at 400 and 500 K the enzyme was found to be stable throughout 5ns simulation time. In the trajectory run at 600 K, the starting conformation changed and the backbone RMSD slightly increased fluctuating between 0.2 and 1.5 nm.
The rise in RMSD indicates large changes in protein structure and some disruption in the tertiary structure of protein. Due to the thermostable property, the hydrophobic mutant can stay longer time in the folded state at higher temperature.

Conclusion
In the current work a computational effort has been made to improve the thermostability of a haloalkane dehalogenase enzyme, thereby understanding the thermostability mechanism of the enzymes. After a rulebased mutant protein design, energy minimization and docking showed all the mutants to be stable not losing their substrate binding affinity with selected ligands. During MD simulation, various properties such as the average values of the Solvent-Accessible Surface Area (SASA) for both hydrophilic and hydrophobic, the Radius of Gyration (RG), the number of inter and intraprotein hydrogen bonds and the number of salt bridges were computed as a function of time. On the basis of the simulation result, it was observed that the hydrophobic pocket mutant enzyme that contains mutation of amino acids 6ARG/PHE, 7THR/VAL, 35 ARG/PHE, 87 LYS/ILE, 88 SER/VAL, 90 LYS/ILE in the loop region show thermo stability. The mechanism of thermostability of hydrophobic mutant was studied that was mainly attributed due to the interaction of 6PHE with 35 PHE and 87 ILE in the surface. Furthermore, at high temperature simulation (at 400 and 500 K), the hydrophobic mutant shows low RMSD confirming the thermo stability nature. Surface hydrophobicity is linked to the thermo stability in the haloalkane dehalogenase enzyme has been established in this study. However this is a computational interpretation, therefore further standardized experimental methods as site directed mutagenesis and enzyme activity analysis is required to establish this concept. The whole study offers a general characterization of the computational approaches to design biotechnologically improved thermostable enzymes.

V. Badireenath Konkimalla:
Developed an in house based software for SASA calculation of protein structure, Linking the concept of hydrophobicity to thermo-stability.
Ratha Jagnyeswar: Analysis of the result and overall hypothesis prediction, overall editing in the manuscript.

Ethics
This manuscript is an original research article and contains unpublished material. The corresponding author declares that, the manuscript has been approved by all the three authors and there are no ethical issues involved.