Structural and Substrate Interaction Properties of Alkaline Phosphatase from Paenibacillus sp . PSB04: In-Silico Analysis

: A Phosphate-Solubilising Bacterium (PSB) of Paenibacillus sp . PSB04 was previously isolated from the Sabah tropical rainforest in Malaysia. Interestingly, the genome sequence of the PSB04 strain harbored an Alkaline Phosphatase (AP) (EC 3.1.3.1) gene and was hypothesized to have unique structural characteristics. Therefore, this study aims to determine the AP three-Dimensional (3D) model and catalytic mechanism from Paenibacillus sp . PSB04 (PAP). To address this, the 3D model of this protein was built and docked into a model substrate of p -nitrophenyl phosphate. As a result, the best complex was shown to have the lowest binding energy of -5.9 kcal/moL. Furthermore, the complex showed the atomic coordination of catalytic residues of PAP and the substrate was similar to that of AP from Escherichia coli (ECAP), which implies that both APs shared a similar catalytic mechanism. In this mechanism, Ser 94 of PAP acted as nucleophilic residues, which were activated by the Zn ion. Arg 145 is predicted to be mobile due to its location in the loop region, which allows this residue to stabilize the substrate through direction or water-mediated secondary interaction. Docking simulation of pNPP indicated that the putative residues involved in the catalysis mainly are Ser 94, Ser 141, Ala 146, Thr 147, Pro 148, Asp 293, and Glu 294. Glu 294 is considered a unique residue corresponding to Lys 328 ECAP, allowing the PAP to have a better affinity to stabilize the substrate in the binding cavity. Accordingly, a unique catalytic mechanism of PAP was proposed.


Introduction
Alkaline Phosphatase (AP) (EC 3.1.3.1) is a hydrolytic metalloenzyme found abundantly across microorganisms. Studies on AP were mostly related to their important roles in soil Phosphorus (P) availability (Lindang et al., 2021;Millan, 2006). In addition, alkaline AP was also reported to be produced by some Lactic Acid Bacteria (LAB)-based probiotics which lead to the possibility of involvement in food fermentation for health benefits. Afiyah et al. (2015) indicated that LAB-based probiotics benefit humans by producing various functional enzymes during fermentation. According to AP from Escherichia coli (ECAP), which is the most studied AP, the primary differences between AP, phytase, and acid phosphatase lie in the catalytic mechanism and pH requirement for the catalytic activity to take place (Dong et al., 2021, Liu et al., 2021Millán, 2006). The phytate enzyme hydrolyses only the ester bond of phytate and releases less phosphorylated myo-inositol and inorganic phosphorus as the by-product (Chen et al., 2015). Meanwhile, Acid Phosphatase (AcP) and AP also attack the ester bond of other P compounds at different pH environments and release inorganic P as the byproduct. AcP and AP catalyze optimally in acidic and alkaline conditions, respectively (Coker et al., 2013). Unlike AP, most AcP does not utilize metal ions during catalysis except for tartrate group AcP.
The catalytic residues of AP are highly conserved, consisting of Ser and Arg with the involvement of divalent metal ions of Zn, Mg, Ca, and Cu (Wang and Griffiths, 2009). The structural arrangement of bacteria AP typically consisted of N-Terminal signal peptide, followed by the mature domain and sometimes crown domain. Interestingly, the crown domain was not conserved across the bacterial APs. The most distinctive crown domain structure across reported APs was Vibrio-AP (PDB ID: 3E2D), essential for the protein's stability and function (Helland et al., 2009;Hjörleifsson et al., 2021;Ásgeirsson et al., 2020).
A mesophilic bacterium, Paenibacillus sp. PSB04 was previously isolated from the nutrient-limiting soil rainforest soil of Danum Valley (DV), Sabah, in Malaysia (Lindang et al., 2021). The complete genome of strain PSB04 reveals an alkaline phosphatase encoded in the sequence (Gene Bank ID: WP014096696.1). The specific activity of Paenibacillus sp. PSB04 was considerably high and it was hypothesized that the high specific activity of AP of Paenibacillus sp. PSB04 (PAP) is due to its unique structural features.
The recent study described the three-dimensional model of PAP, constructed through structural homology modeling and binding properties toward the ligand through docking simulation. Accordingly, the unique structural properties and catalytic mechanism of this protein were then proposed.

Structural Homology Modeling of PAP
The amino acid sequence of PAP was retrieved from the complete whole-genome sequence of Paenibacillus sp. PSB04, which originated from Danum Valley, Sabah, Malaysia. The gene sequence of PAP is available in NCBI with the accession number WP 014096696.1. The nucleotide sequence was then used for constructing the homology model of PAP using the ab initio protein structure prediction server of Robetta (Waterhouse et al., 2018;Kelley et al., 2015;Yang et al., 2015;Kim et al., 2004). The model was then evaluated and validated according to Global Model Quality Estimation (GMQE), QMEAN statistical parameters, and G factor Procheck (Benkert et al., 2011, Laskowski et al., 1993Lüthy et al., 1992).

Molecular Docking
The structural model of PAP obtained above was used for molecular docking study as the receptor. Meanwhile, the known substrate for alkaline phosphatases, para Nitrophenyl Phosphate (pNPP), was selected as the ligand. Both receptors and ligands were prepared by minimizing the energy and then 3D protonating in MOE (Scholz et al., 2015). Further, the docking of ligand onto PAP was performed using yasara Structure software (Yet Another Scientific Artificial Reality Application) (Krieger and Vriend, 2015). Briefly, the receptor and ligand files in .pdb format were used to set target and play macro in the yasara structure software. Then, the docking analyses were performed using local and global docking default macro file dock_runlocal.mcr and dock_run.mcr, respectively. The macro files were used to calculate the interaction energy (kcal/mol) and dissociation constant, KD (pm), of the PAP protein complexes. Meanwhile, local docking was executed by a pre-define square grid with the size of 10Å  10Å  10Å around the catalytic site of the receptor. Further, the receptor file was saved in. yob format as required for the local docking study. Both local and global docking study undergoes 25 VINA docking runs of the ligand to the receptor. Afterward, the PAP complexes were converted into PDB files by YASARA software and visualized for its 2D-3D interactive study with the help of Discovery Studio Visualizer version v19.1.0.18287 (BIOVIA, San Diego, CA, USA) and PyMol Software. To note, the selection of the area around the catalytic site for the binding site during docking was based on the finding of other APs which showed that the substrate docked into this area. The catalytic sites of PAP were identified using sequence alignment as detailed below.

Identification of the Essential Residues for the Catalysis
The identification of active site residues reveals the possible catalytic mechanism of the PAP proteins and the active site residues of alkaline phosphatase are known to be conserved. In this study, the identification of the PAP protein was carried out by multiple sequence alignment and validated by determining its configuration based on the position and distance of the active site.

Molecular Dynamic (MD) Simulation
MD simulation was performed to determine the structural stability of PAP in apo and holo forms. The simulation was done using the YASARA structure package (Krieger et al., 2002). The simulation was carried out in an explicit water environment, at constant pressure, using an AMBER14 force field, in a periodic cell boundary condition and the model was simulated at 298 K (25°C), pH 7.4. First, the solvated structure was minimized by the steepest descent method for 15,000 steps at a temperature of 298 K and a constant pressure. Then the complex was equilibrated for a 2 ns period. After equilibration, a production MD was run for 100 ns at a constant temperature and pressure. The Root Mean Square Fluctuation (RMSF) values were analyzed for each simulation. The RMSF for the average equilibrium conformation is often used as an indicator of protein rigidity, which is considered a feature related to the resistance to unfolding (Kuzmanic and Zagrovic, 2010). All the prediction and simulation studies were done on a Dell XPS8940 with the processor of Intel®Core TM i7-10700 CPU@2.90GHz.

Structural Homology Modeling and Validation of PAP
Robetta modeling structure produced five structural models of PAP (model 01-05). To decide on the most reliable PAP model, further structural evaluation and validation were performed on the five models, as shown in Table 1. All models exhibited more than 30% sequence identity against their template (Table 1). Xiang (2006) reported that a protein sequence with over 30% identity to a known structure could often be predicted with an accuracy equivalent to a low-resolution X-ray structure. This indicated that all models are reliable in terms of their sequence identity. Nevertheless, as shown in Table 1, the QMEAN value of model 02 was 0.2, which is the closest to 0. The score reflected the errors of the model structure based on its geometrical aspects globally (entire structure) and locally (residue level). QMEAN value around zero indicates a good agreement between the model structure and experimental structures of similar size. Scores of -4.0 or below indicate low-quality models (Benkert et al., 2011). Further, the Ramachandran plot (Table 1) also revealed that while model 02 has only 94% residues in the favored region, this model has the lowest residues located in the lower region. In addition, the score of Verify 3D was also found to be higher than the requirement (>80%), which indicated the correctness of the model. This score evaluates segments of the model based on how well the environments of the residues in that segment (e.g., burial, secondary structure) correlate with their observed propensities for being in those environments (Eisenberg et al., 1997). In addition, the overall average G-factor of dihedral angles and main-chain covalent forces of model 02 was 0.18, which was greater than the acceptable cut-off of -0.5. The G-factor provides a measure of the plausibility of a stereochemical property and a high G-factor means the property corresponds to a high probability of conformation (Aslanzadeh and Ghaderian, 2012). Altogether, model 02 was considered the best model of PAP.
The dimeric structure of PAP follows the typical quaternary structure of other bacterial APs, which were widely reported to be homo-dimeric structures in crystal structure and solution (Aiba et al., 2017;Sharma et al., 2014). The active site of PAP (Ser 94 and Arg 158) is located between α 5 and β 7 (Fig. 2). These residues are well conserved to the other well-reported bacterial APs. Structural comparison with other APs indicated that PAP has the closest similarity to human placenta AP (PDB ID: 1EW2), with an RMSD of 1.04 Å, as shown in Fig. 1C. Meanwhile, the structural comparison with E. coli AP (ECAP), the most extensively studied AP, revealed a moderate structural similarity between both structures, with an RMSD of 2.12 Å (Fig. 1D). The RMSD reflected the similarity between two proteins, which was computed between aligned pairs of their backbone Cα atoms. The lower RMSD indicated a small deviation between the Cα; hence both structures are considered similar (Reva et al., 1998). While there is no definite RMSD threshold, an RMSD of less than two is often used to consider the high similarity of the aligned structures. Further detailed analysis of the dimeric interface revealed some hydrophobic interactions that are essential for dimerization stability. The critical involvement of hydrophobic interactions in the dimeric interface was found to be common in some proteins (Budiman et al., 2011).

Identification of the Catalytic Active Sites and Metal-binding Sites of PAP
A multiple amino acid sequence alignment of PAP with other selected APs is shown in Fig. 3. The alignment revealed that Ser 94 and Arg 158 of PAP were well aligned to the active sites of ECAP (Ser 102 and Arg 166), which implied that these two residues are likely to be the catalytic sites of PAP. In addition, almost all the residues with essential roles in the metal-binding and active sites were well conserved. This indicated that PAP might utilize the canonical catalytic mechanism of APs with the involvement of similar metal ions. This, nevertheless, remain to be experimentally confirmed.

Molecular Docking of PAP
Docking simulation was performed between PAP and pNPP as the ligand, using global and local docking approaches. In total, seven complex models were obtained, where the best model was depicted based on the consideration of its binding energy, dissociation energy, and orientation towards the ligand. Figure 4(A) showed the best complex of PAP with pNPP as a ligand. The complex has a binding energy of -5.9 kcal/moL with dissociation energy of 48.719 um.
Further, MD simulation revealed that the average RMSF values of the apo-and holo forms of PAP were 2.27 and 2.21 Å, respectively (Fig. 5). This is in the range of acceptable RMSF value (1-3 Å) for protein to be considered structurally stable (Parikesit and Nurdiansyah, 2021). To note, some segment residues showed fluctuation higher than 3. The segments include 65-70, 83-92, 105-120, 150-165, 182-210, 250-260, and 340-375 (Fig. 2). These segments are located in the loop region, therefore acceptable for high flexibility during the simulation. Cheng and Ivanov (2012) indicated that RMSF reflects the flexibility of different regions of a protein, which can be related to crystallographic B factors. Accordingly, it is acceptable for the loop region to have high RMSF values during the simulation due to high B factors. Altogether, the MD simulation convincingly indicated that the model of PAP, in the absence or the presence of substrate (pNPP), is considered stable and reliable for the analysis.
The complex showed that the Ser 94 and Arg 158 catalytic sites were in close contact with the ligand and the distance between Ser 94 and the phosphoryl group of pNPP was 3.4 Å. The side chain of Ser 94 facing toward the phosphate group of the ligand indicates the possibility for the first catalysis reaction to occur. In the case of ECAP, the nucleophilicity of Ser catalytic residue is activated by the Zn 2+ ion. While our model was unable to depict the presence of any metal ion, nevertheless, the sequence alignment (Fig. 3) indicated the possibility of a Zn binding site along with the sequences. To note, the possibility of the presence of a Zn binding site in PAP is solely based on the conservation residues with the known Zn binding site of ECAP (Fig. 3), which indeed requires experimental confirmation. Prediction of the zinc-binding site through sequence alignment with known Zn binding proteins is considered a useful tool for discovering new zinc-binding sites (Furukawa et al., 2018;Hubbard et al., 1991). Figure 4(B) showed one of the coordination of metalbinding sites around Ser 94 and Arg 158 catalytic sites, which indicated that the site formed by Asp 335 or His 336 or Thr 147 or Asp 293 is in proximity to Ser 94. This implied that the activation of Ser 94 nucleophilicity by Zn 2+ , coordinated by any of those residues, is an acceptable route.
On the other hand, Fig. 4(A) indicates the distance between Arg 158 of PAP and PO4 moiety of the ligand is in the range of 5.9-7.0 Å. Holtz and Kantrowitz (1999) reported that the catalytic site of Arg 166 of ECAP directly interacts with the PO4 substrate through its side chain. During the catalysis, its guanidinium group is vital to stabilize the transition state of the reaction. Notably, the distance between Arg 158 of PAP and PO4 is unfavorable for the direct interaction (5.9-7.0 Å). Nevertheless, the position of Arg 158 is located at the loop region (between Ser 153 and Lys 160), which indicates that the 5.9-7.0 Ådistance is quite dynamic. The Arg158 can be at a favorable distance to the PO4 for catalysis due to the mobility of the loop. Binding to the ligand may also stabilize the loop to be an inappropriate distance for the catalysis. A similar phenomenon was also reported for a small cysteine protease of pineapple MD2, harboring mobile catalytic site Cys at the loop region. Structural modeling of the Cys is quite distant from the ligand, yet it exhibits remarkable activity (Razali et al., 2020). Further, it was also known that the loop structure is highly flexible and mobile and the dynamic loop structure facilitates the catalytic activity of PAP (Hjörleifsson et al., 2021;Razali et al., 2020). Alternatively, a water molecule might mediate the interaction between the guanidinium group of Arg158 to PO4. To note, the 5.9-7.0 Å-distance is sufficient to accommodate a water molecule. Interestingly, 4B also showed some metal-binding sites close to Arg 158, mainly Thr 147 and Asp 293. This leads to a possibility of metal ion-mediate interaction of Arg 158 to the substrate.
Further, a 2D interaction map between PAP and pNPP ( Fig. 6) showed that the interaction between PAP and pNPP is dominated by non-polar interaction, facilitated by Ser 94, Ser 141, Ala 146, Thr 147, Pro 148, Asp 293, and Glu 294. Interestingly, Arg158 was not depicted in the 2D map, which indicated that the interaction of this residue to the ligand might not be in a direct way. To note, the docking simulation in this study could only depict the direct interaction, with no water molecule involved. Further, the role of Arg 158 in catalysis is an exciting avenue to be experimentally investigated.
Further, when the essential residues for catalysis of ECAP were aligned to the residues of PAP, most of the residues were highly conserved, except for Glu 294 of PAP (corresponds to Lys 328 ECAP) (Fig. 7). To note, Asp 101, Asp 153 and Lys 328 of ECAP were reported to form secondary interactions with phosphate through either a water molecule or Arg-166. Regarding Lys 328 of ECAP, a salt link between Lys 328 and Asp 153 is formed, which further orients Asp 153 to interact with Mg 2+ for stabilizing the PO4 moiety in the substrate cavity. Lys 328 could also interact with water molecules which further protonates the oxygen atom of PO4 moiety for catalysis.
In PAP, Asp 153 of ECAP corresponds to Asp145, located close to Glu 294. Nevertheless, Glu 294 and Asp 145 would not form a salt bridge due to the repulsion between these residues. The absence of the salt link might lead to a "swing" position of Glu 294, which allows this residue to interact with the phosphate substrate. This assumption is supported by the 2D map (Fig. 6), which showed direct interaction between Glu 294 to the ligand. This direct interaction differs from Lys 328 of ECAP, which interacts with the ligand through a water molecule. In addition, a "swing" orientation was also observed when Lys 328 was mutated to His in ECAP, which further increased the ligand-binding stability (Holtz and Kantrowitz, 1999). This might be the reason for the high catalytic activity of PAP, as reported earlier (Lindang et al., 2021). In addition, the replacement of Lys 328 with Glu 294 might also change the charge of the substrate-binding cavity of PAP. Koutsioulis et al. (2010) andO'Brien et al. (2008) reported that the charge of the residues within the catalytic binding pocket was found to have a serious effect on the catalytic activity of AP. The ZnI metal biding residue is located next to Glu 294 and it is postulated that Asp 293 coordinates the Zn 2+ and facilitates the interaction of these two residues. Nevertheless, the proposed indirect interaction can be experimentally confirmed by an x-ray crystallography study.
To note, it is not only PAP that has different residues at the Lys 328 position. The AP of Geobacillus thermodenitrificus strain T2, GtAP (A8WEG4), was also reported to have Glu at the position corresponding to Lys 328 of ECAP. In addition, the replacement of Lys 328 of ECAP to His can be seen in the human placenta (1EW2), halobacterium (2 X 98), ShrimpAP (1SHN), and Pyrococcus (Q9UZV2). Meanwhile, replacement of the Lys 328 to Trp was seen in Antarctic bacteria Tab5 (2IUC), Parageobacillus (C1K6P2), Bacillus subtillis (P1940), Vibrio (3E2D), and Halomonas (3WBH). This indicated that another residue could take over the role of Lys 328 in the catalysis, which further leads to a unique catalytic mechanism due to the specific part of the non-Lys residue at this position.

Conclusion
Structural homology modeling showed that PAP forms a dimeric structure with canonical catalytic sites of Ser 94 and Arg 158. The sequence alignment of PAP revealed the binding sites for Zn 2+ and Mg 2+ ions, which are suspected of playing an essential role in catalysis. The docking simulation of PAP to the pNPP substrate and comparison with ECAP revealed that the uniqueness of Glu 294, which correspond to Lys 328 of ECAP, might account for the unique catalysis mechanism employed by PAP. This uniqueness may also contribute to the high activity of PAP and is promising to be further harnessed for further applications.