Cloning of the Endoglucanase Gene from a Bacillus amyloliquefaciens PSM 3.1 in Escherichia coli Revealed Catalytic Triad Residues Thr-His-Glu

Problem statement: An Indonesian marine bacterial isolate, Bacillus amyloliquefaciens PSM 3.1 was isolated for hydrolyzing cellulose. A 1500-bp nucleotide fragment was amplified from the chromosomal DNA by the use of primers directed against the conserved sequence of Bacilli endoglucanase genes obtained from GenBank. Approach: The fragment was cloned and expressed in Escherichia coli. Results: The endoglucanase gene (eglII gene) had an open reading frame of 1500 nucleotides encoding a protein of 499 amino acids. The EglII protein belonged to Glycosyl Hydrolase family 5 (GH5) with a Cellulose Binding Module 3 (CBM 3). The structure model of the EglII protein revealed that the catalytic residues seemed to be Glu169 (as proton donor) and Glu257 (as nucleophile) and the catalytic triad residues were Thr256, His229 and Glu169. The EglII endoglucanase exhibited an optimum pH of 6.0 and temperature of 50°C and the enzyme tolerated to high salt concentration. Conclusion/Recommendations: This EglII endoglucanase is a promising candidate for many applications in biomass degradation.

Endoglucanases from a wide range of terrestrial organisms have been reported, however, few endoglucanase reports from marine sources are known. In order to explore Indonesian marine microbial collections, gene encoding enzyme for hydrolyzing cellulose is investigated. Of 32 isolates we have, Bacillus amyloliquefaciens PSM 3.1, which isolated from the surface of the hard coral Galaxea sp.-collected from Pantai Merah, Komodo Island, Flores, Indonesia by scuba diving at depth of approximately 3 m, shows the highest cellulolytic activity. Here, we report the cloning of endoglucanase gene of B. amyloliquefaciens PSM 3.1 in E. coli. Protein structure model of the endoglucanase from amino acid residues deduced its gene shows novel catalytic triad residues.

MATERIALS AND METHODS
Bacterial strains, vector and culture conditions: The bacterial strains and plasmids that used in this study are listed in Table 1. The B. amyloliquefaciens PSM 3.1 was cultivated in marine broth medium (0.25% yeast extract and 0.5% bacto peptone in seawater-water 3:1). The E. coli strains were cultivated in LB broth medium. For plating, 1% bacto agar was added to LB broth medium. For screening of recombinants, 100 μg mL −1 of ampicillin, 100 mM Isopropyl-β-Thiogalactopyranoside (IPTG) and 50 mg mL −1 5 bromo-4-chloro-3-indoly-β-D-galactopyranoside (X-Gal) were added to agar plate.

DNA manipulations:
The molecular cloning procedure was performed mainly using the procedure of Sambrook et al. (1989). Genomic DNA was isolated from B. amyloliquefaciens PSM 3.1 using a Wizard Genomic Purification Kit (Promega). Digestion of DNA with restriction endonuclease, separation of fragment by agarose gel electrophoresis, ligation of DNA fragments, transformation of E. coli with plasmid DNA and extraction of recombinant DNA were all performed as described by Sambrook et al. (1989). DNA fragments were recovered from agarose gel using Gel Extraction Kit (QiaGen).
Primers and PCR amplification conditions of the endoglucanase gene: According to the published sequences of endoglucanase genes in the GenBank database (accession nos. AY044252, Z29076, DQ116829, DQ82954 and EU022560), two DNA oligonucleotide primers were designed and synthesized to allow the PCR amplification of the entire gene from genomic DNA of B. amyloliquefaciens PSM 3.1. The primers were as follows: • CelF: 5'-ATGAAACGGTCAATCTC-3' (deduced from conserved sequence in the genus of Bacillus) • CelR: 5'-CTAATTTGGTTCTGTTCCC-3' (deduced from conserved sequence in the genus of Bacillus) The PCR reaction mixtures for 25 μL volume reaction system contained 1× Dream Taq reaction buffer (Fermentas), 0.2 mM forward primer and reverse primer, 0.2 mM dNTP, 2mM MgCl 2 , 2.5 U Dream Taq DNA polymerase (Fermentas) and 2 ng of DNA template. The PCR started with a 2 min denaturation at 94°C, followed by 30 cycles of 1 min at 94°C, 1.5 min at 51°C and 2 min at 72°C. A final extension of 5 min at 72°C was performed. The amplified PCR products were analyzed by agarose gel electrophoresis.

Cloning and expression of endoglucanase gene:
The PCR products with size of 1500 bp which is predicted as an endoglucanase gene ( Fig. 1A) was cloned into a pGEM-T vector, yielding pG-Cel and then transformed into E. coli TOP10 for sequencing and assaying. According to the result of sequences (Fig. 1B), two new primers were designed and synthesized to allow the PCR amplification of the endoglucanase gene in pG-Cel without signal peptide sequence (1-87) and addition of restriction sites for EcoRI and XhoI. The primers were as follows: Each of PCR products were cloned into a pET-20b vector, yielding the expression vector pET-Egl. The recombinant plasmids were then transformed into E.coli BL21 (DE3) for further analysis and expression experiments.
late assays for endoglucanase activity: The transformed E.coli BL21 (DE3) cells harbouring pET-Egl were screened for the expression of endoglucanase activity by the Congo-red assay (Wood et al., 1988). The transformed cells were overlaid on LB agar medium containing 1% Carboxymethyl Cellulose (CMC) and then incubated at 37°C overnight. Following incubation, the plates were stained with 1% Congo-red for 5 min and destained with 1M NaCl for 5 min. Positive clones were identified by a clear zone around endoglucanase-expressing colonies. The resulting DNAs were sequenced using dye-end terminator (Macrogen, Korea). A similarity analysis of the nucleotide sequence arising from PCR amplification was carried out using the BLAST(N) program (http://www.ncbi.nlm.nih.gov). A similarity analysis of the deduced amino acid sequence of the cloned gene was performed using the BLAST(P) program (http://www.ncbi.nlm.nih.gov). Protein structure modeling was carried out by using the SWISS-MODEL (Arnold et al., 2006).

Nucleotide sequence number:
The DNA sequence of the endoglucanase gene from B. amyloliquefaciens PSM 3.1 has been deposited at GenBank under accession no. GU390463.

Enzyme assays:
The positive clone producing endoglucanase was cultured in LB broth medium and induced by IPTG. The cells were harvested by centrifugation (7000×g for 20 min at 4°C), suspended in 50 mM universal buffer (succinic acid, Na 2 HPO 4 , glycine), pH 6.0 in a ratio of 1 g cell: 2 mL buffer and then sonicated (50% pulse for 15 second at 4°C, repeat 10 times). The endoglucanase was obtained from centrifugation (11000×g for 20 min at 4°C) of the solution and its activity was measured by measuring the release of reducing sugar, as given by the Dinitrosalicylic acid (DNS) method using cellobiosa as a standard (Miller, 1959). One unit of endoglucanase activity was defined as the amount of enzyme producing 1 μmol of reducing equivalents per minute under the assay condition. All assays were conducted in triplicates.
Effect of pH, temperature and NaCl on endoglucanase activity: The optimal pH of endoglucanase activity was determined by evaluating the hydrolysis reaction in the range of pH 4.0-10.0 using 25 mM universal buffers at 50°C. The optimum temperature of endoglucanase activity determined by evaluating the hydrolysis reaction in the temperature range of 26-80°C at pH 6.0. The influence of NaCl on the endoglucanase activity was determined by incubation of reaction mixture with different NaCl concentration at 50°C and pH 6.0.
Protein determination: Total protein concentration was determined by the Bradford method using bovine serum albumin as a standard (Bradford, 1976).

RESULTS AND DISCUSSION
Analysis of the endoglucanase gene: Endoglucanases are the most diverse class of enzymes amongst cellulolytic enzymes (http://afmb.cnrsmrs.fr/CAZY/index.html). Most of endoglucanase genes are obtained by PCR or by constructing DNA libraries. Here, the PCR products obtained using B. amyloliquefaciens PSM 3.1 chromosomal DNA as template were about 1500 bp and other non-specific bands (Fig. 1A). A 1500 bp band of the PCR products was recovered from the agarose gel and ligated to pGEM-T vector and then recombinant plasmids (pG-Cel) were transformed into E. coli TOP10.
To analyze whether pG-Cel containing the targeted endoglucanase gene, the nucleotide sequence was determined (Fig 1B). The nucleotide sequence of endoglucanase gene B. amyloliquefaciens PSM 3.1 was generally 97% identical to that of endoglucanase (engA) gene of B. amyloliquefaciens strain UMAS1002 (GenBank no AF363635.1) and 97% identical to that of endoglucanase (bglC) gene of B. subtilis DLG (GenBank no M16185.1B) and there are 17 base differences. Thus, with respect to the sequence identity, the 1500 bp PCR product of the genome DNA is considered to be an endoglucanase gene (designated as eglII gene).
To examine expression of endoglucanase, the eglII gene of pG-Cel was PCR amplified using primers Egl-F and Egl-R, cloned to pET-20b vector and then recombinant plasmids (pET-Egl) were transformed into E. coli BL21 (DE3). The transformed cells were screened and positive clones were identified by the presence of a clear zone surrounding colonies suggesting CMC hydrolysis ( Fig. 2A). A positive clone contained pET-Egl harboring about 1423 bp of eglII gene (Fig. 2B). Further resequencing result of the eglII gene in pET-Egl showed no significant nucleotide sequence difference from that of eglII gene in pG-Cel, except signal peptide sequence. So, the eglII gene contained an Open Reading Frame (ORF) of 1500 nucleotides that started with ATG start codon and terminated with a non coding TAG as stop codon (Fig. 1B). Endoglucanases are modular enzymes consisting of two or more functional module, such as catalytic modules and carbohydrate binding modules (Ohmiya et al., 1977). The amino acid sequence deduced from eglII gene is shown in Fig. 3A. The ORF of eglII gene consisted of 1500 nucleotides encoding a protein 499 amino acids, called EglII endoglucanase. BLAST(P) result of EglII endoglucanase revealed a modular enzyme composed of a signal peptide (1-29), a catalytic domain (48-301) of Glycosyl Hydrolase family 5 (GH5) and a substrate binding domain (356-437) of Cellulose Binding Module 3 (CBM 3) (Fig 3B). The modular organization (GH5-CBM 3) of endoglucanase are also found in many Bacillus genus such as in B. licheniformis strain B-41362 (Bischoff et al., 2007), B. amyloliquefaciens DL-3 (Lee et al., 2008) and B. subtilis (Li et al., 2009).
To identify a signal peptide of the EglII endoglucanase, the Computer Program Signal IP 3.0 (http://www.cbs.dtu.dk/services/SignalP) was employed. The result showed that a signal peptide of the EglII endoglucanase was located at the first twenty nine residues (Fig. 3B). This sequence is conserved among Bacillus genus such as endoglucanase B. amyloliquefaciens strain UMASI002 (GenBank no AF363635.1) located at residues 1-29 and endoglucanase (BglC) B. subtilis DLG (GenBank no M16185.1B) located at residues 1-38.
About 150 amino acid residues forming β-sheet fold structure construct CBM 3 of EglII endoglucanase (Fig. 4A). Conserved residues Asn13-Asp52-Tyr53 for cellulose binding of EglII endoglucanase were identical to residues Asn20-Asp60-His61 of Clostridium cellulolyticum (PDB no. 1G43). In addition, conserved amino acid residues forming the crevice on the surface of CBM 3 of Clostridium cellulolyticum were Arg44-Tyr46-Tyr93-Glu96. These residues were identical to Arg37-Trp39-Tyr78-Glu80 of CBM 3 of EglII endoglucanase. The CBM 3 acts to bind a cellulose surface and acts as an anchor so that it will lead to strong adsorption of the enzyme on the cellulose surface. CBM 3 helps cellulose hydrolysis by selecting a single cellulose chain directly into the active site of the enzyme (Shimon et al., 2000).
To detect conserved catalytic domain and substrate binding domain, structural model of EglII endoglucanase was built by the SWISS-MODEL (Miller, 1959). The catalytic domain model of EglII endoglucanase was a core domain of (β/α) 8 barrel structure. It sequence had 69% identity with the reported Cel5A B. agaredhans (PDB/Protein Data Bank no. 7A3H) and 69% identity with Cel5 Bacillus sp. (PDB no. 1LF1). Multiple alignment of those catalytic domain sequences indicated conserved catalytic residues of EglII endoglucanase were Glu169 (predicted as a proton donor) and Glu257 (predicted as a nucleophile). Futhermore, close to Glu169 was located residues His229 and Thr256 (Fig. 4B). These residues (Thr-His-Glu) were a triad that to be different from a triad in their counterpart from Bacilli genus, that commonly Ser-His-Glu (Davies et al., 1998;Shaw et al., 2002). The position of residue Thr256 together His229 in the active site of EglII endoglucanase may control the protonation of the acid/base Glu160. Protonated Glu may facilitate the first step of the catalytic reaction through protonation of the substrate (Shaw et al., 2002).
Activity of the endoglucanase: Recombinant EglII endoglucanase was obtained from E. coli BL21 (DE3) harboring pET-Egl. The profile of activity of EglII endoglucanase versus pH seemed to be a saddle-like curve (Fig. 5A). The optimal pH of EglII endoglucanase was determined to be 6.0. The enzyme retained more than 50% its activity in a wide range of pH (4-10). The apparent optimal temperature of EglII endoglucanase at pH 6.0 was 50°C, with 50% of maximum activity being retained between 40 and 60°C (Fig. 5B). EglII endoglucanase was found to be salt tolerant (Fig. 5C). The enzyme retained more than 90% of its activity in the presence of up to 2M NaCl. The overall characteristics of recombinant endoglucanase EglII activity were similar to those of wild type EglII endoglucanase. amyloliquefaciens PSM 3.1. Effects of pH, temperature, and NaCl on EglII endoglucanase activity are showed in A, B, and C, respectively

CONCLUSION
The EglII endoglucanase features described here show cellulolytic activity across a broad pH range, moderate thermostable and high salt tolerant. On the basis of this characteristics, EglII endoglucanase is a promising candidate for application in biomass degradation, cellulose processing industry and biotechnological processes. In the future, it will be essential to perform further study of structure-function relationship of the enzyme.