Expression of Thioredoxin-Fusion Proteins of α-Gliadin, γ-Gliadin and Low Molecular Weight Glutenin, from Wheat Endosperm and their Domains in Enterobacteria

Claudia G. Benitez-Cardoza, Yves Popineau and Jacques Gueguen Laboratorio de Investigación Bioquímica, Posgrado en Biomedicina Molecular, ENMyH, Instituto Politécnico Nacional, Guillermo Massieu Helguera 239, Fraccionamiento “La Escalera Ticoman CP 07320, México DF. México Unité Biopolymères Interactions Assemblages, Rue de la Géraudière, B.P. 71627/44316 Nantes, Cedex 03, France ____________________________________________________________________ Abstract: Wheat seed storage proteins play a determining role in the viscoelastic properties of wheat gluten. The genes encoding α-gliadin, γ-gliadin and Low Molecular Weight glutenin and their Ncentral-repetitive and C-terminal domains from wheat endosperm had been subcloned into a thioredoxin expression system (pET102/D-Topo) and produced as fusion proteins in E. coli. The expression levels for each of the proteins varied among constructs from 5 to 12 % of the total proteins in E. coli. This indicates that obtaining prolamins as fusion proteins to thioredoxin might have the potential for preparing milligram quantities of the proteins tested here. The identity of the synthesized polypeptides was confirmed by immunoblotting and antibody-cross reactions. Two cleavage methods for the removal of thioredoxin were assayed. Nevertheless, the attempts to remove the fusion partner from most of the constructs failed. The only construct that was able to be cleaved either by Entorokinase, or by acid cleavage, was the N-terminal domain of γ-Gliandin. Also this construct showed enhanced solubility compared with the rest of the polypeptides produced. Some aspects of the sequence that might contribute to the different behaviour of this construct are discussed. The results presented in this work open new alternatives for the production of large amounts of seed storage proteins, in order to further characterise their structure and interactions.


INTRODUCTION
The wheat seed storage proteins are a major source of protein in the human diet, and are responsible for the properties of wheat doughs that allow a wide range of food products. They are also implicated in wheat allergies [1] and Coeliac disease, an autoimmune condition triggered by some cereal proteins. [2] Gliadins and glutenins are the major storage proteins that accumulate in wheat endosperm cells during seed development. These polymers are among the largest protein molecules known in nature and are the most important determinants of the viscoelastic properties of gluten. [3] Glutenins consist of very large disulfide-linked polymers made up of high molecular weight and low molecular weight (LMW) subunits. LMW glutenins consist of 250-300 residues forming two domains. [4] These proteins have a cysteine residue within the N-terminal domain, which is unlikely to form intramolecular disulfide bonds with cysteine residues within the C-terminal domain, because of the rigidity imposed by the repetitive sequence. In addition LMW glutenins have seven cysteine residues in their Cterminal domain, at least one of which is unpaired, thus available for intermolecular bonding. On the other hand, gliadins, in accordance with their mobility in acid PAGE, are divided in four groups: -(fastest mobility), -, -, and -gliadins (slowest mobility). All gliadins are low in ionic aminoacids (histidine, arginine, lysine, and free carboxylic groups of aspartic and glutamic acid). Glutamic and aspartic acids exist almost as amides. The -gliadins differ from -and -gliadins in the amount of aspartic acid, proline, methionine, tyrosine, phenylalanine and tryptophan. [5] In their sequence, -gliadins have six cysteine residues, while -gliadins have eight. They form three and four disulfide bonds respectively.
There are no free _____________________________________________________________________________________________ cysteines and all S-S linkages are intramolecular preventing gliadins from participating in polymeric structure of glutenins. [6] One of the problems in fully-characterising structural, functional and immunological properties of gliadins and glutenins were the difficulties in extraction-purification to homogeneity of single polypeptides from the natural sources.
The development of efficient methods for production and purification of plant seed storage proteins using different heterologous systems would facilitate structure-function studies of these proteins. There have been some reports about the expression of prolamins in model systems such as E. coli. Nevertheless, previous attempts to produce large amounts of cereal seed storage proteins in E. coli have not always been fully successful. [7][8][9] A frequent problem of the expression of glutenins and gliadins and other prolamins in enterobacteria systems have been found, such as low expression level. This might be due to toxicity of the proteins to the E. coli cells, or to unfavourable mRNA secondary structure.
Also, improper folding of recombinant prolamins has been encountered. [10] These later problems may arise from a lack of specific conditions or proteins like BiP, and PDI that improves correct folding and disulfide bond formation.
It is believed that eukaryotic host would be more suitable for the expression of proteins from other eukaryotes because the eukaryotic nature of the host cells should facilitate the production of proteins which are correctly folded and processed. Therefore prolamin research has frequently used eukaryotic systems, such as yeast, transgenic plants, and Xenopus oocytes to study the structure, functions and interactions of prolamins. [11][12][13] On the contrary, some eukaryotic systems have failed to produce correctly folded prolamins requiring refolding in vitro. That was the case of -gliadin, expressed in yeast, [12] and LMW glutenin, expressed in insect cells using a baculovirus expression system. [14] On the contrary, a -turn rich barley C-hordein was expressed in E. coli, at high levels showing correctly folded structures. [15] A different approach was to express prolamins in a thioredoxin fusion system (pET system). It has been successfully used to produce soluble target proteins which are otherwise insoluble in E. coli. For example Maize-Gamma zein and their N-and C-terminus domains had been expressed in E. coli as a thioredoxin fusion. This strategy resulted in significantly enhanced solubility of the C-terminus domain of the fusion protein when compared with the C-terminus domain without thioredoxin. [16] On the contrary, the Nterminus domain and the full length Gamma zein were mainly insoluble.
The authors suggested that insolubility of full length Gamma zein results from structural interactions of the N-terminus and that the solubility of the C-terminal domain depends on proper disulfide bond formation. In other work, Tamas and co-workers were able to express C-hordein in E. coli as a correctly folded polypeptide. They suggested that the success achieved in the expression yields and correctly folding of prolamins in heterologous systems depends on the protein structure, rather than the expression system. [15] In this work, we report the sub-cloning, bacterial expression, and partial isolation of α-gliadin, γ-gliadin and Low Molecular Weight glutenin (LMW) from wheat endosperm and their respective N-repetitive and C-terminus domains, as thioredoxin fusion proteins.

Amplification and cloning of genes.
Oligonucleotide primers (Invitrogen) were designed for amplification of α-gliadin, γ-gliadin and Low Molecular Weight glutenin genes coding for mature protein and their respective the N-repetitive and C-terminal domains. The signal peptides lacked, in all cases Specific primers had the sequences shown in Table 1.
For polymerase chain reaction (PCR), about 100 ng of cDNA was used in a reaction mixture containing 0.3 mM each dNTP (Amersham Pharmacia Biotech), 300 nM each primer, 1 mM MgSO 4 , 1.25 Units of Platinum Pfx-DNA polymerase (Invitrogen). An initial denaturation step for 3 minutes at 94°C was followed by 30 cycles of denaturation, annealing, and polymerisation temperatures of 94°C, 55°C, and 68°C respectively.
After the 30 cycles polymerisation temperature was maintained for 10 minutes. After cycling the temperature of the PCR products was maintained at 4°C. Purified PCR fragments were subcloned using the Directional Topo cloning system into the pET102D/topo vector according to the Table 1: Primer sequences used for amplification of α-gliadin (α), γ-gliadin (γ) and Low molecular Weight glutenin (LMW) genes and their N-and C-terminal domains (prefix N-and C-correspondingly). Initial and Stop codons of the protein are shown in bold. The region coding the 6-His tag is italicized.
instructions in the manufacturer's manual (Invitrogen). The reaction mixture was used to transform chemically competent Top10 cells (Invitrogen), and recombinant plasmids were isolated using Qiagen purification kits (Qiagen). Clones were analysed by BamHI restriction sites. The sequences of positive clones were determined using the Trxfus (5´-TTCCTCGACGCTAACCTG-3´) and T7ter (5´ -TAGTTATTGCTCAGCGGTGG -3´) vector specific primers (MilleGen Biotechnologies). The vector pET102 (Invitrogen) expresses fusion proteins of Thioredoxin, cleavable by a specific enterokinase site. Our constructs were designed to contain a 6-His tag in the sequence-region connecting thioredoxin with the corresponding prolamin (Table 1, Figure 1). Expression and partial isolation of recombinat proteins. Plasmids containing prolamin inserts of the appropriate sequences were transformed into competent BL21(DE3)pLys cells (Invitrogen) which carries the gene for T7 RNA polymerase under the control of the lacUV5 promoter.
Pilot expression. Pilot expression experiments were performed using freshly transformed single colonies grown in 0.05 litre cultures of LB medium containing 100 µg/ml of ampicillin, at 37°C and 250 rpm agitation. At a culture density of A 600 = 0.8, the cultures were either left at 37°C or transferred to incubators at 25°C or 30°C. After 20 minutes of temperature equilibration isopropyl β-D -thiogalactopyranoside was added to a final concentration of 0.5 mM. Aliquots were taken each hour, during 8 hours, and the culture density was measured. The cells were harvested at 5000 rpm, during 10 minutes. The pellets were resuspended in buffer containing Tris 50 mM pH 8.0 and 6 M Urea, incubated at room temperature for 2 hours. Cell debris was eliminated by centrifugation at 18000 rpm for 45 minutes.
The supernatants were prepared and electrophoresed on 15% SDS-PAGE gels, with a 6% stacking gel. Special care was taken, to inject the same amount of cells in each lane. The optimal temperature and incubation time after induction for a maximal protein expression were determined by the intensity of the corresponding band in the SDS gels.

Medium scale expression and partial isolation.
Freshly transformed single colonies were grown in 0.4 litre cultures of TB medium in the same way as described previously for pilot expression. At a culture density of A 600 = 0.8, the cultures were either left at 37°C or transferred to incubators at 25°C or 30°C; according to the results of the pilot expression. After 20 minutes of temperature equilibration isopropyl β-Dthiogalactopyranoside was added to a final concentration of 0.4mM.
The pilot expression indicated that for all the constructs, 4 hours was the optimal incubation time after induction. Similarly to pilot expression, after harvesting, cell pellets obtained by centrifugation for 30 minutes at 5000 rpm were resuspended in 6 M urea, Tris 50mM and incubated at room temperature for 2 hours. Bacterial cell debris was then removed by centrifugation (45minutes at 18000 rpm). The supernatant was filtered (0.20 µm) and loaded to Imodacetic acid-resin (Pharmacia Biotech) previously charged with NiSO 4 100mM, and preequilibrated with buffer of 50mM Tris, and 6 M Urea, pH 8.0. In order to eliminate the non specific binding proteins, the Ni-column was washed with 30 column volumes of buffer containing Tris 50mM pH 8.0, 300 mM NaCl, 20mM Imidazole and 6.0M Urea. The elution of fusion proteins was performed adding three column volumes of buffer containing Tris 50mM pH 8.0, 300 mM NaCl, 250mM Imidazole and 6 M Urea.
Cleavage assays of the fusion proteins. The fusion proteins were cleaved either by Enterokinase (EK, Biolabs), or by acid cleavage. When using EK, samples were extensively dialysed against Tris 20 mM, NaCl 20 mM, and CaCl 2 2mM pH 8.0 at room temperature. The digestion was performed following the manufacturer instructions i.e. 16 hrs at 23 °C, with 6.4 X10 -3 units of EK per milligram of fusion protein.
For performing acid cleavage, the eluted fusion proteins were extensively dialysed against acetic acid 50mM at room temperature (pH 3.5), afterwards the pH was lowered to a value of 2.0 by the addition of the appropriate volume of 1M HCl to a final concentration of 10mM. The digestion was performed at two different conditions; 50°C for 24 hours, or 55° for 16 hours. The acid cleavage of the fusion protein was followed by reversed-phase HPLC (column: Nucleosil C18, 5 m,300Å, 4x300mm) using a gradient running from 100% buffer A (0.1% TFA in H 2 O) to 100% buffer B (75% acetonitrile, 24.92% H 2 O, 0.08% TFA).

SDS Electrophoresis.
The proteins were analysed using one dimensional sodium dodecyl sulphate/polyacrylamide gel electrophoresis (SDS/PAGE) with 15% gels, with 6% stacking gels and staining with Brilliant Blue R250 (Sigma). Immunoblotting and antibody-cross reactions were performed as described by Denery-Papini et al. [19]

RESULTS
Plasmids containing prolamin inserts of the appropriate sequences were transformed into competent BL21(DE3)pLys cells. The mature peptides α-gliadin, γ-gliadin and Low Molecular Weight glutenin and their N-repetitive and C-terminal domains were expressed in individual cultures at different temperatures (pilot expression, 25°C, 30°C and 37°C). Aliquots were taken each hour after induction. The over-expression of every construct was analysed by SDS-electrophoresis. The optimal expression temperatures and induction times were determined by the intensity of the corresponding band on the gels.
The optimal temperatures after induction are shown in Table 2. It was found that optimal induction times, under our expression conditions, were four hours after adding IPTG.
Afterwards, medium-scale expression experiments were carried out, using the optimal temperatures after induction observed in the pilot expression. Figure 2 shows the gels for the optimal expression conditions of each construct. Table 2 shows the levels of expression of each prolamin expressed as percentages of the total E. coli proteins. As we can see, the expression levels for each of the proteins varied among constructs from 5 to 12 % of the total proteins in E. coli. This indicates that expressing prolamins as fusion proteins to thioredoxin might have the potential for preparing milligram quantities of the proteins tested here. To confirm the identity of the synthesized polypeptides, immunoblotting and antibody-cross reactions were performed (data not shown).
In the gels of Figure 2, it can be noticed that fulllength -gliadin is over-expressed with another protein, which band appears at a shorter mass. The same result is observed for the full-length -gliadin and full-length LMW-glutenin. These non-expected bands appear at a very similar molecular mass to that of the corresponding N-terminal domains.
Western blot results demonstrated immunological specific reactions of these unexpected bands with the corresponding antibodies. This might indicate, that some truncated constructs were also produced, probably related to repetitive domain sequences.
These truncated polypeptides, might be either not fully synthesised proteins or products from some kind of proteolysis. Further analysis is needed to clearly determine which process is originating these truncated polypeptides.
It has been observed that glutenins and gliadins migrate in SDS/polyacrylamide gels with a lower mobility than expected according to their molecular weight. This phenomenon has been attributed to the Proline-rich repetitive domain. Therefore, it was not surprising to observe a significant decrease in the gel mobilities of the N-central-repetitive domains of αgliadin, γ-gliadin and LMW-glutenin. Also, full length and C-terminal domain of LMW-glutenin showed decreased gel mobility, but it was less significant  M F N C compared with the repetitive domains. Partial isolation of the fusion protein was carried out, by Ni affinity chromatography. SDS gels were obtained of the eluted fractions of the Ni-column. The results of partial isolation are shown in Figure 3. Gels show several protein bands. It is clear, that after affinity chromatography fusion proteins are not homogeneous, and further purification steps would be needed to get homogeneity. With respect to the putative truncated polypeptides observed on the lanes corresponding to the full length proteins on gels of Figure 2, it can be observed that those bands are conserved after the first isolation by Ni-column (Figure 3).
These observations reinforced the idea of some kind of truncated polypeptides. The sequence scheme of our constructs is presented in Figure 1. As we can see, the polypeptides are composed of thioredoxin (first 115 residues) at the very N-terminus, followed by the Asp-Asp-Asp-Asp-Lys specific sequence for Enterokinase cleavage; residues 124 and 125 are Asp-Pro. This pair is quite labile for acid cleavage at high temperatures Therefore, they constitute an acid cleavable site under selected conditions for the recovery of the respective prolamin.
The last residues correspond to the sequence of each prolamin.
After the Ni-column, removal of the thioredoxin partner was assayed using Enterokinase or acid cleavage. The efficiency of Enterokinase cleavage was verified by SDS gels (data not shown). For most of the constructs it was observed that there was not cleavage product appreciable on the gels.
The bands corresponding to fusion proteins remained at the same intensities. The only construct that showed appreciable cleavage was the N-terminal domain of γ-gliadin. It is important to mention that this last construct was readily soluble in the presence of the buffer composed by Tris 20 mM, NaCl 20 mM, and CaCl 2 2mM pH 8.0 at room temperature, whereas all the other constructs showed evident aggregation at the same conditions. This indicates that EK cleavage might have not worked due to inaccessibility of the protease to its specific cleavage site. [20] In order to sort this problem out, an alternative would be to assay further experimental conditions for EK cleavage, such as pH, buffer composition and temperature. Another alternative could be to replace the Enterokinase site of the constructs by the specific cleavage site of another enzyme using molecular biology techniques. Acid cleavage of proteins was followed by reverse-phase HPLC (µRPC C18) using a gradient running from 100% buffer A (0.1% TFA in H 2 O) to 100% buffer B (75% acetonitrile, 24.92% H 2 O, 0.08% TFA). The chromatograms obtained for most of the constructs were large ensemble of tiny peaks that appeared all along the concentrations of the elution gradient. This might indicate that acid cleavage caused extensive proteolysis of the prolamins. The only construct that resisted acid cleavage treatment was again the N-terminal domain of γ-gliadin. The results obtained with this construct are thoroughly discussed elsewhere. [20] DISCUSSION Seed storage proteins are considered the most important determinants of the viscoelastic properties of wheat gluten and, as such, of bread-making quality. Glutenins and gliadins interact to form a protein network into the ripe grain, precursor of the mentioned rheological features of gluten and wheat flour. Several sequences of glutenins and gliadins have been determined. Nevertheless, we still lack a lot of information about their three-dimensional structure and their interactions. In order to characterise both of them, large amounts of purified proteins are needed. The ability of a purified protein product is often a major limitation for its analysis. Consequently, in this paper the expression of large amounts of α-gliadin, γ-gliadin and Low Molecular Weight glutenin, as well as their repetitive and C-terminal domains as fusion to thioredoxin in E. coli was evaluated. Some other attempts of expressing seed storage proteins in prokaryotic, and even in eukaryotic systems have not always been fully successful. [7][8][9]21] A frequent problem encountered had been very low expression levels. On the contrary, here very good expression levels are reported, demonstrating that with the proper moiety, such as thioredoxin as a partner, E. coli BL21 is able to over-express and accumulate wheat prolamins. This strategy has been used conveniently before for the expression of Maize-Gamma zein and their N-and Cterminus domains, and for the production of periodic polymers modelled on the repetitive domain of wheat gliadins. [8,16] All the constructs contain an enterokinase specific site for thioredoxin removal. Some attempts for the enzymatic cleavage were carried out, following manufacturer instructions (Invitrogen). Nevertheless these efforts failed, except for the N-terminal domain of γ-gliadin, probably due to poor accessibility of the protease to the cleavage site. To sort this out, it would be suitable to modify the expression vector in order to remove the Enterokinase site and to introduce another specific site for a different protease which optimal activity occurs at low pH, or even in the presence of large concentrations of ethanol. It is know that prolamins show enhanced solubility at lower pHs and high concentration of ethanol.
Also, all our constructs present acid-labile sites for thioredoxin elimination. Acid cleavage has been assayed at two experimental conditions; pH 2.1 at 55 °C for 16 hours, and 50 °C for 24 hours. Unfortunately, both of those conditions seem to be very harsh for most of the constructs assayed here. Only the N-terminal domain of γ-gliadin was properly cleaved at the expected sites. All the rest of the constructs, suffer from extensive proteolysis leading to collections o small polypeptides. Acid cleavage has been used before, in the case of recombinant periodic polypeptides modelled on the repetitive domain of wheat gliadins. Acid cleavage is easy and cheap to scale-up. Therefore, it would be convenient to evaluate further experimental conditions, to find the right one for each sequence. To explain the enhanced solubility (in buffer containing Tris 20 mM, NaCl 20 mM, and CaCl 2 2mM pH 8.0 at room temperature) and resistance to acid cleavage of N-terminal domain of γ-Gliadin a meticulous analysis of the sequences of each prolamin studied was carried out ( Table 3). The comparison of the sequences was made without considering the thioredoxin moiety because it is the same for all the constructs studied, therefore, it might be that the effects of the presence of this partner would be approximately the same in all polypeptides studied. Firstly, the theoretical isoelectric point of N-terminal domain of γgliadin is the lowest (5.97) of all the constructs, particularly it is the furthest value from the optimal pH of enzymatic activity of Enterokinase. It is well known that solubility of proteins decrease near their isoelectric points. Furthermore, this construct possesses the smallest percentage of non-polar residues. The grand average of hydropathicity (calculated as the sum of hydropathy values of all the amino acids, divided by the number of residues in the sequence, [22] ) of N-terminal domain of γ-gliadin is the most negative value of all constructs. Also, this polypeptide has the smallest amount of charged amino acids (only one aspartic acid) and has no positively charged residues, whereas all the rest o the constructs possess at least three charged residues. It is important to mention that the theoretical isoelectric point and the percentage of non-polar residues are not significantly different from those from N-terminal domain of α-gliadin, nevertheless, it might be that all the characteristics mentioned together might make the differences in the behaviour encountered. Although it was not possible for us to obtain most of the prolamins separated from their fusion partner, the results presented in this work open new alternatives for the production of large amounts of seed storage proteins, in order to further characterise their structure and interactions.  [22] .