Structural Characterization at the Atomic Level of a Molecular Nano-Machine: The State of the Art of Helicobacter Pylori Flagellum Organization

Corresponding Author: Giuseppe Zanotti, Department of Biomedical Sciences, University of Padua, Viale G. Colombo 3, 35131 Padua, Italy Tel. ++39-0498276409 Email: giuseppe.zanotti@unipd.it  Abstract: Motility is fundamental for success in colonization a d survival of the human pathogen H. pylori, the bacterium that colonizes about half of the human population. Motility is achieved through flagella. In this review the present structural knowledge about the organization of H. pylori flagella is summarized, considering not only the structure of i ts proteins, but also what is known about flagella of other Gram-negative bacteria. Despite t h limited amount of structural information about the H. pylori nano-machinery, sequence alignments and homology modeling allow to investigate the unique structural properties of several H. pylori flagellar proteins.


Introduction
The bacterium Helicobacter pylori colonizes the stomach of more than half of the world's population, being one of the most successful bacterial pathogens (Rothenbacher and Brenner, 2003). It was isolated and characterized at the beginning of the '80 s, (Marshall and Warren, 1984), thereby becoming the focus of intense research activity. It is a microaerophilic, non sporeforming and spiral-shaped Gram-negative bacterium, transmitted from human-to-human possibly by the fecaloral or oral-oral route (Blaser, 1998). To date, several genomes of different strains have been completely sequenced, among them HP26695, J99 and G27 (Tomb et al., 1997;Doig et al., 1999;Oh et al., 2006). A map of protein-protein interactions can be found through the PIMryder server (Rain et al., 2001). The bacterial infection in humans is associated with a spectrum of disease outcomes, perhaps associated to potential selective advantages during long periods of the human history (Suerbaum and Michetti, 2002;Suerbaum and Josenhans, 2007). While many H. pylori infected individuals are clinically asymptomatic, most will exhibit some degree of gastritis. Approximately 10% of the infected subjects will develop more severe gastric pathologies, like peptic ulcer disease and atrophic gastritis. Approximately 1% of infected individuals develop gastric adenocarcinoma and lymphoma of the Mucosa-Associated Lymphoid Tissue (MALT lymphoma) (Blaser, 1998;Suerbaum and Michetti, 2002). The severity of symptoms largely depends on the genetic diversity of the infecting strain and particularly on specific genotypes of virulence-associated genes, such as the cag Pathogen City Island (cag-PAI), encoding a T4SS which promotes delivery of the CagA effector into host cells, as well as a CagA-independent induction of interleukin-8 secretion via the host AP-1 and NF-kB signaling pathway (Backert and Selbach, 2008;Zanotti, 2011). H. pylori infections can be successfully cured with antibiotic treatment: Unfortunately, the available therapies are beginning to lose efficacy, because of the insurgence of antibiotic resistance.

H. Pylori's Flagellum
Success in colonization and survival of H. pylori is centered on the regulation of important virulence traits, such as motility, acid resistance, detoxification and metal ion homeostasis (Danielli and Scarlato, 2010). Motility is achieved through flagella. Flagella are of general relevance for many bacteria, but they are of paramount importance for H. pylori, as they are necessary for the survival of the bacterium in the stomach, particularly during the initial phases of infection (Josenhans and Suerbaum, 2002;Ottemann, 2002;Amieva and El-Omar, 2008). In fact, to survive and colonize the host the bacterium has to avoid the very acidic milieu of the stomach lumen and its periodic mechanical clearance. Flagella and outer membrane adhesins allow H. pylori to swim through the mucus layer and adhere to gastric epithelial cells. In contrast to many other Gram-negative bacteria, Helicobacter (and Campylobacter) species possess an unusual velocity in viscous media, possibly due to their helical shapes and to the presence of exclusively polar flagella (Lertsethtakarn et al., 2011). Flagella are complex organelles composed of approximately 30 different proteins, but many others are necessary for flagella expression and assembly, for a total of at least 45 (listed in Table 1). Some of them have been identified as members of flagella, others are necessary for flagella assembly. The structural organization and control of flagella in Gram-negative bacteria have been thoroughly studied (Thomas et al., 2006;Chevance and Hughes, 2008;Lertsethtakarn et al., 2011;Paul et al., 2011). A flagellum can be divided in two main portions, the hookbasal body and the extracellular filament (Fig. 1). The former, in turn, can be divided into three substructures: (i) The base, localized in the inner membrane and spanning to the cytoplasm; (ii) the rod and ring structures, located in the periplasm; and (iii) the hook, present on the surface.
The relevance of the flagellum is twofold: From one side, the knowledge of the detailed molecular architecture of H. pylori flagellum may allow to better fight the bacterium, since flagella are component essential for bacterium colonization of the host; on the other side, the flagellum represents an ideal example of a molecular machine and the mechanisms that bacteria have been able to devise in billion years of evolution to convert chemical into mechanical energy can eventually be copied by us for nano-technological purposes.
About the 3D structures of the proteins of H. pylori flagellum, only seven protein structures (including flagellar chaperons) have been determined and another nineteen molecular models can be constructed by homology modeling using an automated server (Phyre2, http://www.sbg.bio.ic.ac.uk/phyre2/), thanks to the structures of homologs from other species. In this review, what is known about the structure and organization of H. pylori flagellum, taking in consideration also what is known from flagella of other bacteria, will be summarized. All the protein mentioned in this review will refer to the bacteria strain HP26695.

The Bacterial Motor
The bacterial flagellar motor (BTM) is a rotatory nano-machine, able to self-assemble, that converts the energy, generally deriving from a flux of cations, into mechanical energy. The latter manifests itself in the rotation of a long filament that allows the bacterium to move inside a viscous medium. Flagellar motor is generically composed by two elements, the stator and the rotor. The stator is non-convalently attached to the peptidoglycan layer. It interacts with the rotor, which is in contact with the MS-ring and is responsible of the torque generation (Macnab, 2003).
Several studies have been performed on the bacterial flagellar motor of Escherichia coli and Salmonella sp. and, despite the crystal structure of the separated components of the torque ring is known, more than one model of the BTM has been proposed (for a review, see for example (Stock et al., 2012)).

Flagellar Basal Body Organization
In Gram-negative bacteria the basal body, which is embedded in the cell envelope, consists of a membrane ring, called MS-ring and a P-ring that passes through the periplasmic space and reach the outer part of the membrane until the L-ring (Fig. 1). The basal body has the role of transmitting the torque force from the motor to the filament (Chen et al., 2011). Another feature common to most bacterial basal bodies is the presence of a Type 3 Secretion System (T3SS) export apparatus, which translocates unfolded flagellar proteins outside the membrane envelope through a central channel. It allows the assembly of the flagellum hook and of the filament beyond the membrane layer (Minamino and Macnab, 1999). It must be mentioned that flagellar systems and T3SS are evolutionary related each other (Abby and Rocha, 2012).

H. Pylori Stator Proteins
The stator is composed of two different transmembrane proteins, MotA and MotB, associated in a complex consisting in four copies of MotA and two of MotB (MotA 4 MotB 2 ) (Terashima et al., 2008). The stator main function is to create a proton gradient across the cytoplasmic membrane. The proton flow generates a potential difference that is used to generate a torque. The proton translocation through the MotA 4 MotB 2 complex induces a conformational change in MotA cytoplasmic loop, which interests the rotor via electrostatic interaction with FliG (Chen et al., 2011). There is very few information about MotA of H. pylori, but the structures of MotA and MotB are available for Salmonella typhimurium. MotA includes four transmembrane helices and a large cytoplasmatic loop, with a charged residue on the last component of the structure that interacts with FliG (Zhou et al., 1998). MotA transmembrane helices surround two MotB proteins. In particular, the transmembrane helix of MotB and two transmembrane helices of MotA constitute the proton channel (Blair and Berg, 1990;Stolz and Berg, 1991;Zhou et al., 1998).  Lertsethtakarn et al., 2011) In Salmonella, MotB is divided in three different domains: a small cytoplasmic N-terminal domain (1-28); a single transmembrane helix (29-50) and a periplasmic Cterminal domain (51-209). MotA, due to its highly hydrophobic nature, has been modeled only for the middle domain, composed of around 60 amino acids ( Fig. 2A). Compared with MotA, MotB is much better characterized in H. pylori (Fig. 2B). O'Neill et al. (2011) have solved the structure of the middle and Cterminal part of the protein. The latter is composed of four pairs of alternating β-strands and α-helices, topologically arranged as βαβαβαβα and is thought to be involved in Peptidoglycan (PG) binding (Roujeinikova, 2008). In H. pylori, MotB is composed of a N-terminal cytoplasmic segment (residues 1-28), a single transmembrane helix (residues 29-50) and a large C-terminal periplasmic region (Muramoto and Macnab, 1998). The transmembrane helix contains an Aspartate residue (Asp33) that has been hypothesized to be involved in proton translocation across the cell membrane. Finally, residues 149-270 of MotB show sequence similarity to other OmpA-like proteins (O'Neill et al., 2011). In solution, MotB behaves as a dimer; β-sheets of the two subunits associate through strand β3 in an antiparallel edge-to-edge manner. The dimer is functional to peptidoglycan binding (Kojima et al., 2009): It exposes the binding sites of the glycan chain on opposite sites, in such a way that two different glycan chains can be bound simultaneously.
In H. pylori, the MotA 4 MotB 2 complex is hypothesized to remain inactive inside the membrane until it is incorporated into the motor, extending the linkers connecting the transmembrane helices and the PG-binding domain. Notably, in the absence of the appropriate peptidoglycan maturation enzymes, MotB does not localize properly and this affects flagellar functionality (Roure et al., 2012). The opening of the channel is triggered by the electrostatic interaction through MotA4 complex and FliG rotor protein and induces the proton flow through the inner membrane of bacteria. The periplasmic P-ring has been proposed to be a binding site for the stator MotB2 complex. Based on the crystal structure of MotB, the following conformational changes have been proposed: (i) Unfolding of the linkers connecting the transmembrane part of the MotB protein with the C-terminal one; ii) interaction with the PG layer; (iii) opening of the proton channel (O'Neill et al., 2011) ( Fig. 3). It has been shown that the anchoring of MotA/MotB complex is essential for the motor function (Blair and Berg 1991;Togashi et al., 1997).
From the superposition of H. pylori MotB and the same protein from Disulfovibros vulgaris and Salmonella ( Fig. 3C-D), it is noticeable how the Nterminal portion of the H. pylori protein assumes a very different conformation with respect to the others. Another peculiar feature is the different orientation of the α-helix in Salmonella protein (amino acids from 265 to 274) with respect to the H. pylori one (amino acids from 231 to 250). These differences suggest a peculiar organization of H. pylori stator inside the membrane layer.

H. Pylori Switch Proteins
The switch complex is responsible of the torque reversal. It allows the bacterium to reorient its swimming, from Counterclockwise (CCW) to Clockwise (CW) (Lee et al., 2010). The motion is controlled through a chemotactic signal, consisting in proteins CheW/CheA and CheY. In brief, the chemoreceptor CheW breaks its link to the histidine kinase CheA, which auto-phosphorylates and becomes active, activating in turn the response regulator CheY. Once phosphorylated, the latter interacts with the N-terminal part of the rotor protein FliM that induces the switch from the CCW to CW (Sarkar et al., 2010).
Proteins involved in the switching of the rotor have been identified for several Gram-negative bacteria, but the better characterized are the ones from Salmonella. The switch components of the flagellum are FliG, FliM and FliN, present in 26, 34 and 136 copies, respectively (Thomas et al., 2001). They are arranged in a ring at the base of flagellum. FliG, the main protein involved in the torque generation, is directly in contact with the stator protein MotA, possibly through a single helix located in the Cterminal domain (Zhou et al., 1998). Several hypotheses about the exact position of FliG in the ring have been put forward; in particular, a huge number of interactions between FliG and FliM have been exploited. From 3D electron microscopy reconstruction of Salmonella C-ring, it seems reasonable to believe that FliG is located on the cytoplasmic face of the MS-ring. A relevant interaction with FliM and FliN has been hypothesized, owing to the tendency of FliG to be disordered in the absence of the C-ring.
In general, three different domains of FliG are known: A N-terminal domain, which binds FliF and acts as an anchor in the C-ring (Levenson et al., 2012); a Cterminal domain involved in the electrostatic interaction with MotA (Lloyd and Blair, 1997); a middle-domain associated with FliM through the sequence EHPQR, highly conserved in most of the flagellate organisms. Another binding site for FliM is present in FliG Cdomain, but it is not clear yet how this interaction takes place. FliG and FliM are present in a 1:1 stoichiometry, according to a correspondence between FliG in the Cring and FliF in the MS-ring. A 26-fold symmetry has been detected in both rings (Suzuki et al., 2004).
Three different architectures of the bacterial flagellar motor have been proposed, based on the crystal structures of single proteins and/or on the EM images of Salmonella flagellum.

Model A
The first model (Paul et al., 2011) locates FliG in the inner and the outer side of the Cring, with the C-terminal part of the protein in the outer part. As seen in EM, the outer lobe of the electron density contains a 34-fold rotational symmetry, but only 26 FliG molecules are present. Model A proposes a solution to this contradiction. It assumes a different arrangement of FliG molecules aligned on the 34 FliM subunits. It seems reasonable that FliG molecules are separated by several gaps in the outer lobes. The main feature of this model is that FliG C-domain and FliG middle domain interact both with FliM. In particular, it is proposed that all 26 FliG C-domains are bound to FliM monomers. The remaining FliM monomers bind the FliG middle domains. Although the geometrical problem of rotor dimensions seems to be solved, a major problem of this model is that the free hydrophobic surface exposed to the solvent is quite high, around 21,000 Å 2 .
Model B. Stock et al. (2012) It imposes some intermolecular restrains between FliG proteins. In protein crystals, FliG-FliG interaction is mediated by an Armadillo Repeat Motifs (ARM), characterized by a three-helix fold. Each of these motifs is composed by FliG middle domain (ARMm) and FliG C-domain (ARMc). The interaction between ARMm-ARMc shields each monomer from the contact with the bulk solvent. The molecule is obliged to arrange in a righthanded super helix. So FliG can be divided in a globular N-terminal domain, ARM super helix and C-terminal domain (Stock et al., 2012). The final restrain position of FliG confines the N-terminal domain in the inner lobe of Cring, with helices involved in MS-ring binding. FliG Cdomain is proximal to the membrane. Finally, FliG middle domain is arranged to form an ARM super helix (Lee et al., 2010). Even though the model seems to give a better accord with EM data, no explanation is given about the symmetry mismatching.
Model C. Stock et al. (2012) Model C was proposed after that the structure of FliG C-domain and middle domain were determined. Model C fits the X-ray structure in the EM electron density as does model B, but with some significant differences. For instance, the helix connecting the Cdomain and the middle domain is not in close contact with the middle domain. Indeed, it was observed that the close position is not physiological. Finally, the contact between ARMc and ARMm occurs in the same subunit.
These three different models imply three different switching mechanisms. Model A requires the rotation of FliG C-domain, because of the direct binding with FliM. The turn reorients torquegenerating charges, which will cause rotational switching. The FliN-FliM C-domain interaction can be seen as a cooperative mechanism, which generate the torque after a chemotactic signal. On the other hand, in model B the motion is essentially due to the conformational changes in the helix connecting the FliG C-domain and FliG middle domain. The movement causes its adjacent FliG promoter rotate. Finally, in model C the movement is intrinsically similar to model B (Stock et al., 2012).
Differently from other bacteria, H. pylori includes one more protein involved in flagellum switch regulation in its genome. The four motor switch proteins are identified as FliG (HP0352), FliM (HP1031), FliN (HP0584) and FliY (HP1030) (Fig. 4). Chen et al. (2011) demonstrated by some cryo-EM studies that the main core of C-ring in most of the bacteria is principally regulated by the presence of FliG, FliM and FliN, but not FliY. According to Lowenthal et al. (2009), FliY protein has an N-termianl domain that belongs to CheC/CheX/FliY/FliM family. Due to its feature, it is possible that FliY may substitute FliM in FliG Cdomain interaction, but experimental confirmation is lacking. Following this hypothesis, the structural assembly of H. pylori flagellum should be barely different from the other organisms.
FliN (Fig. 5C) plays a structural role in building the flagellar architecture, too. It binds the flagellar export protein FliH and localizes itself with FliI and FliJ. In H. pylori, FliN C-terminal domain is fused with a phosphatase/CheC-like domain (McMurry et al., 2006;Paul and Blair, 2006). Moreover, FliN is involved in the direct binding of the regulator protein CheY-P (Paul and Blair, 2006). Lowenthal et al. (2009) performed some alignment of FliN and FliY sequences taken from well characterized bacteria. They noticed highly conserved structural regions, even though some specific amino acids were not. Taking into account the differences, alignment data seem to suggest that both FliN and FliY are involved in export and protein-protein interaction, beyond the motility function. Moreover, it was evident that fliY and fliN mutants were only partially flagellated.

H. Pylori Flagellar Export Apparatus
The general assembly of the export apparatus resembles the Type III Secretion System (T3SS), a virulence factor common to most Gram-negative bacteria. The two systems usually share several homologous proteins (Cornelis, 2006). In fact, flagellar T3SS has been hypothesized of being also involved in secretion of virulence factors in Campylobacter jejuni, a species closely related to H. pylori (Ó'Cróinín and Backert, 2012). Very few information are present in the literature about the flagellar export apparatus from H. pylori, but common features with other bacteria can be detected. The export apparatus consists of six membrane proteins (FliO, FliP, FliQ, FliR; FlhA and FlhB) and three soluble ones (FliH, FliI and FliJ) (Minamino and Namba, 2004). All of them associate to the MS-ring and assemble together to build the export machinery. Among all the proteins listed, FlhA from H. pylori is the best characterized, especially the cytoplasmic domain, whose structure was solved in 2010 (Moore and Jia, 2010) (Fig. 4D). Biochemical studies have demonstrated a strong interaction between the membrane fragment of FlhA and FliF, which is the unique component of MS-ring (McMurry et al., 2004). Interactions between FlhA and FlgM, a protein involved in regulation of transcription of flagellum genes, have also been demonstrated (Colland et al., 2001).
In analogy with its homolog in the T3SS, FlhA presents an N-terminal part inserted inside the cytoplasmic membrane, a C-terminal domain floating in the cytoplasm and a linker portion. The C-terminal domain contains a thioredoxin-like domain, a RNA recognition motif domain inserted into the thioredoxin-like domain, a helical domain and a C-terminal β/α-domain. The alignment was carried out on 98 atoms with RMS 2.45 Å The last is characterized by a high hydrophobic surface that likely suggests the presence of a ligand-binding site. The protein, crystallized in a closed conformation, presents the RNA recognition domain and the C-terminal domain in close contact. Most of the protein domains seem to be highly conserved in structures from different organisms. The thioredoxin-like and the helical domain seem to be the highly conserved in sequence. The most variable part is the linker segment, which nevertheless seems to be crucial in the protein correct conformation and position (Moore and Jia, 2010). FlhB in H. pylori is protein HP0770. FlhB from Salmonella typhimorium has been demonstrated to play a structural role in the formation of the rod and the basal body of flagellum and to have a tendency length regulation of the hook (Minamino et al., 1994;Minamino and Macnab, 1999). The hypothesis in H. pylori is confirmed by the study of mutated cells, which shows a flagellate, non-motile and non-pathogenic behavior (Foynes et al., 1999). From a structural point of view, comparing the cytoplasmic domain of Salmonella protein and H. pylori model protein, a significant divergence in the C-terminal portion is evident (Fig. 5C).
HP1419 shows high homology in sequence with FliQ from Salmonella typhimurium, which is demonstrated to be a membrane protein involved in the export flagellar apparatus . H. pylori fliQ mutants present a reduced ability of the bacterium to adhere to the gastric cells and the development of a non-motile species (Foynes et al., 1999). The experimental evidences support the hypothesis that the protein could be involved in the structural building of the export flagellar apparatus and has an influential role in the bacterial adherence to the gastric membrane.
HP0685 is the membrane protein ortholog to Salmonella typhimorium FliP protein, better characterized. FliP is another flagellar export component, which in Salmonella assembles to form a component containing five subunits . FliP seems to be involved in the export stability of flagellin components, since H. pylori fliP mutants are not able to export flagellar components (Josenhans et al., 2000).
FliO shows a large conservation in export apparatus from several different organisms. In H. pylori, its role is played by HP0583. fliO mutants show a no-flagellate nature (Tsang and Hoover, 2014). Since fliO knockout induces a decrease in FlhA and RpoN-regulon (protein involved in biogenesis of flagellar genes) expression in H. pylori cells, causing a reduced level of FlaA and off the hook protein FlgE, it has been postulated that FliO is essential for the transcription of RpoNregulated flagellar genes. In H. pylori, FliO seems to present a large periplasmic Nterminal domain, that is absent in Salmonella protein.
Miniamino (Minamino, 2014) has reported a detailed description of the flagellar basal body of Salmonella typhimorium. According to his description, FlhA through its N-terminal domain, FliR and FliP associates with the MS-ring. Moreover FlhA binds also to FliO, FliP and FliQ, while FliO interact directly with FliP, which stabilizes the protein. Finally, a proximal position of FliR and FlhB proteins has been highlighted . A general interaction between the whole export apparatus and the MS-ring can be supposed in most of the flagellated organisms, which designs a multisubunits complex ending with the channel of the export apparatus (Minamino, 2014).
Nothing is known about H. pylori ATPase proteins involved in the export process of hook and filament proteins outside the cell membrane, but some consideration can be proposed based on homology studies with Salmonella proteins. The proteins arranged in the export flagellar apparatus interact with the complex FliI-FliH X , involved in the guide of substrates and ensuring a correct assembly of flagellum (Claret et al., 2003). FliI is a peripheral membrane ATPase essential for the correct function of the export apparatus (Vogler et al., 1991). It converts chemical energy, obtained from ATP hydrolysis, in the mechanical motion necessary for the protein export process. It is assembled as a homohexamer around the export gate, where it is able to bind to the cytoplasmic domain of FlhA and FlhB proteins (Claret et al., 2003). FliI binds the FlgN chaperone as well, confirming its role in recognition of specific protein involved in the flagellum growth (Thomas et al., 2004). FliH has an antagonist role in FliI regulation. FliH contrasts FliI oligomerization (Lane et al., 2006) and ATPase activity, binding to the N-terminal region and forming the complex FliH2-FliI . On the other hand, FliI is not able to dock correctly the export gate proteins if FliH is not linked, suggesting that FliH plays a crucial role in FliI binding process (Minamino et al., 2003). The models of H. pylori proteins are reported in Fig. 6A and B.

MS Ring Proteins
The MS-ring consists of FliF, which in H. pylori is supposed to be protein HP0351. The orthologous protein from Salmonella is well characterized: It assembles in a ring with dimension of 25 nm that is comparable with the M basal body. Moreover, an axial projection map of the ring reveal a central channel, involved in the export system of flagellins. The ring is kept together mostly by electrostatic forces (Suzuki et al., 1998). Some attempts of homology modeling of FliF were performed to predict the hypothetical topological arrangement of the protein, but the model obtained was deprived of the C-terminal part.

The Periplasmic Rod and Rings
After a proper construction of all components of the base of the hook basal body, Helicobacter pylori Flagellar assembly continues with a formation of the periplasmic elements of the rod (FliE, FlgB, FlgC, FlgF and FlgG), the P-ring (FlgI) and the L-ring (FlgH) (Fig. 1).
Reconstruction of the basal body by the threedimensional image implicates that the thin proximal portion of the rod is inserted into the cylindrical hollow (a distal portion of FliF) of the MS-ring. Genetic and biochemical data indicates a physical interactions between FliE and FlgB, suggesting that FliE has a role as a structural adapter between the MS-ring and the rod .
FlgG is placed in a distal part of the rod (Okino et al., 1989). For FlgB, FlgC and FlgF it is not clear the order along the rod and the regulatory mechanisms that determines the rod length. In vitro trials (Saijo-Hamano et al., 2004) did not yield in a successful self-assembly into the rod structure like in case of flagellin (Aizawa et al., 1980) and other flagellar axial proteins (Kato et al., 1982;Vonderviszt et al., 1995;Furukawa et al., 2002). The reason could lie in the lack of properly folded terminal regions responsible for the formation of the inner core of the axial structure.
FlgI and FlgH form around the rod the P-and the Lring, respectively. Figure 1 shows a model where the Pring is located into the peptidoglycan layer and the Lring in the outer membrane of H. pylori flagellum (Lertsethtakarn et al., 2011). The smooth and mechanically stable rotation of the rod and L-and Prings requires two main properties: Rigid backbones and flexible surface-exposed side chains. It has been thought that the symmetry mismatch allows rod proteins to interact with circular symmetry elements of MS-ring (FliF protein) and L-and P-rings. In addition, this allows easy rotation of the rod inside the L-and P-rings. For the moment, the structural support about these proteins from H. pylori or other similar organisms (like Salmonella sp., or E. coli) is still missing. Moreover, the reason why it is difficult to visualize the rod by electron microscopy is because the L-, Pand MS-ring usually cover the surface of the rod.

Characterization of rod proteins in solution from
Salmonella typhimurium showed that all rod proteins have tendency to aggregate (Saijo-Hamano et al., 2004), a fact that by itself explain the lack of structural information on these proteins.

The Hook
The hook is a highly curved, tubular structure that bridges the basal body and the filament. It is composed of about 120 copies of a single protein, FlgE. Until now, only the crystal structure of FlgE from Salmonella enterica subsp. enterica serovar Typhimurium (PDB ID 1WLG; (Samatey et al., 2004)) has been determined at 1.8Å resolution (Fig. 7). The structure corresponds to residues 71-369 out of a total of 402. The crystallized fragment of FlgE lacks both the N-and C-terminal regions, since the full length protein formed filaments and thus failed to crystallize. The same behavior occurs during the crystallization of the full-length flagellin protein (see paragraph Extracellular filament). The group of authors that solved the crystal structure of FlgE built a model of the hook by using electron cryomicroscopy and image analysis, together with the docked crystal structure of FlgE. According to the density map, the hook is composed of three domains: The outermost domain at the surface (7.5 nm), the middle domain (5-6 nm) and the inner core domain that forms a tube (1nm thick; 3 nm axial lumen). Samatey et al. (2004) assume that the terminal chains of FlgE are located in the inner core domain in a similar way to those of flagellin in the filament (Yonekura et al., 2003).
The similarity between the crystal structure of FlgE from Salmonella typhimurium and the modeled 3D structure of FlgE from H. pylori is shown in Fig. 7. In the domain 2 of the modeled HpFlgE, the portion from Ala 528 to Val 532 includes a small α-helix that is absent in the StFlgE. In addition, a α-helix is present from Gly 652 to Ser 654 in domain 1 of HpFlgE, wheras the corresponding residues of StFlgE are flexible and form a loop. The protein sequence length of FlgE from H. pylori is 718 amino acids, almost double in size than the orthologue protein from other bacteria. It is interesting to highlight that E. coli and Salmonella share a sequence identity of 88% for FlgE, while in the case of H. pylori the identity with E. coli and Salmonella protein sequence is only 36 and 35%, respectively. The main part of the H. pylori sequence that is missing in other organisms is from the residues 200 to 443. Even though it is not clear why this protein requires around 250 additional amino acids, this property could be mandatory for the proper hook assembly in the extreme enviroment to which H. pylori is exposed and the crystal structure of FlgE from H. pylori would possibly give some clues about it.
The proper hook assembly of H. pylori needs other proteins, like FliK, FlgD and FlhB. FliK is supposed to be involved in measurement of a correct hook length (Ryan et al., 2005;Kamal et al., 2007). The predicted role of FlgD is that of controlling the number of FlgE monomers that are used for the hook growing (Kubori et al., 1992). Ohnishi et al. (1997) demonstrated that FlgD is displaced by FlgK prior to filament formation, suggesting it is present only as an intermediate during the hook polymerization. FlhB (part of T3SS) is located in the inner membrane and helps the hook formation by interacting with the Cdomain of FliK. When this interaction occurs, it induces a signal for the termination of the export of proteins involved in hook assembly and a signal for the export of proteins necessary for filament formation (Minamino et al., 2009).
H. pylori FlgD is composed of 301 amino acid residues. By now, two crystal structures of FlgD have been solved, from P. aeruginosa (PDB ID 3OSV; (Zhou et al., 2011)) and X. campestris (PDB ID 3C12; (Kuo et al., 2008)). Both structures do not include the N-terminal domain, largely flexible. A similar behavior occurs in H. pylori FlgD, whose crystals have been grown by our group. A study of FlgD from E. coli (Weber-Sparenberg et al., 2006) indicated that the first 71 N-terminal residues represent a signal for the export into the flagellar channel. The superimposition of FlgD structures is given within Fig. 8. The structure of HpFlgD was modeled by the software Phyre 2 (identity with the template 34%) limited to part of the protein from residues 125 to 266. HpFlgD superimposes onto the PaFlgD structure with a r.m.s.d. of 3.4Å 2 for 89 aligned Cα atoms (Fig. 8A) and onto the XcFlgD structure with a calculated r.m.s.d. of 6.7Å 2 for 82 aligned Cα atoms (Fig. 8B). As in other two homologs, modeled HpFlgD is composed of two domains, mainly consisting of β-sheets and flexible loops. The difference is that HpFlgD comprises two α-helices in domain 1 that are not found in PaFlgD and XcFlgD. In addition, HpFlgD forms a small α-helix at the N terminal part like in XcFlgD, which on the contrary is not found in the Pa homolog.
There are also two proteins in between the hook and the filament, called hook associated proteins-FlgK and FlgL. These proteins form a very short hook-filament junction zone, important for adapting these two mechanically different structures. The hook is relatively flexible, while the filament works as a propeller and for this reason has much more rigid structure. According to this fact, those two protein should share structural characteristics similar to the hook and filament proteins. In fact, FlgK crystallized from Salmonella typhimurium is reach in both αand β-structural elements (PDB ID 2D4Y), while FlgL is mainly composed of α-helices (PDB ID 2D4X). Modeled HpFlgK and HpFlgL have the same composition in terms of the secondary structure elements ( Fig. 9A and 9C). Moreover, the superimposition of HpFlgK onto the StFlgK gives a calculated r.m.s.d. of 4.9Å 2 for 342 aligned Cα atoms (Fig. 9B), while the superimposition of HpFlgL onto the StFlgL shows a higher r.m.s.d. of 11.4Å 2 for 149 aligned Cα atoms (Fig. 9D). The latter is explained by the presence in HpFlgL of two extra β-strands with respect to the Salmonella orthologue.

The Extracellular Filament
The complete biosynthesis of the flagellum ends with the assembly of a filament. Its function is that of a helical propeller, whose movement is generated by the flagellar motor. H. pylori filament is composed of the major flagellin FlaA, the minor flagellin FlaB, coexpressed in a different amount, and a filamentassociated cap protein, FliD.
In some other organisms (like P. aeruginosa and Salmonella) the filament is built through the assembly of a single FliC flagellin (Maki-Yonekura et al., 2010;Song and Yoon, 2014). Electron cryomicroscopy data have shown (Yonekura et al., 2003) that there are 11 protofilaments involved in filament assembly and each filament is formed by the polymerization of a single FliC monomer. H. pylori flagellins are composed of 510 and 514 amino acids of FlaA and FlaB, respectively. The protein sequence identity between these two flagellins in H. pylori is 61%. Both flagellins are essential for H. pylori motility (Josenhans et al., 1995). It is thought that the amount of expression of the H. pylori FlaA/FlaB ratio is keen on environmental factors (as viscosity or pH), providing a mechanism that allows bacteria to create the best filament properties for motility in a certain condition. The proposed models of HpFlaA and HpFlaB are given in the Fig. 10. Sequence identities of the models with the template are 26 and 29%, respectively. From Figure 11 it is evident that both Hp-flagellins share similar characteristics. Superposition of HpFlaA onto HpFlaB show a r.m.s.d. of 2.1Å for 235 aligned Cα atoms. The comparison of the known crystal structure of FliC protein from P. aeruginosa and H. pylori flagellins (Fig. 11B and 11C) show slight differences in their structural arrangement.
Finally, it is interesting to stress that flagellins that form the tubular filament have a structure similar to that of FlgE proteins in the hook, despite the fact that the hook and flagellin proteins have clearly different amino acid sequences.
The crystal structures of some chaperones of proteins of H. pylori filament are known: FliS (PDB ID 3IQC), HP1076 (PDB ID 3K1H) and their complex ( Fig. 12; PDB ID 3K1I). The role of HP1076 is not clear, but it is known that this protein appears only in Campylobacterrelated species. Structural studies of FliS-HP1076 complex suggest a role of HP1076 in flagellar biosynthesis. From Fig. 12 it is possible to see that the contacts between these two proteins are formed by helical stacking, with a measured association constant of 1.5×10 7 M −1 (Lam et al., 2010).
The role of FliS would be in controlling the premature polymerization of filament proteins. The 100residues of the C-terminal part are essential for FliS interaction with FlaB. The large (61%) sequence identity between FlaA and FlaB in H. pylori suggests that the same interaction could occur between the FliS-FlaA and FliS-FlaB complex. It is interesting that the activity of FliS protein in H. pylori is not limited to the flagellin proteins. Moreover, it has been hypothesized that FliS can interact with other flagellar proteins, like hookfilament junction protein FlgK and filament cap protein FliD (Lam et al., 2010). The importance of FliD goes outside its chaperon role, since it is a possible marker for serologic diagnosis of H. pylori infection (Khalifeh Gholi et al., 2013).

Flagellar proteins glycosylation
A significant number of H. pylori's proteins are glycosylated (Champasa et al., 2014). Among them, flagellins FlaA and FlaB are heavily glycosylated with the nine-carbon sugar pseudaminic acid (Schirm et al., 2003), a post-translational modification absolutely essential for the formation of functional flagella. In fact, deletion of any of the enzymes involved in pseudaminic biosynthetic pathways results in non-motile bacteria (Schirm et al., 2003). In this respect, inhibition of enzymes of the latter metabolic pathway could be a key to indirectly target H. pylori motility and consequently stop colonization.

Conclusion
Although the structural information about the bacteria flagella are known from species like Salmonella typhimurium and Eschericha coli, H. pylori flagellum is still in the relatively early stages of structural investigation. For the moment, the crystal structures of the C-ring proteins HpFliG, HpFliM and their complex, crystal structure of the cytosolic domain of T3SS protein HpFlhA, crystal structure of a stator component of the flagellar motor HpMotB, crystal structure of the HpFliS flagellin chaperon and its complex with HP1076 have been solved. We still lack the structure of the H. pylori proteins that build the periplasmic rod and rings, the hook and the extracellular filament. Nevertheless, only nineteen structural models could be constructed according to the existing crystal structures of orthologs from other species. The interest in structural investigations of the H. pylori flagellar proteins could, together with the understanding of flagellar gene transcription and chemotactic response, explain how this bacteria achieve a functional rotary nano-machine required for swimming to the more favorable environment during the infection processes.

Funding Information
This study was supported by PRIN 2010-2011 "Unraveling structural and functional determinants behind Helicobacter pylori pathogenesis and persistence". V.L. and I.P. were partially supported by fellowships from the same project.

Author's Contributions
All three authors contributed to write the paper.

Ethics
There are no ethical issues.