In Silico Characterization and Motif Election of Neurotoxins from Snake Venom

Corresponding Author: Mahmudul Hasan Department of Genetic Engineering and Biotechnology Shahjalal University of Science and Technology, Sylhet-3114, Bangladesh Email: hasan_sust@yahoo.com Abstract: Snake venom is a mixture of many biological components. Snakes usually use their venomic armory to tackle different prey and predators in adverse natural world. Among various components of the snake venom, neurotoxins play an important role in exerting effect by blocking the neuromuscular transmission through selective binding to muscle nicotinic Acetyl-Choline Receptors (nAChR). A set of 30 reference protein sequences representing neurotoxin of snake venom were retrieved from NCBI protein database and characterized for various physic-chemical properties, Multiple Sequence Alignment (MSA), phylogenetic analysis and motifs election. The physic-chemical properties of the selected proteins were analyzed by using ExPASy’s Prot Param tool and it was found that the Molecular Weight (M.Wt) of maximum proteins is around 10000 Da. Isoelectric points (pI) of all the organisms were found to be basic in nature. The aliphatic index infers that neurotoxins showed the tendency of having both wide and low range of temperature as 16 proteins showed AI above 70 and others showed AI value below 70. The negative value of GRAVY indicates that there will be better interaction with water. The secondary structure prediction was done by SOPMA which showed that random coils dominated all the other conformations. Multiple sequence analysis and Phylogenetic analysis of neurotoxins were carried out by MEGA 5. Motif election was done by MEME which represents motif 1 (21 sequences), Motif 2 (11 sequences), Motif 3 (3 sequences) and these indicates the region also indicates the DUF3963, Toxin_1 (PF008), CAP (PF0188) protein family respectively which was done by Pfam. Motif 2 gives the insight of functional domain for neurotoxins and also suggests the degenerate primer for neurotoxin protein family.


Introduction
The venomic composition of snakes is a mix of biologically active proteins and polypeptides. The primary function of snake venom is to incapacitation and immobilization of the prey of the snakes (as an offensive armory). Evolved snake venom to aid in catching prey exhibits fatal and enfeebling effect. The secondary function of venom is to serve as defensive machinery against their predators. Snake venom also assists in digestion of variety of diets of snakes (Kang et al., 2011). Besides this snake venom is considered as important biological resource with various features of human welfare. Several isolated snake venom proteins with a known mode of action have found pratical application as pharmaceutical agents, diagnostic reagent or preparative tools in haemostaseology, neurobiology and complement research (Stocker, 1999). The use of snake venom as medicine was known to man for centuries. It is over sixty years since it was first realized that the physiological active components of snake venoms might have therapeutic potential (Sanjoy et al., 2002). In the Unani system of medicine cobra venom has been used as a tonic, hepatic stimulant and for revival in collapsed conditions (Debnath et al., 1972). Venoms of viper, crotalus, cobra and lacasis are also routinely used in homeopathic medicine Chinese physitian use snake venom products routinely to treat stroke and view them as effective and relatively safe (Senior, 1999). Natural protease inhibitors to haemorrhagins in snake venom and their potential use in medicine have also been reported (Perez and Sanchez, 1999). Snake venom has been used to develop newer drugs to combat various diseases including cancer. Calmetta et al. (1993) investigated the use of cobra venom in the treatment of cancer in mice (Gomes et al., 2001). Showed that cobra venom, in extremely minute does produced analgesic effects. This led to the possibility of therapeutic use of cobra in arthritis and cancer (Match, 1936).
It is important to study these snake venom proteins which are of pharmacological value. Among the different venomic component snake venom cytotoxins and short neurotoxins are non-enzymatic polypeptide candidates (Yee et al., 2004). Short neurotoxins exert their effect by blocking the neuromuscular transmission through selective binding to muscle nicotinic Acetyl-Choline Receptors (nAChR) (Changeux, 1990). Venoms of several snakes are known to cause muscular paralysis. Subsequently several neurotoxic components that inhibit neuromuscular transmission by attacking different target have been isolated. Neurotoxins from snake venom have been utilized in different pharmacological and biochemical studies of nicotinic Acetyl-Choline Receptor (nAChRs) in the neurotransmitter and neuromuscular junction (Takacs et al., 2004).
The aim of the present study was to analyze the diversification profile of amino acid sequences, secondary structure analysis, conservation pattern of amino acid residues and phylogenetic tree of snake venom neurotoxin proteins from some common snakes from different region of the earth. This study helps us to analyze the physic-chemical and structural properties of snake neurotoxins and also to the better understand of effective conserved motif structure of neurotoxins.

Materials and Methods
A set of 30 sequences of neurotoxins (Table 1) were retrieved from National Center for Biotechnology Information (NCBI). Sequence of neurotoxins represents the neurotoxins from various region of the earth.
The different physicochemical properties of neurotoxin enzymes were computed using ExPASy's ProtParam tool and these properties can be deduced from a protein sequence. The computed Isolelectric point (pI) will be useful for developing buffer systems for purification by isoelectric focusing method (Sivakumar et al., 2007). The instability index provides an estimate of the stability of our protein. A protein whose instability index is smaller than 40 is predicted as stable; a value above 40 predicts that the protein may be unstable (Guruprasad et al., 1990). The aliphatic index of a protein is defined as the relative volume occupied by aliphatic side chains (alanine, valine, isoleucine and leucine). It may be regarded as a positive factor for the increase of thermo stability of globular proteins (Walker, 2005). The secondary structure was predicted by Self-Optimized Prediction Method with Alignment (SOPMA). SOPMA was employed for calculating the secondary structural features of the selected protein sequences considered in this study (Neelima et al., 2009). This method calculates the content of α-helix, βsheets, turns, random coils and extended strands. SOPMA is a neural network based methods; global sequence prediction may be done by this sequence method (Prashant et al., 2010).
Motif election is very important in case of predicting probable domain of neurotoxins. Motif election & domain analysis was done using MEME (http://meme.nbcr.net/meme/). Nerotoxins from different organisms were subjected to multiple sequence alignment by Clwstal W2. Phylogenetic analyses based on protein sequences were carried out using the maximum-likelihood method with MEGA 5.2.2 version.

Results
The physicochemical properties of neurotoxins were predicted by using ProtParam tool. The ProtParam includes the following computed parameters: Molecular Weight (M.Wt), theoretical pI, Instability Index (II), Aliphatic Index (AI) and grand average of hydropathicity (GRAVY) ( Table 2). The physicochemica properties show that molecular weight of maximum number of neurotoxins is around 10000 Da. The highest molecular weight was found in Gloydius blomhoffii (Q8JI40.1) which is 26,914.3 Da. The instability index showed that except five proteins all of our studied neurotoxins were stable as their instability index stayed below 42. Isoelectric point (pI) is the pH at which the surface of protein is covered with charge but net charge of the protein is zero. The computed pI value of the studied neurotoxins showed that neurotoxins were basic in nature ((pH>7)) except Daboia russelii (A8CG87.1) and Gloydius blomhoffii (Q8JI40.1). The instability index is used to measurein vivo half-life of a protein (Guruprasad et al., 1990). The proteins which have been reported as in vivo half-life of less than 5 h showed instability index greater than 40, whereas those having more than 16 h half-life (Rogers et al., 1986) have an instability index of less than 40. Among the studied neurotoxins 18 sequences showed stable nature having more than sixty hours of half-life as the contains instability index less than 40. In case of Aliphatic Index (AI) the studied neurotoxins showed the tendency of having both wide and low range of temperature as 16 proteins showed AI above 70 and others showed AI value below 70. GRAVY value of the studied neurotoxins showed that maximum (20 sequences) proteins exhibits lower GRAVY value which indicate the better interaction of that proteins with water.
Secondary structure pattern of studied neurotoxins exhibits whether a given amino acid lies in a helix, strand or coil. The secondary structure prediction of the studied neurotoxins showed that random coil predominates the other structures where as β-turn being the least conformational structure (Table 3). In all the neurotoxins analyzed, it was clearly noticed that β-turns showing very less percentage of conformation (below 10%). In most of the neurotoxins, extended strands were ranging from 10-30%. Phylogenetic analyses based on protein sequences were carried out using the maximumlikelihood method with the MEGA 5.2.2. The resulting tree is represented in (Fig. 1). The phylogenetic tree revealed three major clusters. Cluster 1 is the major cluster containing 12 sequences where 6 sequences (A8S6B0.1, P01384, AEHO5953, AAD40974.1, A6MFK5, A8HDK7) are of the Australian origin. Clusters 1 also contains proteins (P82662, C6JUP2, P58370, C1IC47.1) representing the origin of Asia and South America. Cluster 2 contains 11 sequences and most the sequences (P19959, Q5UFR8, P10459, Q9YGC2, Q6EER3) represents the ocean snake species. Clusters 3 contains 7 sequences representing Asian (P0991, Q8JI40, P62388) and American (Q7ZT99, BOVXV6, P86095) and African (P01424) snake species.
Motif analysis of the sequences was conducted by using MEME. The output of this modal of MEME shows color graphical alignment as well as common regular expression of motifs. On the hand, the block represents start and end point of the amino acid sequences with motif length. This is well known fact that E-value describes the statistical significance of the motif. According to (Baker et al., 2002), by default, MEME looks for up to three motifs, each of which may be present in some or all of the input sequences. MEME chooses the width and number of occurrences of each motif automatically in order to minimize the 'E-value' of the motif which increase the probability of finding an equally well-conserved pattern in random sequences. Motif overview has shown in figure describing 6.8e-350 E-value of motif one, 3.3e-210 E-value of motif two and 1.4e-074 Evalue of motif three. E-value, width, sites, sequence logo and regular expressions are given in Table 4. All three motifs (Fig. 2) found by MEME are subjectede to Pfam (protein family database) to find out the domain of protein family related to our MEME motifs. Multiple sequence alignment by ClustalW2 represents significant alignment pattern (Fig. 3).

Discussion
The physicochemical properties of neurotoxins represents various features describing molecular weight, theoretical pI, instability index, aliphatic index and grand average of hydropathicity. Most of the studied nurotoxins are stable, basic in nature and can show better interaction with water. Random coil structure is predominant in the secondary structure of studied neurotoxins. Phylogeny represents the evolutionary spectrum of studied neurotoxins. Motif is very important for representing the domain of particular protein family. Motif helps to find out the functional domain of proteins and also motif represents the conserved pattern in protein sequencs through which we can design degenrate primer of that protin sequences. Motif one represents the similarity with Domin of Unknown Function (DUF) protein family. More than 20% of all protein domains are currently annotated as "Domains of Unknown Function" (DUFs). Evolutionary conservation suggests that many of these DUFs are important (Goodacre et al., 2013). Motif two represents the domain of Toxin_1 (PF008) protein family. Multiple sequence alignment by ClustalW2 exhibits a significant alignment pattern where we have found the motif two. Conserved cysteine is found in one position for all sequences and some others conserved cysteine placement are also found in maximum sequences. Cysteine represents the disulphide bonds and also indicate the existence of functional domain. So, motif 2 may be the significant orientation for functional domain of neurotoxins which may be related to the nerotoxins interaction with neurotransmitters. Motif 3 represents the domin of CAP (PF0188) protein family. CAP protein family members secrete an extracellular endocrine or paracrine function and are involved in processes including the regulation of extracellular matrix and branching morphogenesis, potentially as either proteases or protease inhibitors; in ion channel regulation in fertility; as tumour suppressor or pro-oncogenic genes in tissues including the prostate; and in cell-cell adhesion during fertilization (Gibbs et al., 2008).

Conclusion
In this study, 30 neurotoxins sequences were selected to acquire an understanding about their physico-chemical properties and functional motif by using in silico techniques. Physicochemical characterization studies give more insight about the properties such as M.Wt, pI, AI, GRAVY and Instability Index that are essential and vital in providing data about the proteins and their properties. SOPMA predicted that all the neurotoxins contain large percentage of random coils and the least conformation was of β-turns. Conserved sequences in motifs help us to culminate a significant insight of functional domain. Conserved sequences also may be utilized for designing specific degenerate primers for identification and isolation of type and class of neurotoxins as numerous neurotoxins are being isolated to fulfill the need of efficient application in various system. This study has knocked on the priliminary outlook of neurotoxins. Further study is necessary which will help us to understand the structure and mechanism of blocking the neuromuscular transmission through selective binding to muscle nicotinic Acetyl-Choline Receptors (nAChR). Future reseach will also extends the isolation and industrial production of neurotoxins from snake venoum as it has an immense application on pharmaceutical industry.