Development of Neighbors Atomic Coordinate Tables and Plot-Graphical Representation for Tertiary Structure in Proteins 1

Problem statement: Contact map prediction is an interesting method for its application in fold recognition and protein tertiary structure determination. Approach: This study developed a method to predict a protein tertiary structure from the amino acid sequence using contact map. The method showed prediction a protein in a binary space of exposure states. The approach produced several procedures to finds a set of three dimensional coordinates consistent with contact map of threshold t. Results: The approach used a Matlab tools for improve protein prediction accuracy. The developed tool selects a set of three dimensional coordinates for distance between nodes. Then, the node mapped into protein structure. Conclusion: Finally the approach produced a plot-graphical representation for tertiary structure in proteins.


INTRODUCTION
Contact map used for a protein predictions. In view of this a direct random coordinate was desired, in terms of the contribution of atomic neighbors in making them exposed to predict a protein tertiary structure. Present study aims to prediction of a protein tertiary structure from contact map, how the random prediction of an atomic coordinate is affected by the change threshold t in a neighboring residue, or other positions of these residues. In addition, a graphical method of representing a protein tertiary structure was developed. For the past few years, several tools have been developed in order to help predicting a protein 3D structure to understand protein functionality. Vassura et al. (2007) produce a software tool for reconstructing a protein 3D structure form contact map. The tool based on distance geometry which, finds a set of three dimensional coordinates consistent with some given contact map of threshold t. Hu et al. (2002) present techniques describe how data mining can be used to extract valuable information from contact map. This tool used contact map to discover 3D structure by test each two amino acid to determine 3D distance by coordinate of α carbon atom. Pollastri and Baldi (2002) used a Neural Network to predict protein contact map and find its 3D structure. The tool focus in grained contact map prediction. The approach concentrates on find a 3D structure from liner sequences of protein. The major task in this approach is to propose and verify precise and robust adaptation rule to predict contact map. Moré and Wu (1999) developed a tool based on Gaussian smoothing to develop an efficient and reliable code to solve the distance geometry problem in protein structure. The algorithm in this tool work with the sparse set of distance constraints while other algorithm work for distance geometry which tend to work with dense set of constraints.

MATERIALS AND METHODS
Predicting tertiary protein structure from it is primary structure is the greatest problem in the bioinformatics (Fariselli et al., 2001), solve this problem go from contact map to the protein structure an efficient and fast algorithm is needed, many of methods introduced to reconstruct contacts prediction in several way (Vassura et al., 2008a;2008b;Hu et al., 2002;Moult, 1999;Gomes, 2006;Gutpa et al., 2005;Fariselli et al., 2001;Casbon, 2002).
Traditionally, contact map is Boolean matrix create from distance map used a pre assigned threshold value t. Distance map D is a N×N matrix where N is the number of residues in a protein and D[i,j] is the distance between coordinate of the α carbon in two residues i and j which measured in Angstroms A°. Two residues i and j in a protein are come in contact with each other if the tertiary distance D[i,j] is less than or equal to some threshold value.
This study developed a method to predict a protein structure from contact map with supported by Matlab tools. The approach taken contains several procedures to finds a set of three dimensional coordinates consistent with contact map of threshold t.
The research method contains three modules as shown in Fig. 1. The scanner module reads the protein ID from the list (extract from PDB). Then accepts the Contact Map (CM) of protein as an input and produces a New Contact Map (NCM) by Scanning method. This process based on prediction quality more than quantity of contacts. The producer module produces distance matrix procedure, which find a possible set of distance between nodes. Then compute a 3D point used nonlinear function from Matlab tools. New contact map is extract based on new coordinates and compare with native contact map to find number of differences, the final module corrector module generate a set of coordinates consistent with the given contact map. Then the module used Matlab plot 3D function to map a protein tertiary structure (Vassura et al., 2007).

RESULTS AND DISCUSSION
The experimental results shows that the efficiency of prediction method for a protein structure. We took the list of proteins of different lengths related to the most popular classes from the PDB.
For each protein in the selected list we generate different contact maps by changing the threshold value. The result is automatically analysis for prediction purposes. Based on the result, we scan the contact map with a pre assigned threshold to show the accuracy of extract contact map from dense area. Firstly, the result generated for a part of sequence instead of whole area. Also the method shows the effect of threshold on the tertiary structure of a protein. This study shows some experimental result for different proteins. In addition, the result has been analyzed and compared with the original proteins. Table 1 shows error percentage of 12 different contact maps before and after the corrector module with average time. The contact threshold changed from 7 to 18 Angstrom. The result analysis shows that the corrector module decreases the error percentage when it applies iteratively. The producer module generate an accurate set of coordinates consist with the native contact map to predict protein tertiary structure. The result shows running time is increase and decrease depending on the threshold value and protein length.
The contact map computed with a threshold equal to 7 Angstrom does not contain enough global information of the protein structure. Also the result is similar to a huge number of protein contact map. The prediction structure is not clear compare with the native structure. Figure 2b shows the prediction of a 451C protein tertiary structure form its contact map at threshold of 16 Angstrom with error percentage 0.08. The prediction result recovered tertiary structure of a protein which is more similar to native tertiary structure as shown in Fig. 2a. The experimental result shows threshold equals to 12 angstrom or more gives more accurate for prediction of a protein tertiary structure. The experimental result show that the contact map computed using threshold values (12-18) Å allow better tertiary structure recovery than those computed at thresholds (7-9) Å. Also, correct procedure improve the coordinates to obtain the best set consists with native contact map in this approach, as shown in the Table 1. Figure 3 shows that the running time of predicting protein structure is not effected by threshold value. The average time is decrease and increase in arbitrary way. The average time is changeable because the algorithm takes random number as starting points. The approach used nonlinear system with supported by FSOLVE function as a tools form Matlab.

CONCLUSION
Predicting a protein structure is one of the approaches that have been used in folding a protein 3D structure. For the past few years, several efforts have been developed in order to help predicting a protein 3D structure to understand protein functionality. These efforts used machine learning approaches such as neural network and support vector machine and distance geometric.
This study used contact map matrix to predict tertiary structure of a protein. Also, the result shows that the contact maps computed using threshold values (12-18) Å allow better tertiary structure recovery than those computed at thresholds (7-9) Å.
The experimental results show that the scanning of contact map for a protein is much more reliable to predict the important areas of the protein structure. The study used the function FSOLVE form Matlab. The approach taken shows an efficient and fast way to predict a protein with a highly prediction accuracy.