Research Article Open Access

Genome Sequence Analysis: A Survey

Hassan Mathkour and Muneer Ahmad

Abstract

Problem statement: Sequence analysis problems are NP hard and need optimal solutions. Interesting problems include duplicate sequence detection, sequence matching by relevance, sequence analysis using approximate comparison in general or using tools i.e., Matlab and multi-lingual sequence analysis. The usefulness of these operations is highlighted and future expectations are described. Approach: This study described the concepts, tools, methodologies, algorithms being used for sequence analysis. The sequences contained precious information that needed to be mined for useful purposes. There was high concentration required to model the optimal solution. The similarity and alignments concepts can not be addressed directly with one technique or algorithm, a better performance was achieved by the comprehension of different concepts. Results: We had compared different approaches using exemplary data and found that ClustalW2 is fairly good tool in terms of analysis. We assigned different weight values for relevant features and obtained score 95 in comparison phenomenon and 45 in alignment. Conclusion: Different techniques and approaches had been evaluated and compared.

Journal of Computer Science
Volume 5 No. 9, 2009, 651-660

DOI: https://doi.org/10.3844/jcssp.2009.651.660

Submitted On: 29 June 2009 Published On: 30 September 2009

How to Cite: Mathkour, H. & Ahmad, M. (2009). Genome Sequence Analysis: A Survey. Journal of Computer Science, 5(9), 651-660. https://doi.org/10.3844/jcssp.2009.651.660

  • 2,537 Views
  • 2,308 Downloads
  • 4 Citations

Download

Keywords

  • Genome
  • multi-lingual
  • approximate matching
  • nucleotide base pair
  • corpora
  • duplicate sequences