COBALT, webPRANK, DbClustal
Kamer Burak İŞÇİ*
Dept. of Molecular Biology and Genetics
Izmir Institute of Technology
Izmir, Turkey kamerisci@std.iyte.edu.tr Cem TOSUN*
Dept. of Molecular Biology and Genetics
Izmir Institute of Technology
Izmir, Turkey cemtosun@std.iyte.edu.tr Bita SABET*
Dept. of Molecular Biology and Genetics
Izmir Institute of Technology
Izmir, Turkey bitasabet@std.iyte.edu.tr Abstract—Multiple sequence alignment tools provide opportunities to identify sequence similarities of two and more biological sequences such as DNA, RNA or proteins. Wide range of MSA tools help to get any needed information and compare them to obtain results with precision as much as possible. This study aims to inform about general working principles of three multiple sequence alignment tools; COBALT, webPRANK and DbClustal and compare their results internally also with each other.
Index Terms—COBALT, webPRANK, DbClustal
Introduction
Sequence alignment of two or more biological sequences, which may belong to protein, DNA or RNA is called multiple sequence alignment (MSA) [1]. Generally multiple sequence alignment is used to identify evolutionary relationship by shares of lineages and descending to common ancestor. Thus, computational algorithms are used to produce and analyze the alignments. Most MSA tools use heuristic methods rather than global optimization because of computationally expensiveness of describing the optimal alignment between more than a few sequences of moderate length. There are two main approaches to MSA, which include progressive and iterative. Progressive multiple alignment method begins with a sequence and progressively aligns the others one by one creating a distance matrix and guide tree from the matrices, which is used to determine the next sequence to be added to the alignment. Progressive MSA is a faster approach when compared to pair-wise alignment to multiple sequences,
References: Budd, Aidan (10 February 2009). "Multiple sequence alignment exercises and demonstrations". European Molecular Biology Laboratory. Retrieved June 30, 2010. Mount DM. (2004). Bioinformatics: Sequence and Genome Analysis 2nd ed. Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY. Papadopolous, J. S. and Agarwala, R. (2007) COBALT: a constraint-based alignment tool for multiple protein sequences. Bioinformatics 23(9): 1073-1079. Zhang, X and Kahveci, T(2006).ANewApproach forAlignment of multiple proteins. Pac. Symp. Biocomput., 11: 339350. Ogden,T.H. and Rosenberg, M.S. (2006) Multiple sequence alignment accuracy and phylogenetic inference. Systematic Biol., 55, 314–328. [1] Bahr,A. et al. (2001) BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res., 29, 323–326. [2] Kececioglu,J.D. and Starrett,D. (2004) Aligning alignments exactly. In Proceedings of the 8th ACM Conference Research in Computational Molecular Biology, pp. 85–96. [4] Loytynoja A, Goldman N. Webprank: A phylogenyaware multiple sequence aligner with interactive alignment browser. BMC Bioinformatics, 2010, 11(1): 579.