Constrained pairwise and center-star sequences alignment problems

Yong Zhang, Joseph W T CHAN, Francis Y.L. Chin, Hing Fung Ting, Deshi Ye*, Feng Zhang, Jianyu Shi

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

3 Citations (Scopus)

Abstract

Sequence alignment is a fundamental problem in computational biology, which is also important in theoretical computer science. In this paper, we consider the problem of aligning a set of sequences subject to a given constrained sequence. Given two sequences A= a1a2… an and B= b1b2… bn with a given distance function and a constrained sequence C= c1c2… ck, our goal is to find the optimal sequence alignment of A and B w.r.t. the constraint C. We investigate several variants of this problem. If C= ck, i.e., all characters in C are same, the optimal constrained pairwise sequence alignment can be solved in O(min { kn2, (t- k) n2}) time, where t is the minimum number of occurrences of character c in A and B. If in the final alignment, the alignment score between any two consecutive constrained characters is upper bounded by some value, which is called GB-CPSA, we give a dynamic programming with the time complexity O(kn4/ log n). For the constrained center-star sequence alignment (CCSA), we prove that it is NP-hard to achieve the optimal alignment even over the binary alphabet. Furthermore, we show a negative result for CCSA, i.e., there is no polynomial-time algorithm to approximate the CCSA within any constant ratio.

Original languageEnglish
Pages (from-to)79-94
Number of pages16
JournalJournal of Combinatorial Optimization
Volume32
Issue number1
DOIs
Publication statusPublished - 1 Jul 2016

Scopus Subject Areas

  • Computer Science Applications
  • Discrete Mathematics and Combinatorics
  • Control and Optimization
  • Computational Theory and Mathematics
  • Applied Mathematics

User-Defined Keywords

  • Complexity
  • Dynamic programming
  • Sequence alignment

Fingerprint

Dive into the research topics of 'Constrained pairwise and center-star sequences alignment problems'. Together they form a unique fingerprint.

Cite this