Abstract
Sequence alignment is a fundamental problem in computational biology, which is also important in theoretical computer science. In this paper, we consider the problem of aligning a set of sequences subject to a given constrained sequence. Given two sequences A= a1a2… an and B= b1b2… bn with a given distance function and a constrained sequence C= c1c2… ck, our goal is to find the optimal sequence alignment of A and B w.r.t. the constraint C. We investigate several variants of this problem. If C= ck, i.e., all characters in C are same, the optimal constrained pairwise sequence alignment can be solved in O(min { kn2, (t- k) n2}) time, where t is the minimum number of occurrences of character c in A and B. If in the final alignment, the alignment score between any two consecutive constrained characters is upper bounded by some value, which is called GB-CPSA, we give a dynamic programming with the time complexity O(kn4/ log n). For the constrained center-star sequence alignment (CCSA), we prove that it is NP-hard to achieve the optimal alignment even over the binary alphabet. Furthermore, we show a negative result for CCSA, i.e., there is no polynomial-time algorithm to approximate the CCSA within any constant ratio.
Original language | English |
---|---|
Pages (from-to) | 79-94 |
Number of pages | 16 |
Journal | Journal of Combinatorial Optimization |
Volume | 32 |
Issue number | 1 |
DOIs | |
Publication status | Published - 1 Jul 2016 |
Scopus Subject Areas
- Computer Science Applications
- Discrete Mathematics and Combinatorics
- Control and Optimization
- Computational Theory and Mathematics
- Applied Mathematics
User-Defined Keywords
- Complexity
- Dynamic programming
- Sequence alignment