## Abstract

Sequence alignment is a fundamental problem in computational biology, which is also important in theoretical computer science. In this paper, we consider the problem of aligning a set of sequences subject to a given constrained sequence. Given two sequences A= a_{1}a_{2}… a_{n} and B= b_{1}b_{2}… b_{n} with a given distance function and a constrained sequence C= c_{1}c_{2}… c_{k}, our goal is to find the optimal sequence alignment of A and B w.r.t. the constraint C. We investigate several variants of this problem. If C= c^{k}, i.e., all characters in C are same, the optimal constrained pairwise sequence alignment can be solved in O(min { kn^{2}, (t- k) n^{2}}) time, where t is the minimum number of occurrences of character c in A and B. If in the final alignment, the alignment score between any two consecutive constrained characters is upper bounded by some value, which is called GB-CPSA, we give a dynamic programming with the time complexity O(kn^{4}/ log n). For the constrained center-star sequence alignment (CCSA), we prove that it is NP-hard to achieve the optimal alignment even over the binary alphabet. Furthermore, we show a negative result for CCSA, i.e., there is no polynomial-time algorithm to approximate the CCSA within any constant ratio.

Original language | English |
---|---|

Pages (from-to) | 79-94 |

Number of pages | 16 |

Journal | Journal of Combinatorial Optimization |

Volume | 32 |

Issue number | 1 |

DOIs | |

Publication status | Published - 1 Jul 2016 |

## Scopus Subject Areas

- Computer Science Applications
- Discrete Mathematics and Combinatorics
- Control and Optimization
- Computational Theory and Mathematics
- Applied Mathematics

## User-Defined Keywords

- Complexity
- Dynamic programming
- Sequence alignment