Abstract
We present three linear algorithms for as many formulations of the problem of finding motifs with gaps. The three versions of the problem are distinct in that they assume different constraints on the size of the gaps. The outline of the algorithm is always the same, although this is adapted each time to the specific problem, while maintaining a linear time complexity with respect to the input size. The approach we suggest is based on a re-writing of the text that uses a new alphabet made of labels representing words of the original input text. The computational complexity of the algorithm allows the use of it also to find long motifs. The algorithm is in fact general enough that it could be applied to several variants of the problem other than those suggested in this paper.
Original language | English (US) |
---|---|
Pages (from-to) | 1145-1154 |
Number of pages | 10 |
Journal | International Journal of Foundations of Computer Science |
Volume | 16 |
Issue number | 6 |
DOIs | |
State | Published - Dec 2005 |
All Science Journal Classification (ASJC) codes
- Computer Science (miscellaneous)
Keywords
- DNA
- Gapped Motifs
- Inference
- Suffix tree