Given a set of sequences, S, and degeneracy parameter, d, the Consensus Sequence problem asks whether there exists a sequence that has Hamming distance at most d from each sequence in S. A valid motif set is a set of sequences for which such a consensus sequence exists, while a decoy set is a set of sequences that does not have a consensus sequence but whose pairwise Hamming distances are all at most 2d. At present, no efficient solution is known to the Consensus Sequence problem when the number of sequences is greater than three. For instances of Consensus Sequence with binary sequences and cardinality four, we present a combinatorial characterization of decoy sets and a linear-time exact algorithm, resolving an open problem posed by Gramm et al.
Christina Boucher, Daniel Brown and Stephane Durocher. On the structure of small motif recognition instances. In proceedings of the 15th Annual String Processing and Information Retrieval Symposium (SPIRE 2008), pages 269--281.