Hi, First of all, thanks for a great library. I have some questions about using the library in my project (not developing it per se), I hope this is the right venue for asking. I would like to do end-to-end consensus alignment between several DNA sequences. I currently have this implemented as follows: for (unsigned i = 0; i < v.size(); ++i) { seqan::appendRead(store, v[i].data()); seqan::appendAlignedRead( store, i, 0, 0, 0 + (int)v[i].size()); } options.useContigID = true; options.usePositions = true; seqan::consensusAlignment(store, options); // seqan::reAlignment(store, 0, 1, 10, false); // needed? result = store.contigStore[0].seq; This works ok, but it has 2 shortcomings: 1. I noticed after some runs is that the library only processes the read store if it contains sequences with length greater than 20bp. I'm not sure if all sequences, or only some of them, need to be of this length for it to work. If all sequences are shorter than 20bp, it seems that the consensus output is just the first sequence. So, for instance: Reads: A AA AA Consensus alignment: A . So the output is "A", not "AA" as I would expect. Is there a way to force the consensus algorithm to run regardless of the size of the sequences? I couldn't find such an option documented anywhere. As a workaround, I tried to add an "N" * 20 prefix to all sequences, but now I'm occasionally getting strange results such as: Reads: NNNNNNNNNNNNNNNNNNNNTTTTTTTTTTTTTTTTTTTTTAG NNNNNNNNNNNNNNNNNNNNTTTTTTTTTTTTTTTTTTTAG NNNNNNNNNNNNNNNNNNNNTTTTTTTTTTTTTTTTTTTTTAG NNNNNNNNNNNNNNNNNNNNTTTTTTTTTTTTTTTTTTAG Consensus alignment: NNNNNNNNNNNNNNNNNNNNTTTTTTTTTTTTTTTTTTTTNNNNNNNNNNNNNNNNNNNN--TTTTTTTTTTTTTTTTTTTAG ........................................TAG ....................**..................... ....................TT..................... ...................**N.................... In case this gets wrapped, the idea is that the output of the consensus alignment contains (unexpectedly) more than 1 initial run of Ns. I tried to do reAlignment() with a bandwidth of 20, but it doesn't always fix it. Is the 20 1- or 2-sided? Note- if I were to use real bases instead of Ns, then it would become nontrivial to remove them from the consensus sequence, because they might get mixed up with the real sequence. Is there a way to use seqan to do the consensus on short sequences as well? 2. The second problem I noticed is that the alignments are not always end-to-end. This is usually seen only when the tails are homopolymers: Reads: TTTTTTTTTTTTTTTTTTTTGAA TTTTTTTTTTTTTTTTTTTGAGA TTTTTTTTTTTTTTTTTTTTGAGA TTTTTTTTTTTTTTTTTTTGAGA TTTTTTTTTTTTTTTTTTTGAGA TTTTTTTTTTTTTTTTTTTGAGA Consensus alignment: TTTTTTTTTTTTTTTTTTTTGAGA ......................A ........................ ....................... ....................... ....................... ....................... ^ - note the spaces here Here, I would have hoped the consensus to be the sequence which occurs 4 times, with one less T than the in the largest sequence. As before, I can potentially work around this by adding a common homopolymer prefix and suffix, then removing it from the output: Reads: CCCCCTTTTTTTTTTTTTTTTTTTTGAACCCCC CCCCCTTTTTTTTTTTTTTTTTTTGAGACCCCC CCCCCTTTTTTTTTTTTTTTTTTTTGAGACCCCC CCCCCTTTTTTTTTTTTTTTTTTTGAGACCCCC CCCCCTTTTTTTTTTTTTTTTTTTGAGACCCCC CCCCCTTTTTTTTTTTTTTTTTTTGAGACCCCC Consensus alignment: (note: one less T) CCCCC-TTTTTTTTTTTTTTTTTTTGAGACCCCC .....T.....................*...... .....*............................ .....T............................ .....*............................ .....*............................ .....*............................ However, this works ok only if the homopolymer I add doesn't get mixed up with the rest of the input sequences. I wonder if there is a way to get seqan to do end-to-end (re-)alignment, with no free end gaps. Thanks, Matei