Hi Birte, Another remark: for the sequences: s1: GGGAGGGGAGGGGGAGGGGGGAGGG s2: GGGGGGAGGGGGAGGGGAGGG using :localAlignment(align, finder, Score<int> scoring(1, -1, -100, -100), 13, WatermanEggert())
I get: Score = 15 0 . : . GGGGAGGGGGAGGGG ||||||||||||||| GGGGAGGGGGAGGGG Aligns Seq1[4:18] and Seq2[2:16] Score = 140 . : . GGGAGGGGAGGGGGAGGG
|||||||| |||||||| GGGAGGGGGAGGGGAGGG Aligns Seq1[0:17] and Seq2[3:20] Score = 130 . : . : GGGGAGGGGGAGGGGGGAGGG
|||| | ||| | |||||||| GGGGGGAGGGGGAGGGGAGGG Aligns Seq1[4:24] and Seq2[0:20]I would have though I only get the first since its giving the best score and the other two alignments clearly overlap with the sequence parts used in the first alignment. I thought the WatermanEggert would filter these alignments as not feasible(?) Is this a bug or am I mistaken?
Cheers, Fabian Fabian Buske wrote:
Hi Birte, Thanks for the quick response.Would you mind to add an example how to use the BandedWatermanEggert. I get smacked when I simply adjust the example in alignment_local under demos to :Align< String<Dna5> > ali2; appendValue(rows(ali2), "AAAAAAANAAAGGGNGGGGGGGGNGGGGGANAA"); appendValue(rows(ali2), "GGGGGGCGGGGGGGA"); LocalAlignmentFinder<> finder(ali2); Score<int> scoring(1, -1, -1, -1);while (localAlignment(ali2, finder, scoring, 5, BandedWatermanEggert())) {::std::cout << "Score = " << getScore(finder) << ::std::endl; ::std::cout << ali2;::std::cout << "Aligns Seq1[" << sourceBeginPosition(row(ali2, 0)) << ":" << (sourceEndPosition(row(ali2, 0))-1) << "]"; ::std::cout << " and Seq2[" << sourceBeginPosition(row(ali2, 1)) << ":" << (sourceEndPosition(row(ali2, 1))-1) << "]" << ::std::endl << ::std::endl;} The compiler complains: alignment_local.cpp: In function 'int main()':alignment_local.cpp:28: error: no matching function for call to 'localAlignment(seqan::Align<seqan::String<seqan::Dna5, seqan::Alloc<void> >, seqan::ArrayGaps>&, seqan::LocalAlignmentFinder<int>&, seqan::Score<int, seqan::Simple>&, int, const seqan::BandedWatermanEggert)'The (non-banded) WatermanEggert is throwing a "Bad access memory exception" from time to time (even in the alignment example under demos). Its difficult to narrow down what the problem actually is. I take it thats the reason why the example is commented out in the first place?Also this method in find_swift.h template <typename TIndex, typename TSpec, typename TText>inline typename Infix< typename GetSequenceByNo< TIndex const >::Type >::Typeinfix(Pattern<TIndex, Swift<TSpec> > const & pattern, TText &text) { __int64 hitBegin = pattern.curBeginPos; __int64 hitEnd = pattern.curEndPos;__int64 textLength = sequenceLength(pattern.curSeqNo, needle(pattern));if (hitEnd > textLength) hitEnd = textLength; if (hitBegin < 0) hitBegin = 0; return infix(text, hitBegin, hitEnd); }complains about TText &text and would rather prefer a TText const &text (at least for the data structure I input).Cheers, Fabian Kehr, Birte wrote:Hi Fabian,you are right, there is no Hamming specialization for SwiftLocal in SeqAn yet.I am currently working on a verification strategy for the more general edit distance version.In your case, I would suggest to use the more general edit distance filter (SwiftLocal). Swift is only a filter algorithm, so all hamming distance matches will be contained in the results from the edit distance version. And you will have to verify the reported hits in any case.You are also right that the local Swift computes not only the positions in the haystack but also the positions in the needle. The local version is thought to be a filter for local alignments between two long sequences.Once you have called the find function on a finder and pattern, e.g. find(finder, pattern, epsilon, minLength), you can obtain the positions of a hit in the haystack and needle with the function positionRange(finder) and postionRange(pattern), and the corresponding sequences with infix(finder) and infix(pattern).For the verification , you should be aware that Swift only guarantees to report regions that **overlap** with possible epsilon-matches. My suggestion is to use banded local alignment (BandedWatermanEggert) on the swift hits (parallelograms). The local alignments could be used as seeds for ungapped extension (UngappedXDrop). Here are the corresponding links to the SeqAn documentation:http://www.seqan.de/dddoc/html_devel/FUNCTION.local_Alignment.html http://www.seqan.de/dddoc/html_devel/FUNCTION.extend_Seed.htmlYou may also find the sections about local alignment and seed extension of the SeqAn tutorial interesting:https://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Alignments#Localhttps://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Seed-and-Extend#SeedExtensionAndBandedAlignmentFor the details of your verification step you will have to see what is appropriate for your special application.Cheers, Birte ------------------------------------------------------------------------ _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev
-- Fabian Buske Institute for Molecular BioscienceThe University of Queensland Brisbane, Qld. 4072 Australia
Phone: (61)-(7)-334-62608