Hi Birte, Kehr, Birte wrote:
Apologies for the confusion about the commented WatermanEggert example. I did that myself and forgot about it.Hi Fabian, I added an example for the BandedWatermanEggert in demos/alignment_local.cpp. Your problem was that you did not specify the band - the lowest diagonal and the highest diagonal that is to be computed in the alignment matrix. Also make sure that you have updated to the current version of SeqAn. (I have fixed a small bug in the banded local alignment.)The (non-banded) WatermanEggert is throwing a "Bad access memoryexception" from time to time (even in the alignment example under demos). Its difficult to narrow down what the problem actually is. I take it thats the reason why the example is commented out in the first place?
Here, I need some more information from you. In the current version no example is commented out in demos/alignment_local.cpp. I also do not get an exception. Could you send me the code that is producing the error message? Again, make sure that you have checked out the current version of SeqAn.
I've updated my local version to the current trunk but I still get a "Bus error" when running the Waterman Eggert algorithm. In fact the very first of the examples of alignment_local demo causes the problem. I'm working on i686-apple-darwin9-gcc-4.0.1 on a Mac osx.
I dump the debugger tree below:#0 0x0000725a in seqan::previousPosition<int, 0u, unsigned long> at matrix_base.h:413
#1 0x0000813c in seqan::goPrevious<int, 0u> at matrix_base.h:543#2 0x0001fd90 in seqan::_smith_waterman_declump<int, seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> >, seqan::ArrayGaps> at align_local_dynprog.h:414 #3 0x00020654 in seqan::_smithWatermanGetNext<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> >, seqan::ArrayGaps, int> at align_local_dynprog.h:765 #4 0x00020886 in seqan::localAlignment<seqan::String<seqan::SimpleType<unsigned char, seqan::_Dna>, seqan::Alloc<void> >, seqan::ArrayGaps, int, int, int> at align_local_dynprog.h:841
#5 0x000036e9 in main at alignment_local.cpp:26 The variables in previousPosition contain: position_ = 3221222916 dimension_ = 1 while me.data_factors is pretty much empty (capacity=0) So I take it that is has not been initialized properly.
Your other remark:... the other two alignments clearly overlap with the sequence parts used in the first alignment. I thought the WatermanEggert would filter these alignments as not feasible(?) Is this a bug or am I mistaken?The Waterman-Eggert algorithm computes no alignments where a character from seq1 is matched to the same character in seq2 as in a previous alignment. Sequence parts are allowed to overlap, only the traces in the alignment matrix are not allowed to overlap. In your example, there are different possibilities to match a G from one sequence to a G in the other sequence. The pairs of characters that are matched to each other are all different.
Thanks for making that clear.
Also this method in find_swift.h [...] complains about TText &text and would rather prefer a TText const &text(at least for the data structure I input). We have fixed a c&p bug here, it should work now. Thanks. In my applications I do not specify the text separately. If you have a reason for specifying it, I would be curious about it.
I'm not doing that either. The compiler just complained about it. Cheers, Fabian
Cheers, Birte Kehr, Birte wrote:Hi Fabian,you are right, there is no Hamming specialization for SwiftLocal in SeqAn yet.I am currently working on a verification strategy for the more general edit distance version.In your case, I would suggest to use the more general edit distance filter (SwiftLocal). Swift is only a filter algorithm, so all hamming distance matches will be contained in the results from the edit distance version. And you will have to verify the reported hits in any case.You are also right that the local Swift computes not only the positions in the haystack but also the positions in the needle. The local version is thought to be a filter for local alignments between two long sequences.Once you have called the find function on a finder and pattern, e.g. find(finder, pattern, epsilon, minLength), you can obtain the positions of a hit in the haystack and needle with the function positionRange(finder) and postionRange(pattern), and the corresponding sequences with infix(finder) and infix(pattern).For the verification , you should be aware that Swift only guarantees to report regions that **overlap** with possible epsilon-matches. My suggestion is to use banded local alignment (BandedWatermanEggert) on the swift hits (parallelograms). The local alignments could be used as seeds for ungapped extension (UngappedXDrop). Here are the corresponding links to the SeqAn documentation:http://www.seqan.de/dddoc/html_devel/FUNCTION.local_Alignment.html http://www.seqan.de/dddoc/html_devel/FUNCTION.extend_Seed.htmlYou may also find the sections about local alignment and seed extension of the SeqAn tutorial interesting:https://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Alignments#Local https://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Seed-and-Extend#SeedExtensionAndBandedAlignmentFor the details of your verification step you will have to see what is appropriate for your special application.Cheers, Birte ------------------------------------------------------------------------ _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev
-- Fabian Buske Institute for Molecular BioscienceThe University of Queensland Brisbane, Qld. 4072 Australia
Phone: (61)-(7)-334-62608