Re: [Seqan-dev] SwiftLocal specialization with Hamming distance - problems with WatermanEggert
Hi Birte,
Thanks for the quick response.
Would you mind to add an example how to use the BandedWatermanEggert. I
get smacked when I simply adjust the example in alignment_local under
demos to :
Align< String<Dna5> > ali2;
appendValue(rows(ali2), "AAAAAAANAAAGGGNGGGGGGGGNGGGGGANAA");
appendValue(rows(ali2), "GGGGGGCGGGGGGGA");
LocalAlignmentFinder<> finder(ali2);
Score<int> scoring(1, -1, -1, -1);
while (localAlignment(ali2, finder, scoring, 5, BandedWatermanEggert())) {
::std::cout << "Score = " << getScore(finder) << ::std::endl;
::std::cout << ali2;
::std::cout << "Aligns Seq1[" << sourceBeginPosition(row(ali2, 0))
<< ":" << (sourceEndPosition(row(ali2, 0))-1) << "]";
::std::cout << " and Seq2[" << sourceBeginPosition(row(ali2, 1)) <<
":" << (sourceEndPosition(row(ali2, 1))-1) << "]" << ::std::endl <<
::std::endl;
}
The compiler complains:
alignment_local.cpp: In function 'int main()':
alignment_local.cpp:28: error: no matching function for call to
'localAlignment(seqan::Align<seqan::String<seqan::Dna5,
seqan::Alloc<void> >, seqan::ArrayGaps>&,
seqan::LocalAlignmentFinder<int>&, seqan::Score<int, seqan::Simple>&,
int, const seqan::BandedWatermanEggert)'
The (non-banded) WatermanEggert is throwing a "Bad access memory
exception" from time to time (even in the alignment example under
demos). Its difficult to narrow down what the problem actually is. I
take it thats the reason why the example is commented out in the first
place?
Also this method in find_swift.h
template <typename TIndex, typename TSpec, typename TText>
inline typename Infix< typename GetSequenceByNo< TIndex const >::Type
>::Type
infix(Pattern<TIndex, Swift<TSpec> > const & pattern, TText &text)
{
__int64 hitBegin = pattern.curBeginPos;
__int64 hitEnd = pattern.curEndPos;
__int64 textLength = sequenceLength(pattern.curSeqNo, needle(pattern));
if (hitEnd > textLength) hitEnd = textLength;
if (hitBegin < 0) hitBegin = 0;
return infix(text, hitBegin, hitEnd);
}
complains about TText &text and would rather prefer a TText const &text
(at least for the data structure I input).
Cheers,
Fabian
Kehr, Birte wrote:
Hi Fabian,
you are right, there is no Hamming specialization for SwiftLocal in
SeqAn yet.
I am currently working on a verification strategy for the more general
edit distance version.
In your case, I would suggest to use the more general edit distance
filter (SwiftLocal). Swift is only a filter algorithm, so all hamming
distance matches will be contained in the results from the edit
distance version. And you will have to verify the reported hits in any
case.
You are also right that the local Swift computes not only the
positions in the haystack but also the positions in the needle. The
local version is thought to be a filter for local alignments between
two long sequences.
Once you have called the find function on a finder and pattern, e.g.
find(finder, pattern, epsilon, minLength), you can obtain the
positions of a hit in the haystack and needle with the function
positionRange(finder) and postionRange(pattern), and the corresponding
sequences with infix(finder) and infix(pattern).
For the verification , you should be aware that Swift only guarantees
to report regions that **overlap** with possible epsilon-matches. My
suggestion is to use banded local alignment (BandedWatermanEggert) on
the swift hits (parallelograms). The local alignments could be used as
seeds for ungapped extension (UngappedXDrop). Here are the
corresponding links to the SeqAn documentation:
http://www.seqan.de/dddoc/html_devel/FUNCTION.local_Alignment.html
http://www.seqan.de/dddoc/html_devel/FUNCTION.extend_Seed.html
You may also find the sections about local alignment and seed
extension of the SeqAn tutorial interesting:
https://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Alignments#Local
https://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Seed-and-Extend#SeedExtensionAndBandedAlignment
For the details of your verification step you will have to see what is
appropriate for your special application.
Cheers,
Birte
------------------------------------------------------------------------
_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev
--
Fabian Buske
Institute for Molecular Bioscience
The University of Queensland
Brisbane, Qld. 4072 Australia
Phone: (61)-(7)-334-62608