[Seqan-dev] bandedChainAlignment default failing due to default k
- From: Brett Bowman <bnbowman@gmail.com>
- To: SeqAn Development <seqan-dev@lists.fu-berlin.de>
- Date: Fri, 22 May 2015 15:53:28 -0700
- Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
- Subject: [Seqan-dev] bandedChainAlignment default failing due to default k
I'm trying to align two highly similar sequences found via Kmer search:
Query = "ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT"
Reference = "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT"
The expected alignment looks like this:
Score: 80
0 . : . : . : . : .
ATCTCTCTCAACAA-AACAAC-GAGGAGGAGTGAAAAGAGAGAGAT
|||||||||||||| |||||| ||||||||| ||||||||||||||
ATCTCTCTCAACAACAACAACGGAGGAGGAG-GAAAAGAGAGAGAT
But when I align it using the default values suggested by the tutorial, it doesn't show any inserted gaps at all, and I wind up with this instead:
Score: 80
0 . : . : . : . :
ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT
|||||||||||||| | | | | | | |||
ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGA
I finally traced it down to the k-value (bandExtension value) passed into the alignment algorithm - values of K <= 13 succeed and generate the top-most alignment, while the values of 14-15 like the default (15) report the low-quality alignment.
Yet oddly, both alignments report the correct alignment score at the end - so it's not failing, precisely. It's just not storing or displaying the correct alignment.
So I have two questions:
1) What exactly does the k / bandExtension variable do?
2) What is going on here?
My code is pasted below for your use.
Sincerely,
-Brett
"""
#include <seqan/seeds.h>
using namespace seqan;
DnaString query = "ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT";
DnaString ref = "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT";
typedef Seed<Simple> TSeed;
String<TSeed> seedChain;
appendValue(seedChain, TSeed( 0, 0, 14));
appendValue(seedChain, TSeed(30, 31, 14));
Score<int, Simple> scoringScheme(2, -1, -2);
Align<DnaString, ArrayGaps> alignment1;
resize(rows(alignment1), 2);
assignSource(row(alignment1, 0), query);
assignSource(row(alignment1, 1), ref);
Align<DnaString, ArrayGaps> alignment2;
resize(rows(alignment2), 2);
assignSource(row(alignment2, 0), query);
assignSource(row(alignment2, 1), ref);
int result1 = bandedChainAlignment(alignment1, seedChain, scoringScheme, 14);
std::cout << "Score: " << result1 << std::endl;
std::cout << alignment1 << std::endl;
int result2 = bandedChainAlignment(alignment2, seedChain, scoringScheme, 13);
std::cout << "Score: " << result2 << std::endl;
std::cout << alignment2 << std::endl;
"""
-
seqan-dev - May 2015 - Archives indexes sorted by:
[ thread ] [ subject ] [ author ] [ date ] - Complete archive of the seqan-dev mailing list
- More info on this list...