FU Logo
  • Startseite
  • Kontakt
  • Impressum
  • Home
  • Listenauswahl
  • Anleitungen

[Seqan-dev] bandedChainAlignment default failing due to default k

<-- thread -->
<-- date -->
  • From: Brett Bowman <bnbowman@gmail.com>
  • To: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Date: Fri, 22 May 2015 15:53:28 -0700
  • Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Subject: [Seqan-dev] bandedChainAlignment default failing due to default k

I'm trying to align two highly similar sequences found via Kmer search:
Query = "ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT"
Reference = "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT"

The expected alignment looks like this:

Score: 80
      0     .    :    .    :    .    :    .    :    .  
        ATCTCTCTCAACAA-AACAAC-GAGGAGGAGTGAAAAGAGAGAGAT
        |||||||||||||| |||||| ||||||||| ||||||||||||||
        ATCTCTCTCAACAACAACAACGGAGGAGGAG-GAAAAGAGAGAGAT

But when I align it using the default values suggested by the tutorial, it doesn't show any inserted gaps at all, and I wind up with this instead:

Score: 80
      0     .    :    .    :    .    :    .    :     
        ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT
        |||||||||||||| |  |   |  |  | | |||         
        ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGA

I finally traced it down to the k-value (bandExtension value) passed into the alignment algorithm - values of K <= 13 succeed and generate the top-most alignment, while the values of 14-15 like the default (15) report the low-quality alignment.  

Yet oddly, both alignments report the correct alignment score at the end - so it's not failing, precisely.  It's just not storing or displaying the correct alignment.

So I have two questions:
1) What exactly does the  k / bandExtension variable do?
2) What is going on here?

My code is pasted below for your use.

Sincerely,
-Brett

"""
     #include <seqan/seeds.h>

     using namespace seqan;

     DnaString query = "ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT";
     DnaString ref   = "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT";
 
     typedef Seed<Simple> TSeed;
     String<TSeed> seedChain;
     appendValue(seedChain, TSeed( 0,  0, 14));
     appendValue(seedChain, TSeed(30, 31, 14));
     Score<int, Simple> scoringScheme(2, -1, -2);
 
     Align<DnaString, ArrayGaps> alignment1;
     resize(rows(alignment1), 2);
     assignSource(row(alignment1, 0), query);
     assignSource(row(alignment1, 1), ref);
 
     Align<DnaString, ArrayGaps> alignment2;
     resize(rows(alignment2), 2);
     assignSource(row(alignment2, 0), query);
     assignSource(row(alignment2, 1), ref);
 
     int result1 = bandedChainAlignment(alignment1, seedChain, scoringScheme, 14);
     std::cout << "Score: " << result1 << std::endl;
     std::cout << alignment1 << std::endl;
 
     int result2 = bandedChainAlignment(alignment2, seedChain, scoringScheme, 13);
     std::cout << "Score: " << result2 << std::endl;
     std::cout << alignment2 << std::endl;
"""
<-- thread -->
<-- date -->
  • seqan-dev - May 2015 - Archives indexes sorted by:
    [ thread ] [ subject ] [ author ] [ date ]
  • Complete archive of the seqan-dev mailing list
  • More info on this list...

Hilfe

  • FAQ
  • Dienstbeschreibung
  • ZEDAT Beratung
  • postmaster@lists.fu-berlin.de

Service-Navigation

  • Startseite
  • Listenauswahl

Einrichtung Mailingliste

  • ZEDAT-Portal
  • Mailinglisten Portal