FU Logo
  • Startseite
  • Kontakt
  • Impressum
  • Home
  • Listenauswahl
  • Anleitungen

[Seqan-dev] Fwd: Building QGramSA fibre from StringSet

<-- thread -->
<-- date -->
  • From: Daniel James <danielpeterjames@gmail.com>
  • To: seqan-dev@lists.fu-berlin.de
  • Date: Sun, 10 Jul 2011 16:50:30 +0100
  • Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Subject: [Seqan-dev] Fwd: Building QGramSA fibre from StringSet

Hi

I'm getting an error whilst trying to get the SA fibre of a QGram
index over a StringSet

There's a minimal example below. Is there any chance someone could
have a look at this?

I'm on Rev: 9996 of trunk.

Daniel


#include <algorithm>
#include <seqan.h>

using namespace seqan;

// Generates random nucleotides.
struct MyGenerator : std::unary_function<char, void>
{
    std::string syms;
    MyGenerator (std::string syms = "ACGT") : syms(syms) { srand(time(NULL)); }
   char operator()(void) { return syms[rand() % syms.size()]; }
};

int main(int argc, char** argv)
{
   typedef StringSet<DnaString>                                  TMyStringSet;
   typedef Index<TMyStringSet, IndexQGram<UngappedShape<8> > >   TMyIndex;

   StringSet<DnaString> myStringSet;

   for (unsigned i = 0; i < 100; ++i) {
       DnaString input;
       resize(input, 60);
       generate_n(begin(input), 60, MyGenerator());
       appendValue(myStringSet, input);
   }

   std::cout << myStringSet[0] << std::endl;
   std::cout << "requiring QGramSA..." << std::endl;
   TMyIndex index(myStringSet);
   indexRequire(index, QGramSA());

   return 0;
}




---------- Forwarded message ----------
From: Daniel James <danielpeterjames@gmail.com>
Date: 5 July 2011 15:18
Subject: Building QGramSA fibre from StringSet
To: seqan-dev@lists.fu-berlin.de


Hi

I'm running into some unexpected behaviour whilst try to use an SA
fibre of a QGram index that's built over a string set.

The below code runs OK with 1 or 10 as command line args, (strings in
string set), but fails at 100 with the following exception:

/Users/dj5/usr/local/include/seqan/sequence/string_base.h:238
Assertion failed : static_cast<TStringPos>(pos) <
static_cast<TStringPos>(length(me)) was: 171599218 >= 100 (Trying to
access an element behind the last one!)

Have I made a coding blunder or is this a bug?


Many thanks,

Daniel



<code>

#include <ctime>
#include <cstdlib>

#include <iostream>
#include <sstream>

#include <string>

#include <algorithm>

#include <seqan.h>

using namespace std;
using namespace seqan;

// Generates random nucleotides.
struct MyGenerator : public unary_function<char, void>
{
  string syms;
  MyGenerator (string syms = "ACGT") : syms(syms) { srand(time(NULL)); }
  char operator()(void) { return syms[rand() % syms.size()]; }
};

int main(int argc, char** argv)
{
  // Input the number of sequences for the string set.
  stringstream ss(argv[1]);
  unsigned n;
  ss >> n;

  typedef StringSet<DnaString>                                  TMyStringSet;
  typedef Index<TMyStringSet, IndexQGram<UngappedShape<8> > >   TMyIndex;
  typedef Fibre<TMyIndex, QGramSA>::Type                        TMySAFibre;
  typedef Fibre<TMyIndex, QGramDir>::Type                       TMyDirFibre;

  // Fill a string set with 60-mer DNA sequences.
  StringSet<DnaString> myStringSet;
  string input_s;
  DnaString input;
  for (unsigned i = 0; i < n; ++i)
  {
      input_s.resize(60);
      generate_n(input_s.begin(), 60, MyGenerator());
      input = input_s;
      appendValue(myStringSet, input);
  }


  // Build the index.
  TMyIndex index(myStringSet);

  // Require the QGramSA fibre.
  cout << "requiring SA fibre:\n";
  float t0 = clock();
  indexRequire(index, QGramSA());
  cout << (clock() - t0)/CLOCKS_PER_SEC << endl;

  cout << "requiring Dir fibre:\n";
  t0 = clock();
  indexRequire(index, QGramDir());
  cout << (clock() - t0)/CLOCKS_PER_SEC << endl;

  TMySAFibre mySAFibre = getFibre(index, QGramSA());
  cout << "QGramSA length: " << length(mySAFibre) << endl;

  TMyDirFibre myDirFibre = getFibre(index, QGramDir());
  cout << "QGramDir length: " << length(myDirFibre) << endl;

  return 0;
}

</code>



<-- thread -->
<-- date -->
  • References:
    • [Seqan-dev] Building QGramSA fibre from StringSet
      • From: Daniel James <danielpeterjames@gmail.com>
  • seqan-dev - July 2011 - Archives indexes sorted by:
    [ thread ] [ subject ] [ author ] [ date ]
  • Complete archive of the seqan-dev mailing list
  • More info on this list...

Hilfe

  • FAQ
  • Dienstbeschreibung
  • ZEDAT Beratung
  • postmaster@lists.fu-berlin.de

Service-Navigation

  • Startseite
  • Listenauswahl

Einrichtung Mailingliste

  • ZEDAT-Portal
  • Mailinglisten Portal