FU Logo
  • Startseite
  • Kontakt
  • Impressum
  • Home
  • Listenauswahl
  • Anleitungen

[Seqan-dev] Performance advice for whole genome ESA

<-- thread -->
<-- date -->
  • From: John Reid <j.reid@mail.cryst.bbk.ac.uk>
  • To: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Date: Thu, 21 Jun 2012 16:33:26 +0100
  • Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Subject: [Seqan-dev] Performance advice for whole genome ESA

Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.



<-- thread -->
<-- date -->
  • Follow-Ups:
    • Re: [Seqan-dev] Performance advice for whole genome ESA
      • From: John Reid <j.reid@mail.cryst.bbk.ac.uk>
  • seqan-dev - June 2012 - Archives indexes sorted by:
    [ thread ] [ subject ] [ author ] [ date ]
  • Complete archive of the seqan-dev mailing list
  • More info on this list...

Hilfe

  • FAQ
  • Dienstbeschreibung
  • ZEDAT Beratung
  • postmaster@lists.fu-berlin.de

Service-Navigation

  • Startseite
  • Listenauswahl

Einrichtung Mailingliste

  • ZEDAT-Portal
  • Mailinglisten Portal