Re: [Seqan-dev] Performance advice for whole genome ESA

From: &quot;Holtgrewe, Manuel&quot; &lt; manuel.holtgrewe@fu-berlin.de &gt;
To: SeqAn Development &lt; seqan-dev@lists.fu-berlin.de &gt;
Date: Tue, 26 Jun 2012 15:20:54 +0000
Reply-to: SeqAn Development &lt; seqan-dev@lists.fu-berlin.de &gt;
Subject: Re: [Seqan-dev] Performance advice for whole genome ESA

"Holtgrewe, Manuel" <manuel.holtgrewe@fu-berlin.de> · Tue, 26 Jun 2012 15:20:54 +0000

Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev