Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA

From: &quot;Siragusa, Enrico&quot; &lt; Enrico.Siragusa@fu-berlin.de &gt;
To: SeqAn Development &lt; seqan-dev@lists.fu-berlin.de &gt;
Date: Thu, 5 Jul 2012 08:27:29 +0000
Reply-to: SeqAn Development &lt; seqan-dev@lists.fu-berlin.de &gt;
Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA

"Siragusa, Enrico" <Enrico.Siragusa@fu-berlin.de> · Thu, 5 Jul 2012 08:27:29 +0000

Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev