[Seqan-dev] Performance advice for whole genome ESA
- From: John Reid <j.reid@mail.cryst.bbk.ac.uk>
- To: SeqAn Development <seqan-dev@lists.fu-berlin.de>
- Date: Thu, 21 Jun 2012 16:33:26 +0100
- Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
- Subject: [Seqan-dev] Performance advice for whole genome ESA
Hi, I'm reading the whole mouse genome into a seqan::IndexEsa based on a seqan::StringSet. At the moment I have the genome (2,730,871,774 bp) stored in one uncompressed fasta file on disk. Once I have the genome loaded I'm iterating over it many times looking at all the words < about 20bp. I'm wondering if there is a better way to go about this. Should I be looking at memory mapped files and/or compression on disk? Any pointers or advice would be welcome. Thanks, John.
- Follow-Ups:
- Re: [Seqan-dev] Performance advice for whole genome ESA
- From: John Reid <j.reid@mail.cryst.bbk.ac.uk>
- Re: [Seqan-dev] Performance advice for whole genome ESA
-
seqan-dev - June 2012 - Archives indexes sorted by:
[ thread ] [ subject ] [ author ] [ date ] - Complete archive of the seqan-dev mailing list
- More info on this list...