Hi, I'm reading the whole mouse genome into a seqan::IndexEsa based on a seqan::StringSet. At the moment I have the genome (2,730,871,774 bp) stored in one uncompressed fasta file on disk. Once I have the genome loaded I'm iterating over it many times looking at all the words < about 20bp. I'm wondering if there is a better way to go about this. Should I be looking at memory mapped files and/or compression on disk? Any pointers or advice would be welcome. Thanks, John.