I am having a lot of trouble using lambda_indexer to index full
UniRef100 fasta files. I followed the instructions in the lamda
website: % /path/to/segmasker -infmt fasta -in db.fasta -outfmt interval -out db.seg % bin/lambda_indexer -d db.fasta -s db.seg I first tried with the current UniRef100 release (2015_05), which is huge (26GB uncompressed fasta file) and then I came across the memory problems that are documented in lambda_indexer's help. So I ended up using "-a skew7ext" and ran it in the largest memory system I had available (128GB). The program ran, but at some point after "Generating Index..." it died with a segfault and no other information. Then I decided to try on a smaller UniRef, so I took an older version (2012_06, only 8GB uncompressed fasta file). I ran again with "-a skew7ext" and this time it did go further, but eventually also died: Dumping unreduced Subj Sequences… done. Generating Index…Asynchronous I/O operation failed (waitFor): "Success" fildes: 5 buffer: 7fcb59d7f000 offset: 1ee3e0000 nbytes: 20000 event: 1 Raddr: 0x2566998 /home/mi/h4nn3s/takifugu/seqan-lambda-v0.4.7/core/include/seqan/file/file_page.h:752 FAILED! (WRITING operation could not be completed: "Success") stack trace: 0 [0x5263be] seqan::ClassTest::fail() + 0xe 1 [0x4dfeb5] ./bin/lambda_indexer() 2 [0x5b1471] seqan::PageChain<seqan::Buffer<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::PageFrame<seqan::File<seqan::Async<void> >, seqan::Dynamic> > >::getRead yPage() + 0x171 3 [0x5b2b09] seqan::Handler<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<seqan::_skew7NMapSliced<seqan::Pair<unsig ned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::MapperAsyncWriter>::_writeBucket(seqan::PageBucket<seqan::Pair<unsigned lo ng, unsigned long, seqan::Tag<seqan::Pack_> > >&, unsigned int) + 0xa9 4 [0x6a4f97] bool seqan::Pipe<seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<seqan::filterI1<seqan::Pair <unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigned long, unsigned long, seq an::Tag<seqan::Pack_> >, unsigned long> > >, seqan::Skew7>::process<seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<s eqan::filterI1<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigne d long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long> > > >(seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSi ze<seqan::filterI1<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long> > >&) + 0x1be7 5 [0x6a6811] seqan::Pipe<seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<seqan::filterI1<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long> > >, seqan::Skew7>::Pipe(seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<seqan::filterI1<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long> > >&) + 0x971 I'm pretty sure I have enough space available on the disk where I'm running it (>100GB). Is there anything obvious that I am doing wrong? Do you guys have any experience in indexing large files like this? Apologies in advance if the message does not belong in the dev list. I couldn't find any more appropriate place to post it. I'd like to confirm this is a bug rather than me misusing the software before submitting an issue to github. Any pointers are very much appreciated Thanks Jose |