Hi Hannes
Thanks so much for the answers. Some comments below
If this is a different error from the one below, it is unexpected.
Can you
open an issue for this in the seqan bug tracker with a link to the
exact file
used? Please note that the requirements for free disk space for skew
are very
high (see below).
The error looked different, just a segfault without a trace. But of
course it must have been the disk space as you explain.
Indeed the requirements for disk space are quite high for skew. As
described
in the help-page, I have measured 30x. So if your file is 8GB and say
6GB of
this is sequence data, than the external space requirement might well be
180GB...
You might want to try the quicksort or quicksortbuckets algorithms.
The don't
require external disk space and if you have 128GB of RAM, this should be
enough to build the index for your 8GB file.
Alright I'll try that, thanks for the tip. But in the end of the day I
would like to index the current UniRef100. Following your estimates I
would need something like 600GB of disk space for that... I might be
able still to try it, but surely in a few months from now UniRef100
will have a size that will be impossible to deal with. It's great that
you guys are already working on new algos for indexing :)
On an unrelated note, I also tried out the pre-indexed nr files you
guys distribute from your website. There I get this:
./bin/lambda -q query.fasta -d nr/nr.fasta -p blastp
LAMBDA - the Local Aligner for Massive Biological DatA
======================================================
Version 0.4.7
Loading Subj Sequences… done.
Loading Subj
Ids…/home/mi/h4nn3s/takifugu/seqan-lambda-v0.4.7/core/include/seqan/basic/basic_exception.h:345
FAILED! (Uncaught exception of type std::bad_alloc: std::bad_alloc)
stack trace:
0 [0xa97a0e] seqan::ClassTest::fail() + 0xe
1 [0x8fd5a2] ./bin/lambda()
2 [0x1510ed6] __cxxabiv1::__terminate(void (*)()) + 0x6
3 [0x1510f03] ./bin/lambda()
4 [0x151131e] ./bin/lambda()
5 [0x151121d] operator new(unsigned long) + 0x7d
6 [0xfb8dd2] void
seqan::AssignString_<seqan::Tag<seqan::TagExact_>
>::assign_<seqan::String<char, seqan::Alloc<void> >,
seqan::String<char,
seqan::External<seqan::ExternalConfigLarge<seqan::File<seqan::Async<void>
>, 4194304u, 2u> > > const>(seqan::String<char, seqan::Alloc<void> >&,
seqan::String<char,
seqan::External<seqan::ExternalConfigLarge<seqan::File<seqan::Async<void>
>, 4194304u, 2u> > > const&) + 0x192
7 [0x919266] ./bin/lambda()
8 [0xfe74d0] int loadSubjects<(seqan::BlastFormatFile)8,
(seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0,
seqan::SimpleType<unsigned char,
seqan::ReducedAminoAcid_<seqan::Tag<seqan::Murphy10_> > >,
seqan::Score<int, seqan::ScoreMatrix<seqan::SimpleType<unsigned char,
seqan::AminoAcid_>, seqan::Blosum62_> >, seqan::FMIndex<void,
seqan::FMIndexConfig<void> >
>(GlobalDataHolder<seqan::SimpleType<unsigned char,
seqan::ReducedAminoAcid_<seqan::Tag<seqan::Murphy10_> > >,
seqan::Score<int, seqan::ScoreMatrix<seqan::SimpleType<unsigned char,
seqan::AminoAcid_>, seqan::Blosum62_> >, seqan::FMIndex<void,
seqan::FMIndexConfig<void> >, (seqan::BlastFormatFile)8,
(seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0>&,
LambdaOptions const&) + 0x230
9 [0x14e15f5] int argConv2<(seqan::BlastFormatFile)8,
(seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0,
seqan::SimpleType<unsigned char,
seqan::ReducedAminoAcid_<seqan::Tag<seqan::Murphy10_> > >
>(LambdaOptions const&,
seqan::Tag<seqan::BlastFormat_<(seqan::BlastFormatFile)8,
(seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0> >
const&, seqan::SimpleType<unsigned char,
seqan::ReducedAminoAcid_<seqan::Tag<seqan::Murphy10_> > > const&) + 0x335
10 [0x15078ec] argConv0(LambdaOptions const&) + 0x6c
11 [0x8e5cb7] main + 0x3e7
12 [0x7fea204d0a40] __libc_start_main + 0xf0
13 [0x8e6299] ./bin/lambda()
Aborted (core dumped)
I am assuming that the nr db is protein and that I can do protein
queries (blastp) against it, is that right?
Thanks for all the help!
Jose
_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev