[Seqan-dev] Memory problem with Razers3
Dear Seqan development team,
We are trying to run Razers3 to produce some benchmark datasets for Rabema. We used the command as shown in Rabema’s user manual : “razers3 -tc 12 -v -rr 100 -i 92 -m 1000000 -ds “ It works fine for human, chimp and other genomes and finished
within reasonable amount of time and resources. But for zebrafish, a relative medium size genome, it took much more memory than we expected and failed even we set the memory to 1T. We only used a set of 1M simulated reads generated from ART.
Is it possible that razers3 tries to align each single read to multiple positions in the genome, so increases the memory load in order to hold those multiple information,
as I set the number of the best hits to "--max-hits 10000000" means "Collect up to 1M alignments per read". Even though a single read should not have remotely that many SAM records. This number is impossible to reach, moreover the read is about 70bps long.
The number of perfect match along the genome wouldn't be that many.... but...for zebrafish (high repetitive genome) and a set of simulated reads, it may be something related to this setting? I have attached the error message “zebrafish_70.err-1T”, please
have a look and see whether you could provide a potential solution for that.
Also, could you please let me know whether a set of pair-end reads can be used in Rabema’s “Using
Rabema In Normal Mode “ workflow? We have got 2M paired-end reads simulated using wgsim with an insert length about 500bps.
Submitted using Razers3 with a similar set of parameters but failed again. I also attached the error file, named “zebrafish_PE_70_wgsim.err”.
Thanks. Look forward to your reply.
Regards,
Sean
|
___SETTINGS____________
Genome file: /flush/li11k/Razers3/genomes/zebrafish.fa
Read file: /flush/li11k/Razers3/reads/zebrafish.art.70.1M.fq
Compute forward matches: YES
Compute reverse matches: YES
Allow Indels: YES
Error rate: 0.08
Pigeonhole mode with overlap: 0
Shape: 11111111111
Repeat threshold: 1000
Overabundance threshold: 1
Program PID: 252031
69799170 bps of 1001445 reads loaded.
Loading reads took 2.29319 seconds
Number of threads: 128
Initialization took 0.406083 seconds
Process genome seq #0[fwd]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M.........55M.........56M...
Process genome seq #0[rev]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M.........55M.........56M...
Process genome seq #1[fwd]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M....
Process genome seq #1[rev]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M....
Process genome seq #2[fwd]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M.........55M.........56M.........57M.........58M.........59M.........60M.........61M.........62M.........63M
Process genome seq #2[rev]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M.........55M.........56M.........57M.........58M.........59M.........60M.........61M.........62M.........63M
Process genome seq #3[fwd]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.......
Process genome seq #3[rev]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.......
Process genome seq #4[fwd]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M.........55M.........56M.........57M.........58M.........59M.........60M.........61M.........62M.........63M.........64M.........65M.........66M.........67M.........68M.........69M.........70M....
Process genome seq #4[rev]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M.........55M.........56M.........57M.........58M.........59M.........60M.........61M.........62M.........63M.........64M.........65M.........66M.........67M.........68M.........69M.........70M....
Process genome seq #5[fwd]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M.........55M.........56M.........57M.........58M.........59M...
Process genome seq #5[rev]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M.........37M.........38M.........39M.........40M.........41M.........42M.........43M.........44M.........45M.........46M.........47M.........48M.........49M.........50M.........51M.........52M.........53M.........54M.........55M.........56M.........57M.........58M.........59M...
Process genome seq #6[fwd]........1M.........2M.........3M.........4M.........5M.........6M.........7M.........8M.........9M.........10M.........11M.........12M.........13M.........14M.........15M.........16M.........17M.........18M.........19M.........20M.........21M.........22M.........23M.........24M.........25M.........26M.........27M.........28M.........29M.........30M.........31M.........32M.........33M.........34M.........35M.........36M....../home/asc/hli001/flush/seqan/seqan-trunk/core/include/seqan/basic/basic_exception.h:236 FAILED! (Uncaught exception of type St9bad_alloc: std::bad_alloc)
stack trace:
0 [0x5e947e] seqan::ClassTest::fail() + 0xe
1 [0x599009] /apps/seqan/1.1b/bin/razers3()
2 [0x86cb56] __cxxabiv1::__terminate(void (*)()) + 0x6
3 [0x8bede9] /apps/seqan/1.1b/bin/razers3()
4 [0x86ca5a] __gxx_personality_v0 + 0x52a
5 [0x8c41c3] /apps/seqan/1.1b/bin/razers3()
6 [0x8c481a] /apps/seqan/1.1b/bin/razers3()
7 [0x86cc81] __cxa_throw + 0x51
8 [0x86cf4d] operator new(unsigned long) + 0x7d
9 [0x5997fa] /apps/seqan/1.1b/bin/razers3()
10 [0x7c2b36] void seqan::writeBackToLocal<seqan::String<seqan::MatchRecord<unsigned long>, seqan::Alloc<void> >, seqan::FragmentStore<MyFragStoreConfig, seqan::FragmentStoreConfig<MyFragStoreConfig> >, seqan::Finder<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q_>, seqan::Alloc<void> >, seqan::Pigeonhole<void> >, seqan::Pattern<seqan::Index<seqan::StringSet<seqan::Segment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q_>, seqan::Alloc<void> >, seqan::InfixSegment> const, seqan::Owner<seqan::Tag<seqan::Default_> > >, seqan::IndexQGram<seqan::Shape<seqan::SimpleType<unsigned char, seqan::Dna_>, seqan::OneGappedShape>, seqan::Tag<seqan::OpenAddressing_> > >, seqan::Pigeonhole<void> >, seqan::Shape<seqan::SimpleType<unsigned char, seqan::Dna_>, seqan::OneGappedShape>, seqan::RazerSOptions<seqan::RazerSSpec<false, false> >, seqan::String<seqan::String<unsigned short, seqan::Alloc<void> >, seqan::Alloc<void> >, seqan::RazerSMode<seqan::RazerSGlobal, seqan::RazerSGapped, seqan::RazerSErrors, seqan::NMatchesNone_> >(seqan::ThreadLocalStorage<seqan::MapSingleReads<seqan::String<seqan::MatchRecord<unsigned long>, seqan::Alloc<void> >, seqan::FragmentStore<MyFragStoreConfig, seqan::FragmentStoreConfig<MyFragStoreConfig> >, seqan::Finder<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q_>, seqan::Alloc<void> >, seqan::Pigeonhole<void> >, seqan::Pattern<seqan::Index<seqan::StringSet<seqan::Segment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q_>, seqan::Alloc<void> >, seqan::InfixSegment> const, seqan::Owner<seqan::Tag<seqan::Default_> > >, seqan::IndexQGram<seqan::Shape<seqan::SimpleType<unsigned char, seqan::Dna_>, seqan::OneGappedShape>, seqan::Tag<seqan::OpenAddressing_> > >, seqan::Pigeonhole<void> >, seqan::Shape<seqan::SimpleType<unsigned char, seqan::Dna_>, seqan::OneGappedShape>, seqan::RazerSOptions<seqan::RazerSSpec<false, false> >, seqan::String<seqan::String<unsigned short, seqan::Alloc<void> >, seqan::Alloc<void> >, seqan::RazerSMode<seqan::RazerSGlobal, seqan::RazerSGapped, seqan::RazerSErrors, seqan::NMatchesNone_> > >&, seqan::String<seqan::SingleVerificationResult<seqan::String<seqan::MatchRecord<unsigned long>, seqan::Alloc<void> > >, seqan::Alloc<void> >&, bool) + 0x246
11 [0x5ce9b9] /apps/seqan/1.1b/bin/razers3()
12 [0x7ffff7546bba] /apps/gcc/4.8.1/lib64/libgomp.so.1(+0x8bba)
13 [0x7ffff73287f6] /lib64/libpthread.so.0(+0x77f6)
14 [0x7ffff7083f8d] clone + 0x6d
Command terminated by signal 6
Command being timed: "/apps/seqan/1.1b/bin/razers3 -tc 128 -v -rr 100 -i 92 -m 1000000 -ds -o /flush/li11k/Razers3/reads/gold_pre.zebrafish.70.1M.sam /flush/li11k/Razers3/genomes/zebrafish.fa /flush/li11k/Razers3/reads/zebrafish.art.70.1M.fq"
User time (seconds): 1885731.17
System time (seconds): 95033.81
Percent of CPU this job got: 6655%
Elapsed (wall clock) time (h:mm:ss or m:ss): 8:15:59
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 893420816
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 12
Minor (reclaiming a frame) page faults: 755025775
Voluntary context switches: 97743709
Involuntary context switches: 87673555
Swaps: 0
File system inputs: 3926056
File system outputs: 11792
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
___SETTINGS____________
Genome file: /flush/li11k/GPUS/datasets/genomes/zebrafish.fa
Read files: /flush/li11k/GPUS/datasets/synthetic_data/wgsim/zebrafish/70_PE/zebrafish.1M.1.fq
/flush/li11k/GPUS/datasets/synthetic_data/wgsim/zebrafish/70_PE/zebrafish.1M.2.fq
Compute forward matches: YES
Compute reverse matches: YES
Allow Indels: YES
Error rate: 0.08
Pigeonhole mode with overlap: 0
Shape: 11111111111
Repeat threshold: 1000
Overabundance threshold: 1
Program PID: 7585
138774160 bps of 1999998 reads loaded.
Loading reads took 7.09714 seconds
Number of threads: 12
Initialization took 0.532618 seconds
Process genome seq #0[fwd]/home/hli001/flush/seqan/seqan-trunk/core/include/seqan/basic/basic_exception.h:/home/hli001/flush/seqan/seqan-trunk/core/include/seqan/basic/basic_exception.h:236 FAILED! (236 FAILED! (Uncaught exception of type St9bad_alloc: std::bad_alloc)
Uncaught exception of type St9bad_alloc: std::bad_alloc)
stack trace:
stack trace:
0 [0x79a14e] seqan::ClassTest::fail() + 0xe
1 [0x745689] razers3()
2 [0xa60026] __cxxabiv1::__terminate(void (*)()) + 0x6
3 [0xa600e9] razers3()
4 [0xa5f641] __gxx_personality_v0 + 0x481
5 [0xabb30b] razers3()
6 [0xabb90c] razers3()
7 [0xa602c1] __cxa_throw + 0x51
8 [0xa5fffd] operator new(unsigned long) + 0x7d
9 [0x744090] razers3()
10 [0x7440f5] razers3()
11 [0x8011a2] seqan::Size<seqan::String<seqan::SwiftHitSemiGlobal_<long>, seqan::Alloc<void> > >::Type seqan::ClearSpaceExpandStringBase_<seqan::Tag<seqan::TagGenerous_> >::_clearSpace_<seqan::String<seqan::SwiftHitSemiGlobal_<long>, seqan::Alloc<void> > >(seqan::String<seqan::SwiftHitSemiGlobal_<long>, seqan::Alloc<void> >&, seqan::Size<seqan::String<seqan::SwiftHitSemiGlobal_<long>, seqan::Alloc<void> > >::Type) + 0x22
12 [0x8df989] void seqan::AssignString_<seqan::Tag<seqan::TagGenerous_> >::assign_<seqan::String<seqan::SwiftHitSemiGlobal_<long>, seqan::Alloc<void> >, seqan::String<seqan::SwiftHitSemiGlobal_<long>, seqan::Alloc<void> > const>(seqan::String<seqan::SwiftHitSemiGlobal_<long>, seqan::Alloc<void> >&, seqan::String<seqan::SwiftHitSemiGlobal_<long>, seqan::Alloc<void> > const&) + 0x39
13 [0x8e126e] seqan::Finder<seqan::Segment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q_>, seqan::Alloc<void> >, seqan::InfixSegment>, seqan::Pigeonhole<void> >::operator=(seqan::Finder<seqan::Segment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5Q_>, seqan::Alloc<void> >, seqan::InfixSegment>, seqan::Pigeonhole<void> > const&) + 0x5e
14 [0x782d4b] razers3()
15 [0x7f255d8b006a] /apps/gcc/4.7.0/lib64/libgomp.so.1(+0x906a)
16 [0x7f255d6917b6] /lib64/libpthread.so.0(+0x77b6)
17 [0x7f255d3ecc5d] clone + 0x6d
Command terminated by signal 6
Command being timed: "razers3 -tc 12 -v -rr 100 -i 92 -ll 420 -le 80 -m 10000 -ds -o /flush/li11k/GPUS/datasets/synthetic_data/sam_rabema/zebrafish_wgsim/gold_pre.zebrafish.PE.70.1M.sam /flush/li11k/GPUS/datasets/genomes/zebrafish.fa /flush/li11k/GPUS/datasets/synthetic_data/wgsim/zebrafish/70_PE/zebrafish.1M.1.fq /flush/li11k/GPUS/datasets/synthetic_data/wgsim/zebrafish/70_PE/zebrafish.1M.2.fq"
User time (seconds): 16.22
System time (seconds): 3.10
Percent of CPU this job got: 130%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.80
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 6397728
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 3
Minor (reclaiming a frame) page faults: 889194
Voluntary context switches: 43111
Involuntary context switches: 2498
Swaps: 0
File system inputs: 3305768
File system outputs: 2115488
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0