Hi Matthias, RazerS keeps a q-gram index of reads in memory. Hence its memory consumption is directly proportional to the input size. And it requires about 10GB for 10M x 100bp reads. Unfortunately, there is currently no other option than to split the input file into chunks and map then independently one-after-another or in-parallel on a cluster. BAM outputs will certainly be supported in the near future and gzipped fastq input could be supported but requires to benchmark the alternative I/O module before. Cheers, Dave -- David Weese weese@inf.fu-berlin.de Freie Universität Berlin http://www.inf.fu-berlin.de/ Institut für Informatik Phone: +49 30 838 75137 Takustraße 9 Algorithmic Bioinformatics 14195 Berlin Room 020 Am 05.06.2013 um 11:44 schrieb Matthias Lienhard <lienhard@molgen.mpg.de>: > Hi, > when runnig razers3 on my paired end HiSeq fastq files I get the following errors > > > razers3 -i 94 -rr 95 -tc 20 -o sample.sam reads1.fastq reads2.fastq > terminate called recursively > terminate called recursively > Aborted > > or > > terminate called after throwing an instance of 'std::bad_alloc' > what(): std::bad_alloc > Aborted > > It seems as memory usage is very high (>50gb). Each of the fastq files is about 7 gb. When I take the first 100000 reads, the razers3 seems to work fine. However, I don't want to split the files in small chucks and merge them together afterwards (because of disk usage and convenience - I have about 50 samples to process) > Is there another way to handle this issue? > > Also, it would be very convienient if gzipped fastq files could be used as input directly - and output in bam-format would be nice as well. > > Best wishes, Matthias > > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev