Re: [Seqan-dev] RazerS 3

From: "Weese, David" <weese@campus.fu-berlin.de>
To: "seqan-dev@lists.fu-berlin.de" <seqan-dev@lists.fu-berlin.de>
Date: Fri, 17 Aug 2012 17:21:14 +0000
Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
Subject: Re: [Seqan-dev] RazerS 3

Hi Sébastien,

how's it going? It seems the time has come to move RazerS 3 from my sandbox to the public trunk (seqan/extras/). :-) Right now only versions 1 and 2 are there.

The bug in the binary is strange as here everything seems to be fine:

weese@sequoia:~/sandbox/seqan-trunk/build/mak3$ valgrind ./tmp/razers3.0-beta1/linux64/razers3 --help

==3110== Memcheck, a memory error detector

==3110== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info

==3110== Command: ./tmp/razers3.0-beta1/linux64/razers3 --help

==3110==

***********************************************************

*** RazerS - Fast Read Mapping with Sensitivity Control ***

***********************************************************

Usage: razers3 [OPTION]... <GENOME FILE> <READS FILE>

razers3 [OPTION]... <GENOME FILE> <MP-READS FILE1> <MP-READS FILE2>

-h, --help displays this help message

--write-ctd exports the app's interface description to a .ctd file

-V, --version print version information

Main Options:

-f, --forward only compute forward matches

-r, --reverse only compute reverse complement matches

-i, --percent-identity NUM set the percent identity threshold (default: 92)

-rr, --recognition-rate NUM set the percent recognition rate (default: 99)

-mr, --mutation-rate NUM set the percent mutation rate (default: 5)

-pd, --param-dir DIR folder containing user-computed parameter files (optional)

-id, --indels allow indels (default: mismatches only)

-ll, --library-length NUM mate-pair library length (default: 220)

-le, --library-error NUM mate-pair library length tolerance (default: 50)

-m, --max-hits NUM output only NUM of the best hits (default: 100)

--unique output only unique best matches (-m 1 -dr 0 -pa)

-tr, --trim-reads NUM trim reads to given length (default off)

-o, --output FILE change output filename (default <READS FILE>.result)

-v, --verbose verbose mode

-vv, --vverbose very verbose mode

Output Format Options:

-a, --alignment dump the alignment for each match

-pa, --purge-ambiguous purge reads with more than max-hits best matches

-dr, --distance-range NUM only consider matches with at most NUM more errors compared to the best (default output all)

-of, --output-format NUM set output format (default: 0)

0 = Razer format

1 = enhanced Fasta format

2 = Eland format

3 = Gff format

4 = Sam format

5 = Amos AFG format

-gn, --genome-naming NUM select how genomes are named (default: 0)

0 = use Fasta id

1 = enumerate beginning with 1

-rn, --read-naming NUM select how reads are named (default: 0)

0 = use Fasta id

1 = enumerate beginning with 1

2 = use the read sequence (only for short reads!)

3 = use the Fasta id, do NOT append '/L' or '/R' for mate pairs

--full-readid use the whole read id (don't clip after whitespace)

-so, --sort-order NUM select how matches are sorted (default: 0)

0 = 1. read number, 2. genome position

1 = 1. genome position, 2. read number

-pf, --position-format NUM select begin/end position numbering (default: 0)

0 = gap space

1 = position space

-ga, --global-alignment compute global alignment (in SAM output) (default: 0)

Misc Options:

-cm, --compact-mult NUM multiply compaction treshold by this value after reaching and compacting (default: 2.2)

-ncf, --no-compact-frac NUM don't compact if in this last fraction of genome (default: 0.05)

Filtration Options:

-s, --shape BITSTRING set k-mer shape (default: 11111111111)

-t, --threshold NUM set minimum k-mer threshold (0=pigeonhole principle) (default: 1)

-ol, --overlap-length NUM set the overlap length of adjacent q-grams (pigeonhole mode) (default: 0)

-oc, --overabundance-cut NUM set k-mer overabundance cut ratio (default: 1)

-rl, --repeat-length NUM set simple-repeat length threshold (default: 1000)

-tl, --taboo-length NUM set taboo length (default: 1)

-lf, --load-factor NUM set the load factor for the open addressing q-gram index (default: 1.6)

Verification Options:

-mN, --match-N 'N' matches with all other characters

-ed, --error-distr FILE write error distribution to FILE

-mf, --mismatch-file FILE write mismatch patterns to FILE

Parallelism Options:

-tc, --thread-count NUM Set the number of threads to use (0 to force sequential mode). (default: 1)

-pws, --parallel-window-size NUM Collect SWIFT hits in windows of this length. (default: 500000)

-pvs, --parallel-verification-size NUM Verify SWIFT hits in packages of this size. (default: 100)

-pvmpc, --parallel-verification-max-package-count NUM Largest number of packages to create for verification per thread-1, go over package size if this limit is reached.. (default: 100)

-amms, --available-matches-memory-size NUM Bytes of main memory available for storing matches. Used to switch to external sorting. -1 for always external, 0 for never, other value as threshold. (default: 0)

-mhst, --match-histo-start-threshold NUM When to start histogram. (default: 5)

==3110==

==3110== HEAP SUMMARY:

==3110== in use at exit: 0 bytes in 0 blocks

==3110== total heap usage: 1,826 allocs, 1,826 frees, 235,933 bytes allocated

==3110==

==3110== All heap blocks were freed -- no leaks are possible

==3110==

==3110== For counts of detected and suppressed errors, rerun with: -v

==3110== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

I will come back to you later after I move it.

Cheers,

Dave

--
David Weese weese@inf.fu-berlin.de
Freie Universität Berlin http://www.inf.fu-berlin.de/
Institut für Informatik Phone: +49 30 838 75137
Takustraße 9 Algorithmic Bioinformatics
14195 Berlin Room 020

Am 17.08.2012 um 17:32 schrieb Sébastien Boisvert <sebastien.boisvert.3@ulaval.ca>:

Hello David and others,

After reading the preprint of the RazerS3 paper, I am really looking
toward testing the tool !

>From the readme it says that it can be installed in binary form
or from source code.

In the subversion source tree, there is razers 1.1 and razers 2:

[sboisver12@colosse1 software]$ grep RazerS seqan/core/apps/*/*.cpp |grep version
seqan/core/apps/micro_razers/micro_razers.cpp: cerr << "MicroRazerS version 0.1 20090710 (prerelease)" << rev << endl;
seqan/core/apps/razers2/razers.cpp:     addVersionLine(parser, "RazerS version 2.0 20110518 [" + rev.substr(11, 4) + "]");
seqan/core/apps/razers/razers.cpp:      addVersionLine(parser, "RazerS version 1.1 20100618 [" + rev.substr(11, 4) + "]");
seqan/core/apps/splazers/splazers.cpp: addVersionLine(parser, "RazerS version 1.1 20100618 [" + rev.substr(11, 4) + "]");

And the distributed binary 'razers --help' produces this on my system:

[sboisver12@colosse1 linux64]$ ./razers3 --help
*** glibc detected *** ./razers3: double free or corruption (out): 0x00000000008d8140 ***
======= Backtrace: =========
/lib64/libc.so.6[0x396e27245f]
/lib64/libc.so.6(cfree+0x4b)[0x396e2728bb]
/software/compilers/gcc/4.6.1/lib64/libstdc++.so.6(_ZNSs9_M_mutateEmmm+0x1bf)[0x2aed46c40aef]
/software/compilers/gcc/4.6.1/lib64/libstdc++.so.6(_ZNSs15_M_replace_safeEmmPKcm+0x2c)[0x2aed46c40b2c]
./razers3(_ZN5seqan13RazerSOptionsINS_10RazerSSpecILb0ELb0EEEEC1Ev+0x1d2)[0x57a702]
./razers3(main+0x27)[0x68d147]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x396e21d994]
./razers3[0x53a449]
======= Memory map: ========
00400000-006d8000 r-xp 00000000 c96:9a5e 272865020                       /rap/nne-790-ab/software/RazerS/v3.0-beta1/razers3.0-beta1/linux64/razers3
008d7000-008d8000 rw-p 002d7000 c96:9a5e 272865020                       /rap/nne-790-ab/software/RazerS/v3.0-beta1/razers3.0-beta1/linux64/razers3
008d8000-008da000 rw-p 008d8000 00:00 0
04efd000-04f1e000 rw-p 04efd000 00:00 0                                  [heap]
396de00000-396de1c000 r-xp 00000000 08:01 27394050                       /lib64/ld-2.5.so
396e01c000-396e01d000 r--p 0001c000 08:01 27394050                       /lib64/ld-2.5.so
396e01d000-396e01e000 rw-p 0001d000 08:01 27394050                       /lib64/ld-2.5.so
396e200000-396e34e000 r-xp 00000000 08:01 27394052                       /lib64/libc-2.5.so
396e34e000-396e54e000 ---p 0014e000 08:01 27394052                       /lib64/libc-2.5.so
396e54e000-396e552000 r--p 0014e000 08:01 27394052                       /lib64/libc-2.5.so
396e552000-396e553000 rw-p 00152000 08:01 27394052                       /lib64/libc-2.5.so
396e553000-396e558000 rw-p 396e553000 00:00 0
396ea00000-396ea82000 r-xp 00000000 08:01 27394082                       /lib64/libm-2.5.so
396ea82000-396ec81000 ---p 00082000 08:01 27394082                       /lib64/libm-2.5.so
396ec81000-396ec82000 r--p 00081000 08:01 27394082                       /lib64/libm-2.5.so
396ec82000-396ec83000 rw-p 00082000 08:01 27394082                       /lib64/libm-2.5.so
396ee00000-396ee16000 r-xp 00000000 08:01 27394070                       /lib64/libpthread-2.5.so
396ee16000-396f015000 ---p 00016000 08:01 27394070                       /lib64/libpthread-2.5.so
396f015000-396f016000 r--p 00015000 08:01 27394070                       /lib64/libpthread-2.5.so
396f016000-396f017000 rw-p 00016000 08:01 27394070                       /lib64/libpthread-2.5.so
396f017000-396f01b000 rw-p 396f017000 00:00 0
3971200000-3971207000 r-xp 00000000 08:01 27394084                       /lib64/librt-2.5.so
3971207000-3971407000 ---p 00007000 08:01 27394084                       /lib64/librt-2.5.so
3971407000-3971408000 r--p 00007000 08:01 27394084                       /lib64/librt-2.5.so
3971408000-3971409000 rw-p 00008000 08:01 27394084                       /lib64/librt-2.5.so
2aed46b86000-2aed46b88000 rw-p 2aed46b86000 00:00 0
2aed46ba0000-2aed46c86000 r-xp 00000000 c96:9a5e 241766367               /software/compilers/gcc/4.6.1/lib64/libstdc++.so.6.0.16
2aed46c86000-2aed46e86000 ---p 000e6000 c96:9a5e 241766367               /software/compilers/gcc/4.6.1/lib64/libstdc++.so.6.0.16
2aed46e86000-2aed46e8e000 r--p 000e6000 c96:9a5e 241766367               /software/compilers/gcc/4.6.1/lib64/libstdc++.so.6.0.16
2aed46e8e000-2aed46e90000 rw-p 000ee000 c96:9a5e 241766367               /software/compilers/gcc/4.6.1/lib64/libstdc++.so.6.0.16
2aed46e90000-2aed46ea6000 rw-p 2aed46e90000 00:00 0
2aed46ea6000-2aed46eb3000 r-xp 00000000 c96:9a5e 241766181               /software/compilers/gcc/4.6.1/lib64/libgomp.so.1.0.0
2aed46eb3000-2aed470b2000 ---p 0000d000 c96:9a5e 241766181               /software/compilers/gcc/4.6.1/lib64/libgomp.so.1.0.0
2aed470b2000-2aed470b3000 rw-p 0000c000 c96:9a5e 241766181               /software/compilers/gcc/4.6.1/lib64/libgomp.so.1.0.0
2aed470b3000-2aed470c8000 r-xp 00000000 c96:9a5e 241766007               /software/compilers/gcc/4.6.1/lib64/libgcc_s.so.1
2aed470c8000-2aed472c7000 ---p 00015000 c96:9a5e 241766007          Abandon (core dumped)

Sébastien Boisvert
U. Laval

<-- thread -->

<-- date -->

Follow-Ups:
- Re: [Seqan-dev] RazerS 3
  - From: Sébastien Boisvert <Sebastien.Boisvert.3@ulaval.ca>

References:
- [Seqan-dev] RazerS 3
  - From: Sébastien Boisvert <Sebastien.Boisvert.3@ulaval.ca>

seqan-dev - August 2012 - Archives indexes sorted by:
[ thread ] [ subject ] [ author ] [ date ]
Complete archive of the seqan-dev mailing list
More info on this list...