From jose.duarte@psi.ch Fri May 08 10:27:07 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Yqdce-002y0d-Vt>; Fri, 08 May 2015 10:27:05 +0200 Received: from edge10.ethz.ch ([82.130.75.186]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1Yqdce-002lqq-Ql>; Fri, 08 May 2015 10:27:04 +0200 Received: from CAS12.d.ethz.ch (172.31.38.212) by edge10.ethz.ch (82.130.75.186) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 8 May 2015 10:26:56 +0200 Received: from [129.129.205.109] (129.129.205.109) by mail.ethz.ch (172.31.38.212) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 8 May 2015 10:27:01 +0200 Message-ID: <554C7355.40504@psi.ch> Date: Fri, 8 May 2015 10:27:01 +0200 From: Jose Manuel Duarte User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Content-Type: multipart/alternative; boundary="------------090407050806000100040208" X-Originating-IP: 82.130.75.186 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1431073624-00000DE8-9CB838BB/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.005775, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-2.3 required=5.0 tests=HTML_MESSAGE, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL X-Spam-Checker-Version: SpamAssassin 3.4.0 on Palau.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] lambda_indexer trouble X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 May 2015 08:27:07 -0000 --------------090407050806000100040208 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit I am having a lot of trouble using lambda_indexer to index full UniRef100 fasta files. I followed the instructions in the lamda website: % /path/to/segmasker -infmt fasta -in db.fasta -outfmt interval -out db.seg % bin/lambda_indexer -d db.fasta -s db.seg I first tried with the current UniRef100 release (2015_05), which is huge (26GB uncompressed fasta file) and then I came across the memory problems that are documented in lambda_indexer's help. So I ended up using "-a skew7ext" and ran it in the largest memory system I had available (128GB). The program ran, but at some point after "Generating Index..." it died with a segfault and no other information. Then I decided to try on a smaller UniRef, so I took an older version (2012_06, only 8GB uncompressed fasta file). I ran again with "-a skew7ext" and this time it did go further, but eventually also died: Dumping unreduced Subj Sequences… done. Generating Index…Asynchronous I/O operation failed (waitFor): "Success" fildes: 5 buffer: 7fcb59d7f000 offset: 1ee3e0000 nbytes: 20000 event: 1 Raddr: 0x2566998 /home/mi/h4nn3s/takifugu/seqan-lambda-v0.4.7/core/include/seqan/file/file_page.h:752 FAILED! (WRITING operation could not be completed: "Success") stack trace: 0 [0x5263be] seqan::ClassTest::fail() + 0xe 1 [0x4dfeb5] ./bin/lambda_indexer() 2 [0x5b1471] seqan::PageChain >, seqan::PageFrame >, seqan::Dynamic> > >::getRead yPage() + 0x171 3 [0x5b2b09] seqan::Handler >, seqan::MapperSpec >, unsigned long>, unsigned long, seqan::File > > > >, seqan::MapperAsyncWriter>::_writeBucket(seqan::PageBucket > >&, unsigned int) + 0xa9 4 [0x6a4f97] bool seqan::Pipe >, seqan::MapperSpec >, unsigned long>, unsigned long, seqan::File > > > >, seqan::Filter >, unsigned long> > >, seqan::Skew7>::process >, seqan::MapperSpec >, unsigned long>, unsigned long, seqan::File > > > >, seqan::Filter >, unsigned long> > > >(seqan::Pipe >, seqan::MapperSpec >, unsigned long>, unsigned long, seqan::File > > > >, seqan::Filter >, unsigned long> > >&) + 0x1be7 5 [0x6a6811] seqan::Pipe >, seqan::MapperSpec >, unsigned long>, unsigned long, seqan::File > > > >, seqan::Filter >, unsigned long> > >, seqan::Skew7>::Pipe(seqan::Pipe >, seqan::MapperSpec >, unsigned long>, unsigned long, seqan::File > > > >, seqan::Filter >, unsigned long> > >&) + 0x971 I'm pretty sure I have enough space available on the disk where I'm running it (>100GB). Is there anything obvious that I am doing wrong? Do you guys have any experience in indexing large files like this? Apologies in advance if the message does not belong in the dev list. I couldn't find any more appropriate place to post it. I'd like to confirm this is a bug rather than me misusing the software before submitting an issue to github. Any pointers are very much appreciated Thanks Jose --------------090407050806000100040208 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 8bit I am having a lot of trouble using lambda_indexer to index full UniRef100 fasta files. I followed the instructions in the lamda website:

% /path/to/segmasker -infmt fasta -in db.fasta -outfmt interval -out db.seg

% bin/lambda_indexer -d db.fasta -s db.seg


I first tried with the current UniRef100 release (2015_05), which is huge (26GB uncompressed fasta file) and then I came across the memory problems that are documented in lambda_indexer's help. So I ended up using "-a skew7ext" and ran it in the largest memory system I had available (128GB). The program ran, but at some point after "Generating Index..." it died with a segfault and no other information.

Then I decided to try on a smaller UniRef, so I took an older version (2012_06, only 8GB uncompressed fasta file). I ran again with "-a skew7ext" and this time it did go further, but eventually also died:



Dumping unreduced Subj Sequences… done.
Generating Index…Asynchronous I/O operation failed (waitFor): "Success"
fildes:  5
buffer:  7fcb59d7f000
offset:  1ee3e0000
nbytes:  20000
event:   1
Raddr:   0x2566998
/home/mi/h4nn3s/takifugu/seqan-lambda-v0.4.7/core/include/seqan/file/file_page.h:752 FAILED!  (WRITING operation could not be completed: "Success")

stack trace:
  0          [0x5263be]  seqan::ClassTest::fail() + 0xe
  1          [0x4dfeb5]  ./bin/lambda_indexer()
  2          [0x5b1471]  seqan::PageChain<seqan::Buffer<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::PageFrame<seqan::File<seqan::Async<void> >, seqan::Dynamic> > >::getRead
yPage() + 0x171
  3          [0x5b2b09]  seqan::Handler<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<seqan::_skew7NMapSliced<seqan::Pair<unsig
ned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::MapperAsyncWriter>::_writeBucket(seqan::PageBucket<seqan::Pair<unsigned lo
ng, unsigned long, seqan::Tag<seqan::Pack_> > >&, unsigned int) + 0xa9
  4          [0x6a4f97]  bool seqan::Pipe<seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<seqan::filterI1<seqan::Pair
<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigned long, unsigned long, seq
an::Tag<seqan::Pack_> >, unsigned long> > >, seqan::Skew7>::process<seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<s
eqan::filterI1<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigne
d long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long> > > >(seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSi
ze<seqan::filterI1<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long> > >&) + 0x1be7
  5          [0x6a6811]  seqan::Pipe<seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<seqan::filterI1<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long> > >, seqan::Skew7>::Pipe(seqan::Pipe<seqan::Pool<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, seqan::MapperSpec<seqan::MapperConfigSize<seqan::filterI1<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long>, unsigned long, seqan::File<seqan::Async<void> > > > >, seqan::Filter<seqan::filterI2<seqan::Pair<unsigned long, unsigned long, seqan::Tag<seqan::Pack_> >, unsigned long> > >&) + 0x971




I'm pretty sure I have enough space available on the disk where I'm running it (>100GB). Is there anything obvious that I am doing wrong? Do you guys have any experience in indexing large files like this?

Apologies in advance if the message does not belong in the dev list. I couldn't find any more appropriate place to post it. I'd like to confirm this is a bug rather than me misusing the software before submitting an issue to github.

Any pointers are very much appreciated

Thanks

Jose
--------------090407050806000100040208-- From hannes.hauswedell@fu-berlin.de Fri May 08 11:20:08 2015 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1YqeRy-0032hU-M3>; Fri, 08 May 2015 11:20:06 +0200 Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost.zedat.fu-berlin.de (Exim 4.85) with esmtp (envelope-from ) id <1YqeRy-0042Ub-Ks>; Fri, 08 May 2015 11:20:06 +0200 Received: from celegans.imp.fu-berlin.de ([160.45.111.134]) by inpost2.zedat.fu-berlin.de (Exim 4.85) with esmtpsa (envelope-from ) id <1YqeRy-0039E8-Jp>; Fri, 08 May 2015 11:20:06 +0200 From: Hannes Hauswedell To: seqan-dev@lists.fu-berlin.de Date: Fri, 08 May 2015 11:21:43 +0200 Message-ID: <37661189.r85dM8J3Ni@celegans.imp.fu-berlin.de> Organization: MPI MolGen / FU-Berlin In-Reply-To: <554C7355.40504@psi.ch> References: <554C7355.40504@psi.ch> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-Originating-IP: 160.45.111.134 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1431076806-00000DE8-3127E67A/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-50.0 required=5.0 tests=ALL_TRUSTED, T_FILL_THIS_FORM_SHORT X-Spam-Checker-Version: SpamAssassin 3.4.0 on Tokelau.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] lambda_indexer trouble X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 May 2015 09:20:08 -0000 Dear Jose, I am sorry to hear that it is not working for you as expected. The next= Lambda=20 version will contain indexing methods that are more more memory efficie= nt. I=20 will still try to answer your questions between the lines: Am Freitag, 8. Mai 2015, 10:27:01 schrieb Jose Manuel Duarte: > I am having a lot of trouble using lambda_indexer to index full > UniRef100 fasta files. I followed the instructions in the lamda websi= te: >=20 > % /path/to/segmasker -infmt fasta -in db.fasta -outfmt interval -out = db.seg >=20 > % bin/lambda_indexer -d db.fasta -s db.seg That's correct. > I first tried with the current UniRef100 release (2015_05), which is > huge (26GB uncompressed fasta file) and then I came across the memory= > problems that are documented in lambda_indexer's help. So I ended up > using "-a skew7ext" and ran it in the largest memory system I had > available (128GB). The program ran, but at some point after "Generati= ng > Index..." it died with a segfault and no other information. If this is a different error from the one below, it is unexpected. Can = you=20 open an issue for this in the seqan bug tracker with a link to the exac= t file=20 used? Please note that the requirements for free disk space for skew ar= e very=20 high (see below). > Then I decided to try on a smaller UniRef, so I took an older version= > (2012_06, only 8GB uncompressed fasta file). I ran again with "-a > skew7ext" and this time it did go further, but eventually also died: >=20 > Dumping unreduced Subj Sequences=E2=80=A6 done. > Generating Index=E2=80=A6Asynchronous I/O operation failed (waitFor):= "Success" > [...] This is always an indicator of running out of disk space in the TMPDIR.= > I'm pretty sure I have enough space available on the disk where I'm > running it (>100GB). Is there anything obvious that I am doing wrong?= Do > you guys have any experience in indexing large files like this? Indeed the requirements for disk space are quite high for skew. As desc= ribed=20 in the help-page, I have measured 30x. So if your file is 8GB and say 6= GB of=20 this is sequence data, than the external space requirement might well b= e=20 180GB... You might want to try the quicksort or quicksortbuckets algorithms. The= don't=20 require external disk space and if you have 128GB of RAM, this should b= e=20 enough to build the index for your 8GB file. > Apologies in advance if the message does not belong in the dev list. = I > couldn't find any more appropriate place to post it. I'd like to conf= irm > this is a bug rather than me misusing the software before submitting = an > issue to github. Feel free to write to this list, post on github or write to me directly= ! All=20 ways are accepted :) Best regards, --=20 Hannes Hauswedell PhD student Max Planck Institute for Molecular Genetics / Freie Universit=C3=A4t Be= rlin address Institut f=C3=BCr Informatik Takustra=C3=9Fe 9 Room 019 14195 Berlin telephone +49 (0)30 838-75241 fax +49 (0)30 838-75218 e-mail hannes.hauswedell@[molgen.mpg.de|fu-berlin.de] From jose.duarte@psi.ch Fri May 08 12:55:33 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1YqfwJ-0038ls-J5>; Fri, 08 May 2015 12:55:31 +0200 Received: from edge20.ethz.ch ([82.130.99.26]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1YqfwJ-003UOi-FB>; Fri, 08 May 2015 12:55:31 +0200 Received: from CAS20.d.ethz.ch (172.31.51.110) by edge20.ethz.ch (82.130.99.26) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 8 May 2015 12:55:22 +0200 Received: from [129.129.205.109] (129.129.205.109) by mail.ethz.ch (172.31.51.110) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 8 May 2015 12:55:28 +0200 Message-ID: <554C9620.9090104@psi.ch> Date: Fri, 8 May 2015 12:55:28 +0200 From: Jose Manuel Duarte User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Hannes Hauswedell , References: <554C7355.40504@psi.ch> <37661189.r85dM8J3Ni@celegans.imp.fu-berlin.de> In-Reply-To: <37661189.r85dM8J3Ni@celegans.imp.fu-berlin.de> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: 82.130.99.26 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1431082531-00000DE8-6459C926/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.004505, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL X-Spam-Checker-Version: SpamAssassin 3.4.0 on Tuvalu.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] lambda_indexer trouble X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 May 2015 10:55:33 -0000 Hi Hannes Thanks so much for the answers. Some comments below > If this is a different error from the one below, it is unexpected. Can you > open an issue for this in the seqan bug tracker with a link to the exact file > used? Please note that the requirements for free disk space for skew are very > high (see below). The error looked different, just a segfault without a trace. But of course it must have been the disk space as you explain. > Indeed the requirements for disk space are quite high for skew. As described > in the help-page, I have measured 30x. So if your file is 8GB and say 6GB of > this is sequence data, than the external space requirement might well be > 180GB... > > You might want to try the quicksort or quicksortbuckets algorithms. The don't > require external disk space and if you have 128GB of RAM, this should be > enough to build the index for your 8GB file. > Alright I'll try that, thanks for the tip. But in the end of the day I would like to index the current UniRef100. Following your estimates I would need something like 600GB of disk space for that... I might be able still to try it, but surely in a few months from now UniRef100 will have a size that will be impossible to deal with. It's great that you guys are already working on new algos for indexing :) On an unrelated note, I also tried out the pre-indexed nr files you guys distribute from your website. There I get this: ./bin/lambda -q query.fasta -d nr/nr.fasta -p blastp LAMBDA - the Local Aligner for Massive Biological DatA ====================================================== Version 0.4.7 Loading Subj Sequences… done. Loading Subj Ids…/home/mi/h4nn3s/takifugu/seqan-lambda-v0.4.7/core/include/seqan/basic/basic_exception.h:345 FAILED! (Uncaught exception of type std::bad_alloc: std::bad_alloc) stack trace: 0 [0xa97a0e] seqan::ClassTest::fail() + 0xe 1 [0x8fd5a2] ./bin/lambda() 2 [0x1510ed6] __cxxabiv1::__terminate(void (*)()) + 0x6 3 [0x1510f03] ./bin/lambda() 4 [0x151131e] ./bin/lambda() 5 [0x151121d] operator new(unsigned long) + 0x7d 6 [0xfb8dd2] void seqan::AssignString_ >::assign_ >, seqan::String >, 4194304u, 2u> > > const>(seqan::String >&, seqan::String >, 4194304u, 2u> > > const&) + 0x192 7 [0x919266] ./bin/lambda() 8 [0xfe74d0] int loadSubjects<(seqan::BlastFormatFile)8, (seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0, seqan::SimpleType > >, seqan::Score, seqan::Blosum62_> >, seqan::FMIndex > >(GlobalDataHolder > >, seqan::Score, seqan::Blosum62_> >, seqan::FMIndex >, (seqan::BlastFormatFile)8, (seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0>&, LambdaOptions const&) + 0x230 9 [0x14e15f5] int argConv2<(seqan::BlastFormatFile)8, (seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0, seqan::SimpleType > > >(LambdaOptions const&, seqan::Tag > const&, seqan::SimpleType > > const&) + 0x335 10 [0x15078ec] argConv0(LambdaOptions const&) + 0x6c 11 [0x8e5cb7] main + 0x3e7 12 [0x7fea204d0a40] __libc_start_main + 0xf0 13 [0x8e6299] ./bin/lambda() Aborted (core dumped) I am assuming that the nr db is protein and that I can do protein queries (blastp) against it, is that right? Thanks for all the help! Jose From jose.duarte@psi.ch Sat May 09 12:31:31 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Yr22b-001XOj-ME>; Sat, 09 May 2015 12:31:29 +0200 Received: from edge10.ethz.ch ([82.130.75.186]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1Yr22b-002scN-ID>; Sat, 09 May 2015 12:31:29 +0200 Received: from CAS10.d.ethz.ch (172.31.38.210) by edge10.ethz.ch (82.130.75.186) with Microsoft SMTP Server (TLS) id 14.3.195.1; Sat, 9 May 2015 12:31:19 +0200 Received: from [192.168.1.146] (178.82.25.121) by mail.ethz.ch (172.31.38.210) with Microsoft SMTP Server (TLS) id 14.3.195.1; Sat, 9 May 2015 12:31:26 +0200 Message-ID: <554DE1FE.8050709@psi.ch> Date: Sat, 9 May 2015 12:31:26 +0200 From: Jose Manuel Duarte User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: References: <554C7355.40504@psi.ch> <37661189.r85dM8J3Ni@celegans.imp.fu-berlin.de> <554C9620.9090104@psi.ch> In-Reply-To: <554C9620.9090104@psi.ch> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: 82.130.75.186 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1431167489-00000CF1-93515FBA/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000131, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL X-Spam-Checker-Version: SpamAssassin 3.4.0 on Vanuatu.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] lambda_indexer trouble X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 May 2015 10:31:31 -0000 Just to say that in the end I've managed to index UniRef100 2012_06. So I could finally run lambda and test that it lives up to the expectation. It could do multiple query sequences in less than a second per query! That really is impressing, we will try to switch from blast as soon as possible! For our application (http://www.eppic-web.org) we need speed but not so much sensitivity, we are only interested in homologs > 50% sequence identitiy. Cheers Jose On 08.05.2015 12:55, Jose Manuel Duarte wrote: > Hi Hannes > > Thanks so much for the answers. Some comments below > > >> If this is a different error from the one below, it is unexpected. >> Can you >> open an issue for this in the seqan bug tracker with a link to the >> exact file >> used? Please note that the requirements for free disk space for skew >> are very >> high (see below). > The error looked different, just a segfault without a trace. But of > course it must have been the disk space as you explain. > >> Indeed the requirements for disk space are quite high for skew. As >> described >> in the help-page, I have measured 30x. So if your file is 8GB and say >> 6GB of >> this is sequence data, than the external space requirement might well be >> 180GB... >> >> You might want to try the quicksort or quicksortbuckets algorithms. >> The don't >> require external disk space and if you have 128GB of RAM, this should be >> enough to build the index for your 8GB file. >> > Alright I'll try that, thanks for the tip. But in the end of the day I > would like to index the current UniRef100. Following your estimates I > would need something like 600GB of disk space for that... I might be > able still to try it, but surely in a few months from now UniRef100 > will have a size that will be impossible to deal with. It's great that > you guys are already working on new algos for indexing :) > > On an unrelated note, I also tried out the pre-indexed nr files you > guys distribute from your website. There I get this: > > ./bin/lambda -q query.fasta -d nr/nr.fasta -p blastp > LAMBDA - the Local Aligner for Massive Biological DatA > ====================================================== > Version 0.4.7 > > Loading Subj Sequences… done. > Loading Subj > Ids…/home/mi/h4nn3s/takifugu/seqan-lambda-v0.4.7/core/include/seqan/basic/basic_exception.h:345 > FAILED! (Uncaught exception of type std::bad_alloc: std::bad_alloc) > > stack trace: > 0 [0xa97a0e] seqan::ClassTest::fail() + 0xe > 1 [0x8fd5a2] ./bin/lambda() > 2 [0x1510ed6] __cxxabiv1::__terminate(void (*)()) + 0x6 > 3 [0x1510f03] ./bin/lambda() > 4 [0x151131e] ./bin/lambda() > 5 [0x151121d] operator new(unsigned long) + 0x7d > 6 [0xfb8dd2] void > seqan::AssignString_ > >::assign_ >, > seqan::String seqan::External > >, 4194304u, 2u> > > const>(seqan::String >&, > seqan::String seqan::External > >, 4194304u, 2u> > > const&) + 0x192 > 7 [0x919266] ./bin/lambda() > 8 [0xfe74d0] int loadSubjects<(seqan::BlastFormatFile)8, > (seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0, > seqan::SimpleType seqan::ReducedAminoAcid_ > >, > seqan::Score seqan::AminoAcid_>, seqan::Blosum62_> >, seqan::FMIndex seqan::FMIndexConfig > > >(GlobalDataHolder seqan::ReducedAminoAcid_ > >, > seqan::Score seqan::AminoAcid_>, seqan::Blosum62_> >, seqan::FMIndex seqan::FMIndexConfig >, (seqan::BlastFormatFile)8, > (seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0>&, > LambdaOptions const&) + 0x230 > 9 [0x14e15f5] int argConv2<(seqan::BlastFormatFile)8, > (seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0, > seqan::SimpleType seqan::ReducedAminoAcid_ > > > >(LambdaOptions const&, > seqan::Tag (seqan::BlastFormatProgram)1, (seqan::BlastFormatGeneration)0> > > const&, seqan::SimpleType seqan::ReducedAminoAcid_ > > const&) + 0x335 > 10 [0x15078ec] argConv0(LambdaOptions const&) + 0x6c > 11 [0x8e5cb7] main + 0x3e7 > 12 [0x7fea204d0a40] __libc_start_main + 0xf0 > 13 [0x8e6299] ./bin/lambda() > > Aborted (core dumped) > > > > > > > I am assuming that the nr db is protein and that I can do protein > queries (blastp) against it, is that right? > > Thanks for all the help! > > Jose > > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev From jose.duarte@psi.ch Mon May 11 12:06:14 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1YrkbF-000KmB-9Y>; Mon, 11 May 2015 12:06:13 +0200 Received: from edge10.ethz.ch ([82.130.75.186]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1YrkbF-00489Y-6A>; Mon, 11 May 2015 12:06:13 +0200 Received: from CAS11.d.ethz.ch (172.31.38.211) by edge10.ethz.ch (82.130.75.186) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 11 May 2015 12:06:09 +0200 Received: from [129.129.205.109] (129.129.205.109) by mail.ethz.ch (172.31.38.211) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 11 May 2015 12:06:10 +0200 Message-ID: <55507F03.10600@psi.ch> Date: Mon, 11 May 2015 12:05:55 +0200 From: Jose Manuel Duarte User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: 82.130.75.186 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1431338773-00000CF1-28179767/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.493222, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL X-Spam-Checker-Version: SpamAssassin 3.4.0 on Vanuatu.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] Client-server lambda? X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2015 10:06:14 -0000 After trying lambda a bit, I am quite impressed with the results. One logical question that comes with it, is whether a client-server implementation of the search would be possible. As I understand it, loading the indexed database into memory takes quite some time (> 1 min for my db), then the multiquery search is very fast. That's fine if one has just a one-off multiple query. But for other use-cases one would want to query it from time to time. The optimal way to do that would be to have some kind of client-server system where the queries are served by a server having the db pre-loaded in memory. This is for instance what BLAT or SANSparallel do. Is there any plans to implement such a feature in lambda? That would be an incredibly helpful feature. Cheers Jose From hannes.hauswedell@fu-berlin.de Mon May 11 14:00:09 2015 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1YrmNT-000SCR-TG>; Mon, 11 May 2015 14:00:08 +0200 Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost.zedat.fu-berlin.de (Exim 4.85) with esmtp (envelope-from ) id <1YrmNT-003wju-S8>; Mon, 11 May 2015 14:00:07 +0200 Received: from celegans.imp.fu-berlin.de ([160.45.111.134]) by inpost2.zedat.fu-berlin.de (Exim 4.85) with esmtpsa (envelope-from ) id <1YrmNT-000gH8-R4>; Mon, 11 May 2015 14:00:07 +0200 From: Hannes Hauswedell To: Jose Manuel Duarte Date: Mon, 11 May 2015 14:01:39 +0200 Message-ID: <12955394.izmhf2Peqr@celegans.imp.fu-berlin.de> Organization: MPI MolGen / FU-Berlin In-Reply-To: <55507F03.10600@psi.ch> References: <55507F03.10600@psi.ch> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-Originating-IP: 160.45.111.134 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1431345607-00000CF1-2E9AEC87/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000020, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-50.0 required=5.0 tests=ALL_TRUSTED, T_FILL_THIS_FORM_SHORT X-Spam-Checker-Version: SpamAssassin 3.4.0 on Vanuatu.ZEDAT.FU-Berlin.DE X-Spam-Level: Cc: SeqAn Development Subject: Re: [Seqan-dev] Client-server lambda? X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2015 12:00:09 -0000 Am Montag, 11. Mai 2015, 12:05:55 schrieb Jose Manuel Duarte: > After trying lambda a bit, I am quite impressed with the results. One= > logical question that comes with it, is whether a client-server > implementation of the search would be possible. >=20 > As I understand it, loading the indexed database into memory takes qu= ite > some time (> 1 min for my db), then the multiquery search is very fas= t. > That's fine if one has just a one-off multiple query. But for other > use-cases one would want to query it from time to time. The optimal w= ay > to do that would be to have some kind of client-server system where t= he > queries are served by a server having the db pre-loaded in memory. >=20 > This is for instance what BLAT or SANSparallel do. Is there any plans= to > implement such a feature in lambda? That would be an incredibly helpf= ul > feature. Thanks for the feedback! Most of the use cases reported were very large= query=20 files (where the loading time of the database is small compared to the = total=20 time), so we haven't thought about this, yet. However I do understand t= hat if=20 you repeatedly search few sequences in a large database, the database l= oading=20 time will be a large factor. Have you tried storing the database file (including lambda's files) in = a=20 shared memory filesystem, e.g. /dev/shm ? If you do this all data will = already=20 be in main memory when the program is started -- however it will still = need to=20 be copied around, so of course its not optimal. Also during program run= -time=20 the sequences will both be in the program's allocated memory and in the= shm,=20 so they will effectively use double the space. But it might still be=20= worthwhile for you, I can't say without knowing the exact use-case and=20= hardware available. What we also have planned for one of the next releases is using mmapped= IO for=20 the database loading; this should reduce the initial loading time=20 significantly (although obviously it still has to be loaded from disk o= ne way=20 or another). A seperation into a client-server architecture might be done in the fut= ure,=20 but I can't promise a time-frame for that. Best regards, --=20 Hannes Hauswedell PhD student Max Planck Institute for Molecular Genetics / Freie Universit=C3=A4t Be= rlin address Institut f=C3=BCr Informatik Takustra=C3=9Fe 9 Room 019 14195 Berlin telephone +49 (0)30 838-75241 fax +49 (0)30 838-75218 e-mail hannes.hauswedell@[molgen.mpg.de|fu-berlin.de] From Johannes.Droege@uni-duesseldorf.de Mon May 11 14:23:05 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Yrmjg-000UYK-9Q>; Mon, 11 May 2015 14:23:04 +0200 Received: from mailout-apollo.uni-duesseldorf.de ([134.99.128.36]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1Yrmjg-000PRw-7Y>; Mon, 11 May 2015 14:23:04 +0200 MIME-version: 1.0 Content-transfer-encoding: 8BIT Content-type: text/plain; charset=utf-8 Received: from [172.17.228.46] (meerkat.cs.uni-duesseldorf.de [134.99.112.124]) by mail.rz.uni-duesseldorf.de (Oracle Communications Messaging Server 7.0.5.34.0 64bit (built Oct 14 2014)) with ESMTPA id <0NO600DUTQEE7U90@mail.rz.uni-duesseldorf.de> for seqan-dev@lists.fu-berlin.de; Mon, 11 May 2015 14:23:02 +0200 (MEST) Message-id: <55509F40.5050608@uni-duesseldorf.de> Date: Mon, 11 May 2015 14:23:28 +0200 From: =?UTF-8?B?Sm9oYW5uZXMgRHLDtmdl?= Organization: Helmholtz Centre for Infection Research/Heinrich Heine University User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 To: SeqAn Development References: <55507F03.10600@psi.ch> <12955394.izmhf2Peqr@celegans.imp.fu-berlin.de> In-reply-to: <12955394.izmhf2Peqr@celegans.imp.fu-berlin.de> OpenPGP: url=http://keys.fungs.de/6ea5e4.asc X-Originating-IP: 134.99.128.36 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1431346984-00000CF1-7588D6FC/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.002320, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED X-Spam-Checker-Version: SpamAssassin 3.4.0 on Tokelau.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] Client-server lambda? X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2015 12:23:05 -0000 Hi, memory mapping is a decent solution as it will leave the caching (with frequent access) and other optimizations to the operating system. Multiple processes using the same memory-mapped file will also use the same in-memory cache. Gruß Johannes -- Johannes Dröge, M.Sc. Computational Biology, Helmholtz Centre for Infection Research, Braunschweig Algorithmic Bioinformatics, Heinrich Heine University, Düsseldorf PGP: http://keys.fungs.de/6ea5e4.asc (55F2720303A7F236A94666F20E2360727A6EA5E4) Web: algbio.cs.uni-duesseldorf.de | Tel/Fax: +49 211 81-12644/13464 From jose.duarte@psi.ch Mon May 11 15:11:34 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1YrnUa-000ZNl-Oo>; Mon, 11 May 2015 15:11:32 +0200 Received: from edge10.ethz.ch ([82.130.75.186]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1YrnUa-000aZl-LK>; Mon, 11 May 2015 15:11:32 +0200 Received: from CAS22.d.ethz.ch (172.31.51.112) by edge10.ethz.ch (82.130.75.186) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 11 May 2015 15:11:29 +0200 Received: from [129.129.205.109] (129.129.205.109) by mail.ethz.ch (172.31.51.112) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 11 May 2015 15:11:29 +0200 Message-ID: <5550AA81.5010306@psi.ch> Date: Mon, 11 May 2015 15:11:29 +0200 From: Jose Manuel Duarte User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Hannes Hauswedell References: <55507F03.10600@psi.ch> <12955394.izmhf2Peqr@celegans.imp.fu-berlin.de> In-Reply-To: <12955394.izmhf2Peqr@celegans.imp.fu-berlin.de> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: 82.130.75.186 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1431349892-00000CF1-69A992ED/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.099119, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL X-Spam-Checker-Version: SpamAssassin 3.4.0 on Palau.ZEDAT.FU-Berlin.DE X-Spam-Level: Cc: SeqAn Development Subject: Re: [Seqan-dev] Client-server lambda? X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2015 13:11:34 -0000 > Have you tried storing the database file (including lambda's files) in a > shared memory filesystem, e.g. /dev/shm ? If you do this all data will already > be in main memory when the program is started -- however it will still need to > be copied around, so of course its not optimal. Also during program run-time > the sequences will both be in the program's allocated memory and in the shm, > so they will effectively use double the space. But it might still be > worthwhile for you, I can't say without knowing the exact use-case and > hardware available. I've tried /dev/shm already but it didn't make a difference. Here are the runtimes from 3 consecutive runs (269 sequences in one file against a database that takes 7.7GB in plain text fasta file). Reading from disk: real 1m49.591s real 1m49.259s real 1m49.282s Reading from /dev/shm: real 1m49.480s real 1m49.290s real 1m49.007s As you say the data still needs to be copied around and that is most likely where most of that time is spent (steps "Loading Subj Sequences" and "Loading Subj Ids" seem to be the slow ones). My guess is that there's also a lot of disk buffering happening when it's read from disk (the system I'm running has 128GB of memory and not so loaded at the moment, so I'm sure it has enough memory to keep all the files in the buffer cache), that's my explanation as to why there's not much difference between the disk and the /dev/shm runs. Cheers Jose From bnbowman@gmail.com Sat May 23 00:53:33 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Yvvop-001uJO-Q4>; Sat, 23 May 2015 00:53:31 +0200 Received: from mail-ig0-f180.google.com ([209.85.213.180]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1Yvvop-000Z7G-Gb>; Sat, 23 May 2015 00:53:31 +0200 Received: by igbsb11 with SMTP id sb11so1395395igb.0 for ; Fri, 22 May 2015 15:53:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=vmSR9SkX839GumCKWz4GvgKJ9qZIBPD/BU0Fta872Lw=; b=pMdnVLcRDPTGy7qb0aa6KqWDL5LId/X9dJR1QjOQOQRqPEkSAJ9163GmEnE1alufei XizwIEDnPLHjXY99yAMcrFBZyHOHcghltBHF24mjXgh2uSrgbLUrc/c+xrjBpOINod1y yD0g05Vz+WZHrn+AYjLcxEBwdUAwGeiLMQ33TwdverrY8l8T+1u+FE+yhgbEB/xt83WH jBBkH88VA/GornKXCCLXpr/FhZ5bsaSvo9Z7PKQ5UijjgwrDHiSFw6vneUNy2UNjovER r8Nv6QjamYwjLpMV5FRwASEBZMJuImSGeQ3pL3ZxmAyoJKaVX6lYaHdavpAHdr7PiQ3L BTNg== MIME-Version: 1.0 X-Received: by 10.42.206.9 with SMTP id fs9mr11837095icb.19.1432335208963; Fri, 22 May 2015 15:53:28 -0700 (PDT) Received: by 10.36.84.206 with HTTP; Fri, 22 May 2015 15:53:28 -0700 (PDT) Date: Fri, 22 May 2015 15:53:28 -0700 Message-ID: From: Brett Bowman To: SeqAn Development Content-Type: multipart/alternative; boundary=20cf303f64d05466ab0516b387cb X-Originating-IP: 209.85.213.180 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1432335211-00000CF1-2E4A975D/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.071457, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.7 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,T_DKIM_INVALID X-Spam-Checker-Version: SpamAssassin 3.4.1 on Vanuatu.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] bandedChainAlignment default failing due to default k X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 May 2015 22:53:33 -0000 --20cf303f64d05466ab0516b387cb Content-Type: text/plain; charset=UTF-8 I'm trying to align two highly similar sequences found via Kmer search: Query = "ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT" Reference = "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT" The expected alignment looks like this: Score: 80 0 . : . : . : . : . ATCTCTCTCAACAA-AACAAC-GAGGAGGAGTGAAAAGAGAGAGAT |||||||||||||| |||||| ||||||||| |||||||||||||| ATCTCTCTCAACAACAACAACGGAGGAGGAG-GAAAAGAGAGAGAT But when I align it using the default values suggested by the tutorial, it doesn't show any inserted gaps at all, and I wind up with this instead: Score: 80 0 . : . : . : . : ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT |||||||||||||| | | | | | | ||| ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGA I finally traced it down to the k-value (bandExtension value) passed into the alignment algorithm - values of K <= 13 succeed and generate the top-most alignment, while the values of 14-15 like the default (15) report the low-quality alignment. Yet oddly, both alignments report the correct alignment score at the end - so it's not failing, precisely. It's just not storing or displaying the correct alignment. So I have two questions: 1) What exactly does the k / bandExtension variable do? 2) What is going on here? My code is pasted below for your use. Sincerely, -Brett """ #include using namespace seqan; DnaString query = "ATCTCTCTCAACAAAACAACGAGGAGGAGTGAAAAGAGAGAGAT"; DnaString ref = "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT"; typedef Seed TSeed; String seedChain; appendValue(seedChain, TSeed( 0, 0, 14)); appendValue(seedChain, TSeed(30, 31, 14)); Score scoringScheme(2, -1, -2); Align alignment1; resize(rows(alignment1), 2); assignSource(row(alignment1, 0), query); assignSource(row(alignment1, 1), ref); Align alignment2; resize(rows(alignment2), 2); assignSource(row(alignment2, 0), query); assignSource(row(alignment2, 1), ref); int result1 = bandedChainAlignment(alignment1, seedChain, scoringScheme, 14); std::cout << "Score: " << result1 << std::endl; std::cout << alignment1 << std::endl; int result2 = bandedChainAlignment(alignment2, seedChain, scoringScheme, 13); std::cout << "Score: " << result2 << std::endl; std::cout << alignment2 << std::endl; """ --20cf303f64d05466ab0516b387cb Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I'm trying to align two highly similar sequences found= via Kmer search:
Query =3D "ATCTCTCTCAACAAAACAACGAGGAGG= AGTGAAAAGAGAGAGAT"
Reference =3D "ATCTCTCTCAACAACAACAAC= GGAGGAGGAGGAAAAGAGAGAGAT"

The expec= ted alignment looks like this:

Score: 80
=C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 . =C2=A0 =C2=A0: =C2=A0 =C2=A0. = =C2=A0 =C2=A0: =C2=A0 =C2=A0. =C2=A0 =C2=A0: =C2=A0 =C2=A0. =C2=A0 =C2=A0: = =C2=A0 =C2=A0. =C2=A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ATCTCTCTCAACAA-= AACAAC-GAGGAGGAGTGAAAAGAGAGAGAT
=C2=A0 =C2=A0 =C2=A0 =C2=A0 |||||= ||||||||| |||||| ||||||||| ||||||||||||||
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 ATCTCTCTCAACAACAACAACGGAGGAGGAG-GAAAAGAGAGAGAT

<= /div>
But when I align it using the default values suggested by the tut= orial, it doesn't show any inserted gaps at all, and I wind up with thi= s instead:

Score: 80
=C2=A0 =C2=A0 = =C2=A0 0 =C2=A0 =C2=A0 . =C2=A0 =C2=A0: =C2=A0 =C2=A0. =C2=A0 =C2=A0: =C2= =A0 =C2=A0. =C2=A0 =C2=A0: =C2=A0 =C2=A0. =C2=A0 =C2=A0: =C2=A0 =C2=A0=C2= =A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ATCTCTCTCAACAAAACA= ACGAGGAGGAGTGAAAAGAGAGAGAT
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ||||||||||= |||| | =C2=A0| =C2=A0 | =C2=A0| =C2=A0| | ||| =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ATCTCTCTCAACAACAACAACGGAGGAGGA= GGAAAAGAGAGAGA

I finally traced it down to t= he k-value (bandExtension value) passed into the alignment algorithm - valu= es of K <=3D 13 succeed and generate the top-most alignment, while the v= alues of 14-15 like the default (15) report the low-quality alignment. =C2= =A0

Yet oddly, both alignments report the correct = alignment score at the end - so it's not failing, precisely.=C2=A0 It&#= 39;s just not storing or displaying the correct alignment.

So I have two questions:
1) What exactly does the =C2=A0= k / bandExtension variable do?
2) What is going on here?

My code is pasted below for your use.

=
Sincerely,
-Brett

"""= ;
=C2=A0 =C2=A0 =C2=A0#include <seqan/seeds.h>

=C2=A0 =C2=A0 =C2=A0using namespace seqan;

=
=C2=A0 =C2=A0 =C2=A0DnaString query =3D "ATCTCTCTCAACAAAACA= ACGAGGAGGAGTGAAAAGAGAGAGAT";
=C2=A0 =C2=A0 =C2=A0DnaString r= ef =C2=A0 =3D "ATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGAT";
=C2=A0
=C2=A0 =C2=A0 =C2=A0typedef Seed<Simple> TSe= ed;
=C2=A0 =C2=A0 =C2=A0String<TSeed> seedChain;
= =C2=A0 =C2=A0 =C2=A0appendValue(seedChain, TSeed( 0, =C2=A00, 14));
=C2=A0 =C2=A0 =C2=A0appendValue(seedChain, TSeed(30, 31, 14));
=C2=A0 =C2=A0 =C2=A0Score<int, Simple> scoringScheme(2, -1, -2);
=C2=A0
=C2=A0 =C2=A0 =C2=A0Align<DnaString, ArrayGaps&g= t; alignment1;
=C2=A0 =C2=A0 =C2=A0resize(rows(alignment1), 2);
=C2=A0 =C2=A0 =C2=A0assignSource(row(alignment1, 0), query);
=
=C2=A0 =C2=A0 =C2=A0assignSource(row(alignment1, 1), ref);
= =C2=A0
=C2=A0 =C2=A0 =C2=A0Align<DnaString, ArrayGaps> alig= nment2;
=C2=A0 =C2=A0 =C2=A0resize(rows(alignment2), 2);
=C2=A0 =C2=A0 =C2=A0assignSource(row(alignment2, 0), query);
= =C2=A0 =C2=A0 =C2=A0assignSource(row(alignment2, 1), ref);
=C2=A0=
=C2=A0 =C2=A0 =C2=A0int result1 =3D bandedChainAlignment(alignme= nt1, seedChain, scoringScheme, 14);
=C2=A0 =C2=A0 =C2=A0std::cout= << "Score: " << result1 << std::endl;
=C2=A0 =C2=A0 =C2=A0std::cout << alignment1 << std::endl;
=C2=A0
=C2=A0 =C2=A0 =C2=A0int result2 =3D bandedChainAlig= nment(alignment2, seedChain, scoringScheme, 13);
=C2=A0 =C2=A0 = =C2=A0std::cout << "Score: " << result2 << std:= :endl;
=C2=A0 =C2=A0 =C2=A0std::cout << alignment2 <<= std::endl;
"""
--20cf303f64d05466ab0516b387cb-- From bnbowman@gmail.com Tue May 26 23:58:03 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1YxMrJ-001EDT-NG>; Tue, 26 May 2015 23:58:01 +0200 Received: from mail-ig0-f182.google.com ([209.85.213.182]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1YxMrJ-000DWy-HW>; Tue, 26 May 2015 23:58:01 +0200 Received: by igbyr2 with SMTP id yr2so70240053igb.0 for ; Tue, 26 May 2015 14:57:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=UUhE3z1cQ6nSyzyhwjxi6YL3Ds6SC77HJ2+oIzcGacA=; b=NBNLWCMv3gfwpHWRUNiuJWdbPbJGOb4pOXtr41HE+NhHVONHHR6JpzKfaTe2jPenXT muqp0WflaahEFibBXKjdapIhBvLvhTNJV0tyRqZPrN2Hrkep3YGsEakEaiTEvklsLSSS IzbgX63dskXzYM6X3afScfi2Jt4H9sJ7ZyTMhtrGLt0NHc0PQpa8RCJ7A/24+p/6wTmq UP7zlcNiHhWWX2BeWXVXP+oCNR/Lmt8VXLBoR2YfwJwvw+z0W6fkhcZyHGSADolcaDKy leawsmFlf7+oPN4jzdVEh6lAHRGMVdO0Hr7q0FUKJdyHs2WJe6LK+1iIrisYjte6D4/j bngA== MIME-Version: 1.0 X-Received: by 10.50.39.105 with SMTP id o9mr32695654igk.39.1432677478793; Tue, 26 May 2015 14:57:58 -0700 (PDT) Received: by 10.36.84.206 with HTTP; Tue, 26 May 2015 14:57:58 -0700 (PDT) Date: Tue, 26 May 2015 14:57:58 -0700 Message-ID: From: Brett Bowman To: SeqAn Development Content-Type: multipart/alternative; boundary=047d7bdca06c3388e10517033891 X-Originating-IP: 209.85.213.182 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1432677481-00000CF1-3C05B96A/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.384588, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=0.9 required=5.0 tests=FREEMAIL_FROM, HTML_IMAGE_ONLY_12,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,T_DKIM_INVALID,T_REMOTE_IMAGE X-Spam-Checker-Version: SpamAssassin 3.4.1 on Niue.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] Asymmetric Scoring of Insertions / Deletions in Alignments X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 May 2015 21:58:03 -0000 --047d7bdca06c3388e10517033891 Content-Type: text/plain; charset=UTF-8 I'd like to create some scoring schemes with asymmetric weights for insertion / deletion errors for working with single-molecule sequencing data. Raw data from both PacBio and Oxford Nanopore have known insertion-biases, and so optimal scoring schemes need to penalize those errors less than deletions. This appears to be partially supported by the SeqAn API already, since the "Score" class has separate interface functions for "scoreGapHorizontal" and "scoreGapVertical", but I can't see any existing specifications that utilize them. Is there an allowed / recommended way to do this currently, or do I need to create my own Score specialization? Sincerely, -Brett --047d7bdca06c3388e10517033891 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I'd like to create some scoring schemes with asym= metric weights for insertion / deletion errors for working with single-mole= cule sequencing data.=C2=A0 Raw data from both PacBio and Oxford Nanopore h= ave known insertion-biases, and so optimal scoring schemes need to penalize= those errors less than deletions.=C2=A0

This appe= ars to be partially supported by the SeqAn API already, since the "Sco= re" class has separate interface functions for "scoreGapHorizonta= l" and "scoreGapVertical", but I can't see any existing = specifications that utilize them.

Is there an allo= wed / recommended way to do this currently, or do I need to create my own S= core specialization?

Sincerely,
-Brett
--047d7bdca06c3388e10517033891-- From vowinkel.alexander@gmail.com Fri May 29 23:35:49 2015 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1YyRwR-0027tO-C6>; Fri, 29 May 2015 23:35:47 +0200 Received: from mail-oi0-f46.google.com ([209.85.218.46]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1YyRwR-003jLm-6L>; Fri, 29 May 2015 23:35:47 +0200 Received: by oifu123 with SMTP id u123so66241752oif.1 for ; Fri, 29 May 2015 14:35:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=uWnNN38yo7+7fxj4e1PUCkkD6KTcX5r9wJ+fmhGC5vs=; b=rSuNpfX3kEzhDdTNkQUT+MtakfWmdXDm+RzGrjyd6L14XeGiMQvQ4ofoCDZrAbSan0 oRrKjHqunXz5oxvm72NNmob3SYj8J4vjxvRdXuE1GWDiqXbHn1e+AyVOtJYMYVpZCrnz vzO7PMID3SHLJWRmAgTXNqhSghFBowvfWO6rq+ahUAkyZq/nhg3uVVxuExoZ7j2sFZYn 04Rgy9peDIQcLmAgaMV7WnkNVoAabTBgYSXeq9Gw1RplXN7m14U58FfukSJcTbIkb30N lPre9P7Z1CvTW9ScLRHFub3OsjHPcD2Rg7CH9g3BOpPPmWsSbQzmYs0bvXOnOMX/o2Tq 3GVg== MIME-Version: 1.0 X-Received: by 10.182.5.4 with SMTP id o4mr6376856obo.67.1432935344961; Fri, 29 May 2015 14:35:44 -0700 (PDT) Received: by 10.76.27.132 with HTTP; Fri, 29 May 2015 14:35:44 -0700 (PDT) Date: Fri, 29 May 2015 16:35:44 -0500 Message-ID: From: Alexander Vowinkel To: seqan-dev@lists.fu-berlin.de Content-Type: multipart/alternative; boundary=001a1134b02c38fff205173f42aa X-Originating-IP: 209.85.218.46 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1432935347-00000CF1-864D8ABE/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.7 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,T_DKIM_INVALID X-Spam-Checker-Version: SpamAssassin 3.4.1 on Palau.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] String of SeqFileOut X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 May 2015 21:35:49 -0000 --001a1134b02c38fff205173f42aa Content-Type: text/plain; charset=UTF-8 Hi, I want to store some SeqFileOut in a seqan::String. I tried both: Using a *seqan::String* and unsing a *seqan::String* Both fails. I don't understand why. Can someone help me with this? Now I'm using std::vector which does the job, but I'd like to use this as a Property Map for a Graph which needs to be a String if I got that right. Thanks! Alexander int main() > { > typedef seqan::SeqFileOut TFilePointer; > typedef seqan::String TFilePointerSet; > TFilePointerSet files; > seqan::SeqFileIn inputFile1; > append(files, inputFile1); > seqan::SeqFileIn inputFile2; > append(files, inputFile2); > return 0; > } --001a1134b02c38fff205173f42aa Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I want to store some SeqFileOut in = a seqan::String.
I tried both:
Using a seqan::String= <seqan::SeqFileOut>
and unsing a seqan::String<se= qan::SeqFileOut *>

Both fails. I don't = understand why.
Can someone help me with this?
Now = I'm using std::vector<seqan::SeqFileOut *> which
does t= he job, but I'd like to use this as a Property Map
for a Grap= h which needs to be a String if I got that right.

= Thanks!
Alexander

int m= ain()
{
=C2=A0 =C2=A0 typedef seqan::SeqFileOut TFilePointer;
=C2= =A0 =C2=A0 typedef seqan::String<TFilePointer> TFilePointerSet;
= =C2=A0 =C2=A0 TFilePointerSet files;
=C2=A0 =C2=A0 seqan::SeqFileIn inpu= tFile1;
=C2=A0 =C2=A0 append(files, inputFile1);
=C2=A0 =C2=A0 seqan:= :SeqFileIn inputFile2;
=C2=A0 =C2=A0 append(files, inputFile2);
=C2= =A0 =C2=A0 return 0;
}
--001a1134b02c38fff205173f42aa-- From rene.maerker@fu-berlin.de Sat May 30 20:34:35 2015 Received: from outpost9.zedat.fu-berlin.de ([130.133.4.95]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Yylab-003IrX-0m>; Sat, 30 May 2015 20:34:33 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Yylaa-003cVH-Sf>; Sat, 30 May 2015 20:34:32 +0200 Received: from cas1.campus.fu-berlin.de ([130.133.170.201]) by relay2.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (envelope-from ) id <1Yylaa-001DNZ-JJ>; Sat, 30 May 2015 20:34:32 +0200 Received: from EX03A.campus.fu-berlin.de ([130.133.170.134]) by CAS1.campus.fu-berlin.de ([130.133.170.201]) with mapi id 14.03.0224.002; Sat, 30 May 2015 20:34:31 +0200 From: =?utf-8?B?UmFobiwgUmVuw6k=?= To: SeqAn Development Thread-Topic: [Seqan-dev] String of SeqFileOut Thread-Index: AQHQmld0KPq2Jmv3d0ys1RCn2jC1252UuFOA Message-ID: <16CCF226-056F-42BF-8898-393F494F1D1F@campus.fu-berlin.de> References: In-Reply-To: Accept-Language: de-DE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_16CCF226056F42BF8898393F494F1D1Fcampusfuberlinde_" MIME-Version: 1.0 Date: Sat, 30 May 2015 20:34:31 +0200 X-Original-Date: Sat, 30 May 2015 18:34:31 +0000 X-Originating-IP: 130.133.170.201 X-ZEDAT-Hint: XA X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1433010873-00000CF1-B0315C78/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-50.0 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE X-Spam-Checker-Version: SpamAssassin 3.4.1 on Kiribati.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] String of SeqFileOut X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.16 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 May 2015 18:34:35 -0000 --_000_16CCF226056F42BF8898393F494F1D1Fcampusfuberlinde_ Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 SGkgQWxleGFuZGVyLA0KDQpwbGVhc2UgdXNlIGFwcGVuZFZhbHVlIGluc3RlYWQgb2YgYXBwZW5k Lg0KVGhlIGZ1bmN0aW9uIGFwcGVuZCBjb25jYXRlbmF0ZXMgb25lIHN0cmluZyB0byBhbm90aGVy Lg0KSW4gb3JkZXIgdG8ganVzdCBhZGQgYSBuZXcgdmFsdWUgYXQgdGhlIGVuZCB5b3UgbmVlZCB0 byBjYWxsIGFwcGVuZFZhbHVlLg0KDQpJSFRIIQ0KDQpjaGVlcnMsDQoNClJlbsOpDQoNCkFtIE1h eSAyOSwgMjAxNSB1bSAyMzozNSBzY2hyaWViIEFsZXhhbmRlciBWb3dpbmtlbCA8dm93aW5rZWwu YWxleGFuZGVyQGdtYWlsLmNvbTxtYWlsdG86dm93aW5rZWwuYWxleGFuZGVyQGdtYWlsLmNvbT4+ Og0KDQpIaSwNCg0KSSB3YW50IHRvIHN0b3JlIHNvbWUgU2VxRmlsZU91dCBpbiBhIHNlcWFuOjpT dHJpbmcuDQpJIHRyaWVkIGJvdGg6DQpVc2luZyBhIHNlcWFuOjpTdHJpbmc8c2VxYW46OlNlcUZp bGVPdXQ+DQphbmQgdW5zaW5nIGEgc2VxYW46OlN0cmluZzxzZXFhbjo6U2VxRmlsZU91dCAqPg0K DQpCb3RoIGZhaWxzLiBJIGRvbid0IHVuZGVyc3RhbmQgd2h5Lg0KQ2FuIHNvbWVvbmUgaGVscCBt ZSB3aXRoIHRoaXM/DQpOb3cgSSdtIHVzaW5nIHN0ZDo6dmVjdG9yPHNlcWFuOjpTZXFGaWxlT3V0 ICo+IHdoaWNoDQpkb2VzIHRoZSBqb2IsIGJ1dCBJJ2QgbGlrZSB0byB1c2UgdGhpcyBhcyBhIFBy b3BlcnR5IE1hcA0KZm9yIGEgR3JhcGggd2hpY2ggbmVlZHMgdG8gYmUgYSBTdHJpbmcgaWYgSSBn b3QgdGhhdCByaWdodC4NCg0KVGhhbmtzIQ0KQWxleGFuZGVyDQoNCmludCBtYWluKCkNCnsNCiAg ICB0eXBlZGVmIHNlcWFuOjpTZXFGaWxlT3V0IFRGaWxlUG9pbnRlcjsNCiAgICB0eXBlZGVmIHNl cWFuOjpTdHJpbmc8VEZpbGVQb2ludGVyPiBURmlsZVBvaW50ZXJTZXQ7DQogICAgVEZpbGVQb2lu dGVyU2V0IGZpbGVzOw0KICAgIHNlcWFuOjpTZXFGaWxlSW4gaW5wdXRGaWxlMTsNCiAgICBhcHBl bmQoZmlsZXMsIGlucHV0RmlsZTEpOw0KICAgIHNlcWFuOjpTZXFGaWxlSW4gaW5wdXRGaWxlMjsN CiAgICBhcHBlbmQoZmlsZXMsIGlucHV0RmlsZTIpOw0KICAgIHJldHVybiAwOw0KfQ0KX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCnNlcWFuLWRldiBtYWls aW5nIGxpc3QNCnNlcWFuLWRldkBsaXN0cy5mdS1iZXJsaW4uZGU8bWFpbHRvOnNlcWFuLWRldkBs aXN0cy5mdS1iZXJsaW4uZGU+DQpodHRwczovL2xpc3RzLmZ1LWJlcmxpbi5kZS9saXN0aW5mby9z ZXFhbi1kZXYNCg0KLS0tDQoNClJlbsOpIFJhaG4NClBoLkQuIFN0dWRlbnQNCi0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpUZWw6ICAoKzQ5KSAzMCA4MzggNzUyNzcNCk1haWw6IHJl bmUucmFobkBmdS1iZXJsaW4uZGU8bWFpbHRvOnJlbmUucmFobkBmdS1iZXJsaW4uZGU+DQotLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KSW5zdGl0dXRlIG9mIENvbXB1dGVyIFNjaWVu Y2UNCkFsZ29yaXRobWljIEJpb2luZm9ybWF0aWNzIChBQkkpDQotLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLQ0KRnJlaWUgVW5pdmVyc2l0w6R0IEJlcmxpbg0KVGFrdXN0cmHDn2UgOQ0K MTQxOTUgQmVybGluDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KDQo= --_000_16CCF226056F42BF8898393F494F1D1Fcampusfuberlinde_ Content-Type: text/html; charset="utf-8" Content-ID: <323B3CB3978DA445985BF85A60A59379@campus.fu-berlin.de> Content-Transfer-Encoding: base64 PGh0bWw+DQo8aGVhZD4NCjxtZXRhIGh0dHAtZXF1aXY9IkNvbnRlbnQtVHlwZSIgY29udGVudD0i dGV4dC9odG1sOyBjaGFyc2V0PXV0Zi04Ij4NCjwvaGVhZD4NCjxib2R5IHN0eWxlPSJ3b3JkLXdy YXA6IGJyZWFrLXdvcmQ7IC13ZWJraXQtbmJzcC1tb2RlOiBzcGFjZTsgLXdlYmtpdC1saW5lLWJy ZWFrOiBhZnRlci13aGl0ZS1zcGFjZTsiIGNsYXNzPSIiPg0KSGkgQWxleGFuZGVyLA0KPGRpdiBj bGFzcz0iIj48YnIgY2xhc3M9IiI+DQo8L2Rpdj4NCjxkaXYgY2xhc3M9IiI+cGxlYXNlIHVzZSBh cHBlbmRWYWx1ZSBpbnN0ZWFkIG9mIGFwcGVuZC48L2Rpdj4NCjxkaXYgY2xhc3M9IiI+VGhlIGZ1 bmN0aW9uIGFwcGVuZCBjb25jYXRlbmF0ZXMgb25lIHN0cmluZyB0byBhbm90aGVyLjwvZGl2Pg0K PGRpdiBjbGFzcz0iIj5JbiBvcmRlciB0byBqdXN0IGFkZCBhIG5ldyB2YWx1ZSBhdCB0aGUgZW5k IHlvdSBuZWVkIHRvIGNhbGwgYXBwZW5kVmFsdWUuPC9kaXY+DQo8ZGl2IGNsYXNzPSIiPjxiciBj bGFzcz0iIj4NCjwvZGl2Pg0KPGRpdiBjbGFzcz0iIj5JSFRIITwvZGl2Pg0KPGRpdiBjbGFzcz0i Ij48YnIgY2xhc3M9IiI+DQo8L2Rpdj4NCjxkaXYgY2xhc3M9IiI+Y2hlZXJzLDwvZGl2Pg0KPGRp diBjbGFzcz0iIj48YnIgY2xhc3M9IiI+DQo8L2Rpdj4NCjxkaXYgY2xhc3M9IiI+UmVuw6k8L2Rp dj4NCjxkaXYgY2xhc3M9IiI+PGJyIGNsYXNzPSIiPg0KPGRpdj4NCjxibG9ja3F1b3RlIHR5cGU9 ImNpdGUiIGNsYXNzPSIiPg0KPGRpdiBjbGFzcz0iIj5BbSBNYXkgMjksIDIwMTUgdW0gMjM6MzUg c2NocmllYiBBbGV4YW5kZXIgVm93aW5rZWwgJmx0OzxhIGhyZWY9Im1haWx0bzp2b3dpbmtlbC5h bGV4YW5kZXJAZ21haWwuY29tIiBjbGFzcz0iIj52b3dpbmtlbC5hbGV4YW5kZXJAZ21haWwuY29t PC9hPiZndDs6PC9kaXY+DQo8YnIgY2xhc3M9IkFwcGxlLWludGVyY2hhbmdlLW5ld2xpbmUiPg0K PGRpdiBjbGFzcz0iIj4NCjxkaXYgZGlyPSJsdHIiIGNsYXNzPSIiPkhpLA0KPGRpdiBjbGFzcz0i Ij48YnIgY2xhc3M9IiI+DQo8L2Rpdj4NCjxkaXYgY2xhc3M9IiI+SSB3YW50IHRvIHN0b3JlIHNv bWUgU2VxRmlsZU91dCBpbiBhIHNlcWFuOjpTdHJpbmcuPC9kaXY+DQo8ZGl2IGNsYXNzPSIiPkkg dHJpZWQgYm90aDo8L2Rpdj4NCjxkaXYgY2xhc3M9IiI+VXNpbmcgYSA8YiBjbGFzcz0iIj5zZXFh bjo6U3RyaW5nJmx0O3NlcWFuOjpTZXFGaWxlT3V0Jmd0OzwvYj48L2Rpdj4NCjxkaXYgY2xhc3M9 IiI+YW5kIHVuc2luZyBhIDxiIGNsYXNzPSIiPnNlcWFuOjpTdHJpbmcmbHQ7c2VxYW46OlNlcUZp bGVPdXQgKiZndDs8L2I+PC9kaXY+DQo8ZGl2IGNsYXNzPSIiPjxiciBjbGFzcz0iIj4NCjwvZGl2 Pg0KPGRpdiBjbGFzcz0iIj5Cb3RoIGZhaWxzLiBJIGRvbid0IHVuZGVyc3RhbmQgd2h5LjxiciBj bGFzcz0iIj4NCjwvZGl2Pg0KPGRpdiBjbGFzcz0iIj5DYW4gc29tZW9uZSBoZWxwIG1lIHdpdGgg dGhpcz88L2Rpdj4NCjxkaXYgY2xhc3M9IiI+Tm93IEknbSB1c2luZyBzdGQ6OnZlY3RvciZsdDtz ZXFhbjo6U2VxRmlsZU91dCAqJmd0OyB3aGljaDwvZGl2Pg0KPGRpdiBjbGFzcz0iIj5kb2VzIHRo ZSBqb2IsIGJ1dCBJJ2QgbGlrZSB0byB1c2UgdGhpcyBhcyBhIFByb3BlcnR5IE1hcDwvZGl2Pg0K PGRpdiBjbGFzcz0iIj5mb3IgYSBHcmFwaCB3aGljaCBuZWVkcyB0byBiZSBhIFN0cmluZyBpZiBJ IGdvdCB0aGF0IHJpZ2h0LjwvZGl2Pg0KPGRpdiBjbGFzcz0iIj48YnIgY2xhc3M9IiI+DQo8L2Rp dj4NCjxkaXYgY2xhc3M9IiI+VGhhbmtzITwvZGl2Pg0KPGRpdiBjbGFzcz0iIj5BbGV4YW5kZXI8 L2Rpdj4NCjxkaXYgY2xhc3M9IiI+PGJyIGNsYXNzPSIiPg0KPC9kaXY+DQo8ZGl2IGNsYXNzPSIi Pg0KPGJsb2NrcXVvdGUgY2xhc3M9ImdtYWlsX3F1b3RlIiBzdHlsZT0ibWFyZ2luOjBweCAwcHgg MHB4IDAuOGV4O2JvcmRlci1sZWZ0LXdpZHRoOjFweDtib3JkZXItbGVmdC1jb2xvcjpyZ2IoMjA0 LDIwNCwyMDQpO2JvcmRlci1sZWZ0LXN0eWxlOnNvbGlkO3BhZGRpbmctbGVmdDoxZXgiPg0KaW50 IG1haW4oKTxiciBjbGFzcz0iIj4NCns8YnIgY2xhc3M9IiI+DQombmJzcDsgJm5ic3A7IHR5cGVk ZWYgc2VxYW46OlNlcUZpbGVPdXQgVEZpbGVQb2ludGVyOzxiciBjbGFzcz0iIj4NCiZuYnNwOyAm bmJzcDsgdHlwZWRlZiBzZXFhbjo6U3RyaW5nJmx0O1RGaWxlUG9pbnRlciZndDsgVEZpbGVQb2lu dGVyU2V0OzxiciBjbGFzcz0iIj4NCiZuYnNwOyAmbmJzcDsgVEZpbGVQb2ludGVyU2V0IGZpbGVz OzxiciBjbGFzcz0iIj4NCiZuYnNwOyAmbmJzcDsgc2VxYW46OlNlcUZpbGVJbiBpbnB1dEZpbGUx OzxiciBjbGFzcz0iIj4NCiZuYnNwOyAmbmJzcDsgYXBwZW5kKGZpbGVzLCBpbnB1dEZpbGUxKTs8 YnIgY2xhc3M9IiI+DQombmJzcDsgJm5ic3A7IHNlcWFuOjpTZXFGaWxlSW4gaW5wdXRGaWxlMjs8 YnIgY2xhc3M9IiI+DQombmJzcDsgJm5ic3A7IGFwcGVuZChmaWxlcywgaW5wdXRGaWxlMik7PGJy IGNsYXNzPSIiPg0KJm5ic3A7ICZuYnNwOyByZXR1cm4gMDs8YnIgY2xhc3M9IiI+DQp9PC9ibG9j a3F1b3RlPg0KPC9kaXY+DQo8L2Rpdj4NCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fPGJyIGNsYXNzPSIiPg0Kc2VxYW4tZGV2IG1haWxpbmcgbGlzdDxiciBj bGFzcz0iIj4NCjxhIGhyZWY9Im1haWx0bzpzZXFhbi1kZXZAbGlzdHMuZnUtYmVybGluLmRlIiBj bGFzcz0iIj5zZXFhbi1kZXZAbGlzdHMuZnUtYmVybGluLmRlPC9hPjxiciBjbGFzcz0iIj4NCmh0 dHBzOi8vbGlzdHMuZnUtYmVybGluLmRlL2xpc3RpbmZvL3NlcWFuLWRldjxiciBjbGFzcz0iIj4N CjwvZGl2Pg0KPC9ibG9ja3F1b3RlPg0KPC9kaXY+DQo8YnIgY2xhc3M9IiI+DQo8ZGl2IGFwcGxl LWNvbnRlbnQtZWRpdGVkPSJ0cnVlIiBjbGFzcz0iIj4NCjxkaXYgc3R5bGU9ImNvbG9yOiByZ2Io MCwgMCwgMCk7IGZvbnQtZmFtaWx5OiBIZWx2ZXRpY2E7ICBmb250LXN0eWxlOiBub3JtYWw7IGZv bnQtdmFyaWFudDogbm9ybWFsOyBmb250LXdlaWdodDogbm9ybWFsOyBsZXR0ZXItc3BhY2luZzog bm9ybWFsOyBsaW5lLWhlaWdodDogbm9ybWFsOyBvcnBoYW5zOiAyOyB0ZXh0LWFsaWduOiAtd2Vi a2l0LWF1dG87IHRleHQtaW5kZW50OiAwcHg7IHRleHQtdHJhbnNmb3JtOiBub25lOyB3aGl0ZS1z cGFjZTogbm9ybWFsOyB3aWRvd3M6IDI7IHdvcmQtc3BhY2luZzogMHB4OyAtd2Via2l0LXRleHQt c2l6ZS1hZGp1c3Q6IGF1dG87IC13ZWJraXQtdGV4dC1zdHJva2Utd2lkdGg6IDBweDsgd29yZC13 cmFwOiBicmVhay13b3JkOyAtd2Via2l0LW5ic3AtbW9kZTogc3BhY2U7IC13ZWJraXQtbGluZS1i cmVhazogYWZ0ZXItd2hpdGUtc3BhY2U7ICIgY2xhc3M9IiI+DQo8ZGl2IHN0eWxlPSJjb2xvcjog cmdiKDAsIDAsIDApOyBmb250LXZhcmlhbnQ6IG5vcm1hbDsgbGV0dGVyLXNwYWNpbmc6IG5vcm1h bDsgbGluZS1oZWlnaHQ6IG5vcm1hbDsgb3JwaGFuczogMjsgdGV4dC1hbGlnbjogLXdlYmtpdC1h dXRvOyB0ZXh0LWluZGVudDogMHB4OyB0ZXh0LXRyYW5zZm9ybTogbm9uZTsgd2hpdGUtc3BhY2U6 IG5vcm1hbDsgd2lkb3dzOiAyOyB3b3JkLXNwYWNpbmc6IDBweDsgLXdlYmtpdC10ZXh0LXNpemUt YWRqdXN0OiBhdXRvOyAtd2Via2l0LXRleHQtc3Ryb2tlLXdpZHRoOiAwcHg7IHdvcmQtd3JhcDog YnJlYWstd29yZDsgLXdlYmtpdC1uYnNwLW1vZGU6IHNwYWNlOyAtd2Via2l0LWxpbmUtYnJlYWs6 IGFmdGVyLXdoaXRlLXNwYWNlOyAiIGNsYXNzPSIiPg0KPGRpdiBzdHlsZT0iY29sb3I6IHJnYigw LCAwLCAwKTsgZm9udC12YXJpYW50OiBub3JtYWw7IGxldHRlci1zcGFjaW5nOiBub3JtYWw7IGxp bmUtaGVpZ2h0OiBub3JtYWw7IG9ycGhhbnM6IDI7IHRleHQtYWxpZ246IC13ZWJraXQtYXV0bzsg dGV4dC1pbmRlbnQ6IDBweDsgdGV4dC10cmFuc2Zvcm06IG5vbmU7IHdoaXRlLXNwYWNlOiBub3Jt YWw7IHdpZG93czogMjsgd29yZC1zcGFjaW5nOiAwcHg7IC13ZWJraXQtdGV4dC1zaXplLWFkanVz dDogYXV0bzsgLXdlYmtpdC10ZXh0LXN0cm9rZS13aWR0aDogMHB4OyB3b3JkLXdyYXA6IGJyZWFr LXdvcmQ7IC13ZWJraXQtbmJzcC1tb2RlOiBzcGFjZTsgLXdlYmtpdC1saW5lLWJyZWFrOiBhZnRl ci13aGl0ZS1zcGFjZTsgIiBjbGFzcz0iIj4NCjxkaXYgY2xhc3M9IiI+PGZvbnQgZmFjZT0iQ291 cmllciBOZXciIGNsYXNzPSIiPi0tLTwvZm9udD48L2Rpdj4NCjxkaXYgY2xhc3M9IiI+PGZvbnQg ZmFjZT0iQ291cmllciBOZXciIGNsYXNzPSIiPjxiciBjbGFzcz0iIj4NCjwvZm9udD48L2Rpdj4N CjxkaXYgY2xhc3M9IiI+PGZvbnQgZmFjZT0iQ291cmllciBOZXciIGNsYXNzPSIiPlJlbsOpIFJh aG48L2ZvbnQ+PC9kaXY+DQo8ZGl2IGNsYXNzPSIiPjxmb250IGZhY2U9IkNvdXJpZXIgTmV3IiBj bGFzcz0iIj5QaC5ELiBTdHVkZW50PC9mb250PjwvZGl2Pg0KPGRpdiBjbGFzcz0iIj48Zm9udCBm YWNlPSJDb3VyaWVyIE5ldyIgY2xhc3M9IiI+LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS08L2ZvbnQ+PC9kaXY+DQo8ZGl2IGNsYXNzPSIiPg0KPGRpdiBjbGFzcz0iIj48Zm9udCBmYWNl PSJDb3VyaWVyIE5ldyIgY2xhc3M9IiI+VGVsOiAmbmJzcDsoJiM0Mzs0OSkgMzAgODM4IDc1Mjc3 PC9mb250PjwvZGl2Pg0KPGRpdiBjbGFzcz0iIj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyIgY2xh c3M9IiI+TWFpbDogPGEgaHJlZj0ibWFpbHRvOnJlbmUucmFobkBmdS1iZXJsaW4uZGUiIGNsYXNz PSIiPg0KcmVuZS5yYWhuQGZ1LWJlcmxpbi5kZTwvYT48L2ZvbnQ+PC9kaXY+DQo8ZGl2IGNsYXNz PSIiPjxmb250IGZhY2U9IkNvdXJpZXIgTmV3IiBjbGFzcz0iIj4tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLTwvZm9udD48L2Rpdj4NCjwvZGl2Pg0KPGRpdiBjbGFzcz0iIj48Zm9udCBm YWNlPSJDb3VyaWVyIE5ldyIgY2xhc3M9IiI+SW5zdGl0dXRlIG9mIENvbXB1dGVyIFNjaWVuY2U8 L2ZvbnQ+PC9kaXY+DQo8ZGl2IGNsYXNzPSIiPjxmb250IGZhY2U9IkNvdXJpZXIgTmV3IiBjbGFz cz0iIj5BbGdvcml0aG1pYyBCaW9pbmZvcm1hdGljcyAoQUJJKTwvZm9udD48L2Rpdj4NCjxkaXYg Y2xhc3M9IiI+PGZvbnQgZmFjZT0iQ291cmllciBOZXciIGNsYXNzPSIiPi0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tPC9mb250PjwvZGl2Pg0KPGRpdiBjbGFzcz0iIj48Zm9udCBmYWNl PSJDb3VyaWVyIE5ldyIgY2xhc3M9IiI+RnJlaWUgVW5pdmVyc2l0w6R0IEJlcmxpbjwvZm9udD48 L2Rpdj4NCjxkaXYgY2xhc3M9IiI+PGZvbnQgZmFjZT0iQ291cmllciBOZXciIGNsYXNzPSIiPlRh a3VzdHJhw59lIDk8L2ZvbnQ+PC9kaXY+DQo8ZGl2IGNsYXNzPSIiPjxmb250IGZhY2U9IkNvdXJp ZXIgTmV3IiBjbGFzcz0iIj4xNDE5NSBCZXJsaW48L2ZvbnQ+PC9kaXY+DQo8ZGl2IGNsYXNzPSIi Pjxmb250IGZhY2U9IkNvdXJpZXIgTmV3IiBjbGFzcz0iIj4tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLTwvZm9udD48L2Rpdj4NCjwvZGl2Pg0KPC9kaXY+DQo8L2Rpdj4NCjwvZGl2Pg0K PGJyIGNsYXNzPSIiPg0KPC9kaXY+DQo8L2JvZHk+DQo8L2h0bWw+DQo= --_000_16CCF226056F42BF8898393F494F1D1Fcampusfuberlinde_--