From Sabrina.Krakau@fu-berlin.de Mon Sep 02 16:51:14 2013 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.80.1) with esmtp (envelope-from ) id <1VGVTD-0004dQ-W9>; Mon, 02 Sep 2013 16:51:12 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.80.1) with esmtp (envelope-from ) id <1VGVTD-003vB5-U8>; Mon, 02 Sep 2013 16:51:11 +0200 Received: from cas3.campus.fu-berlin.de ([130.133.170.203]) by relay2.zedat.fu-berlin.de (Exim 4.80.1) with esmtp (envelope-from ) id <1VGVTD-0030TA-JK>; Mon, 02 Sep 2013 16:51:11 +0200 Received: from EX03A.campus.fu-berlin.de ([130.133.170.134]) by CAS3.campus.fu-berlin.de ([130.133.170.203]) with mapi id 14.03.0123.003; Mon, 2 Sep 2013 16:51:09 +0200 From: "Krakau, Sabrina" To: "seqan-interests@lists.fu-berlin.de" , SeqAn Development Thread-Topic: SeqAn - BioStore Workshop 2013, Berlin, September 17th - 19th Thread-Index: Ac6n69rvM+5DhVEbQji6aP+WMkZ67w== Message-ID: Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_CBC5629F5E78A84A853AD8A3D5AF81BF51D400B2ex03acampusfube_" MIME-Version: 1.0 Date: Mon, 02 Sep 2013 16:51:07 +0200 X-Original-Date: Mon, 2 Sep 2013 14:51:07 +0000 X-Originating-IP: 130.133.170.203 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1378133471-0000097E-1828FEA3/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.012225, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-49.3 required=5.0 tests=ALL_TRUSTED, HTML_IMAGE_ONLY_28,HTML_MESSAGE,T_REMOTE_IMAGE X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Burundi.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] SeqAn - BioStore Workshop 2013, Berlin, September 17th - 19th X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Sep 2013 14:51:15 -0000 --_000_CBC5629F5E78A84A853AD8A3D5AF81BF51D400B2ex03acampusfube_ Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable Dear SeqAn Users and Developers, We are looking forward to see you at the SeqAn - BioStore Workshop 2013, Se= ptember 17th - 19th in Berlin. There are still a few free places available, so if there are any last-minut= e registrations, please send an email to sabrina.krakau@fu-berlin. General information about the workshop you can find on our BioStore website= : http://www.seqan-biostore.de/wp/seqan-workshops/2013-09-seqan-workshop/ Workshop Preparation Please bring your own Laptop for the SeqAn Tutorials. Your computer should = have the following installed: * C++ compiler and/or IDEs like Xcode, Visual C++ or Eclipse * CMake (http://www.cmake.org/) In preparation for the workshop please go through the 'Getting Started' to = install SeqAn and create a first "Hello World!" application: http://trac.seqan.de/wiki/Tutorial/GettingStarted Additionally we offer a 'SeqAn Install Session' at 9:00 a.m. on the first d= ay of the workshop for the case of unforeseen difficulties. For the KNIME Tutorial on the last day you can install already the KNIME SD= K (http://www.knime.org/node/81). Date September 17th - 19th (Tuesday - Thursday) Location Freie Universit=E4t Berlin Institute of Computer Science Takustr. 9, 14195 Berlin See you in Berlin, The SeqAn team -- Sabrina Krakau [Logo] Freie Universit=E4t Berlin Institute of Computer Science Algorithmic Bioinformatics - Project BioStore Takustr. 9, 14195 Berlin Telefon: +49 (0)30 838 75228 --_000_CBC5629F5E78A84A853AD8A3D5AF81BF51D400B2ex03acampusfube_ Content-Type: text/html; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable Dear SeqAn Users and Developers,

We are looking forward to see you at the SeqAn - BioStore Workshop 2013, Se= ptember 17th - 19th in Berlin.
There are still a few free places available, so if there are any last-minut= e registrations, please send an email to sabrina.krakau@fu-berlin.
General information about the workshop you can find on our BioStore website= :
http://www.seqan-biostore.de/wp/se= qan-workshops/2013-09-seqan-workshop/

Workshop Preparation
Please bring your own Laptop for the SeqAn Tutorials. Your computer should = have the following installed: 
  • C++ compiler and/or IDEs like Xcode, Visual C++ or Ecli= pse
In preparation for the workshop please go through the 'Getting Started' to = install SeqAn and create a first "Hello World!" application:
http://trac.seqan.de/wiki/Tutorial/GettingStarted Additionally we offer a 'SeqAn Install Session' at 9:00 a.m. on the first d= ay of the workshop for the case of unforeseen difficulties.
For the KNIME Tutorial on the last day you can install already the KNIME SD= K (http://www.knime.org/node/81).

Date
September 17th - 19th
(Tuesday - Thursday)

Location
Freie Universit=E4t Berlin
Institute of Computer Science
Takustr. 9, 14195 Berlin


See you in Berlin,
The SeqAn team

--

Sabrina Krakau

3D"Logo"

Freie Universit=E4t Berlin
Institute of Computer Science
Algorithmic Bioinformatics - Project BioStore

Takustr. 9, 14195 Berlin
Telefon: +49 (0)30 838 75228

--_000_CBC5629F5E78A84A853AD8A3D5AF81BF51D400B2ex03acampusfube_-- From kslowikowski@gmail.com Tue Sep 03 01:52:18 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VGdur-000kYE-AE>; Tue, 03 Sep 2013 01:52:17 +0200 Received: from mail-qa0-f49.google.com ([209.85.216.49]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VGdur-003Zjk-4a>; Tue, 03 Sep 2013 01:52:17 +0200 Received: by mail-qa0-f49.google.com with SMTP id w8so986366qac.8 for ; Mon, 02 Sep 2013 16:52:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=hO6qeK5nhVgIs2ZXNcHvKzJX0xMFC1ud8QPOSWtfR4I=; b=f4kYqSgDB83sEsignsxycKJRvV6Jfi+HQbCpJgOEJ5JkTJjCH0ZZlxRlL7gp6P+x46 ycQNLY1l8ybQwl4tlfURf5NiduSjZuIv+XLPopBqaYC27CP+McS5xqBCw5slv4zA+TkP eBiyKwG3rcAtoSDcKHRcH+aloaDpeTRvdwxe6kOj+4L0JpU/mD6Te+x83h2usOQ6zsy+ QnobZuV+hwyFG+a8JC8CkO72M5WgIhZutrS92od4asnf2sAyRh3pYgsNV8XxEY0L7ZaS oqxg62QPwi0uNajCI9UNMND5GbTw/cRIr0t78dJVB6WFRWCqb72aOBvy89Zox+QVQlyB iQnQ== X-Received: by 10.49.39.161 with SMTP id q1mr12226816qek.66.1378165934774; Mon, 02 Sep 2013 16:52:14 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.63.6 with HTTP; Mon, 2 Sep 2013 16:51:54 -0700 (PDT) From: Kamil Slowikowski Date: Mon, 2 Sep 2013 19:51:54 -0400 Message-ID: To: seqan-dev@lists.fu-berlin.de Content-Type: multipart/alternative; boundary=047d7bd756d2fbdd6904e56f42cb X-Originating-IP: 209.85.216.49 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1378165937-0000097E-E9E67AE9/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.496577, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.7 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,T_DKIM_INVALID X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Dschibuti.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] Read GCT gene expression matrix X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Sep 2013 23:52:18 -0000 --047d7bd756d2fbdd6904e56f42cb Content-Type: text/plain; charset=ISO-8859-1 Hi, I'd like to try using SeqAn for my research because it seems you can read some standard formats like GFF, GTF, BED, BAM, etc. Could you show me an example of how to read a GCT file? http://www.broadinstitute.org/cancer/software/genepattern/gp_guides/file-formats/sections/gct I saw that there is a Matrix class http://docs.seqan.de/seqan/1.4.1/CLASS_Matrix.html so perhaps you could show a simple example of reading a GCT file, accessing rows, columns, and individual elements, and perhaps writing a GCT file? I'd be happy to try writing this myself, but I wonder if you already have some code for this task, so I thought I should ask. Thanks a lot! Kamil --047d7bd756d2fbdd6904e56f42cb Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,

I'd like to try using SeqAn for= my research because it seems you can read some standard formats like GFF, = GTF, BED, BAM, etc.

Could you show me an example o= f how to read a GCT file?


I saw that there is a Matrix class

=

so perhaps you could show a simple example of reading a GCT file, acce= ssing rows, columns, and individual elements, and perhaps writing a GCT fil= e?

I'd be happy to try writing this myself, bu= t I wonder if you already have some code for this task, so I thought I shou= ld ask.

Thanks a lot!
Kamil
--047d7bd756d2fbdd6904e56f42cb-- From hauswedell@mi.fu-berlin.de Fri Sep 06 16:57:25 2013 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VHxTQ-000P81-Iw>; Fri, 06 Sep 2013 16:57:24 +0200 Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VHxTQ-0045Ej-HD>; Fri, 06 Sep 2013 16:57:24 +0200 Received: from ecoli.imp.fu-berlin.de ([160.45.40.78]) by inpost2.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtpsa (envelope-from ) id <1VHxTQ-003iWa-FI>; Fri, 06 Sep 2013 16:57:24 +0200 Message-ID: <5229ED54.3090102@mi.fu-berlin.de> Date: Fri, 06 Sep 2013 16:57:24 +0200 From: Hannes Hauswedell User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130827 Icedove/17.0.8 MIME-Version: 1.0 To: seqan-dev@lists.fu-berlin.de Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Originating-IP: 160.45.40.78 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1378479444-0000097E-FB61A66C/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-50.0 required=5.0 tests=ALL_TRUSTED X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Botsuana.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] Web-Site and resources offline... X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Sep 2013 14:57:25 -0000 ... does anyone know when they will be up again? Thanks, Hannes From jer15@hermes.cam.ac.uk Tue Sep 10 17:40:32 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VJQ3K-003a2I-Oe>; Tue, 10 Sep 2013 17:40:30 +0200 Received: from ppsw-32.csi.cam.ac.uk ([131.111.8.132]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VJQ3K-003yPa-M8>; Tue, 10 Sep 2013 17:40:30 +0200 X-Cam-AntiVirus: no malware found X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from wifi-host-45.mrc-bsu.cam.ac.uk ([193.60.87.45]:36803) by ppsw-32.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.156]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1VJQ3J-0002nw-1K (Exim 4.80_167-5a66dd3) for seqan-dev@lists.fu-berlin.de (return-path ); Tue, 10 Sep 2013 16:40:29 +0100 Message-ID: <522F3D6D.4060507@mail.cryst.bbk.ac.uk> Date: Tue, 10 Sep 2013 16:40:29 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130803 Thunderbird/17.0.8 MIME-Version: 1.0 To: SeqAn Development References: <521DBAC6.1@mail.cryst.bbk.ac.uk> <2497807B-B9C0-4907-BE51-87448BC9493D@fu-berlin.de> In-Reply-To: <2497807B-B9C0-4907-BE51-87448BC9493D@fu-berlin.de> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.132 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1378827630-0000097E-2F4179BE/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.012107, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-1.2 required=5.0 tests=HTML_MESSAGE, MIME_HTML_ONLY, RCVD_IN_DNSWL_MED X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Benin.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] Disk-based index X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Sep 2013 15:40:32 -0000
On 28/08/13 10:46, Siragusa, Enrico wrote:
Hi John,

On Aug 28, 2013, at 10:54 AM, John Reid <j.reid@mail.cryst.bbk.ac.uk>
 wrote:

Hi all,

I would like to index the mouse or human genome with an ESA. I need to do this more than once though and would like to store the ESA on disk as it takes some hours to construct. Is this feasible? Is there any way to do this in SeqAn already?

Sure. To save an index after constructing it, you can call save(index, "/path/to/index"). To load it, call open(index, "/path/to/index"). The path must be given as a C style string, so if you're using a SeqAn String, please use toCString() to convert it.

Enrico, Thanks for the help. Is it possible to save the index into a compressed file(s)? I'm guessing the format SeqAn currently uses is not compressed.

Regards,
John.

From daniel.bartha@gmail.com Wed Sep 11 15:43:59 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VJki5-002VAg-2q>; Wed, 11 Sep 2013 15:43:57 +0200 Received: from mail-vc0-f177.google.com ([209.85.220.177]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VJki4-0033m1-M7>; Wed, 11 Sep 2013 15:43:57 +0200 Received: by mail-vc0-f177.google.com with SMTP id gf12so6003282vcb.8 for ; Wed, 11 Sep 2013 06:43:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=MuQjFtY56sGlrh6lOeGexA5TVdmuMZXzY0DMdq4Vfes=; b=B14ZHx7huWD2QNBLkxaADh5P1HnRIwMr79LcPd3BdlX3vRS9HOcP1L/grTQvJEbnTT 8cL0N6XbmnzGYz4pCYcjiht1YRax/p8itPY4IY9F88kbi+jqzyGZemVaT/sZtzIm/DcF tbAskjoyBTj1Er0/bHpO4MoeMuXISUCdV1AIVb8e/C5CqkLz4N2L3/o846ohSyXNmjWl fbbSIc37/NzAdpmxFMgp9wQ5jv2bGLmjaaZK7VZlKSr2HQkyM3ESHObd2TC8DDRumalv eommwxov/5L4YKoALl13GFs90SBIkx3mMe1Ear4sfeuoxncAjNuuMJpvaQLrhLjladAL TnLw== MIME-Version: 1.0 X-Received: by 10.58.171.4 with SMTP id aq4mr858936vec.26.1378907034578; Wed, 11 Sep 2013 06:43:54 -0700 (PDT) Received: by 10.58.254.166 with HTTP; Wed, 11 Sep 2013 06:43:54 -0700 (PDT) In-Reply-To: References: Date: Wed, 11 Sep 2013 15:43:54 +0200 Message-ID: From: =?UTF-8?Q?Bartha_D=C3=A1niel?= To: SeqAn Development Content-Type: multipart/alternative; boundary=047d7b6743b4f9770b04e61bcf3d X-Originating-IP: 209.85.220.177 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1378907037-0000097E-DCF3EF6C/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.462974, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.7 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,T_DKIM_INVALID X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Algerien.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] question about the efficiency of the sequan sequence classes X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Sep 2013 13:43:59 -0000 --047d7b6743b4f9770b04e61bcf3d Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Manuel and People there, i promised to report over the performance comparsion between seqan::String and std::string. So here are the (for me) surprising results: I replaced the strings and chars with the seqan types in all over my source files. I access the characters in the seqan strings trough [] operator and corrected the functions where needed. The program does its job, but its 5 times slower then the simple std implementation! Thats not exactly what i expected, i thought it will be a little slower or much faster, but not this extreme slowdown. I suppose it happens because i dont use seqan the right way. Do you have an idea, whats the reason? I paste here the responsible two functions, it would be great, if someone could spend a couple of minutes. *Dna5 eventspace::select_event(Dna5 base, double p)* { /**this function does only gives back a Dna5 char, if the random number i give is in some of the pre-stored intervals, so nothing special**/ for(event e : E[base]) { if(e.a > p) { if(p >=3D e.b) { return e.to; //which is a seqan::Dna5 character } } } } *seqan::String replicate2(framework& sys, seqan::String seq, default_random_engine engine)* { uniform_real_distribution<> ur_dist(0, sys.Getscale()); //this and the default_random_engine are needed for real random number generation vector probs(length(seq)); vector index; for(unsigned i=3D0; i sys.lookup[seq[i]])index.push_back(i); } for(unsigned i : index) { seq[i]=3Dsys.events.select_event(seq[i],probs[i]); /**so practically one Dna5 =3D the other Dna5 variable, with assign(= ) is it even a little slower**/ } return seq;} Do you have any idea, or is this slowdown maybe normal? Thanks, regards: Daniel Live long and prosper Bartha D=C3=A1niel MTA-VMRI, 2013 2013/8/28 Bartha D=C3=A1niel > Hi Manuel (and other c++ fellows), > > i try it, and tell you, if it's better. > > But there is an other problem now, and there was a discussion about in > februar already.( > https://lists.fu-berlin.de/pipermail/seqan-dev/2013-February/msg00002.htm > I dont know if it is solved or not, but i still/again get exact the same > error message: > > /usr/include/seqan/bam_io/cigar.h||In function =E2=80=98bool > seqan::operator<(const seqan::CigarElement&, const > seqan::CigarElement&)=E2=80=99:| > /usr/include/seqan/bam_io/cigar.h|120|error: parse error in template > argument list| > ||=3D=3D=3D Build finished: 1 errors, 0 warnings (0 minutes, 2 seconds) = =3D=3D=3D| > > This is caused by the including of #include , and the > program is completly empty (return 0;...). I use ubuntu linux amd64, and > g++ 4.7.3. > > I bypass the usage of this header now, but it doesn't seems to be uniqe. > > Thank you very much again, and have a good day! > > Daniel > > > Live long and prosper > Bartha D=C3=A1niel > MTA-VMRI, 2013 > > > 2013/8/28 Holtgrewe, Manuel > >> Hi Daniel, >> >> it depends on your application and what you do with your strings. Using >> the SeqAn library can yield more elegant and faster code than using >> std::string or self-written string classes but it depends on the actual = use >> case. >> >> For Sequences, there are two aspects: >> >> (1) Using SeqAn's Dna5, Dna for characters stores the alphabet as >> numbers 0..3/4 internally. This makes it easier for indices and mappings >> since they can work directly and efficiently on the ordinal value >> (ordValue). >> >> For example, if you are counting the nucleotide content along strings, >> you can simply have a 4-element container (String in this case) for each >> position in your reads (thus a String of Strings). Thus, you do not need= a >> possible mapping for 'A' =3D> 0, 'C' =3D> 1, 'G' =3D> 2, 'T' =3D> 3, 'N'= =3D> 4 since >> the mapping is done beforehand. >> >> String > counters; >> for (unsigned i =3D 0; i < length(reads); ++i) >> { >> // Increase number of counters if reads[i] is longer than the >> previous reads. >> if (length(counters) < length(reads[i])) >> { >> unsigned oldSize =3D length(counters); >> resize(counters, length(reads[i])); >> for (unsigned j =3D oldSize; j < length(counters); ++j) >> resize(counters[j], 5, 0); >> } >> >> // Count nucleotides for each position in reads[i]; >> for (unsigned j =3D 0; j < length(reads[i]); ++j) >> counters[ordValue(reads[i][j])] +=3D 1; >> } >> >> (2) SeqAn's String class allows additionally giving an alternative >> implementation. The default implementation simply uses an array and woul= d >> store a Dna character in a Byte. By using the Packed String, you can >> byte-compress four 4-character DNA characters into one Byte (each only >> needs 2 bits). This comes at the cost of some computation but in this ca= se >> leads to a 4x memory consumption direction. >> >> We as library writers can now combine these two aspects of sequences >> and alphabets with generic programming and write algorithms that allow t= he >> user to change the alphabet type and the string implementation depending= on >> the user's requirements and get the best possible implementation for thi= s >> case. Because template specialization allows us to decide for the the >> correct implementation of ordValue(), length() etc. at *compile time*, w= e >> do not need virtual functions and thus no cost for runtime polymorphism. >> >> If you want to use the algorithms in the SeqAn library then you could >> benefit from using SeqAn sequences. However, many algorithms also work w= ith >> std::string and without knowing your application and code it is hard to >> make any promise on acceleartion. >> >> Cheers, >> Manuel >> >> ------------------------------ >> *From:* Bartha D=C3=A1niel [daniel.bartha@gmail.com] >> *Sent:* Wednesday, August 28, 2013 11:49 AM >> *To:* SeqAn Development >> *Subject:* [Seqan-dev] question about the efficiency of the sequan >> sequence classes >> >> Hi All, >> >> i have a big queston there. I wrote an application, that currently uses >> my own custom std::string based implementation for some dna mutation stu= ff. >> I basically have to access every simple character in the dna, and then d= o >> something with them, but that is not important for the question. >> >> I tend to rewrite the whole app with seqan, but it only has sense, if >> the manipulation and accessing of the seqan classes significant faster i= s, >> than my own. I read about the effectiveness in the Motivation chapter, b= ut >> does anybody have any experience about the concrete yield of possible >> acceleration? >> >> Thanks! >> >> Regards: Daniel >> >> Live long and prosper >> Bartha D=C3=A1niel >> MTA-VMRI, 2013 >> >> _______________________________________________ >> seqan-dev mailing list >> seqan-dev@lists.fu-berlin.de >> https://lists.fu-berlin.de/listinfo/seqan-dev >> >> > --047d7b6743b4f9770b04e61bcf3d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Manuel and People there,
i promised to report over the performance comparsion between seqan:= :String<seqan::Dna5> and std::string. So here are the (for me) surpri= sing results:

I replaced the strings and chars with the seqan types in all over= my source files. I access the characters in the seqan strings trough [] op= erator and corrected the functions where needed.

The program d= oes its job, but its 5 times slower then the simple std implementation! Tha= ts not exactly what i expected, i thought it will be a little slower or muc= h faster, but not this extreme slowdown.

I suppose it happens because i dont use seqan the right way. Do y= ou have an idea, whats the reason? I paste here the responsible two functio= ns, it would be great, if someone could spend a couple of minutes.


Dna5 eventspace::s= elect_event(Dna5 base, double p)
{
=C2=A0=C2=A0=C2=A0 /**this function does only gives back a Dna5 char, if = the random number i give is in some of the pre-stored intervals, so nothing= special**/
=C2=A0= =C2=A0=C2=A0 for(event e : E[base])
=C2=A0=C2=A0=C2=A0 {

=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if(e.a > p)
=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 if(p >=3D e.b)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return e.to;
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 //which is a seqan::Dna5 ch= aracter
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 }
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 }
=C2=A0=C2=A0=C2=A0 }
}

seqan::St= ring<seqan::Dna5> replicate2(framework& sys, seqan::String<seq= an::Dna5> seq, default_random_engine engine)
{
=C2=A0=C2=A0=C2=A0 uniform_real_distribution<> ur_dist(0, sys.Ge= tscale());
//this and the d= efault_random_engine are needed for real random number generation

=C2=A0=C2= =A0=C2=A0 vector<double> probs(length(seq));
=C2=A0=C2=A0=C2=A0 vector<= ;int> index;

=C2=A0=C2=A0=C2=A0 for(unsigned i=3D0; i<probs.si= ze(); ++i)
=C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 probs[i]= =3Dur_dist(engine);
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if(probs[= i] > sys.lookup[seq[i]])index.push_back(i);
=C2=A0=C2=A0=C2=A0 }
= =C2=A0=C2=A0=C2=A0 for(unsigned i : index)
=C2=A0=C2=A0=C2=A0 {
=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 seq[i]=3Dsys.events.select_event(seq[i],probs[i]);
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /**so = practically one Dna5 =3D the other Dna5 variable, with assign() is it even = a little slower**/
=C2=A0=C2=A0=C2=A0 }=
return seq;}


Do you have any idea, or is this = slowdown maybe normal?

Thanks, regards:

Daniel

Live long and prosper
Bartha D=C3=A1niel
MTA-VMRI, 2013


2013/8/28 Bartha D=C3=A1niel <d= aniel.bartha@gmail.com>
Hi Manuel (and other c++ fellows)= ,

i try it, and tell you, if it's better.

But ther= e is an other problem now, and there was a discussion about in februar alre= ady.(https://lists.fu-berlin.de/pipermail/seqa= n-dev/2013-February/msg00002.htm
I dont know if it is solved or not, but i still/again get exact the s= ame error message:

/usr/include/seqan/bam_io/cigar.h||In function = =E2=80=98bool seqan::operator<(const seqan::CigarElement<TOperation, = TCount>&, const seqan::CigarElement<TOperation, TCount>&)= =E2=80=99:|
/usr/include/seqan/bam_io/cigar.h|120|error: parse error in template argume= nt list|
||=3D=3D=3D Build finished: 1 errors, 0 warnings (0 minutes, 2 = seconds) =3D=3D=3D|

This is caused by the including of #includ= e <seqan/seq_io.h>, and the program is completly empty (return 0;...)= . I use ubuntu linux amd64, and g++ 4.7.3.

I bypass the usage of this header now, but it doesn't seems to be u= niqe.

Thank you very much again, and have a good day!

<= /div>Daniel


<= div dir=3D"ltr">Live long and prosper
Bartha D=C3=A1niel
MTA-VMRI, 2013
=


2013/8/28 Holtgrewe, Manuel <manuel.holtgrewe@fu-berlin.de>
Hi Daniel,

it depends on your application and what you do with your strings. Usin= g the SeqAn library can yield more elegant and faster code than using std::= string or self-written string classes but it depends on the actual use case= .

For Sequences, there are two aspects:

(1) Using SeqAn's Dna5, Dna for characters stores the alphabet as = numbers 0..3/4 internally. This makes it easier for indices and mappings si= nce they can work directly and efficiently on the ordinal value (ordValue).=

For example, if you are counting the nucleotide content along strings,= you can simply have a 4-element container (String in this case) for each p= osition in your reads (thus a String of Strings). Thus, you do not need a p= ossible mapping for 'A' =3D> 0, 'C' =3D> 1, 'G' =3D> 2, 'T' =3D> 3, 'N' =3D&g= t; 4 since the mapping is done beforehand.

String<String<unsigned> > count= ers;
for (unsigned i =3D 0; i < length(reads)= ; ++i)
{
=C2=A0 =C2=A0 // Increase number of counter= s if reads[i] is longer than the previous reads.
=C2=A0 =C2=A0 if (length(counters) < len= gth(reads[i]))
=C2=A0 =C2=A0 {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 unsigned oldSiz= e =3D length(counters);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 resize(counters= , length(reads[i]));
=C2=A0 =C2=A0 =C2=A0 =C2=A0 for (unsigned j= =3D oldSize; j < length(counters); ++j)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 r= esize(counters[j], 5, 0);
=C2=A0 =C2=A0 }

=C2=A0 =C2=A0 // Count nucleotides for each= position in reads[i];
=C2=A0 =C2=A0 for (unsigned j =3D 0; j <= length(reads[i]); ++j)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 counters[ordVal= ue(reads[i][j])] +=3D 1;
}

(2) SeqAn's String class allows additionally giving an alternative= implementation. The default implementation simply uses an array and would = store a Dna character in a Byte. By using the Packed String, you can byte-c= ompress four 4-character DNA characters into one Byte (each only needs 2 bits). This comes at the cost of some com= putation but in this case leads to a 4x memory consumption direction.

We as library writers can now combine these two aspects of sequences a= nd alphabets with generic programming and write algorithms that allow the u= ser to change the alphabet type and the string implementation depending on = the user's requirements and get the best possible implementation for this case. Because template specializ= ation allows us to decide for the the correct implementation of ordValue(),= length() etc. at *compile time*, we do not need virtual functions and thus= no cost for runtime polymorphism.

If you want to use the algorithms in the SeqAn library then you could = benefit from using SeqAn sequences. However, many algorithms also work with= std::string and without knowing your application and code it is hard to ma= ke any promise on acceleartion.

Cheers,
Manuel


Fro= m: Bartha D=C3=A1niel [daniel.bartha@gmail.com]
Sent: Wednesday, August 28, 2013 11:49 AM
To: SeqAn Development
Subject: [Seqan-dev] question about the efficiency of the sequan seq= uence classes

Hi All,

i have a big queston there. I wrote an application, that currently uses my = own custom std::string based implementation for some dna mutation stuff. I = basically have to access every simple character in the dna, and then do som= ething with them, but that is not important for the question.

I tend to rewrite the whole app with seqan, but it only has sense, if the m= anipulation and accessing of the seqan classes significant faster is, than = my own. I read about the effectiveness in the Motivation chapter, but does = anybody have any experience about the concrete yield of possible acceleration?

Thanks!

Regards: Daniel

Live long and prosper
Bartha D=C3=A1niel
MTA-VMRI, 2013

_______________________________________________
seqan-dev mailing list
seqan-dev= @lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



--047d7b6743b4f9770b04e61bcf3d-- From jer15@hermes.cam.ac.uk Fri Sep 13 12:04:32 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKQEo-000ZU4-JC>; Fri, 13 Sep 2013 12:04:30 +0200 Received: from ppsw-32.csi.cam.ac.uk ([131.111.8.132]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKQEo-001P8G-Gj>; Fri, 13 Sep 2013 12:04:30 +0200 X-Cam-AntiVirus: no malware found X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from cpc6-dals15-2-0-cust115.hari.cable.virginmedia.com ([82.35.196.116]:58191 helo=[192.168.1.4]) by ppsw-32.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.156]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1VKQEn-00024C-0W (Exim 4.80_167-5a66dd3) for seqan-dev@lists.fu-berlin.de (return-path ); Fri, 13 Sep 2013 11:04:29 +0100 Message-ID: <5232E32C.40605@mail.cryst.bbk.ac.uk> Date: Fri, 13 Sep 2013 11:04:28 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130803 Thunderbird/17.0.8 MIME-Version: 1.0 To: SeqAn Development References: <521DBAC6.1@mail.cryst.bbk.ac.uk> <2497807B-B9C0-4907-BE51-87448BC9493D@fu-berlin.de> In-Reply-To: <2497807B-B9C0-4907-BE51-87448BC9493D@fu-berlin.de> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.132 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1379066670-0000097E-6CD79A4D/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000273, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-1.2 required=5.0 tests=HTML_MESSAGE, MIME_HTML_ONLY, RCVD_IN_DNSWL_MED X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Algerien.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] Disk-based index X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 10:04:32 -0000 Hi Enrico,

On 28/08/13 10:46, Siragusa, Enrico wrote:
Hi John,

On Aug 28, 2013, at 10:54 AM, John Reid <j.reid@mail.cryst.bbk.ac.uk>
 wrote:

Hi all,

I would like to index the mouse or human genome with an ESA. I need to do this more than once though and would like to store the ESA on disk as it takes some hours to construct. Is this feasible? Is there any way to do this in SeqAn already?

Sure. To save an index after constructing it, you can call save(index, "/path/to/index"). To load it, call open(index, "/path/to/index"). The path must be given as a C style string, so if you're using a SeqAn String, please use toCString() to convert it.

Do you have any experience using this functionality with genome sized indexes (3Gb or so)? Would you expect it to work? I seem to be running into some issues I need to debug. I was just wondering if anyone else had used it in this way. Also the save function seems to create many files in the same directory. I imagine this could be a problem for some filesystems. Might you consider changing this? Also as mentioned before the ability to save in a compressed format would be very attractive to me as well.

Thanks for all the great work in SeqAn,
John.


From rene.maerker@fu-berlin.de Fri Sep 13 12:34:50 2013 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKQi7-000bn6-Jp>; Fri, 13 Sep 2013 12:34:47 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKQi7-003Unh-Gp>; Fri, 13 Sep 2013 12:34:47 +0200 Received: from cas2.campus.fu-berlin.de ([130.133.170.202]) by relay2.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKQi7-0036oJ-1c>; Fri, 13 Sep 2013 12:34:47 +0200 Received: from EX03A.campus.fu-berlin.de ([130.133.170.134]) by CAS2.campus.fu-berlin.de ([130.133.170.202]) with mapi id 14.03.0123.003; Fri, 13 Sep 2013 12:34:46 +0200 From: =?utf-8?B?UmFobiwgUmVuw6k=?= To: SeqAn Development Thread-Topic: [Seqan-dev] question about the efficiency of the sequan sequence classes Thread-Index: AQHOo+0CR/c43hIMskqFSfijiQg9X5nAgIgAgALv2QA= Message-ID: <9CAA1752-576D-407F-B96E-9F3EAEC916C0@campus.fu-berlin.de> References: In-Reply-To: Accept-Language: de-DE, en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Content-Type: multipart/mixed; boundary="_004_9CAA1752576D407FB96E9F3EAEC916C0campusfuberlinde_" MIME-Version: 1.0 Date: Fri, 13 Sep 2013 12:34:45 +0200 X-Original-Date: Fri, 13 Sep 2013 10:34:45 +0000 X-Originating-IP: 130.133.170.202 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1379068487-0000097E-175C6858/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.012828, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-50.0 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Algerien.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] question about the efficiency of the sequan sequence classes X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 10:34:50 -0000 --_004_9CAA1752576D407FB96E9F3EAEC916C0campusfuberlinde_ Content-Type: multipart/alternative; boundary="_000_9CAA1752576D407FB96E9F3EAEC916C0campusfuberlinde_" --_000_9CAA1752576D407FB96E9F3EAEC916C0campusfuberlinde_ Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 SGV5IERhbmllbCwNCg0KSSB0cmllZCBvdXQgeW91ciBjb2RlIGV4YW1wbGVzIGJlbG93LiBJIGRp ZCBoYXZlIHNvbWUgc3VycHJpc2luZyBvYnNlcnZhdGlvbnMgYnV0IHRoZXJlIGFyZSBkaWZmZXJl bnQgZnJvbSB3aGF0IHlvdSB3aGVyZSByZXBvcnRpbmcuIEkgcmVwbGFjZWQgc29tZSBvZiB5b3Vy IGZ1bmN0aW9uYWxpdHkuIEkgYWRhcHRlZCB0aGUgc2VsZWN0X2V2ZW50IGZ1bmN0aW9uIHRvIHNp bXBseSByZXR1cm4gdGhlIGNvbXBsZW1lbnQgb2YgYSBnaXZlbiBiYXNlLiBJIHJlbW92ZWQgdGhl IHJhbmRvbW5lc3MgZmFjdG9yIHRvIHNlbGVjdCB0aGUgaW5kZXggYW5kIHNpbXBseSB1c2VkIGV2 ZXJ5IGluZGV4IHRvIGJlIGNvbnZlcnRlZC4gSSBsb2FkZWQgdGhlIGNocjIyIHNlcXVlbmNlIG9m IHRoZSBodW1hbiBnZW5vbWUgKH41MCBNYikgIGFuZCBtZWFzdXJlZCB0aGUgdGltZSBvZiBydW5u aW5nIDUwIHRpbWVzIGEpIHRoZSByZXBsaWNhdGUgZnVuY3Rpb24gYW5kIGIpIHRoZSBpbm5lciBs b29wIHdpdGggdGhlIGFzc2lnbm1lbnQuIEkgZGlkIHRoZSBleHBlcmltZW50cyB3aXRoIHRoZSBz ZXFhbjo6U3RyaW5nPERuYTU+LCBzdGQ6OnZlY3RvcjxEbmE1PiAsIHN0ZDo6YmFzaWNfc3RyaW5n PERuYTU+IGFuZCBzdGQ6OnN0cmluZy4gSSBhbHNvIGltcGxlbWVudGVkIGEgcmVwbGljYXRlMyBm dW5jdGlvbiB3aGljaCBwZXJmb3JtcyBiZXN0IGFzIGl0IHJlZHVjZXMgdGhlIG51bWJlciBvZiBj b3B5aW5nIHdob2xlIFN0cmluZ3MuDQpJIGRpZCB0aGUgcGFyc2luZyBvdmVyIHRoZSBpbmRleCB3 aXRoIGFuIGMrKzExIHJhbmdlLWJhc2VkIGZvciBsb29wIGFuZCB0aGUgc3RhbmRhcmQgZm9yIGxv b3AuDQpIZXJlIGFyZSBteSByZXN1bHRzIGJ1aWx0IGluIHJlbGVhc2UgbW9kZSBvbiBhIDIuMyBH SHogQ29yZSBpNy4NCg0KQWxsIHRpbWVzIGFyZSB0aGUgc3VtIG9mIDUwIGV4cGVyaW1lbnRzLg0K DQpDKysxMSBzdHlsZToNCg0KU2VxYW4gU3RyaW5nIFRpbWU6IDExLjE4IHMuICAgSW5uZXIgTG9v cDogMi41ODA2NCBzLg0KU1RMIFZlY3RvciBUaW1lOiAxMC45Nzk4IHMuIElubmVyIExvb3A6IDIu NTM4MzUgcy4NClNUTCBCYXNpYyBTdHJpbmcgRG5hNSBUaW1lOiAxMC42NTAxIHMuIElubmVyIExv b3A6IDMuOTQ1NTQgcy4NClNUTCBCYXNpYyBTdHJpbmcgQ2hhciBUaW1lOiAxMS40Nzk5IHMuIElu bmVyIExvb3A6IDQuODU1MDYgcy4NCnJlcGxpY2F0ZTMgVGltZTogOC42NzE3MiBzLiBJbm5lciBM b29wOiAyLjUyNDc0IHMuDQoNCkMrKzk4IHN0eWxlDQoNClNlcWFuIFN0cmluZyBUaW1lOiAxMS4w ODI4IHMuIElubmVyIExvb3A6IDIuNDk2Njcgcy4NClNUTCBWZWN0b3IgVGltZTogMTAuOTE3OCBz LiBJbm5lciBMb29wOiAyLjU0NjE0IHMuDQpTVEwgQmFzaWMgU3RyaW5nIERuYTUgVGltZTogMTAu OTA0OCBzLiBJbm5lciBMb29wOiA0LjIwMDI0IHMuDQpTVEwgQmFzaWMgU3RyaW5nIENoYXIgVGlt ZTogMTIuMzE4NCBzLiBJbm5lciBMb29wOiA1LjYxMjMxIHMuDQpyZXBsaWFjdGUzIFRpbWU6IDku NTU3MTkgcy4gSW5uZXIgTG9vcDogMy4zMDA1MiBzLg0KDQpBcyB5b3UgY2FuIHNlZSB0aGUgcmVw bGljYXRlMyBmdW5jdGlvbiBvdXRwZXJmb3JtcyB0aGUgb3RoZXIgdmVyc2lvbnMsIGhvd2V2ZXIg dGhlIGlubmVyIGxvb3AgZ2V0cyBzbG93ZXIgd2hlbiB1c2luZyB0aGUgc3RhbmRhcmQgZm9yIGxv b3AsIGFuZCBJIGFtIG5vdCBxdWl0ZSBzdXJlIHRoYXQgSSBjb21wbGV0ZWx5IHVuZGVyc3RhbmQg d2h5LCBiZWNhdXNlIEkgY2FuJ3Qgb2JzZXJ2ZSB0aGUgc2FtZSBwZXJmb3JtYW5jZSBkcm9wIGlu IHRoZSByZXBsaWNhdGUyIGZ1bmN0aW9uLg0KSG93ZXZlciwgd2hlbiBjb21wYXJpbmcgcmVzdWx0 cyB3aXRoIHRoZSBDKysxMSB2ZXJzaW9uIHRoZSBhc3NpZ25tZW50IG9mIHRoZSBzZXFhbjo6U3Ry aW5nIGlzIGxpa2UgdGhlIHN0ZDo6dmVjdG9yIGFuZCBmYXN0ZXIgdGhhbiB0aGUgc3RkOjpzdHJp bmcgdmVyc2lvbnMuDQoNCkNhbiB5b3UgcGxlYXNlIGdpdmUgdXMgc29tZSBpbmZvcm1hdGlvbiBh Ym91dCB0aGUgZGltZW5zaW9uIG9mIHlvdSBwcm9ibGVtLiBIb3cgbWFueSBzZXF1ZW5jZXMgYXJl IHlvdSByZXBsaWNhdGluZz8gSG93IGxvbmcgYXJlIHRoZSBzZXF1ZW5jZXM/DQpQbGVhc2UgY29u c2lkZXIgdGhlIGZvbGxvd2luZyBwZXJmb3JtYW5jZSBib29zdGVycy4gQWx3YXlzIHByZWZlciBw YXNzaW5nIHBhcmFtZXRlcnMgYnkgY29uc3QtcmVmZXJlbmNlIG92ZXIgcGFzc2luZyB0aGVtIGJ5 IGNvcHkgKGFzIGxvbmcgYXMgeW91IGFyZSBzdXJlIHRoZXNlIGFyZSBub3QganVzdCBzaW1wbGUg dHlwZXMpLiBDb3B5aW5nIGEgYmlnIGNvbnRhaW5lciB3aXRoIG1hbnkgdmFsdWVzIGlzIHNsb3dl ciB0aGFuIGNvcHlpbmcgYSA0LzggQnl0ZSByZWZlcmVuY2UgOikuDQoNCkkgYWxzbyBhcHBlbmRl ZCB0aGUgYmVuY2htYXJrIGZpbGUuIFNvIG1heWJlIHlvdSBjYW4gcnVuIHRoZSB0ZXN0cyBvbiB5 b3VyIG1hY2hpbmUgYW5kIHJlcG9ydCB5b3VyIGV4cGVyaWVuY2UuDQoNCg0KS2luZCByZWdhcmRz LA0KDQpSZW7vv70NCg0KQW0gMTEuMDkuMjAxMyB1bSAxNTo0MyBzY2hyaWViIEJhcnRoYSBE77+9 bmllbCA8ZGFuaWVsLmJhcnRoYUBnbWFpbC5jb208bWFpbHRvOmRhbmllbC5iYXJ0aGFAZ21haWwu Y29tPj46DQoNCkhpIE1hbnVlbCBhbmQgUGVvcGxlIHRoZXJlLA0KDQppIHByb21pc2VkIHRvIHJl cG9ydCBvdmVyIHRoZSBwZXJmb3JtYW5jZSBjb21wYXJzaW9uIGJldHdlZW4gc2VxYW46OlN0cmlu ZzxzZXFhbjo6RG5hNT4gYW5kIHN0ZDo6c3RyaW5nLiBTbyBoZXJlIGFyZSB0aGUgKGZvciBtZSkg c3VycHJpc2luZyByZXN1bHRzOg0KDQpJIHJlcGxhY2VkIHRoZSBzdHJpbmdzIGFuZCBjaGFycyB3 aXRoIHRoZSBzZXFhbiB0eXBlcyBpbiBhbGwgb3ZlciBteSBzb3VyY2UgZmlsZXMuIEkgYWNjZXNz IHRoZSBjaGFyYWN0ZXJzIGluIHRoZSBzZXFhbiBzdHJpbmdzIHRyb3VnaCBbXSBvcGVyYXRvciBh bmQgY29ycmVjdGVkIHRoZSBmdW5jdGlvbnMgd2hlcmUgbmVlZGVkLg0KDQpUaGUgcHJvZ3JhbSBk b2VzIGl0cyBqb2IsIGJ1dCBpdHMgNSB0aW1lcyBzbG93ZXIgdGhlbiB0aGUgc2ltcGxlIHN0ZCBp bXBsZW1lbnRhdGlvbiEgVGhhdHMgbm90IGV4YWN0bHkgd2hhdCBpIGV4cGVjdGVkLCBpIHRob3Vn aHQgaXQgd2lsbCBiZSBhIGxpdHRsZSBzbG93ZXIgb3IgbXVjaCBmYXN0ZXIsIGJ1dCBub3QgdGhp cyBleHRyZW1lIHNsb3dkb3duLg0KDQpJIHN1cHBvc2UgaXQgaGFwcGVucyBiZWNhdXNlIGkgZG9u dCB1c2Ugc2VxYW4gdGhlIHJpZ2h0IHdheS4gRG8geW91IGhhdmUgYW4gaWRlYSwgd2hhdHMgdGhl IHJlYXNvbj8gSSBwYXN0ZSBoZXJlIHRoZSByZXNwb25zaWJsZSB0d28gZnVuY3Rpb25zLCBpdCB3 b3VsZCBiZSBncmVhdCwgaWYgc29tZW9uZSBjb3VsZCBzcGVuZCBhIGNvdXBsZSBvZiBtaW51dGVz Lg0KDQoNCkRuYTUgZXZlbnRzcGFjZTo6c2VsZWN0X2V2ZW50KERuYTUgYmFzZSwgZG91YmxlIHAp DQp7DQogICAgLyoqdGhpcyBmdW5jdGlvbiBkb2VzIG9ubHkgZ2l2ZXMgYmFjayBhIERuYTUgY2hh ciwgaWYgdGhlIHJhbmRvbSBudW1iZXIgaSBnaXZlIGlzIGluIHNvbWUgb2YgdGhlIHByZS1zdG9y ZWQgaW50ZXJ2YWxzLCBzbyBub3RoaW5nIHNwZWNpYWwqKi8NCiAgICBmb3IoZXZlbnQgZSA6IEVb YmFzZV0pDQogICAgew0KDQogICAgICAgIGlmKGUuYSA+IHApDQogICAgICAgIHsNCiAgICAgICAg ICAgIGlmKHAgPj0gZS5iKQ0KICAgICAgICAgICAgew0KICAgICAgICAgICAgICAgIHJldHVybiBl LnRvPGh0dHA6Ly9lLnRvLz47DQogICAgICAgICAgICAgICAgLy93aGljaCBpcyBhIHNlcWFuOjpE bmE1IGNoYXJhY3Rlcg0KICAgICAgICAgICAgfQ0KICAgICAgICB9DQogICAgfQ0KfQ0KDQpzZXFh bjo6U3RyaW5nPHNlcWFuOjpEbmE1PiByZXBsaWNhdGUyKGZyYW1ld29yayYgc3lzLCBzZXFhbjo6 U3RyaW5nPHNlcWFuOjpEbmE1PiBzZXEsIGRlZmF1bHRfcmFuZG9tX2VuZ2luZSBlbmdpbmUpDQp7 DQogICAgdW5pZm9ybV9yZWFsX2Rpc3RyaWJ1dGlvbjw+IHVyX2Rpc3QoMCwgc3lzLkdldHNjYWxl KCkpOw0KICAgIC8vdGhpcyBhbmQgdGhlIGRlZmF1bHRfcmFuZG9tX2VuZ2luZSBhcmUgbmVlZGVk IGZvciByZWFsIHJhbmRvbSBudW1iZXIgZ2VuZXJhdGlvbg0KDQogICAgdmVjdG9yPGRvdWJsZT4g cHJvYnMobGVuZ3RoKHNlcSkpOw0KICAgIHZlY3RvcjxpbnQ+IGluZGV4Ow0KDQogICAgZm9yKHVu c2lnbmVkIGk9MDsgaTxwcm9icy5zaXplKCk7ICsraSkNCiAgICB7DQogICAgICAgIHByb2JzW2ld PXVyX2Rpc3QoZW5naW5lKTsNCiAgICAgICAgaWYocHJvYnNbaV0gPiBzeXMubG9va3VwW3NlcVtp XV0paW5kZXgucHVzaF9iYWNrKGkpOw0KICAgIH0NCiAgICBmb3IodW5zaWduZWQgaSA6IGluZGV4 KQ0KICAgIHsNCiAgICAgICBzZXFbaV09c3lzLmV2ZW50cy5zZWxlY3RfZXZlbnQoc2VxW2ldLHBy b2JzW2ldKTsNCiAgICAgICAvKipzbyBwcmFjdGljYWxseSBvbmUgRG5hNSA9IHRoZSBvdGhlciBE bmE1IHZhcmlhYmxlLCB3aXRoIGFzc2lnbigpIGlzIGl0IGV2ZW4gYSBsaXR0bGUgc2xvd2VyKiov DQogICAgfQ0KcmV0dXJuIHNlcTt9DQoNCkRvIHlvdSBoYXZlIGFueSBpZGVhLCBvciBpcyB0aGlz IHNsb3dkb3duIG1heWJlIG5vcm1hbD8NCg0KVGhhbmtzLCByZWdhcmRzOg0KDQpEYW5pZWwNCg0K TGl2ZSBsb25nIGFuZCBwcm9zcGVyDQpCYXJ0aGEgRO+/vW5pZWwNCk1UQS1WTVJJLCAyMDEzDQoN Cg0KMjAxMy84LzI4IEJhcnRoYSBE77+9bmllbCA8ZGFuaWVsLmJhcnRoYUBnbWFpbC5jb208bWFp bHRvOmRhbmllbC5iYXJ0aGFAZ21haWwuY29tPj4NCkhpIE1hbnVlbCAoYW5kIG90aGVyIGMrKyBm ZWxsb3dzKSwNCg0KaSB0cnkgaXQsIGFuZCB0ZWxsIHlvdSwgaWYgaXQncyBiZXR0ZXIuDQoNCkJ1 dCB0aGVyZSBpcyBhbiBvdGhlciBwcm9ibGVtIG5vdywgYW5kIHRoZXJlIHdhcyBhIGRpc2N1c3Np b24gYWJvdXQgaW4gZmVicnVhciBhbHJlYWR5LihodHRwczovL2xpc3RzLmZ1LWJlcmxpbi5kZS9w aXBlcm1haWwvc2VxYW4tZGV2LzIwMTMtRmVicnVhcnkvbXNnMDAwMDIuaHRtDQpJIGRvbnQga25v dyBpZiBpdCBpcyBzb2x2ZWQgb3Igbm90LCBidXQgaSBzdGlsbC9hZ2FpbiBnZXQgZXhhY3QgdGhl IHNhbWUgZXJyb3IgbWVzc2FnZToNCg0KL3Vzci9pbmNsdWRlL3NlcWFuL2JhbV9pby9jaWdhci5o fHxJbiBmdW5jdGlvbiDvv71ib29sIHNlcWFuOjpvcGVyYXRvcjwoY29uc3Qgc2VxYW46OkNpZ2Fy RWxlbWVudDxUT3BlcmF0aW9uLCBUQ291bnQ+JiwgY29uc3Qgc2VxYW46OkNpZ2FyRWxlbWVudDxU T3BlcmF0aW9uLCBUQ291bnQ+Jinvv706fA0KL3Vzci9pbmNsdWRlL3NlcWFuL2JhbV9pby9jaWdh ci5ofDEyMHxlcnJvcjogcGFyc2UgZXJyb3IgaW4gdGVtcGxhdGUgYXJndW1lbnQgbGlzdHwNCnx8 PT09IEJ1aWxkIGZpbmlzaGVkOiAxIGVycm9ycywgMCB3YXJuaW5ncyAoMCBtaW51dGVzLCAyIHNl Y29uZHMpID09PXwNCg0KVGhpcyBpcyBjYXVzZWQgYnkgdGhlIGluY2x1ZGluZyBvZiAjaW5jbHVk ZSA8c2VxYW4vc2VxX2lvLmg+LCBhbmQgdGhlIHByb2dyYW0gaXMgY29tcGxldGx5IGVtcHR5IChy ZXR1cm4gMDsuLi4pLiBJIHVzZSB1YnVudHUgbGludXggYW1kNjQsIGFuZCBnKysgNC43LjMuDQoN CkkgYnlwYXNzIHRoZSB1c2FnZSBvZiB0aGlzIGhlYWRlciBub3csIGJ1dCBpdCBkb2Vzbid0IHNl ZW1zIHRvIGJlIHVuaXFlLg0KDQpUaGFuayB5b3UgdmVyeSBtdWNoIGFnYWluLCBhbmQgaGF2ZSBh IGdvb2QgZGF5IQ0KDQpEYW5pZWwNCg0KDQpMaXZlIGxvbmcgYW5kIHByb3NwZXINCkJhcnRoYSBE 77+9bmllbA0KTVRBLVZNUkksIDIwMTMNCg0KDQoyMDEzLzgvMjggSG9sdGdyZXdlLCBNYW51ZWwg PG1hbnVlbC5ob2x0Z3Jld2VAZnUtYmVybGluLmRlPG1haWx0bzptYW51ZWwuaG9sdGdyZXdlQGZ1 LWJlcmxpbi5kZT4+DQpIaSBEYW5pZWwsDQoNCml0IGRlcGVuZHMgb24geW91ciBhcHBsaWNhdGlv biBhbmQgd2hhdCB5b3UgZG8gd2l0aCB5b3VyIHN0cmluZ3MuIFVzaW5nIHRoZSBTZXFBbiBsaWJy YXJ5IGNhbiB5aWVsZCBtb3JlIGVsZWdhbnQgYW5kIGZhc3RlciBjb2RlIHRoYW4gdXNpbmcgc3Rk OjpzdHJpbmcgb3Igc2VsZi13cml0dGVuIHN0cmluZyBjbGFzc2VzIGJ1dCBpdCBkZXBlbmRzIG9u IHRoZSBhY3R1YWwgdXNlIGNhc2UuDQoNCkZvciBTZXF1ZW5jZXMsIHRoZXJlIGFyZSB0d28gYXNw ZWN0czoNCg0KKDEpIFVzaW5nIFNlcUFuJ3MgRG5hNSwgRG5hIGZvciBjaGFyYWN0ZXJzIHN0b3Jl cyB0aGUgYWxwaGFiZXQgYXMgbnVtYmVycyAwLi4zLzQgaW50ZXJuYWxseS4gVGhpcyBtYWtlcyBp dCBlYXNpZXIgZm9yIGluZGljZXMgYW5kIG1hcHBpbmdzIHNpbmNlIHRoZXkgY2FuIHdvcmsgZGly ZWN0bHkgYW5kIGVmZmljaWVudGx5IG9uIHRoZSBvcmRpbmFsIHZhbHVlIChvcmRWYWx1ZSkuDQoN CkZvciBleGFtcGxlLCBpZiB5b3UgYXJlIGNvdW50aW5nIHRoZSBudWNsZW90aWRlIGNvbnRlbnQg YWxvbmcgc3RyaW5ncywgeW91IGNhbiBzaW1wbHkgaGF2ZSBhIDQtZWxlbWVudCBjb250YWluZXIg KFN0cmluZyBpbiB0aGlzIGNhc2UpIGZvciBlYWNoIHBvc2l0aW9uIGluIHlvdXIgcmVhZHMgKHRo dXMgYSBTdHJpbmcgb2YgU3RyaW5ncykuIFRodXMsIHlvdSBkbyBub3QgbmVlZCBhIHBvc3NpYmxl IG1hcHBpbmcgZm9yICdBJyA9PiAwLCAnQycgPT4gMSwgJ0cnID0+IDIsICdUJyA9PiAzLCAnTicg PT4gNCBzaW5jZSB0aGUgbWFwcGluZyBpcyBkb25lIGJlZm9yZWhhbmQuDQoNClN0cmluZzxTdHJp bmc8dW5zaWduZWQ+ID4gY291bnRlcnM7DQpmb3IgKHVuc2lnbmVkIGkgPSAwOyBpIDwgbGVuZ3Ro KHJlYWRzKTsgKytpKQ0Kew0KICAgIC8vIEluY3JlYXNlIG51bWJlciBvZiBjb3VudGVycyBpZiBy ZWFkc1tpXSBpcyBsb25nZXIgdGhhbiB0aGUgcHJldmlvdXMgcmVhZHMuDQogICAgaWYgKGxlbmd0 aChjb3VudGVycykgPCBsZW5ndGgocmVhZHNbaV0pKQ0KICAgIHsNCiAgICAgICAgdW5zaWduZWQg b2xkU2l6ZSA9IGxlbmd0aChjb3VudGVycyk7DQogICAgICAgIHJlc2l6ZShjb3VudGVycywgbGVu Z3RoKHJlYWRzW2ldKSk7DQogICAgICAgIGZvciAodW5zaWduZWQgaiA9IG9sZFNpemU7IGogPCBs ZW5ndGgoY291bnRlcnMpOyArK2opDQogICAgICAgICAgICByZXNpemUoY291bnRlcnNbal0sIDUs IDApOw0KICAgIH0NCg0KICAgIC8vIENvdW50IG51Y2xlb3RpZGVzIGZvciBlYWNoIHBvc2l0aW9u IGluIHJlYWRzW2ldOw0KICAgIGZvciAodW5zaWduZWQgaiA9IDA7IGogPCBsZW5ndGgocmVhZHNb aV0pOyArK2opDQogICAgICAgIGNvdW50ZXJzW29yZFZhbHVlKHJlYWRzW2ldW2pdKV0gKz0gMTsN Cn0NCg0KKDIpIFNlcUFuJ3MgU3RyaW5nIGNsYXNzIGFsbG93cyBhZGRpdGlvbmFsbHkgZ2l2aW5n IGFuIGFsdGVybmF0aXZlIGltcGxlbWVudGF0aW9uLiBUaGUgZGVmYXVsdCBpbXBsZW1lbnRhdGlv biBzaW1wbHkgdXNlcyBhbiBhcnJheSBhbmQgd291bGQgc3RvcmUgYSBEbmEgY2hhcmFjdGVyIGlu IGEgQnl0ZS4gQnkgdXNpbmcgdGhlIFBhY2tlZCBTdHJpbmcsIHlvdSBjYW4gYnl0ZS1jb21wcmVz cyBmb3VyIDQtY2hhcmFjdGVyIEROQSBjaGFyYWN0ZXJzIGludG8gb25lIEJ5dGUgKGVhY2ggb25s eSBuZWVkcyAyIGJpdHMpLiBUaGlzIGNvbWVzIGF0IHRoZSBjb3N0IG9mIHNvbWUgY29tcHV0YXRp b24gYnV0IGluIHRoaXMgY2FzZSBsZWFkcyB0byBhIDR4IG1lbW9yeSBjb25zdW1wdGlvbiBkaXJl Y3Rpb24uDQoNCldlIGFzIGxpYnJhcnkgd3JpdGVycyBjYW4gbm93IGNvbWJpbmUgdGhlc2UgdHdv IGFzcGVjdHMgb2Ygc2VxdWVuY2VzIGFuZCBhbHBoYWJldHMgd2l0aCBnZW5lcmljIHByb2dyYW1t aW5nIGFuZCB3cml0ZSBhbGdvcml0aG1zIHRoYXQgYWxsb3cgdGhlIHVzZXIgdG8gY2hhbmdlIHRo ZSBhbHBoYWJldCB0eXBlIGFuZCB0aGUgc3RyaW5nIGltcGxlbWVudGF0aW9uIGRlcGVuZGluZyBv biB0aGUgdXNlcidzIHJlcXVpcmVtZW50cyBhbmQgZ2V0IHRoZSBiZXN0IHBvc3NpYmxlIGltcGxl bWVudGF0aW9uIGZvciB0aGlzIGNhc2UuIEJlY2F1c2UgdGVtcGxhdGUgc3BlY2lhbGl6YXRpb24g YWxsb3dzIHVzIHRvIGRlY2lkZSBmb3IgdGhlIHRoZSBjb3JyZWN0IGltcGxlbWVudGF0aW9uIG9m IG9yZFZhbHVlKCksIGxlbmd0aCgpIGV0Yy4gYXQgKmNvbXBpbGUgdGltZSosIHdlIGRvIG5vdCBu ZWVkIHZpcnR1YWwgZnVuY3Rpb25zIGFuZCB0aHVzIG5vIGNvc3QgZm9yIHJ1bnRpbWUgcG9seW1v cnBoaXNtLg0KDQpJZiB5b3Ugd2FudCB0byB1c2UgdGhlIGFsZ29yaXRobXMgaW4gdGhlIFNlcUFu IGxpYnJhcnkgdGhlbiB5b3UgY291bGQgYmVuZWZpdCBmcm9tIHVzaW5nIFNlcUFuIHNlcXVlbmNl cy4gSG93ZXZlciwgbWFueSBhbGdvcml0aG1zIGFsc28gd29yayB3aXRoIHN0ZDo6c3RyaW5nIGFu ZCB3aXRob3V0IGtub3dpbmcgeW91ciBhcHBsaWNhdGlvbiBhbmQgY29kZSBpdCBpcyBoYXJkIHRv IG1ha2UgYW55IHByb21pc2Ugb24gYWNjZWxlYXJ0aW9uLg0KDQpDaGVlcnMsDQpNYW51ZWwNCg0K X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCkZyb206IEJhcnRoYSBE77+9bmllbCBb ZGFuaWVsLmJhcnRoYUBnbWFpbC5jb208bWFpbHRvOmRhbmllbC5iYXJ0aGFAZ21haWwuY29tPl0N ClNlbnQ6IFdlZG5lc2RheSwgQXVndXN0IDI4LCAyMDEzIDExOjQ5IEFNDQpUbzogU2VxQW4gRGV2 ZWxvcG1lbnQNClN1YmplY3Q6IFtTZXFhbi1kZXZdIHF1ZXN0aW9uIGFib3V0IHRoZSBlZmZpY2ll bmN5IG9mIHRoZSBzZXF1YW4gc2VxdWVuY2UgY2xhc3Nlcw0KDQpIaSBBbGwsDQoNCmkgaGF2ZSBh IGJpZyBxdWVzdG9uIHRoZXJlLiBJIHdyb3RlIGFuIGFwcGxpY2F0aW9uLCB0aGF0IGN1cnJlbnRs eSB1c2VzIG15IG93biBjdXN0b20gc3RkOjpzdHJpbmcgYmFzZWQgaW1wbGVtZW50YXRpb24gZm9y IHNvbWUgZG5hIG11dGF0aW9uIHN0dWZmLiBJIGJhc2ljYWxseSBoYXZlIHRvIGFjY2VzcyBldmVy eSBzaW1wbGUgY2hhcmFjdGVyIGluIHRoZSBkbmEsIGFuZCB0aGVuIGRvIHNvbWV0aGluZyB3aXRo IHRoZW0sIGJ1dCB0aGF0IGlzIG5vdCBpbXBvcnRhbnQgZm9yIHRoZSBxdWVzdGlvbi4NCg0KSSB0 ZW5kIHRvIHJld3JpdGUgdGhlIHdob2xlIGFwcCB3aXRoIHNlcWFuLCBidXQgaXQgb25seSBoYXMg c2Vuc2UsIGlmIHRoZSBtYW5pcHVsYXRpb24gYW5kIGFjY2Vzc2luZyBvZiB0aGUgc2VxYW4gY2xh c3NlcyBzaWduaWZpY2FudCBmYXN0ZXIgaXMsIHRoYW4gbXkgb3duLiBJIHJlYWQgYWJvdXQgdGhl IGVmZmVjdGl2ZW5lc3MgaW4gdGhlIE1vdGl2YXRpb24gY2hhcHRlciwgYnV0IGRvZXMgYW55Ym9k eSBoYXZlIGFueSBleHBlcmllbmNlIGFib3V0IHRoZSBjb25jcmV0ZSB5aWVsZCBvZiBwb3NzaWJs ZSBhY2NlbGVyYXRpb24/DQoNClRoYW5rcyENCg0KUmVnYXJkczogRGFuaWVsDQoNCkxpdmUgbG9u ZyBhbmQgcHJvc3Blcg0KQmFydGhhIETvv71uaWVsDQpNVEEtVk1SSSwgMjAxMw0KDQpfX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0Kc2VxYW4tZGV2IG1haWxp bmcgbGlzdA0Kc2VxYW4tZGV2QGxpc3RzLmZ1LWJlcmxpbi5kZTxtYWlsdG86c2VxYW4tZGV2QGxp c3RzLmZ1LWJlcmxpbi5kZT4NCmh0dHBzOi8vbGlzdHMuZnUtYmVybGluLmRlL2xpc3RpbmZvL3Nl cWFuLWRldg0KDQoNCg0KX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX18NCnNlcWFuLWRldiBtYWlsaW5nIGxpc3QNCnNlcWFuLWRldkBsaXN0cy5mdS1iZXJsaW4u ZGU8bWFpbHRvOnNlcWFuLWRldkBsaXN0cy5mdS1iZXJsaW4uZGU+DQpodHRwczovL2xpc3RzLmZ1 LWJlcmxpbi5kZS9saXN0aW5mby9zZXFhbi1kZXYNCg0KLS0tDQoNClJlbu+/vSBSYWhuDQpQaC5E LiBTdHVkZW50DQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KVGVsOiAgKCs0OSkg MzAgODM4IDc1Mjc3DQpNYWlsOiByZW5lLnJhaG5AZnUtYmVybGluLmRlPG1haWx0bzpyZW5lLnJh aG5AZnUtYmVybGluLmRlPg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCkluc3Rp dHV0ZSBvZiBDb21wdXRlciBTY2llbmNlDQpBbGdvcml0aG1pYyBCaW9pbmZvcm1hdGljcyAoQUJJ KQ0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCkZyZWllIFVuaXZlcnNpdO+/vXQg QmVybGluDQpUYWt1c3RyYe+/vWUgOQ0KMTQxOTUgQmVybGluDQotLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLQ0KDQo= --_000_9CAA1752576D407FB96E9F3EAEC916C0campusfuberlinde_ Content-Type: text/html; charset="utf-8" Content-ID: Content-Transfer-Encoding: base64 PGh0bWw+DQo8aGVhZD4NCjxtZXRhIGh0dHAtZXF1aXY9IkNvbnRlbnQtVHlwZSIgY29udGVudD0i dGV4dC9odG1sOyBjaGFyc2V0PXV0Zi04Ij4NCjwvaGVhZD4NCjxib2R5Pg0KPGRpdiBzdHlsZT0i d29yZC13cmFwOmJyZWFrLXdvcmQiPkhleSBEYW5pZWwsJm5ic3A7DQo8ZGl2Pjxicj4NCjwvZGl2 Pg0KPGRpdj5JIHRyaWVkIG91dCB5b3VyIGNvZGUgZXhhbXBsZXMgYmVsb3cuIEkgZGlkIGhhdmUg c29tZSBzdXJwcmlzaW5nIG9ic2VydmF0aW9ucyBidXQgdGhlcmUgYXJlIGRpZmZlcmVudCBmcm9t IHdoYXQgeW91IHdoZXJlIHJlcG9ydGluZy4gSSByZXBsYWNlZCBzb21lIG9mIHlvdXIgZnVuY3Rp b25hbGl0eS4gSSBhZGFwdGVkIHRoZSBzZWxlY3RfZXZlbnQgZnVuY3Rpb24gdG8gc2ltcGx5IHJl dHVybiB0aGUgY29tcGxlbWVudCBvZiBhIGdpdmVuIGJhc2UuDQogSSByZW1vdmVkIHRoZSByYW5k b21uZXNzIGZhY3RvciB0byBzZWxlY3QgdGhlIGluZGV4IGFuZCBzaW1wbHkgdXNlZCBldmVyeSBp bmRleCB0byBiZSBjb252ZXJ0ZWQuIEkgbG9hZGVkIHRoZSBjaHIyMiBzZXF1ZW5jZSBvZiB0aGUg aHVtYW4gZ2Vub21lICh+NTAgTWIpICZuYnNwO2FuZCBtZWFzdXJlZCB0aGUgdGltZSBvZiBydW5u aW5nIDUwIHRpbWVzIGEpIHRoZSByZXBsaWNhdGUgZnVuY3Rpb24gYW5kIGIpIHRoZSBpbm5lciBs b29wIHdpdGggdGhlIGFzc2lnbm1lbnQuDQogSSBkaWQgdGhlIGV4cGVyaW1lbnRzIHdpdGggdGhl IHNlcWFuOjpTdHJpbmcmbHQ7RG5hNSZndDssIHN0ZDo6dmVjdG9yJmx0O0RuYTUmZ3Q7ICwgc3Rk OjpiYXNpY19zdHJpbmcmbHQ7RG5hNSZndDsgYW5kIHN0ZDo6c3RyaW5nLiBJIGFsc28gaW1wbGVt ZW50ZWQgYSByZXBsaWNhdGUzIGZ1bmN0aW9uIHdoaWNoIHBlcmZvcm1zIGJlc3QgYXMgaXQgcmVk dWNlcyB0aGUgbnVtYmVyIG9mIGNvcHlpbmcgd2hvbGUgU3RyaW5ncy48L2Rpdj4NCjxkaXY+SSBk aWQgdGhlIHBhcnNpbmcgb3ZlciB0aGUgaW5kZXggd2l0aCBhbiBjJiM0MzsmIzQzOzExIHJhbmdl LWJhc2VkIGZvciBsb29wIGFuZCB0aGUgc3RhbmRhcmQgZm9yIGxvb3AuPC9kaXY+DQo8ZGl2Pkhl cmUgYXJlIG15IHJlc3VsdHMgYnVpbHQgaW4gcmVsZWFzZSBtb2RlIG9uIGEgMi4zIEdIeiBDb3Jl IGk3LjwvZGl2Pg0KPGRpdj48YnI+DQo8L2Rpdj4NCjxkaXY+QWxsIHRpbWVzIGFyZSB0aGUgc3Vt IG9mIDUwIGV4cGVyaW1lbnRzLjwvZGl2Pg0KPGRpdj48YnI+DQo8L2Rpdj4NCjxkaXY+QyYjNDM7 JiM0MzsxMSBzdHlsZTo8L2Rpdj4NCjxkaXY+PGJyPg0KPC9kaXY+DQo8ZGl2Pg0KPGRpdj48Zm9u dCBmYWNlPSJDb25zb2xhcyI+U2VxYW4gU3RyaW5nPHNwYW4gY2xhc3M9InhfQXBwbGUtdGFiLXNw YW4iIHN0eWxlPSJ3aGl0ZS1zcGFjZTpwcmUiPg0KPC9zcGFuPjwvZm9udD48c3BhbiBzdHlsZT0i Zm9udC1mYW1pbHk6Q29uc29sYXMiPlRpbWU6IDExLjE4IHMuICZuYnNwOyBJbm5lciBMb29wOiAy LjU4MDY0IHMuPC9zcGFuPjwvZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJDb25zb2xhcyI+U1RMIFZl Y3RvcjxzcGFuIGNsYXNzPSJ4X0FwcGxlLXRhYi1zcGFuIiBzdHlsZT0id2hpdGUtc3BhY2U6cHJl Ij4NCjwvc3Bhbj48L2ZvbnQ+PHNwYW4gc3R5bGU9ImZvbnQtZmFtaWx5OkNvbnNvbGFzIj5UaW1l OiAxMC45Nzk4IHMuIElubmVyJm5ic3A7PC9zcGFuPjxzcGFuIHN0eWxlPSJmb250LWZhbWlseTpD b25zb2xhcyI+TG9vcDwvc3Bhbj48c3BhbiBzdHlsZT0iZm9udC1mYW1pbHk6Q29uc29sYXMiPjog Mi41MzgzNSBzLjwvc3Bhbj48L2Rpdj4NCjxkaXY+PGZvbnQgZmFjZT0iQ29uc29sYXMiPlNUTCBC YXNpYyBTdHJpbmcgRG5hNTxzcGFuIGNsYXNzPSJ4X0FwcGxlLXRhYi1zcGFuIiBzdHlsZT0id2hp dGUtc3BhY2U6cHJlIj4NCjwvc3Bhbj48L2ZvbnQ+PHNwYW4gc3R5bGU9ImZvbnQtZmFtaWx5OkNv bnNvbGFzIj5UaW1lOiAxMC42NTAxIHMuIElubmVyJm5ic3A7PC9zcGFuPjxzcGFuIHN0eWxlPSJm b250LWZhbWlseTpDb25zb2xhcyI+TG9vcDwvc3Bhbj48c3BhbiBzdHlsZT0iZm9udC1mYW1pbHk6 Q29uc29sYXMiPjogMy45NDU1NCBzLjwvc3Bhbj48L2Rpdj4NCjxkaXY+PGZvbnQgZmFjZT0iQ29u c29sYXMiPlNUTCBCYXNpYyBTdHJpbmcgQ2hhcjxzcGFuIGNsYXNzPSJ4X0FwcGxlLXRhYi1zcGFu IiBzdHlsZT0id2hpdGUtc3BhY2U6cHJlIj4NCjwvc3Bhbj48L2ZvbnQ+PHNwYW4gc3R5bGU9ImZv bnQtZmFtaWx5OkNvbnNvbGFzIj5UaW1lOiAxMS40Nzk5IHMuIElubmVyJm5ic3A7PC9zcGFuPjxz cGFuIHN0eWxlPSJmb250LWZhbWlseTpDb25zb2xhcyI+TG9vcDwvc3Bhbj48c3BhbiBzdHlsZT0i Zm9udC1mYW1pbHk6Q29uc29sYXMiPjogNC44NTUwNiBzLjwvc3Bhbj48L2Rpdj4NCjxkaXY+PGZv bnQgZmFjZT0iQ29uc29sYXMiPnJlcGxpY2F0ZTM8c3BhbiBjbGFzcz0ieF9BcHBsZS10YWItc3Bh biIgc3R5bGU9IndoaXRlLXNwYWNlOnByZSI+DQo8L3NwYW4+PHNwYW4gY2xhc3M9InhfQXBwbGUt dGFiLXNwYW4iIHN0eWxlPSJ3aGl0ZS1zcGFjZTpwcmUiPjwvc3Bhbj5UaW1lOiA4LjY3MTcyIHMu IElubmVyJm5ic3A7PC9mb250PjxzcGFuIHN0eWxlPSJmb250LWZhbWlseTpDb25zb2xhcyI+TG9v cDwvc3Bhbj48Zm9udCBmYWNlPSJDb25zb2xhcyI+OiAyLjUyNDc0IHMuPC9mb250PjwvZGl2Pg0K PC9kaXY+DQo8ZGl2PjxzcGFuIHN0eWxlPSJmb250LWZhbWlseTpDb25zb2xhcyI+PGJyPg0KPC9z cGFuPjwvZGl2Pg0KPGRpdj48c3BhbiBzdHlsZT0iZm9udC1mYW1pbHk6Q29uc29sYXMiPkMmIzQz OyYjNDM7OTggc3R5bGU8L3NwYW4+PC9kaXY+DQo8ZGl2Pjxicj4NCjwvZGl2Pg0KPGRpdj4NCjxk aXY+PGZvbnQgZmFjZT0iQ29uc29sYXMiPlNlcWFuIFN0cmluZzxzcGFuIGNsYXNzPSJ4X0FwcGxl LXRhYi1zcGFuIiBzdHlsZT0id2hpdGUtc3BhY2U6cHJlIj4NCjwvc3Bhbj48L2ZvbnQ+PHNwYW4g c3R5bGU9ImZvbnQtZmFtaWx5OkNvbnNvbGFzIj5UaW1lOiAxMS4wODI4IHMuIElubmVyIExvb3A6 IDIuNDk2Njcgcy48L3NwYW4+PC9kaXY+DQo8ZGl2Pjxmb250IGZhY2U9IkNvbnNvbGFzIj5TVEwg VmVjdG9yPHNwYW4gY2xhc3M9InhfQXBwbGUtdGFiLXNwYW4iIHN0eWxlPSJ3aGl0ZS1zcGFjZTpw cmUiPg0KPC9zcGFuPjwvZm9udD48c3BhbiBzdHlsZT0iZm9udC1mYW1pbHk6Q29uc29sYXMiPlRp bWU6IDEwLjkxNzggcy4gSW5uZXImbmJzcDs8L3NwYW4+PHNwYW4gc3R5bGU9ImZvbnQtZmFtaWx5 OkNvbnNvbGFzIj5Mb29wPC9zcGFuPjxzcGFuIHN0eWxlPSJmb250LWZhbWlseTpDb25zb2xhcyI+ OiAyLjU0NjE0IHMuPC9zcGFuPjwvZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJDb25zb2xhcyI+U1RM IEJhc2ljIFN0cmluZyBEbmE1PHNwYW4gY2xhc3M9InhfQXBwbGUtdGFiLXNwYW4iIHN0eWxlPSJ3 aGl0ZS1zcGFjZTpwcmUiPg0KPC9zcGFuPjwvZm9udD48c3BhbiBzdHlsZT0iZm9udC1mYW1pbHk6 Q29uc29sYXMiPlRpbWU6IDEwLjkwNDggcy4gSW5uZXImbmJzcDs8L3NwYW4+PHNwYW4gc3R5bGU9 ImZvbnQtZmFtaWx5OkNvbnNvbGFzIj5Mb29wPC9zcGFuPjxzcGFuIHN0eWxlPSJmb250LWZhbWls eTpDb25zb2xhcyI+OiA0LjIwMDI0IHMuPC9zcGFuPjwvZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJD b25zb2xhcyI+U1RMIEJhc2ljIFN0cmluZyBDaGFyPHNwYW4gY2xhc3M9InhfQXBwbGUtdGFiLXNw YW4iIHN0eWxlPSJ3aGl0ZS1zcGFjZTpwcmUiPg0KPC9zcGFuPjwvZm9udD48c3BhbiBzdHlsZT0i Zm9udC1mYW1pbHk6Q29uc29sYXMiPlRpbWU6IDEyLjMxODQgcy4gSW5uZXImbmJzcDs8L3NwYW4+ PHNwYW4gc3R5bGU9ImZvbnQtZmFtaWx5OkNvbnNvbGFzIj5Mb29wPC9zcGFuPjxzcGFuIHN0eWxl PSJmb250LWZhbWlseTpDb25zb2xhcyI+OiA1LjYxMjMxIHMuPC9zcGFuPjwvZGl2Pg0KPGRpdj48 Zm9udCBmYWNlPSJDb25zb2xhcyI+cmVwbGlhY3RlMzxzcGFuIGNsYXNzPSJ4X0FwcGxlLXRhYi1z cGFuIiBzdHlsZT0id2hpdGUtc3BhY2U6cHJlIj4NCjwvc3Bhbj48c3BhbiBjbGFzcz0ieF9BcHBs ZS10YWItc3BhbiIgc3R5bGU9IndoaXRlLXNwYWNlOnByZSI+PC9zcGFuPjwvZm9udD48c3BhbiBz dHlsZT0iZm9udC1mYW1pbHk6Q29uc29sYXMiPlRpbWU6IDkuNTU3MTkgcy4gSW5uZXImbmJzcDs8 L3NwYW4+PHNwYW4gc3R5bGU9ImZvbnQtZmFtaWx5OkNvbnNvbGFzIj5Mb29wPC9zcGFuPjxzcGFu IHN0eWxlPSJmb250LWZhbWlseTpDb25zb2xhcyI+OiAzLjMwMDUyIHMuPC9zcGFuPjwvZGl2Pg0K PC9kaXY+DQo8ZGl2PjxzcGFuIHN0eWxlPSJmb250LWZhbWlseTpDb25zb2xhcyI+PGJyPg0KPC9z cGFuPjwvZGl2Pg0KPGRpdj5BcyB5b3UgY2FuIHNlZSB0aGUgcmVwbGljYXRlMyBmdW5jdGlvbiBv dXRwZXJmb3JtcyB0aGUgb3RoZXIgdmVyc2lvbnMsIGhvd2V2ZXIgdGhlIGlubmVyIGxvb3AgZ2V0 cyBzbG93ZXIgd2hlbiB1c2luZyB0aGUgc3RhbmRhcmQgZm9yIGxvb3AsIGFuZCBJIGFtIG5vdCBx dWl0ZSBzdXJlIHRoYXQgSSBjb21wbGV0ZWx5IHVuZGVyc3RhbmQgd2h5LCBiZWNhdXNlIEkgY2Fu J3Qgb2JzZXJ2ZSB0aGUgc2FtZSBwZXJmb3JtYW5jZSBkcm9wIGluIHRoZQ0KIHJlcGxpY2F0ZTIg ZnVuY3Rpb24uPC9kaXY+DQo8ZGl2Pkhvd2V2ZXIsIHdoZW4gY29tcGFyaW5nIHJlc3VsdHMgd2l0 aCB0aGUgQyYjNDM7JiM0MzsxMSB2ZXJzaW9uIHRoZSBhc3NpZ25tZW50IG9mIHRoZSBzZXFhbjo6 U3RyaW5nIGlzIGxpa2UgdGhlIHN0ZDo6dmVjdG9yIGFuZCBmYXN0ZXIgdGhhbiB0aGUgc3RkOjpz dHJpbmcgdmVyc2lvbnMuJm5ic3A7PC9kaXY+DQo8ZGl2Pjxicj4NCjwvZGl2Pg0KPGRpdj5DYW4g eW91IHBsZWFzZSBnaXZlIHVzIHNvbWUgaW5mb3JtYXRpb24gYWJvdXQgdGhlIGRpbWVuc2lvbiBv ZiB5b3UgcHJvYmxlbS4gSG93IG1hbnkgc2VxdWVuY2VzIGFyZSB5b3UgcmVwbGljYXRpbmc/IEhv dyBsb25nIGFyZSB0aGUgc2VxdWVuY2VzPzwvZGl2Pg0KPGRpdj5QbGVhc2UgY29uc2lkZXIgdGhl IGZvbGxvd2luZyBwZXJmb3JtYW5jZSBib29zdGVycy4gQWx3YXlzIHByZWZlciBwYXNzaW5nIHBh cmFtZXRlcnMgYnkgY29uc3QtcmVmZXJlbmNlIG92ZXIgcGFzc2luZyB0aGVtIGJ5IGNvcHkgKGFz IGxvbmcgYXMgeW91IGFyZSBzdXJlIHRoZXNlIGFyZSBub3QganVzdCBzaW1wbGUgdHlwZXMpLiBD b3B5aW5nIGEgYmlnIGNvbnRhaW5lciB3aXRoIG1hbnkgdmFsdWVzIGlzIHNsb3dlciB0aGFuIGNv cHlpbmcNCiBhIDQvOCBCeXRlIHJlZmVyZW5jZSA6KS48L2Rpdj4NCjxkaXY+PGJyPg0KPC9kaXY+ DQo8ZGl2PkkgYWxzbyBhcHBlbmRlZCB0aGUgYmVuY2htYXJrIGZpbGUuIFNvIG1heWJlIHlvdSBj YW4gcnVuIHRoZSB0ZXN0cyBvbiB5b3VyIG1hY2hpbmUgYW5kIHJlcG9ydCB5b3VyIGV4cGVyaWVu Y2UuPC9kaXY+DQo8ZGl2Pjxicj4NCjwvZGl2Pg0KPGRpdj48L2Rpdj4NCjwvZGl2Pg0KPGRpdiBz dHlsZT0id29yZC13cmFwOmJyZWFrLXdvcmQiPg0KPGRpdj48L2Rpdj4NCjxkaXY+PGJyPg0KPC9k aXY+DQo8ZGl2PktpbmQgcmVnYXJkcywmbmJzcDs8L2Rpdj4NCjxkaXY+PGJyPg0KPC9kaXY+DQo8 ZGl2PlJlbu+/vTwvZGl2Pg0KPGRpdj48YnI+DQo8ZGl2Pg0KPGRpdj5BbSAxMS4wOS4yMDEzIHVt IDE1OjQzIHNjaHJpZWIgQmFydGhhIETvv71uaWVsICZsdDs8YSBocmVmPSJtYWlsdG86ZGFuaWVs LmJhcnRoYUBnbWFpbC5jb20iPmRhbmllbC5iYXJ0aGFAZ21haWwuY29tPC9hPiZndDs6PC9kaXY+ DQo8YnIgY2xhc3M9InhfQXBwbGUtaW50ZXJjaGFuZ2UtbmV3bGluZSI+DQo8YmxvY2txdW90ZSB0 eXBlPSJjaXRlIj4NCjxkaXYgZGlyPSJsdHIiPg0KPGRpdj4NCjxkaXY+DQo8ZGl2Pg0KPGRpdj4N CjxkaXY+SGkgTWFudWVsIGFuZCBQZW9wbGUgdGhlcmUsPGJyPg0KPGJyPg0KPC9kaXY+DQppIHBy b21pc2VkIHRvIHJlcG9ydCBvdmVyIHRoZSBwZXJmb3JtYW5jZSBjb21wYXJzaW9uIGJldHdlZW4g c2VxYW46OlN0cmluZyZsdDtzZXFhbjo6RG5hNSZndDsgYW5kIHN0ZDo6c3RyaW5nLiBTbyBoZXJl IGFyZSB0aGUgKGZvciBtZSkgc3VycHJpc2luZyByZXN1bHRzOjxicj4NCjxicj4NCjwvZGl2Pg0K SSByZXBsYWNlZCB0aGUgc3RyaW5ncyBhbmQgY2hhcnMgd2l0aCB0aGUgc2VxYW4gdHlwZXMgaW4g YWxsIG92ZXIgbXkgc291cmNlIGZpbGVzLiBJIGFjY2VzcyB0aGUgY2hhcmFjdGVycyBpbiB0aGUg c2VxYW4gc3RyaW5ncyB0cm91Z2ggW10gb3BlcmF0b3IgYW5kIGNvcnJlY3RlZCB0aGUgZnVuY3Rp b25zIHdoZXJlIG5lZWRlZC48YnI+DQo8YnI+DQo8L2Rpdj4NClRoZSBwcm9ncmFtIGRvZXMgaXRz IGpvYiwgYnV0IGl0cyA1IHRpbWVzIHNsb3dlciB0aGVuIHRoZSBzaW1wbGUgc3RkIGltcGxlbWVu dGF0aW9uISBUaGF0cyBub3QgZXhhY3RseSB3aGF0IGkgZXhwZWN0ZWQsIGkgdGhvdWdodCBpdCB3 aWxsIGJlIGEgbGl0dGxlIHNsb3dlciBvciBtdWNoIGZhc3RlciwgYnV0IG5vdCB0aGlzIGV4dHJl bWUgc2xvd2Rvd24uDQo8YnI+DQo8YnI+DQo8L2Rpdj4NCkkgc3VwcG9zZSBpdCBoYXBwZW5zIGJl Y2F1c2UgaSBkb250IHVzZSBzZXFhbiB0aGUgcmlnaHQgd2F5LiBEbyB5b3UgaGF2ZSBhbiBpZGVh LCB3aGF0cyB0aGUgcmVhc29uPyBJIHBhc3RlIGhlcmUgdGhlIHJlc3BvbnNpYmxlIHR3byBmdW5j dGlvbnMsIGl0IHdvdWxkIGJlIGdyZWF0LCBpZiBzb21lb25lIGNvdWxkIHNwZW5kIGEgY291cGxl IG9mIG1pbnV0ZXMuPGJyPg0KPGJyPg0KPGJyPg0KPHNwYW4gc3R5bGU9ImZvbnQtZmFtaWx5OmNv dXJpZXIgbmV3LG1vbm9zcGFjZSI+PGI+RG5hNSBldmVudHNwYWNlOjpzZWxlY3RfZXZlbnQoRG5h NSBiYXNlLCBkb3VibGUgcCk8L2I+PGJyPg0Kezxicj4NCjwvc3Bhbj48L2Rpdj4NCjxkaXY+PHNw YW4gc3R5bGU9ImZvbnQtZmFtaWx5OmNvdXJpZXIgbmV3LG1vbm9zcGFjZSI+Jm5ic3A7Jm5ic3A7 Jm5ic3A7IDxzcGFuIHN0eWxlPSJjb2xvcjpyZ2IoMTA2LDE2OCw3OSkiPg0KLyoqdGhpcyBmdW5j dGlvbiBkb2VzIG9ubHkgZ2l2ZXMgYmFjayBhIERuYTUgY2hhciwgaWYgdGhlIHJhbmRvbSBudW1i ZXIgaSBnaXZlIGlzIGluIHNvbWUgb2YgdGhlIHByZS1zdG9yZWQgaW50ZXJ2YWxzLCBzbyBub3Ro aW5nIHNwZWNpYWwqKi88L3NwYW4+PGJyPg0KPC9zcGFuPjwvZGl2Pg0KPGRpdj48c3BhbiBzdHls ZT0iZm9udC1mYW1pbHk6Y291cmllciBuZXcsbW9ub3NwYWNlIj4mbmJzcDsmbmJzcDsmbmJzcDsg Zm9yKGV2ZW50IGUgOiBFW2Jhc2VdKTxicj4NCiZuYnNwOyZuYnNwOyZuYnNwOyB7PGJyPg0KPGJy Pg0KJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7IGlmKGUuYSAmZ3Q7 IHApPGJyPg0KJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7IHs8YnI+ DQombmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsm bmJzcDsmbmJzcDsgaWYocCAmZ3Q7PSBlLmIpPGJyPg0KJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7 Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7IHs8YnI+DQombmJzcDsm bmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJz cDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsgcmV0dXJuIDxhIGhyZWY9Imh0dHA6Ly9lLnRvLyI+ ZS50bzwvYT47PGJyPg0KPC9zcGFuPjwvZGl2Pg0KPGRpdj48c3BhbiBzdHlsZT0iZm9udC1mYW1p bHk6Y291cmllciBuZXcsbW9ub3NwYWNlIj4mbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsm bmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJz cDsgPHNwYW4gc3R5bGU9ImNvbG9yOnJnYigxMDYsMTY4LDc5KSI+DQovL3doaWNoIGlzIGEgc2Vx YW46OkRuYTUgY2hhcmFjdGVyPC9zcGFuPjxicj4NCjwvc3Bhbj48L2Rpdj4NCjxkaXY+PHNwYW4g c3R5bGU9ImZvbnQtZmFtaWx5OmNvdXJpZXIgbmV3LG1vbm9zcGFjZSI+Jm5ic3A7Jm5ic3A7Jm5i c3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7IH08YnI+ DQombmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsgfTxicj4NCiZuYnNw OyZuYnNwOyZuYnNwOyB9PGJyPg0KfTwvc3Bhbj48YnI+DQo8YnI+DQo8c3BhbiBzdHlsZT0iZm9u dC1mYW1pbHk6Y291cmllciBuZXcsbW9ub3NwYWNlIj48Yj5zZXFhbjo6U3RyaW5nJmx0O3NlcWFu OjpEbmE1Jmd0OyByZXBsaWNhdGUyKGZyYW1ld29yayZhbXA7IHN5cywgc2VxYW46OlN0cmluZyZs dDtzZXFhbjo6RG5hNSZndDsgc2VxLCBkZWZhdWx0X3JhbmRvbV9lbmdpbmUgZW5naW5lKTwvYj48 YnI+DQp7PGJyPg0KJm5ic3A7Jm5ic3A7Jm5ic3A7IHVuaWZvcm1fcmVhbF9kaXN0cmlidXRpb24m bHQ7Jmd0OyB1cl9kaXN0KDAsIHN5cy5HZXRzY2FsZSgpKTs8YnI+DQo8L3NwYW4+PC9kaXY+DQo8 c3BhbiBzdHlsZT0iZm9udC1mYW1pbHk6Y291cmllciBuZXcsbW9ub3NwYWNlIj4mbmJzcDsmbmJz cDsmbmJzcDsgPHNwYW4gc3R5bGU9ImNvbG9yOnJnYigxMDYsMTY4LDc5KSI+DQovL3RoaXMgYW5k IHRoZSBkZWZhdWx0X3JhbmRvbV9lbmdpbmUgYXJlIG5lZWRlZCBmb3IgcmVhbCByYW5kb20gbnVt YmVyIGdlbmVyYXRpb248L3NwYW4+PGJyPg0KPC9zcGFuPg0KPGRpdj48c3BhbiBzdHlsZT0iZm9u dC1mYW1pbHk6Y291cmllciBuZXcsbW9ub3NwYWNlIj48YnI+DQombmJzcDsmbmJzcDsmbmJzcDsg dmVjdG9yJmx0O2RvdWJsZSZndDsgcHJvYnMobGVuZ3RoKHNlcSkpOzxicj4NCjwvc3Bhbj48L2Rp dj4NCjxkaXY+PHNwYW4gc3R5bGU9ImZvbnQtZmFtaWx5OmNvdXJpZXIgbmV3LG1vbm9zcGFjZSI+ Jm5ic3A7Jm5ic3A7Jm5ic3A7IHZlY3RvciZsdDtpbnQmZ3Q7IGluZGV4Ozxicj4NCjxicj4NCiZu YnNwOyZuYnNwOyZuYnNwOyBmb3IodW5zaWduZWQgaT0wOyBpJmx0O3Byb2JzLnNpemUoKTsgJiM0 MzsmIzQzO2kpPGJyPg0KJm5ic3A7Jm5ic3A7Jm5ic3A7IHs8YnI+DQombmJzcDsmbmJzcDsmbmJz cDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsgcHJvYnNbaV09dXJfZGlzdChlbmdpbmUpOzxicj4N CiZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyBpZihwcm9ic1tpXSAm Z3Q7IHN5cy5sb29rdXBbc2VxW2ldXSlpbmRleC5wdXNoX2JhY2soaSk7PGJyPg0KJm5ic3A7Jm5i c3A7Jm5ic3A7IH08YnI+DQombmJzcDsmbmJzcDsmbmJzcDsgZm9yKHVuc2lnbmVkIGkgOiBpbmRl eCk8YnI+DQombmJzcDsmbmJzcDsmbmJzcDsgezxicj4NCiZuYnNwOyZuYnNwOyZuYnNwOyZuYnNw OyZuYnNwOyZuYnNwOyBzZXFbaV09c3lzLmV2ZW50cy48c3BhbiBzdHlsZT0iY29sb3I6cmdiKDI1 NSwwLDApIj5zZWxlY3RfZXZlbnQoc2VxW2ldLHByb2JzW2ldKTwvc3Bhbj47PGJyPg0KPC9zcGFu PjwvZGl2Pg0KPGRpdj48c3BhbiBzdHlsZT0iZm9udC1mYW1pbHk6Y291cmllciBuZXcsbW9ub3Nw YWNlIj4mbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsmbmJzcDsgPHNwYW4gc3R5bGU9ImNv bG9yOnJnYigxMDYsMTY4LDc5KSI+DQovKipzbyBwcmFjdGljYWxseSBvbmUgRG5hNSA9IHRoZSBv dGhlciBEbmE1IHZhcmlhYmxlLCB3aXRoIGFzc2lnbigpIGlzIGl0IGV2ZW4gYSBsaXR0bGUgc2xv d2VyKiovPC9zcGFuPjxicj4NCjwvc3Bhbj48L2Rpdj4NCjxkaXY+PHNwYW4gc3R5bGU9ImZvbnQt ZmFtaWx5OmNvdXJpZXIgbmV3LG1vbm9zcGFjZSI+Jm5ic3A7Jm5ic3A7Jm5ic3A7IH08YnI+DQpy ZXR1cm4gc2VxO308L3NwYW4+PGJyPg0KPGJyPg0KPC9kaXY+DQo8ZGl2PkRvIHlvdSBoYXZlIGFu eSBpZGVhLCBvciBpcyB0aGlzIHNsb3dkb3duIG1heWJlIG5vcm1hbD88YnI+DQo8YnI+DQo8L2Rp dj4NCjxkaXY+VGhhbmtzLCByZWdhcmRzOjxicj4NCjxicj4NCjwvZGl2Pg0KPGRpdj5EYW5pZWw8 YnI+DQo8L2Rpdj4NCjwvZGl2Pg0KPGRpdiBjbGFzcz0ieF9nbWFpbF9leHRyYSI+PGJyIGNsZWFy PSJhbGwiPg0KPGRpdj4NCjxkaXYgZGlyPSJsdHIiPjxmb250IGNvbG9yPSIjNjY2NjY2Ij5MaXZl IGxvbmcgYW5kIHByb3NwZXI8YnI+DQo8L2ZvbnQ+DQo8ZGl2Pjxmb250IGNvbG9yPSIjNjY2NjY2 Ij5CYXJ0aGEgRO+/vW5pZWw8YnI+DQpNVEEtVk1SSSwgMjAxMzwvZm9udD48L2Rpdj4NCjwvZGl2 Pg0KPC9kaXY+DQo8YnI+DQo8YnI+DQo8ZGl2IGNsYXNzPSJ4X2dtYWlsX3F1b3RlIj4yMDEzLzgv MjggQmFydGhhIETvv71uaWVsIDxzcGFuIGRpcj0ibHRyIj4mbHQ7PGEgaHJlZj0ibWFpbHRvOmRh bmllbC5iYXJ0aGFAZ21haWwuY29tIiB0YXJnZXQ9Il9ibGFuayI+ZGFuaWVsLmJhcnRoYUBnbWFp bC5jb208L2E+Jmd0Ozwvc3Bhbj48YnI+DQo8YmxvY2txdW90ZSBjbGFzcz0ieF9nbWFpbF9xdW90 ZSIgc3R5bGU9Im1hcmdpbjowIDAgMCAuOGV4OyBib3JkZXItbGVmdDoxcHggI2NjYyBzb2xpZDsg cGFkZGluZy1sZWZ0OjFleCI+DQo8ZGl2IGRpcj0ibHRyIj4NCjxkaXY+DQo8ZGl2Pg0KPGRpdj4N CjxkaXY+DQo8ZGl2PkhpIE1hbnVlbCAoYW5kIG90aGVyIGMmIzQzOyYjNDM7IGZlbGxvd3MpLDxi cj4NCjxicj4NCjwvZGl2Pg0KaSB0cnkgaXQsIGFuZCB0ZWxsIHlvdSwgaWYgaXQncyBiZXR0ZXIu IDxicj4NCjxicj4NCkJ1dCB0aGVyZSBpcyBhbiBvdGhlciBwcm9ibGVtIG5vdywgYW5kIHRoZXJl IHdhcyBhIGRpc2N1c3Npb24gYWJvdXQgaW4gZmVicnVhciBhbHJlYWR5Lig8YSBocmVmPSJodHRw czovL2xpc3RzLmZ1LWJlcmxpbi5kZS9waXBlcm1haWwvc2VxYW4tZGV2LzIwMTMtRmVicnVhcnkv bXNnMDAwMDIuaHRtIiB0YXJnZXQ9Il9ibGFuayI+aHR0cHM6Ly9saXN0cy5mdS1iZXJsaW4uZGUv cGlwZXJtYWlsL3NlcWFuLWRldi8yMDEzLUZlYnJ1YXJ5L21zZzAwMDAyLmh0bTwvYT48YnI+DQo8 L2Rpdj4NCkkgZG9udCBrbm93IGlmIGl0IGlzIHNvbHZlZCBvciBub3QsIGJ1dCBpIHN0aWxsL2Fn YWluIGdldCBleGFjdCB0aGUgc2FtZSBlcnJvciBtZXNzYWdlOjxicj4NCjxicj4NCi91c3IvaW5j bHVkZS9zZXFhbi9iYW1faW8vY2lnYXIuaHx8SW4gZnVuY3Rpb24g77+9Ym9vbCBzZXFhbjo6b3Bl cmF0b3ImbHQ7KGNvbnN0IHNlcWFuOjpDaWdhckVsZW1lbnQmbHQ7VE9wZXJhdGlvbiwgVENvdW50 Jmd0OyZhbXA7LCBjb25zdCBzZXFhbjo6Q2lnYXJFbGVtZW50Jmx0O1RPcGVyYXRpb24sIFRDb3Vu dCZndDsmYW1wOynvv706fDxicj4NCi91c3IvaW5jbHVkZS9zZXFhbi9iYW1faW8vY2lnYXIuaHwx MjB8ZXJyb3I6IHBhcnNlIGVycm9yIGluIHRlbXBsYXRlIGFyZ3VtZW50IGxpc3R8PGJyPg0KfHw9 PT0gQnVpbGQgZmluaXNoZWQ6IDEgZXJyb3JzLCAwIHdhcm5pbmdzICgwIG1pbnV0ZXMsIDIgc2Vj b25kcykgPT09fDxicj4NCjxicj4NCjwvZGl2Pg0KVGhpcyBpcyBjYXVzZWQgYnkgdGhlIGluY2x1 ZGluZyBvZiAjaW5jbHVkZSAmbHQ7c2VxYW4vc2VxX2lvLmgmZ3Q7LCBhbmQgdGhlIHByb2dyYW0g aXMgY29tcGxldGx5IGVtcHR5IChyZXR1cm4gMDsuLi4pLiBJIHVzZSB1YnVudHUgbGludXggYW1k NjQsIGFuZCBnJiM0MzsmIzQzOyA0LjcuMy4NCjxicj4NCjxicj4NCkkgYnlwYXNzIHRoZSB1c2Fn ZSBvZiB0aGlzIGhlYWRlciBub3csIGJ1dCBpdCBkb2Vzbid0IHNlZW1zIHRvIGJlIHVuaXFlLjxi cj4NCjxicj4NCjwvZGl2Pg0KVGhhbmsgeW91IHZlcnkgbXVjaCBhZ2FpbiwgYW5kIGhhdmUgYSBn b29kIGRheSE8YnI+DQo8YnI+DQo8L2Rpdj4NCkRhbmllbDxicj4NCjxkaXY+PGJyPg0KPC9kaXY+ DQo8L2Rpdj4NCjxkaXYgY2xhc3M9InhfZ21haWxfZXh0cmEiPg0KPGRpdiBjbGFzcz0ieF9pbSI+ PGJyIGNsZWFyPSJhbGwiPg0KPGRpdj4NCjxkaXYgZGlyPSJsdHIiPjxmb250IGNvbG9yPSIjNjY2 NjY2Ij5MaXZlIGxvbmcgYW5kIHByb3NwZXI8YnI+DQo8L2ZvbnQ+DQo8ZGl2Pjxmb250IGNvbG9y PSIjNjY2NjY2Ij5CYXJ0aGEgRO+/vW5pZWw8YnI+DQpNVEEtVk1SSSwgMjAxMzwvZm9udD48L2Rp dj4NCjwvZGl2Pg0KPC9kaXY+DQo8YnI+DQo8YnI+DQo8L2Rpdj4NCjxkaXYgY2xhc3M9InhfZ21h aWxfcXVvdGUiPjIwMTMvOC8yOCBIb2x0Z3Jld2UsIE1hbnVlbCA8c3BhbiBkaXI9Imx0ciI+Jmx0 OzxhIGhyZWY9Im1haWx0bzptYW51ZWwuaG9sdGdyZXdlQGZ1LWJlcmxpbi5kZSIgdGFyZ2V0PSJf YmxhbmsiPm1hbnVlbC5ob2x0Z3Jld2VAZnUtYmVybGluLmRlPC9hPiZndDs8L3NwYW4+PGJyPg0K PGJsb2NrcXVvdGUgY2xhc3M9InhfZ21haWxfcXVvdGUiIHN0eWxlPSJtYXJnaW46MCAwIDAgLjhl eDsgYm9yZGVyLWxlZnQ6MXB4ICNjY2Mgc29saWQ7IHBhZGRpbmctbGVmdDoxZXgiPg0KPGRpdj4N CjxkaXYgY2xhc3M9InhfaDUiPg0KPGRpdj4NCjxkaXYgc3R5bGU9ImRpcmVjdGlvbjpsdHI7IGZv bnQtc2l6ZToxMHB0OyBmb250LWZhbWlseTpUYWhvbWEiPkhpIERhbmllbCwNCjxkaXY+PGJyPg0K PC9kaXY+DQo8ZGl2Pml0IGRlcGVuZHMgb24geW91ciBhcHBsaWNhdGlvbiBhbmQgd2hhdCB5b3Ug ZG8gd2l0aCB5b3VyIHN0cmluZ3MuIFVzaW5nIHRoZSBTZXFBbiBsaWJyYXJ5IGNhbiB5aWVsZCBt b3JlIGVsZWdhbnQgYW5kIGZhc3RlciBjb2RlIHRoYW4gdXNpbmcgc3RkOjpzdHJpbmcgb3Igc2Vs Zi13cml0dGVuIHN0cmluZyBjbGFzc2VzIGJ1dCBpdCBkZXBlbmRzIG9uIHRoZSBhY3R1YWwgdXNl IGNhc2UuPC9kaXY+DQo8ZGl2Pjxicj4NCjwvZGl2Pg0KPGRpdj5Gb3IgU2VxdWVuY2VzLCB0aGVy ZSBhcmUgdHdvIGFzcGVjdHM6PC9kaXY+DQo8ZGl2Pjxicj4NCjwvZGl2Pg0KPGRpdj4oMSkgVXNp bmcgU2VxQW4ncyBEbmE1LCBEbmEgZm9yIGNoYXJhY3RlcnMgc3RvcmVzIHRoZSBhbHBoYWJldCBh cyBudW1iZXJzIDAuLjMvNCBpbnRlcm5hbGx5LiBUaGlzIG1ha2VzIGl0IGVhc2llciBmb3IgaW5k aWNlcyBhbmQgbWFwcGluZ3Mgc2luY2UgdGhleSBjYW4gd29yayBkaXJlY3RseSBhbmQgZWZmaWNp ZW50bHkgb24gdGhlIG9yZGluYWwgdmFsdWUgKG9yZFZhbHVlKS48L2Rpdj4NCjxkaXY+PGJyPg0K PC9kaXY+DQo8ZGl2PkZvciBleGFtcGxlLCBpZiB5b3UgYXJlIGNvdW50aW5nIHRoZSBudWNsZW90 aWRlIGNvbnRlbnQgYWxvbmcgc3RyaW5ncywgeW91IGNhbiBzaW1wbHkgaGF2ZSBhIDQtZWxlbWVu dCBjb250YWluZXIgKFN0cmluZyBpbiB0aGlzIGNhc2UpIGZvciBlYWNoIHBvc2l0aW9uIGluIHlv dXIgcmVhZHMgKHRodXMgYSBTdHJpbmcgb2YgU3RyaW5ncykuIFRodXMsIHlvdSBkbyBub3QgbmVl ZCBhIHBvc3NpYmxlIG1hcHBpbmcgZm9yICdBJyA9Jmd0OyAwLCAnQycNCiA9Jmd0OyAxLCAnRycg PSZndDsgMiwgJ1QnID0mZ3Q7IDMsICdOJyA9Jmd0OyA0IHNpbmNlIHRoZSBtYXBwaW5nIGlzIGRv bmUgYmVmb3JlaGFuZC48L2Rpdj4NCjxkaXY+PGJyPg0KPC9kaXY+DQo8ZGl2Pjxmb250IGZhY2U9 IkNvdXJpZXIgTmV3Ij5TdHJpbmcmbHQ7U3RyaW5nJmx0O3Vuc2lnbmVkJmd0OyAmZ3Q7IGNvdW50 ZXJzOzwvZm9udD48L2Rpdj4NCjxkaXY+PGZvbnQgZmFjZT0iQ291cmllciBOZXciPmZvciAodW5z aWduZWQgaSA9IDA7IGkgJmx0OyBsZW5ndGgocmVhZHMpOyAmIzQzOyYjNDM7aSk8L2ZvbnQ+PC9k aXY+DQo8ZGl2Pjxmb250IGZhY2U9IkNvdXJpZXIgTmV3Ij57PC9mb250PjwvZGl2Pg0KPGRpdj48 Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+Jm5ic3A7ICZuYnNwOyAvLyBJbmNyZWFzZSBudW1iZXIg b2YgY291bnRlcnMgaWYgcmVhZHNbaV0gaXMgbG9uZ2VyIHRoYW4gdGhlIHByZXZpb3VzIHJlYWRz LjwvZm9udD48L2Rpdj4NCjxkaXY+PGZvbnQgZmFjZT0iQ291cmllciBOZXciPiZuYnNwOyAmbmJz cDsgaWYgKGxlbmd0aChjb3VudGVycykgJmx0OyBsZW5ndGgocmVhZHNbaV0pKTwvZm9udD48L2Rp dj4NCjxkaXY+PGZvbnQgZmFjZT0iQ291cmllciBOZXciPiZuYnNwOyAmbmJzcDsgezwvZm9udD48 L2Rpdj4NCjxkaXY+PGZvbnQgZmFjZT0iQ291cmllciBOZXciPiZuYnNwOyAmbmJzcDsgJm5ic3A7 ICZuYnNwOyB1bnNpZ25lZCBvbGRTaXplID0gbGVuZ3RoKGNvdW50ZXJzKTs8L2ZvbnQ+PC9kaXY+ DQo8ZGl2Pjxmb250IGZhY2U9IkNvdXJpZXIgTmV3Ij4mbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJz cDsgcmVzaXplKGNvdW50ZXJzLCBsZW5ndGgocmVhZHNbaV0pKTs8L2ZvbnQ+PC9kaXY+DQo8ZGl2 Pjxmb250IGZhY2U9IkNvdXJpZXIgTmV3Ij4mbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgZm9y ICh1bnNpZ25lZCBqID0gb2xkU2l6ZTsgaiAmbHQ7IGxlbmd0aChjb3VudGVycyk7ICYjNDM7JiM0 MztqKTwvZm9udD48L2Rpdj4NCjxkaXY+PGZvbnQgZmFjZT0iQ291cmllciBOZXciPiZuYnNwOyAm bmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7IHJlc2l6ZShjb3VudGVyc1tqXSwgNSwg MCk7PC9mb250PjwvZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+Jm5ic3A7ICZu YnNwOyB9PC9mb250PjwvZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+PGJyPg0K PC9mb250PjwvZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+Jm5ic3A7ICZuYnNw OyAvLyBDb3VudCBudWNsZW90aWRlcyBmb3IgZWFjaCBwb3NpdGlvbiBpbiByZWFkc1tpXTs8L2Zv bnQ+PC9kaXY+DQo8ZGl2Pjxmb250IGZhY2U9IkNvdXJpZXIgTmV3Ij4mbmJzcDsgJm5ic3A7IGZv ciAodW5zaWduZWQgaiA9IDA7IGogJmx0OyBsZW5ndGgocmVhZHNbaV0pOyAmIzQzOyYjNDM7aik8 L2ZvbnQ+PC9kaXY+DQo8ZGl2Pjxmb250IGZhY2U9IkNvdXJpZXIgTmV3Ij4mbmJzcDsgJm5ic3A7 ICZuYnNwOyAmbmJzcDsgY291bnRlcnNbb3JkVmFsdWUocmVhZHNbaV1bal0pXSAmIzQzOz0gMTs8 L2ZvbnQ+PC9kaXY+DQo8ZGl2PjxzcGFuIHN0eWxlPSJmb250LWZhbWlseTonQ291cmllciBOZXcn OyBmb250LXNpemU6MTBwdCI+fTwvc3Bhbj48L2Rpdj4NCjxkaXY+PGJyPg0KPC9kaXY+DQo8ZGl2 PigyKSBTZXFBbidzIFN0cmluZyBjbGFzcyBhbGxvd3MgYWRkaXRpb25hbGx5IGdpdmluZyBhbiBh bHRlcm5hdGl2ZSBpbXBsZW1lbnRhdGlvbi4gVGhlIGRlZmF1bHQgaW1wbGVtZW50YXRpb24gc2lt cGx5IHVzZXMgYW4gYXJyYXkgYW5kIHdvdWxkIHN0b3JlIGEgRG5hIGNoYXJhY3RlciBpbiBhIEJ5 dGUuIEJ5IHVzaW5nIHRoZSBQYWNrZWQgU3RyaW5nLCB5b3UgY2FuIGJ5dGUtY29tcHJlc3MgZm91 ciA0LWNoYXJhY3RlciBETkEgY2hhcmFjdGVycw0KIGludG8gb25lIEJ5dGUgKGVhY2ggb25seSBu ZWVkcyAyIGJpdHMpLiBUaGlzIGNvbWVzIGF0IHRoZSBjb3N0IG9mIHNvbWUgY29tcHV0YXRpb24g YnV0IGluIHRoaXMgY2FzZSBsZWFkcyB0byBhIDR4IG1lbW9yeSBjb25zdW1wdGlvbiBkaXJlY3Rp b24uPC9kaXY+DQo8ZGl2Pjxicj4NCjwvZGl2Pg0KPGRpdj5XZSBhcyBsaWJyYXJ5IHdyaXRlcnMg Y2FuIG5vdyBjb21iaW5lIHRoZXNlIHR3byBhc3BlY3RzIG9mIHNlcXVlbmNlcyBhbmQgYWxwaGFi ZXRzIHdpdGggZ2VuZXJpYyBwcm9ncmFtbWluZyBhbmQgd3JpdGUgYWxnb3JpdGhtcyB0aGF0IGFs bG93IHRoZSB1c2VyIHRvIGNoYW5nZSB0aGUgYWxwaGFiZXQgdHlwZSBhbmQgdGhlIHN0cmluZyBp bXBsZW1lbnRhdGlvbiBkZXBlbmRpbmcgb24gdGhlIHVzZXIncyByZXF1aXJlbWVudHMgYW5kIGdl dA0KIHRoZSBiZXN0IHBvc3NpYmxlIGltcGxlbWVudGF0aW9uIGZvciB0aGlzIGNhc2UuIEJlY2F1 c2UgdGVtcGxhdGUgc3BlY2lhbGl6YXRpb24gYWxsb3dzIHVzIHRvIGRlY2lkZSBmb3IgdGhlIHRo ZSBjb3JyZWN0IGltcGxlbWVudGF0aW9uIG9mIG9yZFZhbHVlKCksIGxlbmd0aCgpIGV0Yy4gYXQg KmNvbXBpbGUgdGltZSosIHdlIGRvIG5vdCBuZWVkIHZpcnR1YWwgZnVuY3Rpb25zIGFuZCB0aHVz IG5vIGNvc3QgZm9yIHJ1bnRpbWUgcG9seW1vcnBoaXNtLjwvZGl2Pg0KPGRpdj48YnI+DQo8L2Rp dj4NCjxkaXY+SWYgeW91IHdhbnQgdG8gdXNlIHRoZSBhbGdvcml0aG1zIGluIHRoZSBTZXFBbiBs aWJyYXJ5IHRoZW4geW91IGNvdWxkIGJlbmVmaXQgZnJvbSB1c2luZyBTZXFBbiBzZXF1ZW5jZXMu IEhvd2V2ZXIsIG1hbnkgYWxnb3JpdGhtcyBhbHNvIHdvcmsgd2l0aCBzdGQ6OnN0cmluZyBhbmQg d2l0aG91dCBrbm93aW5nIHlvdXIgYXBwbGljYXRpb24gYW5kIGNvZGUgaXQgaXMgaGFyZCB0byBt YWtlIGFueSBwcm9taXNlIG9uIGFjY2VsZWFydGlvbi48L2Rpdj4NCjxkaXY+PGJyPg0KPC9kaXY+ DQo8ZGl2PkNoZWVycyw8L2Rpdj4NCjxkaXY+TWFudWVsPC9kaXY+DQo8ZGl2Pjxicj4NCjxkaXYg c3R5bGU9ImZvbnQtc2l6ZToxNnB4OyBmb250LWZhbWlseTpUaW1lcyBOZXcgUm9tYW4iPg0KPGhy Pg0KPGRpdiBzdHlsZT0iZGlyZWN0aW9uOmx0ciI+PGZvbnQgZmFjZT0iVGFob21hIj48Yj5Gcm9t OjwvYj4gQmFydGhhIETvv71uaWVsIFs8YSBocmVmPSJtYWlsdG86ZGFuaWVsLmJhcnRoYUBnbWFp bC5jb20iIHRhcmdldD0iX2JsYW5rIj5kYW5pZWwuYmFydGhhQGdtYWlsLmNvbTwvYT5dPGJyPg0K PGI+U2VudDo8L2I+IFdlZG5lc2RheSwgQXVndXN0IDI4LCAyMDEzIDExOjQ5IEFNPGJyPg0KPGI+ VG86PC9iPiBTZXFBbiBEZXZlbG9wbWVudDxicj4NCjxiPlN1YmplY3Q6PC9iPiBbU2VxYW4tZGV2 XSBxdWVzdGlvbiBhYm91dCB0aGUgZWZmaWNpZW5jeSBvZiB0aGUgc2VxdWFuIHNlcXVlbmNlIGNs YXNzZXM8YnI+DQo8L2ZvbnQ+PGJyPg0KPC9kaXY+DQo8ZGl2Pg0KPGRpdj48L2Rpdj4NCjxkaXY+ DQo8ZGl2IGRpcj0ibHRyIj4NCjxkaXY+DQo8ZGl2Pg0KPGRpdj4NCjxkaXY+SGkgQWxsLDxicj4N Cjxicj4NCjwvZGl2Pg0KaSBoYXZlIGEgYmlnIHF1ZXN0b24gdGhlcmUuIEkgd3JvdGUgYW4gYXBw bGljYXRpb24sIHRoYXQgY3VycmVudGx5IHVzZXMgbXkgb3duIGN1c3RvbSBzdGQ6OnN0cmluZyBi YXNlZCBpbXBsZW1lbnRhdGlvbiBmb3Igc29tZSBkbmEgbXV0YXRpb24gc3R1ZmYuIEkgYmFzaWNh bGx5IGhhdmUgdG8gYWNjZXNzIGV2ZXJ5IHNpbXBsZSBjaGFyYWN0ZXIgaW4gdGhlIGRuYSwgYW5k IHRoZW4gZG8gc29tZXRoaW5nIHdpdGggdGhlbSwgYnV0IHRoYXQgaXMgbm90DQogaW1wb3J0YW50 IGZvciB0aGUgcXVlc3Rpb24uIDxicj4NCjxicj4NCjwvZGl2Pg0KSSB0ZW5kIHRvIHJld3JpdGUg dGhlIHdob2xlIGFwcCB3aXRoIHNlcWFuLCBidXQgaXQgb25seSBoYXMgc2Vuc2UsIGlmIHRoZSBt YW5pcHVsYXRpb24gYW5kIGFjY2Vzc2luZyBvZiB0aGUgc2VxYW4gY2xhc3NlcyBzaWduaWZpY2Fu dCBmYXN0ZXIgaXMsIHRoYW4gbXkgb3duLiBJIHJlYWQgYWJvdXQgdGhlIGVmZmVjdGl2ZW5lc3Mg aW4gdGhlIE1vdGl2YXRpb24gY2hhcHRlciwgYnV0IGRvZXMgYW55Ym9keSBoYXZlIGFueSBleHBl cmllbmNlIGFib3V0DQogdGhlIGNvbmNyZXRlIHlpZWxkIG9mIHBvc3NpYmxlIGFjY2VsZXJhdGlv bj88YnI+DQo8YnI+DQo8L2Rpdj4NClRoYW5rcyE8YnI+DQo8YnI+DQo8L2Rpdj4NClJlZ2FyZHM6 IERhbmllbDxicj4NCjxiciBjbGVhcj0iYWxsIj4NCjxkaXY+DQo8ZGl2Pg0KPGRpdj4NCjxkaXY+ DQo8ZGl2Pg0KPGRpdj4NCjxkaXYgZGlyPSJsdHIiPjxmb250IGNvbG9yPSIjNjY2NjY2Ij5MaXZl IGxvbmcgYW5kIHByb3NwZXI8YnI+DQo8L2ZvbnQ+DQo8ZGl2Pjxmb250IGNvbG9yPSIjNjY2NjY2 Ij5CYXJ0aGEgRO+/vW5pZWw8YnI+DQpNVEEtVk1SSSwgMjAxMzwvZm9udD48L2Rpdj4NCjwvZGl2 Pg0KPC9kaXY+DQo8L2Rpdj4NCjwvZGl2Pg0KPC9kaXY+DQo8L2Rpdj4NCjwvZGl2Pg0KPC9kaXY+ DQo8L2Rpdj4NCjwvZGl2Pg0KPC9kaXY+DQo8L2Rpdj4NCjwvZGl2Pg0KPC9kaXY+DQo8YnI+DQo8 L2Rpdj4NCjwvZGl2Pg0KX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX188YnI+DQpzZXFhbi1kZXYgbWFpbGluZyBsaXN0PGJyPg0KPGEgaHJlZj0ibWFpbHRvOnNl cWFuLWRldkBsaXN0cy5mdS1iZXJsaW4uZGUiIHRhcmdldD0iX2JsYW5rIj5zZXFhbi1kZXZAbGlz dHMuZnUtYmVybGluLmRlPC9hPjxicj4NCjxhIGhyZWY9Imh0dHBzOi8vbGlzdHMuZnUtYmVybGlu LmRlL2xpc3RpbmZvL3NlcWFuLWRldiIgdGFyZ2V0PSJfYmxhbmsiPmh0dHBzOi8vbGlzdHMuZnUt YmVybGluLmRlL2xpc3RpbmZvL3NlcWFuLWRldjwvYT48YnI+DQo8YnI+DQo8L2Jsb2NrcXVvdGU+ DQo8L2Rpdj4NCjxicj4NCjwvZGl2Pg0KPC9ibG9ja3F1b3RlPg0KPC9kaXY+DQo8YnI+DQo8L2Rp dj4NCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fPGJyPg0K c2VxYW4tZGV2IG1haWxpbmcgbGlzdDxicj4NCjxhIGhyZWY9Im1haWx0bzpzZXFhbi1kZXZAbGlz dHMuZnUtYmVybGluLmRlIj5zZXFhbi1kZXZAbGlzdHMuZnUtYmVybGluLmRlPC9hPjxicj4NCmh0 dHBzOi8vbGlzdHMuZnUtYmVybGluLmRlL2xpc3RpbmZvL3NlcWFuLWRldjxicj4NCjwvYmxvY2tx dW90ZT4NCjwvZGl2Pg0KPGJyPg0KPGRpdj4NCjxkaXYgc3R5bGU9ImNvbG9yOnJnYigwLDAsMCk7 IGZvbnQtZmFtaWx5OkhlbHZldGljYTsgZm9udC1zaXplOm1lZGl1bTsgZm9udC1zdHlsZTpub3Jt YWw7IGZvbnQtdmFyaWFudDpub3JtYWw7IGZvbnQtd2VpZ2h0Om5vcm1hbDsgbGV0dGVyLXNwYWNp bmc6bm9ybWFsOyBsaW5lLWhlaWdodDpub3JtYWw7IG9ycGhhbnM6MjsgdGV4dC1pbmRlbnQ6MHB4 OyB0ZXh0LXRyYW5zZm9ybTpub25lOyB3aGl0ZS1zcGFjZTpub3JtYWw7IHdpZG93czoyOyB3b3Jk LXNwYWNpbmc6MHB4OyB3b3JkLXdyYXA6YnJlYWstd29yZCI+DQo8ZGl2IHN0eWxlPSJjb2xvcjpy Z2IoMCwwLDApOyBmb250LXZhcmlhbnQ6bm9ybWFsOyBsZXR0ZXItc3BhY2luZzpub3JtYWw7IGxp bmUtaGVpZ2h0Om5vcm1hbDsgb3JwaGFuczoyOyB0ZXh0LWluZGVudDowcHg7IHRleHQtdHJhbnNm b3JtOm5vbmU7IHdoaXRlLXNwYWNlOm5vcm1hbDsgd2lkb3dzOjI7IHdvcmQtc3BhY2luZzowcHg7 IHdvcmQtd3JhcDpicmVhay13b3JkIj4NCjxkaXYgc3R5bGU9ImNvbG9yOnJnYigwLDAsMCk7IGZv bnQtdmFyaWFudDpub3JtYWw7IGxldHRlci1zcGFjaW5nOm5vcm1hbDsgbGluZS1oZWlnaHQ6bm9y bWFsOyBvcnBoYW5zOjI7IHRleHQtaW5kZW50OjBweDsgdGV4dC10cmFuc2Zvcm06bm9uZTsgd2hp dGUtc3BhY2U6bm9ybWFsOyB3aWRvd3M6Mjsgd29yZC1zcGFjaW5nOjBweDsgd29yZC13cmFwOmJy ZWFrLXdvcmQiPg0KPGRpdj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+LS0tPC9mb250PjwvZGl2 Pg0KPGRpdj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+PGJyPg0KPC9mb250PjwvZGl2Pg0KPGRp dj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+UmVu77+9IFJhaG48L2ZvbnQ+PC9kaXY+DQo8ZGl2 Pjxmb250IGZhY2U9IkNvdXJpZXIgTmV3Ij5QaC5ELiBTdHVkZW50PC9mb250PjwvZGl2Pg0KPGRp dj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS08L2ZvbnQ+PC9kaXY+DQo8ZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJDb3VyaWVyIE5ldyI+VGVs OiAmbmJzcDsoJiM0Mzs0OSkgMzAgODM4IDc1Mjc3PC9mb250PjwvZGl2Pg0KPGRpdj48Zm9udCBm YWNlPSJDb3VyaWVyIE5ldyI+TWFpbDogPGEgaHJlZj0ibWFpbHRvOnJlbmUucmFobkBmdS1iZXJs aW4uZGUiPnJlbmUucmFobkBmdS1iZXJsaW4uZGU8L2E+PC9mb250PjwvZGl2Pg0KPGRpdj48Zm9u dCBmYWNlPSJDb3VyaWVyIE5ldyI+LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS08L2Zv bnQ+PC9kaXY+DQo8L2Rpdj4NCjxkaXY+PGZvbnQgZmFjZT0iQ291cmllciBOZXciPkluc3RpdHV0 ZSBvZiBDb21wdXRlciBTY2llbmNlPC9mb250PjwvZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJDb3Vy aWVyIE5ldyI+QWxnb3JpdGhtaWMgQmlvaW5mb3JtYXRpY3MgKEFCSSk8L2ZvbnQ+PC9kaXY+DQo8 ZGl2Pjxmb250IGZhY2U9IkNvdXJpZXIgTmV3Ij4tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLTwvZm9udD48L2Rpdj4NCjxkaXY+PGZvbnQgZmFjZT0iQ291cmllciBOZXciPkZyZWllIFVu aXZlcnNpdO+/vXQgQmVybGluPC9mb250PjwvZGl2Pg0KPGRpdj48Zm9udCBmYWNlPSJDb3VyaWVy IE5ldyI+VGFrdXN0cmHvv71lIDk8L2ZvbnQ+PC9kaXY+DQo8ZGl2Pjxmb250IGZhY2U9IkNvdXJp ZXIgTmV3Ij4xNDE5NSBCZXJsaW48L2ZvbnQ+PC9kaXY+DQo8ZGl2Pjxmb250IGZhY2U9IkNvdXJp ZXIgTmV3Ij4tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLTwvZm9udD48L2Rpdj4NCjwv ZGl2Pg0KPC9kaXY+DQo8L2Rpdj4NCjwvZGl2Pg0KPGJyPg0KPC9kaXY+DQo8L2Rpdj4NCjwvYm9k eT4NCjwvaHRtbD4NCg== --_000_9CAA1752576D407FB96E9F3EAEC916C0campusfuberlinde_-- --_004_9CAA1752576D407FB96E9F3EAEC916C0campusfuberlinde_ Content-Type: application/octet-stream; name="string_bench.cpp" Content-Description: string_bench.cpp Content-Disposition: attachment; filename="string_bench.cpp"; size=6944; creation-date="Fri, 13 Sep 2013 10:34:45 GMT"; modification-date="Fri, 13 Sep 2013 10:34:45 GMT" Content-ID: <6CC7F03E765FB944B12F84F9DCF5C3CA@campus.fu-berlin.de> Content-Transfer-Encoding: base64 Ly8gPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT0NCi8vICAgICAgICAgICAgICAgICBTZXFBbiAtIFRoZSBMaWJy YXJ5IGZvciBTZXF1ZW5jZSBBbmFseXNpcw0KLy8gPT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCi8vIENvcHly aWdodCAoYykgMjAwNi0yMDEzLCBLbnV0IFJlaW5lcnQsIEZVIEJlcmxpbg0KLy8gQWxsIHJpZ2h0 cyByZXNlcnZlZC4NCi8vDQovLyBSZWRpc3RyaWJ1dGlvbiBhbmQgdXNlIGluIHNvdXJjZSBhbmQg YmluYXJ5IGZvcm1zLCB3aXRoIG9yIHdpdGhvdXQNCi8vIG1vZGlmaWNhdGlvbiwgYXJlIHBlcm1p dHRlZCBwcm92aWRlZCB0aGF0IHRoZSBmb2xsb3dpbmcgY29uZGl0aW9ucyBhcmUgbWV0Og0KLy8N Ci8vICAgICAqIFJlZGlzdHJpYnV0aW9ucyBvZiBzb3VyY2UgY29kZSBtdXN0IHJldGFpbiB0aGUg YWJvdmUgY29weXJpZ2h0DQovLyAgICAgICBub3RpY2UsIHRoaXMgbGlzdCBvZiBjb25kaXRpb25z IGFuZCB0aGUgZm9sbG93aW5nIGRpc2NsYWltZXIuDQovLyAgICAgKiBSZWRpc3RyaWJ1dGlvbnMg aW4gYmluYXJ5IGZvcm0gbXVzdCByZXByb2R1Y2UgdGhlIGFib3ZlIGNvcHlyaWdodA0KLy8gICAg ICAgbm90aWNlLCB0aGlzIGxpc3Qgb2YgY29uZGl0aW9ucyBhbmQgdGhlIGZvbGxvd2luZyBkaXNj bGFpbWVyIGluIHRoZQ0KLy8gICAgICAgZG9jdW1lbnRhdGlvbiBhbmQvb3Igb3RoZXIgbWF0ZXJp YWxzIHByb3ZpZGVkIHdpdGggdGhlIGRpc3RyaWJ1dGlvbi4NCi8vICAgICAqIE5laXRoZXIgdGhl IG5hbWUgb2YgS251dCBSZWluZXJ0IG9yIHRoZSBGVSBCZXJsaW4gbm9yIHRoZSBuYW1lcyBvZg0K Ly8gICAgICAgaXRzIGNvbnRyaWJ1dG9ycyBtYXkgYmUgdXNlZCB0byBlbmRvcnNlIG9yIHByb21v dGUgcHJvZHVjdHMgZGVyaXZlZA0KLy8gICAgICAgZnJvbSB0aGlzIHNvZnR3YXJlIHdpdGhvdXQg c3BlY2lmaWMgcHJpb3Igd3JpdHRlbiBwZXJtaXNzaW9uLg0KLy8NCi8vIFRISVMgU09GVFdBUkUg SVMgUFJPVklERUQgQlkgVEhFIENPUFlSSUdIVCBIT0xERVJTIEFORCBDT05UUklCVVRPUlMgIkFT IElTIg0KLy8gQU5EIEFOWSBFWFBSRVNTIE9SIElNUExJRUQgV0FSUkFOVElFUywgSU5DTFVESU5H LCBCVVQgTk9UIExJTUlURUQgVE8sIFRIRQ0KLy8gSU1QTElFRCBXQVJSQU5USUVTIE9GIE1FUkNI QU5UQUJJTElUWSBBTkQgRklUTkVTUyBGT1IgQSBQQVJUSUNVTEFSIFBVUlBPU0UNCi8vIEFSRSBE SVNDTEFJTUVELiBJTiBOTyBFVkVOVCBTSEFMTCBLTlVUIFJFSU5FUlQgT1IgVEhFIEZVIEJFUkxJ TiBCRSBMSUFCTEUNCi8vIEZPUiBBTlkgRElSRUNULCBJTkRJUkVDVCwgSU5DSURFTlRBTCwgU1BF Q0lBTCwgRVhFTVBMQVJZLCBPUiBDT05TRVFVRU5USUFMDQovLyBEQU1BR0VTIChJTkNMVURJTkcs IEJVVCBOT1QgTElNSVRFRCBUTywgUFJPQ1VSRU1FTlQgT0YgU1VCU1RJVFVURSBHT09EUyBPUg0K Ly8gU0VSVklDRVM7IExPU1MgT0YgVVNFLCBEQVRBLCBPUiBQUk9GSVRTOyBPUiBCVVNJTkVTUyBJ TlRFUlJVUFRJT04pIEhPV0VWRVINCi8vIENBVVNFRCBBTkQgT04gQU5ZIFRIRU9SWSBPRiBMSUFC SUxJVFksIFdIRVRIRVIgSU4gQ09OVFJBQ1QsIFNUUklDVA0KLy8gTElBQklMSVRZLCBPUiBUT1JU IChJTkNMVURJTkcgTkVHTElHRU5DRSBPUiBPVEhFUldJU0UpIEFSSVNJTkcgSU4gQU5ZIFdBWQ0K Ly8gT1VUIE9GIFRIRSBVU0UgT0YgVEhJUyBTT0ZUV0FSRSwgRVZFTiBJRiBBRFZJU0VEIE9GIFRI RSBQT1NTSUJJTElUWSBPRiBTVUNIDQovLyBEQU1BR0UuDQovLw0KLy8gPT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT0NCi8vIEF1dGhvcjogUmVuZSBSYWhuIDxyZW5lLnJhaG5AZnUtYmVybGluLmRlPg0KLy8gPT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT0NCg0KI2luY2x1ZGUgPHN0cmluZz4NCiNpbmNsdWRlIDx2ZWN0b3I+DQoN CiNpbmNsdWRlIDxzZXFhbi9iYXNpYy5oPg0KI2luY2x1ZGUgPHNlcWFuL3NlcXVlbmNlLmg+DQoj aW5jbHVkZSA8c2VxYW4vc2VxX2lvLmg+DQojaW5jbHVkZSA8c2VxYW4vcmFuZG9tLmg+DQoNCiNp bmNsdWRlIDxzZXFhbi9hcmdfcGFyc2UuaD4NCg0KDQp1c2luZyBuYW1lc3BhY2Ugc2VxYW47DQoN Ci8vID09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09DQovLyBDbGFzc2VzDQovLyA9PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0K DQoNCi8vID09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09DQovLyBGdW5jdGlvbnMNCi8vID09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09DQoNCg0KRG5hNSBzZWxlY3RfZXZlbnQoRG5hNSBiYXNlKQ0Kew0KICAgIC8qKnRoaXMgZnVu Y3Rpb24gZG9lcyBvbmx5IGdpdmVzIGJhY2sgYSBEbmE1IGNoYXIsIGlmIHRoZSByYW5kb20gbnVt YmVyIGkgZ2l2ZSBpcyBpbiBzb21lIG9mIHRoZSBwcmUtc3RvcmVkIGludGVydmFscywgc28gbm90 aGluZyBzcGVjaWFsKiovDQogICAgc3dpdGNoKG9yZFZhbHVlKGJhc2UpKQ0KICAgIHsNCiAgICAg ICAgY2FzZSAnQSc6IHJldHVybiAnVCc7DQogICAgICAgIGNhc2UgJ0MnOiByZXR1cm4gJ0cnOw0K ICAgICAgICBjYXNlICdHJzogcmV0dXJuICdDJzsNCiAgICAgICAgY2FzZSAnVCc6IHJldHVybiAn QSc7DQogICAgICAgIGRlZmF1bHQ6IHJldHVybiAnTic7DQogICAgfQ0KfQ0KDQp0ZW1wbGF0ZSA8 dHlwZW5hbWUgVFN0cmluZz4NClRTdHJpbmcgcmVwbGljYXRlMihUU3RyaW5nIHNlcSwgZG91Ymxl ICYgaW5uZXJUaW1lKQ0Kew0KICAgIHN0ZDo6dmVjdG9yPHVuc2lnbmVkPiBpbmRleDsNCiAgICBp bmRleC5yZXNpemUobGVuZ3RoKHNlcSkpOw0KICAgIGZvciAodW5zaWduZWQgaSA9IDA7IGkgPCBp bmRleC5zaXplKCk7ICsraSkNCiAgICAgICAgaW5kZXhbaV0gPSBpOw0KDQogICAgZG91YmxlIHN0 YXJ0ID0gc3lzVGltZSgpOw0KICAgIGZvciAoYXV0byBpIDogaW5kZXgpDQovLyAgICBmb3IgKHVu c2lnbmVkIGkgPSAwOyBpIDwgaW5kZXguc2l6ZSgpOyArK2kpDQogICAgew0KICAgICAgIHNlcVtp XSA9IHNlbGVjdF9ldmVudChzZXFbaV0pOw0KICAgICAgIC8qKnNvIHByYWN0aWNhbGx5IG9uZSBE bmE1ID0gdGhlIG90aGVyIERuYTUgdmFyaWFibGUsIHdpdGggYXNzaWduKCkgaXMgaXQgZXZlbiBh IGxpdHRsZSBzbG93ZXIqKi8NCiAgICB9DQogICAgaW5uZXJUaW1lICs9IHN5c1RpbWUoKSAtIHN0 YXJ0Ow0KICAgIHJldHVybiBzZXE7DQp9DQoNCnRlbXBsYXRlIDx0eXBlbmFtZSBUU3RyaW5nPg0K aW5saW5lIHZvaWQgcmVwbGljYXRlMyhUU3RyaW5nICYgdGFyZ2V0LCBUU3RyaW5nIGNvbnN0ICYg c291cmNlLCBkb3VibGUgJiBpbm5lclRpbWUpDQp7DQogICAgc3RkOjp2ZWN0b3I8dW5zaWduZWQ+ IGluZGV4Ow0KICAgIGluZGV4LnJlc2l6ZShsZW5ndGgoc291cmNlKSk7DQogICAgZm9yICh1bnNp Z25lZCBpID0gMDsgaSA8IGluZGV4LnNpemUoKTsgKytpKQ0KICAgICAgICBpbmRleFtpXSA9IGk7 DQoNCiAgICByZXNpemUodGFyZ2V0LCBsZW5ndGgoc291cmNlKSwgRXhhY3QoKSk7DQoNCiAgICBk b3VibGUgc3RhcnQgPSBzeXNUaW1lKCk7DQogICAgZm9yIChhdXRvIGkgOiBpbmRleCkNCi8vICAg IGZvciAodW5zaWduZWQgaSA9IDA7IGkgPCBpbmRleC5zaXplKCk7ICsraSkNCiAgICB7DQogICAg ICAgdGFyZ2V0W2ldID0gc2VsZWN0X2V2ZW50KHNvdXJjZVtpXSk7DQogICAgfQ0KICAgIGlubmVy VGltZSArPSBzeXNUaW1lKCkgLSBzdGFydDsNCn0NCg0KLy8gLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCi8v IEZ1bmN0aW9uIG1haW4oKQ0KLy8gLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCg0KLy8gUHJvZ3JhbSBlbnRy eSBwb2ludC4NCg0KaW50IG1haW4oaW50IGFyZ2MsIGNoYXIqIGFyZ3ZbXSkNCnsNCiAgICBpZiAo YXJnYyA8PSAxKQ0KICAgIHsNCiAgICAgICAgc3RkOjpjZXJyIDw8ICJzdHJpbmdfYmVuY2ggRklM RS5mYSIgPDwgc3RkOjplbmRsOw0KICAgICAgICByZXR1cm4gMTsNCiAgICB9DQogICAgc3RkOjpp ZnN0cmVhbSBmaWxlU3RyZWFtKGFyZ3ZbMV0sIHN0ZDo6aW9zX2Jhc2U6OmluKTsNCiAgICBpZiAo IWZpbGVTdHJlYW0uZ29vZCgpKQ0KICAgIHsNCiAgICAgICAgc3RkOjpjZXJyIDw8ICJFcnJvciIg PDwgc3RkOjplbmRsOw0KICAgICAgICByZXR1cm4gMTsNCiAgICB9DQoNCiAgICBSZWNvcmRSZWFk ZXI8c3RkOjppZnN0cmVhbSwgU2luZ2xlUGFzczw+ID4gcmVjUmVhZGVyKGZpbGVTdHJlYW0pOw0K ICAgIENoYXJTdHJpbmcgaWQ7DQogICAgU3RyaW5nPERuYTU+IGJ1ZmZlcjsNCg0KDQogICAgaWYg KHJlYWRSZWNvcmQoaWQsIGJ1ZmZlciwgcmVjUmVhZGVyLCBGYXN0YSgpKSAhPSAwKQ0KICAgIHsN CiAgICAgICAgc3RkOjpjZXJyIDw8ICJFcnJvciIgPDwgc3RkOjplbmRsOw0KICAgICAgICByZXR1 cm4gMTsNCiAgICB9DQoNCiAgICBzdGQ6OmNvdXQgPDwgIlNlcWFuIFN0cmluZyIgPDwgc3RkOjpl bmRsOw0KICAgIGRvdWJsZSB0aW1lU3RhcnQgPSBzeXNUaW1lKCk7DQogICAgZG91YmxlIGlubmVy VGltZSA9IDAuMDsNCiAgICBmb3IgKHVuc2lnbmVkIGkgPSAwOyBpIDwgNTA7ICsraSkNCiAgICAg ICAgYnVmZmVyID0gcmVwbGljYXRlMihidWZmZXIsIGlubmVyVGltZSk7DQogICAgc3RkOjpjb3V0 IDw8ICJUaW1lOiAiIDw8IHN5c1RpbWUoKSAtIHRpbWVTdGFydCA8PCAiIHMuIElubmVyIExvb3A6 ICIgPDwgaW5uZXJUaW1lIDw8ICIgcy4iIDw8IHN0ZDo6ZW5kbDsNCg0KICAgIHN0ZDo6Y291dCA8 PCAiU1RMIFZlY3RvciIgPDwgc3RkOjplbmRsOw0KICAgIHN0ZDo6dmVjdG9yPERuYTU+IGJ1ZmZl clZlYzsNCiAgICBidWZmZXJWZWMucmVzaXplKGxlbmd0aChidWZmZXIpKTsNCiAgICBhcnJheUNv cHlGb3J3YXJkKGJlZ2luKGJ1ZmZlciksIGVuZChidWZmZXIpLCBidWZmZXJWZWMuYmVnaW4oKSk7 DQoNCiAgICB0aW1lU3RhcnQgPSBzeXNUaW1lKCk7DQogICAgZG91YmxlIGlubmVyVGltZTEgPSAw LjA7DQogICAgZm9yICh1bnNpZ25lZCBpID0gMDsgaSA8IDUwOyArK2kpDQogICAgICAgIGJ1ZmZl clZlYyA9IHJlcGxpY2F0ZTIoYnVmZmVyVmVjLCBpbm5lclRpbWUxKTsNCiAgICBzdGQ6OmNvdXQg PDwgIlRpbWU6ICIgPDwgc3lzVGltZSgpIC0gdGltZVN0YXJ0IDw8ICIgcy4gSW5uZXIgTG9vcDog IiA8PCBpbm5lclRpbWUxIDw8ICIgcy4iIDw8IHN0ZDo6ZW5kbDsNCg0KICAgIHN0ZDo6Y291dCA8 PCAiU1RMIEJhc2ljIFN0cmluZyBEbmE1IiA8PCBzdGQ6OmVuZGw7DQogICAgc3RkOjpiYXNpY19z dHJpbmc8RG5hNT4gYnVmZmVyU3RyaW5nOw0KICAgIGJ1ZmZlclN0cmluZy5yZXNpemUobGVuZ3Ro KGJ1ZmZlcikpOw0KICAgIGFycmF5Q29weUZvcndhcmQoYmVnaW4oYnVmZmVyKSwgZW5kKGJ1ZmZl ciksIGJ1ZmZlclN0cmluZy5iZWdpbigpKTsNCg0KICAgIHRpbWVTdGFydCA9IHN5c1RpbWUoKTsN CiAgICBkb3VibGUgaW5uZXJUaW1lMiA9IDAuMDsNCiAgICBmb3IgKHVuc2lnbmVkIGkgPSAwOyBp IDwgNTA7ICsraSkNCiAgICAgICAgYnVmZmVyU3RyaW5nID0gcmVwbGljYXRlMihidWZmZXJTdHJp bmcsIGlubmVyVGltZTIpOw0KICAgIHN0ZDo6Y291dCA8PCAiVGltZTogIiA8PCBzeXNUaW1lKCkg LSB0aW1lU3RhcnQgPDwgIiBzLiBJbm5lciBMb29wOiAiIDw8IGlubmVyVGltZTIgPDwgIiBzLiIg PDwgc3RkOjplbmRsOw0KDQogICAgc3RkOjpjb3V0IDw8ICJTVEwgQmFzaWMgU3RyaW5nIENoYXIi IDw8IHN0ZDo6ZW5kbDsNCiAgICBzdGQ6OnN0cmluZyBjaGFyU3RyaW5nOw0KICAgIGNoYXJTdHJp bmcucmVzaXplKGxlbmd0aChidWZmZXIpKTsNCiAgICBhcnJheUNvcHlGb3J3YXJkKGJlZ2luKGJ1 ZmZlciksIGVuZChidWZmZXIpLCBjaGFyU3RyaW5nLmJlZ2luKCkpOw0KDQogICAgdGltZVN0YXJ0 ID0gc3lzVGltZSgpOw0KICAgIGRvdWJsZSBpbm5lclRpbWUzID0gMC4wOw0KICAgIGZvciAodW5z aWduZWQgaSA9IDA7IGkgPCA1MDsgKytpKQ0KICAgICAgICBjaGFyU3RyaW5nID0gcmVwbGljYXRl MihjaGFyU3RyaW5nLCBpbm5lclRpbWUzKTsNCiAgICBzdGQ6OmNvdXQgPDwgIlRpbWU6ICIgPDwg c3lzVGltZSgpIC0gdGltZVN0YXJ0IDw8ICIgcy4gSW5uZXIgTG9vcDogIiA8PCBpbm5lclRpbWUz IDw8ICIgcy4iIDw8IHN0ZDo6ZW5kbDsNCg0KICAgIHN0ZDo6Y291dCA8PCAiVGhlIFdheSBXZSBX b3VsZCBEbyBJdCIgPDwgc3RkOjplbmRsOw0KICAgIFN0cmluZzxEbmE1PiBidWZmZXIyOw0KICAg IHRpbWVTdGFydCA9IHN5c1RpbWUoKTsNCiAgICBkb3VibGUgaW5uZXJUaW1lNCA9IDAuMDsNCiAg ICBmb3IgKHVuc2lnbmVkIGkgPSAwOyBpIDwgNTA7ICsraSkNCiAgICAgICAgcmVwbGljYXRlMyhi dWZmZXIyLCBidWZmZXIsIGlubmVyVGltZTQpOw0KICAgIHN0ZDo6Y291dCA8PCAiVGltZTogIiA8 PCBzeXNUaW1lKCkgLSB0aW1lU3RhcnQgPDwgIiBzLiBJbm5lciBMb29wOiAiIDw8IGlubmVyVGlt ZTQgPDwgIiBzLiIgPDwgc3RkOjplbmRsOw0KDQogICAgcmV0dXJuIDA7DQp9DQo= --_004_9CAA1752576D407FB96E9F3EAEC916C0campusfuberlinde_-- From Jochen.Singer@fu-berlin.de Fri Sep 13 12:44:08 2013 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKQr8-000cOL-SP>; Fri, 13 Sep 2013 12:44:06 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKQr8-003YkG-QN>; Fri, 13 Sep 2013 12:44:06 +0200 Received: from cas2.campus.fu-berlin.de ([130.133.170.202]) by relay2.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKQr8-0038Cb-CX>; Fri, 13 Sep 2013 12:44:06 +0200 Received: from EX01B.campus.fu-berlin.de ([130.133.170.131]) by CAS2.campus.fu-berlin.de ([130.133.170.202]) with mapi id 14.03.0123.003; Fri, 13 Sep 2013 12:44:05 +0200 From: "Singer, Jochen" To: SeqAn Development Thread-Topic: [Seqan-dev] Disk-based index Thread-Index: AQHOo8w+2ooEBAe2V0WzgQUmrScMqZmqPc8AgBkqVQCAAAsQAA== Message-ID: <99117094-0EB5-4C51-958E-2BA66F6CE07F@campus.fu-berlin.de> References: <521DBAC6.1@mail.cryst.bbk.ac.uk> <2497807B-B9C0-4907-BE51-87448BC9493D@fu-berlin.de> <5232E32C.40605@mail.cryst.bbk.ac.uk> In-Reply-To: <5232E32C.40605@mail.cryst.bbk.ac.uk> Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_991170940EB54C51958E2BA66F6CE07Fcampusfuberlinde_" MIME-Version: 1.0 Date: Fri, 13 Sep 2013 12:44:04 +0200 X-Original-Date: Fri, 13 Sep 2013 10:44:04 +0000 X-Originating-IP: 130.133.170.202 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1379069046-0000097E-73008EBB/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-50.0 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Benin.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] Disk-based index X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 10:44:08 -0000 --_000_991170940EB54C51958E2BA66F6CE07Fcampusfuberlinde_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi John, we use the indices in some application targeted at mapping reads to human r= eference genomes. Therefore all indices should work on large data sets. Cou= ld you provide some more information on the problem you are running into? Concerning the large number of files which are created I assume you are usi= ng an index build over a StringSet. The save function stores each String in= the StringSet into a separate file. However, if you specify the StringSet = to be a ConcatDirect StringSet, (StringSet > = > ) then all strings are concatenated internally and only tow file is stored = (one with the sequence and one with the sequence length information). At the moment there is no compression of the index files available, you wou= ld have to do it manually, but its a thought we should keep in mind. I hope that helps! Kind regards, Jochen On 13.09.2013, at 12:04, John Reid wrote: Hi Enrico, On 28/08/13 10:46, Siragusa, Enrico wrote: Hi John, On Aug 28, 2013, at 10:54 AM, John Reid > wrote: Hi all, I would like to index the mouse or human genome with an ESA. I need to do t= his more than once though and would like to store the ESA on disk as it tak= es some hours to construct. Is this feasible? Is there any way to do this i= n SeqAn already? Sure. To save an index after constructing it, you can call save(index, "/pa= th/to/index"). To load it, call open(index, "/path/to/index"). The path mus= t be given as a C style string, so if you're using a SeqAn String, please u= se toCString() to convert it. Do you have any experience using this functionality with genome sized index= es (3Gb or so)? Would you expect it to work? I seem to be running into some= issues I need to debug. I was just wondering if anyone else had used it in= this way. Also the save function seems to create many files in the same di= rectory. I imagine this could be a problem for some filesystems. Might you = consider changing this? Also as mentioned before the ability to save in a c= ompressed format would be very attractive to me as well. Thanks for all the great work in SeqAn, John. _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev Jochen Singer Institute of Computer Science Algorithmic Bioinformatics Working Group Freie Universit=E4t Berlin Takustr. 9, 14195 Berlin Phone +49 30 838 75228, Room K25 --_000_991170940EB54C51958E2BA66F6CE07Fcampusfuberlinde_ Content-Type: text/html; charset="iso-8859-1" Content-ID: <91E7DB2FEF8BF74A8FCA9876ABB6F5D1@campus.fu-berlin.de> Content-Transfer-Encoding: quoted-printable Hi John,

we use the indices in some application targeted at mapping reads to hu= man reference genomes. Therefore all indices should work on large data sets= . Could you provide some more information on the problem you are running in= to?

Concerning the large number of files which are created I assume you ar= e using an index build over a StringSet. The save function stores each Stri= ng in the StringSet into a separate file. However, if you specify the Strin= gSet to be a ConcatDirect StringSet, (St= ringSet<TString, Owner<ConcatDirect<> > >
) then all strings are concatenated internally and only tow file is st= ored (one with the sequence and one with the sequence length information).&= nbsp;

At the moment there is no compression of the index files available, yo= u would have to do it manually, but its a thought we should keep in mind.

I hope that helps!

Kind regards,
Jochen

On 13.09.2013, at 12:04, John Reid wrote:

Hi Enrico,

On 28/08/13 10:46, Siragusa, Enrico wrote:
Hi John,

Do you have any experience using this functionality with genome sized index= es (3Gb or so)? Would you expect it to work? I seem to be running into some= issues I need to debug. I was just wondering if anyone else had used it in= this way. Also the save function seems to create many files in the same directory. I imagine this could be = a problem for some filesystems. Might you consider changing this? Also as m= entioned before the ability to save in a compressed format would be very at= tractive to me as well.

Thanks for all the great work in SeqAn,
John.


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.d= e
https://lists.fu-berlin.de/listinfo/seqan-dev

Jochen Singer
Institute of Computer Science
Algorithmic Bioinformatics Working Group

Freie Universit=E4t Berlin
Takustr. 9, 14195 Berlin
Phone +49 30 838 75228, Room K25



--_000_991170940EB54C51958E2BA66F6CE07Fcampusfuberlinde_-- From jer15@hermes.cam.ac.uk Fri Sep 13 12:56:56 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKR3W-000d0n-E5>; Fri, 13 Sep 2013 12:56:54 +0200 Received: from ppsw-32.csi.cam.ac.uk ([131.111.8.132]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKR3W-001YAD-Be>; Fri, 13 Sep 2013 12:56:54 +0200 X-Cam-AntiVirus: no malware found X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from cpc6-dals15-2-0-cust115.hari.cable.virginmedia.com ([82.35.196.116]:58322 helo=[192.168.1.4]) by ppsw-32.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.156]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1VKR3V-0001Si-0e (Exim 4.80_167-5a66dd3) for seqan-dev@lists.fu-berlin.de (return-path ); Fri, 13 Sep 2013 11:56:53 +0100 Message-ID: <5232EF74.6030608@mail.cryst.bbk.ac.uk> Date: Fri, 13 Sep 2013 11:56:52 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130803 Thunderbird/17.0.8 MIME-Version: 1.0 To: SeqAn Development References: <521DBAC6.1@mail.cryst.bbk.ac.uk> <2497807B-B9C0-4907-BE51-87448BC9493D@fu-berlin.de> <5232E32C.40605@mail.cryst.bbk.ac.uk> <99117094-0EB5-4C51-958E-2BA66F6CE07F@campus.fu-berlin.de> In-Reply-To: <99117094-0EB5-4C51-958E-2BA66F6CE07F@campus.fu-berlin.de> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.132 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1379069814-0000097E-B9CD60AE/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.064531, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-1.2 required=5.0 tests=HTML_MESSAGE, MIME_HTML_ONLY, RCVD_IN_DNSWL_MED X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Burundi.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] Disk-based index X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 10:56:56 -0000 Hi Jochen,

Thanks for the quick response.

On 13/09/13 11:44, Singer, Jochen wrote:
Hi John,

we use the indices in some application targeted at mapping reads to human reference genomes. Therefore all indices should work on large data sets. Could you provide some more information on the problem you are running into?
I'll try and dig a bit deeper into it.


Concerning the large number of files which are created I assume you are using an index build over a StringSet. The save function stores each String in the StringSet into a separate file. However, if you specify the StringSet to be a ConcatDirect StringSet, (StringSet<TString, Owner<ConcatDirect<> > >
) then all strings are concatenated internally and only tow file is stored (one with the sequence and one with the sequence length information).
Thanks for the tip regarding the stringset. I'm not sure I can change my code so easily to use that but I'll have a look.


At the moment there is no compression of the index files available, you would have to do it manually, but its a thought we should keep in mind.
I might well do it myself then.


I hope that helps!
Thanks!
John.
From Sabrina.Krakau@fu-berlin.de Fri Sep 13 14:21:55 2013 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.80.1) with esmtp (envelope-from ) id <1VKSNj-000hi1-HQ>; Fri, 13 Sep 2013 14:21:51 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.80.1) with esmtp (envelope-from ) id <1VKSNj-0046NB-FM>; Fri, 13 Sep 2013 14:21:51 +0200 Received: from cas1.campus.fu-berlin.de ([130.133.170.201]) by relay2.zedat.fu-berlin.de (Exim 4.80.1) with esmtp (envelope-from ) id <1VKSNj-003LO2-17>; Fri, 13 Sep 2013 14:21:51 +0200 Received: from EX03A.campus.fu-berlin.de ([130.133.170.134]) by CAS1.campus.fu-berlin.de ([130.133.170.201]) with mapi id 14.03.0158.001; Fri, 13 Sep 2013 14:21:48 +0200 From: "Krakau, Sabrina" To: AG ABI ABI , "seqan-interests@lists.fu-berlin.de" , SeqAn Development Thread-Topic: SeqAn - BioStore Workshop 2013, Berlin, September 17th - 19th Thread-Index: Ac6we9BtwyaJV6UHRXK9YJVumcEw4Q== Message-ID: Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_CBC5629F5E78A84A853AD8A3D5AF81BF51D5535Cex03acampusfube_" MIME-Version: 1.0 Date: Fri, 13 Sep 2013 14:21:46 +0200 X-Original-Date: Fri, 13 Sep 2013 12:21:46 +0000 X-Originating-IP: 130.133.170.201 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1379074911-0000097E-357D822C/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.005011, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-49.3 required=5.0 tests=ALL_TRUSTED, HTML_IMAGE_ONLY_28,HTML_MESSAGE X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Botsuana.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] SeqAn - BioStore Workshop 2013, Berlin, September 17th - 19th X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 12:21:55 -0000 --_000_CBC5629F5E78A84A853AD8A3D5AF81BF51D5535Cex03acampusfube_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear SeqAn Users and Developers, We look forward to your participation at the SeqAn - BioStore Workshop next= week. You can find the up-to-date schedule and further information on our website= : http://www.seqan-biostore.de/wp/seqan-workshops/2013-09-seqan-workshop/sche= dule/ The workshop will start on Tuesday at 9 a.m. and will end on Thursday. As = a highlight of the workshop, we will have dinner on board of the cruise shi= p "Josephine" on the Spree river on Wednesday evening. Workshop Preparation Please bring your own Laptop for the SeqAn Tutorials. Your computer should = have the following installed: * C++ compiler and/or IDEs like Xcode, Visual C++ or Eclipse * CMake (http://www.cmake.org/) In preparation for the workshop please go through the 'Getting Started' to = install SeqAn and create a first "Hello World!" application: http://trac.seqan.de/wiki/Tutorial/GettingStarted Additionally we offer a 'SeqAn Install Session' at 9:00 a.m. on the first d= ay of the workshop for the case of unforeseen difficulties. For the KNIME Tutorial on the last day you can install already the KNIME SD= K (http://www.knime.org/node/81). The workshop fee of 80 Euro for graduates and 20 Euro for undergraduates (B= achelor and Master students) can be paid at the beginning of the workshop. If you have any questions, please do not hesitate to send a mail to sabrina= .krakau@fu-berlin.de. See you next week in Berlin, The SeqAn team -- Sabrina Krakau [Logo] Freie Universit=E4t Berlin Institute of Computer Science Algorithmic Bioinformatics - Project BioStore Takustr. 9, 14195 Berlin Telefon: +49 (0)30 838 75228 --_000_CBC5629F5E78A84A853AD8A3D5AF81BF51D5535Cex03acampusfube_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear SeqAn Users and Developers,

We look forward to your participation at the SeqAn - BioStore Workshop next= week.
You can find the up-to-date schedule and further information on our website= :
http://www.seqan-biostore.de/wp/se= qan-workshops/2013-09-seqan-workshop/schedule/

The workshop will start on Tuesday at 9 a.m. and will end on Thursday. = ; As a highlight of the workshop, we will have dinner on board of the cruis= e ship "Josephine" on the Spree river on Wednesday evening.

Workshop Preparation
Please bring your own Laptop for the SeqAn Tutorials. Your computer should = have the following installed: 
  • C++ compiler and/or IDEs like Xcode, Visual C++ or Ecli= pse
In preparation for the workshop please go through the 'Getting Started' to = install SeqAn and create a first "Hello World!" application:
http://trac.seqan.de/wiki/Tutorial/GettingStarted Additionally we offer a 'SeqAn Install Session' at 9:00 a.m. on the first d= ay of the workshop for the case of unforeseen difficulties.
For the KNIME Tutorial on the last day you can install already the KNIME SD= K (http://www.knime.org/node/81).

The workshop fee of 80 Euro for graduates and 20 Euro for undergraduates (B= achelor and Master students) can be paid at the beginning of the workshop.
If you have any questions, please do not hesitate to send a mail to sabrina.krakau@fu-berlin.de.
See you next week in Berlin,
The SeqAn team

--

Sabrina Krakau

3D"Logo"

Freie Universit=E4t Berlin
Institute of Computer Science
Algorithmic Bioinformatics - Project BioStore

Takustr. 9, 14195 Berlin
Telefon: +49 (0)30 838 75228

--_000_CBC5629F5E78A84A853AD8A3D5AF81BF51D5535Cex03acampusfube_-- From daniel.bartha@gmail.com Fri Sep 13 14:55:47 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKSuW-000jV3-IX>; Fri, 13 Sep 2013 14:55:44 +0200 Received: from mail-ve0-f174.google.com ([209.85.128.174]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKSuW-001sIk-5Y>; Fri, 13 Sep 2013 14:55:44 +0200 Received: by mail-ve0-f174.google.com with SMTP id jy13so911173veb.33 for ; Fri, 13 Sep 2013 05:55:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=PIfdOMRuxWtn+3eLfdkKe7x03r/++XbHknBGDaiu/g8=; b=b3JLxTfzYZn0F8gdCpFny/WHG/F0Xn9YyN2LEp9HnCfcpJFeSUWxc7r1525r0jZu1F GaovWrmvesgTjjIzerq/bN5a+DecsnwN6/T3SWZwMvCBjorLw91XsWxXJu+IKvkk6gWY d04b9zGYhjhHFYoiat90MgeAoFJUXHlPcCjDq2bXxh5QtHkXHD4+1JSrY2VeZ/uwI4FU o9piP/5fKi1Gxq99UR8qJtliqOBpoR0PNqxSBFqj3UCeR7DjN1GLkqjNcpdX3vRVB1gx uibpWPcvCj0nW5xdwsaZWv35S83agqbDr5Qs0zUcnzFYEjPBlxhG7tcbbevzrUbOVwIe a2wA== MIME-Version: 1.0 X-Received: by 10.52.227.6 with SMTP id rw6mr9994540vdc.19.1379076942179; Fri, 13 Sep 2013 05:55:42 -0700 (PDT) Received: by 10.58.254.166 with HTTP; Fri, 13 Sep 2013 05:55:42 -0700 (PDT) In-Reply-To: <9CAA1752-576D-407F-B96E-9F3EAEC916C0@campus.fu-berlin.de> References: <9CAA1752-576D-407F-B96E-9F3EAEC916C0@campus.fu-berlin.de> Date: Fri, 13 Sep 2013 14:55:42 +0200 Message-ID: From: =?UTF-8?Q?Bartha_D=C3=A1niel?= To: SeqAn Development Content-Type: multipart/alternative; boundary=089e0116166041c13104e6435f30 X-Originating-IP: 209.85.128.174 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1379076944-0000097E-0060D9A8/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.287126, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.7 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,T_DKIM_INVALID X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Benin.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] question about the efficiency of the sequan sequence classes X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 12:55:47 -0000 --089e0116166041c13104e6435f30 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Ren=C3=A9, thanks a lot for your work! If i am honest, i expected also the same ratios you got, so probably there is something i dont find.I can check your code a bit later, and tell you, what i get, nevertheless the chr22 i got is only ~40 Mbp long. You have to know, that the entire application is a bit more complex, but the given two functions are the main time-consuming parts of the thing. I have some experience with c++, and know so far, which operation how many resources uses (thanks to my "Master" at the Uni :) ), and if i am correct, all variables are passed with the possible fastest method - if that method is thread-safe. The re-implementation of the replicate function is nice, but the randomness there is desired, and the changes are also not all the same, so that can't be saved. I replicate ~5kbp sequences with a custom evolutional model, and how many? Well, as many as possible :D. That's why the optimization matters, and why try seqan now. Because of my degenerated object-oriented mind i feel myself lost again wit the generic programming, but i try my best :) I can (at the moment, without seqan) process 250 Mbp/sec with an 8-threaded i7 3770k, that must be 10-20 times faster to be satisfied. Regards: Daniel Live long and prosper Bartha D=C3=A1niel MTA-VMRI, 2013 2013/9/13 Rahn, Ren=C3=A9 > Hey Daniel, > > I tried out your code examples below. I did have some surprising > observations but there are different from what you where reporting. I > replaced some of your functionality. I adapted the select_event function = to > simply return the complement of a given base. I removed the randomness > factor to select the index and simply used every index to be converted. I > loaded the chr22 sequence of the human genome (~50 Mb) and measured the > time of running 50 times a) the replicate function and b) the inner loop > with the assignment. I did the experiments with the seqan::String, > std::vector , std::basic_string and std::string. I also > implemented a replicate3 function which performs best as it reduces the > number of copying whole Strings. > I did the parsing over the index with an c++11 range-based for loop and > the standard for loop. > Here are my results built in release mode on a 2.3 GHz Core i7. > > All times are the sum of 50 experiments. > > C++11 style: > > Seqan String Time: 11.18 s. Inner Loop: 2.58064 s. > STL Vector Time: 10.9798 s. Inner Loop: 2.53835 s. > STL Basic String Dna5 Time: 10.6501 s. Inner Loop: 3.94554 s. > STL Basic String Char Time: 11.4799 s. Inner Loop: 4.85506 s. > replicate3 Time: 8.67172 s. Inner Loop: 2.52474 s. > > C++98 style > > Seqan String Time: 11.0828 s. Inner Loop: 2.49667 s. > STL Vector Time: 10.9178 s. Inner Loop: 2.54614 s. > STL Basic String Dna5 Time: 10.9048 s. Inner Loop: 4.20024 s. > STL Basic String Char Time: 12.3184 s. Inner Loop: 5.61231 s. > repliacte3 Time: 9.55719 s. Inner Loop: 3.30052 s. > > As you can see the replicate3 function outperforms the other versions, > however the inner loop gets slower when using the standard for loop, and = I > am not quite sure that I completely understand why, because I can't obser= ve > the same performance drop in the replicate2 function. > However, when comparing results with the C++11 version the assignment of > the seqan::String is like the std::vector and faster than the std::string > versions. > > Can you please give us some information about the dimension of you > problem. How many sequences are you replicating? How long are the sequenc= es? > Please consider the following performance boosters. Always prefer passing > parameters by const-reference over passing them by copy (as long as you a= re > sure these are not just simple types). Copying a big container with many > values is slower than copying a 4/8 Byte reference :). > > I also appended the benchmark file. So maybe you can run the tests on > your machine and report your experience. > > > Kind regards, > > Ren=EF=BF=BD > > Am 11.09.2013 um 15:43 schrieb Bartha D=EF=BF=BDniel : > > Hi Manuel and People there, > > i promised to report over the performance comparsion between > seqan::String and std::string. So here are the (for me) > surprising results: > > I replaced the strings and chars with the seqan types in all over my > source files. I access the characters in the seqan strings trough [] > operator and corrected the functions where needed. > > The program does its job, but its 5 times slower then the simple std > implementation! Thats not exactly what i expected, i thought it will be a > little slower or much faster, but not this extreme slowdown. > > I suppose it happens because i dont use seqan the right way. Do you have > an idea, whats the reason? I paste here the responsible two functions, it > would be great, if someone could spend a couple of minutes. > > > *Dna5 eventspace::select_event(Dna5 base, double p)* > { > /**this function does only gives back a Dna5 char, if the random > number i give is in some of the pre-stored intervals, so nothing special*= */ > for(event e : E[base]) > { > > if(e.a > p) > { > if(p >=3D e.b) > { > return e.to; > //which is a seqan::Dna5 character > } > } > } > } > > *seqan::String replicate2(framework& sys, > seqan::String seq, default_random_engine engine)* > { > uniform_real_distribution<> ur_dist(0, sys.Getscale()); > //this and the default_random_engine are needed for real random > number generation > > vector probs(length(seq)); > vector index; > > for(unsigned i=3D0; i { > probs[i]=3Dur_dist(engine); > if(probs[i] > sys.lookup[seq[i]])index.push_back(i); > } > for(unsigned i : index) > { > seq[i]=3Dsys.events.select_event(seq[i],probs[i]); > /**so practically one Dna5 =3D the other Dna5 variable, with > assign() is it even a little slower**/ > } > return seq;} > > Do you have any idea, or is this slowdown maybe normal? > > Thanks, regards: > > Daniel > > Live long and prosper > Bartha D=EF=BF=BDniel > MTA-VMRI, 2013 > > > 2013/8/28 Bartha D=EF=BF=BDniel > > Hi Manuel (and other c++ fellows), > > i try it, and tell you, if it's better. > > But there is an other problem now, and there was a discussion about in > februar already.( > https://lists.fu-berlin.de/pipermail/seqan-dev/2013-February/msg00002.htm > I dont know if it is solved or not, but i still/again get exact the same > error message: > > /usr/include/seqan/bam_io/cigar.h||In function =EF=BF=BDbool > seqan::operator<(const seqan::CigarElement&, const > seqan::CigarElement&)=EF=BF=BD:| > /usr/include/seqan/bam_io/cigar.h|120|error: parse error in template > argument list| > ||=3D=3D=3D Build finished: 1 errors, 0 warnings (0 minutes, 2 seconds) = =3D=3D=3D| > > This is caused by the including of #include , and the > program is completly empty (return 0;...). I use ubuntu linux amd64, and > g++ 4.7.3. > > I bypass the usage of this header now, but it doesn't seems to be uniqe. > > Thank you very much again, and have a good day! > > Daniel > > > Live long and prosper > Bartha D=EF=BF=BDniel > MTA-VMRI, 2013 > > > 2013/8/28 Holtgrewe, Manuel > > Hi Daniel, > > it depends on your application and what you do with your strings. Using > the SeqAn library can yield more elegant and faster code than using > std::string or self-written string classes but it depends on the actual u= se > case. > > For Sequences, there are two aspects: > > (1) Using SeqAn's Dna5, Dna for characters stores the alphabet as > numbers 0..3/4 internally. This makes it easier for indices and mappings > since they can work directly and efficiently on the ordinal value > (ordValue). > > For example, if you are counting the nucleotide content along strings, > you can simply have a 4-element container (String in this case) for each > position in your reads (thus a String of Strings). Thus, you do not need = a > possible mapping for 'A' =3D> 0, 'C' =3D> 1, 'G' =3D> 2, 'T' =3D> 3, 'N' = =3D> 4 since > the mapping is done beforehand. > > String > counters; > for (unsigned i =3D 0; i < length(reads); ++i) > { > // Increase number of counters if reads[i] is longer than the previou= s > reads. > if (length(counters) < length(reads[i])) > { > unsigned oldSize =3D length(counters); > resize(counters, length(reads[i])); > for (unsigned j =3D oldSize; j < length(counters); ++j) > resize(counters[j], 5, 0); > } > > // Count nucleotides for each position in reads[i]; > for (unsigned j =3D 0; j < length(reads[i]); ++j) > counters[ordValue(reads[i][j])] +=3D 1; > } > > (2) SeqAn's String class allows additionally giving an alternative > implementation. The default implementation simply uses an array and would > store a Dna character in a Byte. By using the Packed String, you can > byte-compress four 4-character DNA characters into one Byte (each only > needs 2 bits). This comes at the cost of some computation but in this cas= e > leads to a 4x memory consumption direction. > > We as library writers can now combine these two aspects of sequences and > alphabets with generic programming and write algorithms that allow the us= er > to change the alphabet type and the string implementation depending on th= e > user's requirements and get the best possible implementation for this cas= e. > Because template specialization allows us to decide for the the correct > implementation of ordValue(), length() etc. at *compile time*, we do not > need virtual functions and thus no cost for runtime polymorphism. > > If you want to use the algorithms in the SeqAn library then you could > benefit from using SeqAn sequences. However, many algorithms also work wi= th > std::string and without knowing your application and code it is hard to > make any promise on acceleartion. > > Cheers, > Manuel > > ------------------------------ > *From:* Bartha D=EF=BF=BDniel [daniel.bartha@gmail.com] > > *Sent:* Wednesday, August 28, 2013 11:49 AM > *To:* SeqAn Development > *Subject:* [Seqan-dev] question about the efficiency of the sequan > sequence classes > > Hi All, > > i have a big queston there. I wrote an application, that currently uses > my own custom std::string based implementation for some dna mutation stuf= f. > I basically have to access every simple character in the dna, and then do > something with them, but that is not important for the question. > > I tend to rewrite the whole app with seqan, but it only has sense, if th= e > manipulation and accessing of the seqan classes significant faster is, th= an > my own. I read about the effectiveness in the Motivation chapter, but doe= s > anybody have any experience about the concrete yield of possible > acceleration? > > Thanks! > > Regards: Daniel > > Live long and prosper > Bartha D=EF=BF=BDniel > MTA-VMRI, 2013 > > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev > > > > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev > > > --- > > Ren=EF=BF=BD Rahn > Ph.D. Student > -------------------------------- > Tel: (+49) 30 838 75277 > Mail: rene.rahn@fu-berlin.de > -------------------------------- > Institute of Computer Science > Algorithmic Bioinformatics (ABI) > -------------------------------- > Freie Universit=EF=BF=BDt Berlin > Takustra=EF=BF=BDe 9 > 14195 Berlin > -------------------------------- > > > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev > > --089e0116166041c13104e6435f30 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Ren=C3=A9,

thanks= a lot for your work! If i am honest, i expected also the same ratios you g= ot, so probably there is something i dont find.I can check your code a bit = later, and tell you, what i get, nevertheless the chr22 i got is only ~40 M= bp long.

You have to know, that the entire application is a bit more compl= ex, but the given two functions are the main time-consuming parts of the th= ing. I have some experience with c++, and know so far, which operation how = many resources uses (thanks to my "Master" at the Uni :) ), and i= f i am correct, all variables are passed with the possible fastest method -= if that method is thread-safe. The re-implementation of the replicate func= tion is nice, but the randomness there is desired, and the changes are also= not all the same, so that can't be saved.

I replicate ~5kbp sequences with a custom evolutional model, and = how many? Well, as many as possible :D. That's why the optimization mat= ters, and why try seqan now. Because of my degenerated object-oriented mind= i feel myself lost again wit the generic programming, but i try my best :)= I can (at the moment, without seqan) process 250 Mbp/sec with an 8-threade= d i7 3770k, that must be 10-20 times faster to be satisfied.

Regards:

Daniel







=
Live long a= nd prosper
Bartha D=C3=A1niel
MTA= -VMRI, 2013


2013/9/13 Rahn, Ren=C3=A9 <rene= .maerker@fu-berlin.de>
Hey Daniel,=C2=A0

I tried out your code examples below. I did have some surprising obser= vations but there are different from what you where reporting. I replaced s= ome of your functionality. I adapted the select_event function to simply re= turn the complement of a given base. I removed the randomness factor to select the index and simply used every = index to be converted. I loaded the chr22 sequence of the human genome (~50= Mb) =C2=A0and measured the time of running 50 times a) the replicate funct= ion and b) the inner loop with the assignment. I did the experiments with the seqan::String<Dna5>, std::vector<D= na5> , std::basic_string<Dna5> and std::string. I also implemented= a replicate3 function which performs best as it reduces the number of copy= ing whole Strings.
I did the parsing over the index with an c++11 range-based for loop an= d the standard for loop.
Here are my results built in release mode on a 2.3 GHz Core i7.

All times are the sum of 50 experiments.

C++11 style:

Seqan String Time: 11.18 s. =C2=A0 In= ner Loop: 2.58064 s.
STL Vector Time: 10.9798 s. Inner= =C2=A0Loop: 2.53835 s.
STL Basic String Dna5 Time: 10.6501 s. Inner= =C2=A0Loop: 3.94554 s.
STL Basic String Char Time: 11.4799 s. Inner= =C2=A0Loop: 4.85506 s.
replicate3 Time: 8.67172 s. Inner= =C2=A0Loop: 2.52474 s.

C++98 style

Seqan String Time: 11.0828 s. Inner L= oop: 2.49667 s.
STL Vector Time: 10.9178 s. Inner= =C2=A0Loop: 2.54614 s.
STL Basic String Dna5 Time: 10.9048 s. Inner= =C2=A0Loop: 4.20024 s.
STL Basic String Char Time: 12.3184 s. Inner= =C2=A0Loop: 5.61231 s.
repliacte3 Time: 9.55719 s. Inner=C2=A0Loop: 3.30052 = s.

As you can see the replicate3 function outperforms the other versions,= however the inner loop gets slower when using the standard for loop, and I= am not quite sure that I completely understand why, because I can't ob= serve the same performance drop in the replicate2 function.
However, when comparing results with the C++11 version the assignment = of the seqan::String is like the std::vector and faster than the std::strin= g versions.=C2=A0

Can you please give us some information about the dimension of you pro= blem. How many sequences are you replicating? How long are the sequences?
Please consider the following performance boosters. Always prefer pass= ing parameters by const-reference over passing them by copy (as long as you= are sure these are not just simple types). Copying a big container with ma= ny values is slower than copying a 4/8 Byte reference :).

I also appended the benchmark file. So maybe you can run the tests on = your machine and report your experience.


Kind regards,=C2=A0

Ren=EF=BF=BD

Am = 11.09.2013 um 15:43 schrieb Bartha D=EF=BF=BDniel <daniel.bartha@gmail.com>= :

Hi Manuel and People there,

i promised to report over the performance comparsion between seqan::String&= lt;seqan::Dna5> and std::string. So here are the (for me) surprising res= ults:

I replaced the strings and chars with the seqan types in all over my source= files. I access the characters in the seqan strings trough [] operator and= corrected the functions where needed.

The program does its job, but its 5 times slower then the simple std implem= entation! Thats not exactly what i expected, i thought it will be a little = slower or much faster, but not this extreme slowdown.

I suppose it happens because i dont use seqan the right way. Do you have an= idea, whats the reason? I paste here the responsible two functions, it wou= ld be great, if someone could spend a couple of minutes.


Dna5 eventspace::selec= t_event(Dna5 base, double p)
{
=C2=A0=C2=A0=C2=A0 <= span style=3D"color:rgb(106,168,79)"> /**this function does only gives back a Dna5 char, if the random number i g= ive is in some of the pre-stored intervals, so nothing special**/
=C2=A0=C2=A0=C2=A0 f= or(event e : E[base])
=C2=A0=C2=A0=C2=A0 {

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if(e.a > p)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if(p >= ;=3D e.b)
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 return e.to;=
=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 //which is a seqan::Dna5 character
=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 }
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 }
=C2=A0=C2=A0=C2=A0 }
}


seqan::String<seqan= ::Dna5> replicate2(framework& sys, seqan::String<seqan::Dna5> = seq, default_random_engine engine)
{
=C2=A0=C2=A0=C2=A0 uniform_real_distribution<> ur_dist(0, sys.Getscal= e());
=C2=A0=C2=A0=C2=A0 //this and the default_random_engine are needed for real random number gene= ration

=C2=A0=C2=A0=C2=A0 vector<double> probs(length(seq));
=C2=A0=C2=A0=C2=A0 v= ector<int> index;

=C2=A0=C2=A0=C2=A0 for(unsigned i=3D0; i<probs.size(); ++i)
=C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 probs[i]=3Dur_dist(engine);
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if(probs[i] > sys.lookup[seq[= i]])index.push_back(i);
=C2=A0=C2=A0=C2=A0 }
=C2=A0=C2=A0=C2=A0 for(unsigned i : index)
=C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 seq[i]=3Dsys.events.select_event(seq[i],probs[i]);
=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 /**so practically one Dna5 =3D the other Dna5 variable, with assign() is it= even a little slower**/
=C2=A0=C2=A0=C2=A0 }=
return seq;}


Do you have any idea, or is this slowdown maybe normal?

Thanks, regards:

Daniel

Live long and prosper
Bartha D=EF=BF=BDniel
MTA-VMRI, 2013


2013/8/28 Bartha D=EF=BF=BDniel &l= t;daniel.barth= a@gmail.com>
Hi Manuel (and other c++ fellows),

i try it, and tell you, if it's better.

But there is an other problem now, and there was a discussion about in febr= uar already.(https://lists.fu-berlin.de/piperm= ail/seqan-dev/2013-February/msg00002.htm
I dont know if it is solved or not, but i still/again get exact the same er= ror message:

/usr/include/seqan/bam_io/cigar.h||In function =EF=BF=BDbool seqan::operato= r<(const seqan::CigarElement<TOperation, TCount>&, const seqan= ::CigarElement<TOperation, TCount>&)=EF=BF=BD:|
/usr/include/seqan/bam_io/cigar.h|120|error: parse error in template argume= nt list|
||=3D=3D=3D Build finished: 1 errors, 0 warnings (0 minutes, 2 seconds) =3D= =3D=3D|

This is caused by the including of #include <seqan/seq_io.h>, and the= program is completly empty (return 0;...). I use ubuntu linux amd64, and g= ++ 4.7.3.

I bypass the usage of this header now, but it doesn't seems to be uniqe= .

Thank you very much again, and have a good day!

Daniel


Live long and prosper
Bartha D=EF=BF=BDniel
MTA-VMRI, 2013


2013/8/28 Holtgrewe, Manuel <manuel.holtgrewe@fu-berlin.d= e>
Hi Daniel,

it depends on your application and what you do with your strings. Usin= g the SeqAn library can yield more elegant and faster code than using std::= string or self-written string classes but it depends on the actual use case= .

For Sequences, there are two aspects:

(1) Using SeqAn's Dna5, Dna for characters stores the alphabet as = numbers 0..3/4 internally. This makes it easier for indices and mappings si= nce they can work directly and efficiently on the ordinal value (ordValue).=

For example, if you are counting the nucleotide content along strings,= you can simply have a 4-element container (String in this case) for each p= osition in your reads (thus a String of Strings). Thus, you do not need a p= ossible mapping for 'A' =3D> 0, 'C' =3D> 1, 'G' =3D> 2, 'T' =3D> 3, 'N' =3D&g= t; 4 since the mapping is done beforehand.

String<String<unsigned> > count= ers;
for (unsigned i =3D 0; i < length(reads)= ; ++i)
{
=C2=A0 =C2=A0 // Increase number of counter= s if reads[i] is longer than the previous reads.
=C2=A0 =C2=A0 if (length(counters) < len= gth(reads[i]))
=C2=A0 =C2=A0 {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 unsigned oldSiz= e =3D length(counters);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 resize(counters= , length(reads[i]));
=C2=A0 =C2=A0 =C2=A0 =C2=A0 for (unsigned j= =3D oldSize; j < length(counters); ++j)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 r= esize(counters[j], 5, 0);
=C2=A0 =C2=A0 }

=C2=A0 =C2=A0 // Count nucleotides for each= position in reads[i];
=C2=A0 =C2=A0 for (unsigned j =3D 0; j <= length(reads[i]); ++j)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 counters[ordVal= ue(reads[i][j])] +=3D 1;
}

(2) SeqAn's String class allows additionally giving an alternative= implementation. The default implementation simply uses an array and would = store a Dna character in a Byte. By using the Packed String, you can byte-c= ompress four 4-character DNA characters into one Byte (each only needs 2 bits). This comes at the cost of some com= putation but in this case leads to a 4x memory consumption direction.

We as library writers can now combine these two aspects of sequences a= nd alphabets with generic programming and write algorithms that allow the u= ser to change the alphabet type and the string implementation depending on = the user's requirements and get the best possible implementation for this case. Because template specializ= ation allows us to decide for the the correct implementation of ordValue(),= length() etc. at *compile time*, we do not need virtual functions and thus= no cost for runtime polymorphism.

If you want to use the algorithms in the SeqAn library then you could = benefit from using SeqAn sequences. However, many algorithms also work with= std::string and without knowing your application and code it is hard to ma= ke any promise on acceleartion.

Cheers,
Manuel


From: Bartha D=EF= =BF=BDniel [da= niel.bartha@gmail.com]

Sent: Wednesday, August 28, 2013 11:49 AM
To: SeqAn Development
Subject: [Seqan-dev] question about the efficiency of the sequan seq= uence classes

Hi All,

i have a big queston there. I wrote an application, that currently uses my = own custom std::string based implementation for some dna mutation stuff. I = basically have to access every simple character in the dna, and then do som= ething with them, but that is not important for the question.

I tend to rewrite the whole app with seqan, but it only has sense, if the m= anipulation and accessing of the seqan classes significant faster is, than = my own. I read about the effectiveness in the Motivation chapter, but does = anybody have any experience about the concrete yield of possible acceleration?

Thanks!

Regards: Daniel

Live long and prosper
Bartha D=EF=BF=BDniel
MTA-VMRI, 2013

_______________________________________________
seqan-dev mailing list
seqan-dev= @lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev= @lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

---

Ren=EF=BF=BD Rahn
Ph.D. Student
--------------------------------
--------------------------------
Institute of Computer Science
Algorithmic Bioinformatics (ABI)
--------------------------------
Freie Universit=EF=BF=BDt Berlin
Takustra=EF=BF=BDe 9
14195 Berlin
--------------------------------


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.d= e
https://lists.fu-berlin.de/listinfo/seqan-dev


--089e0116166041c13104e6435f30-- From jer15@hermes.cam.ac.uk Fri Sep 13 15:15:59 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKTE3-000kap-Bv>; Fri, 13 Sep 2013 15:15:55 +0200 Received: from ppsw-52.csi.cam.ac.uk ([131.111.8.152]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKTE3-001wCf-6o>; Fri, 13 Sep 2013 15:15:55 +0200 X-Cam-AntiVirus: no malware found X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from cpc6-dals15-2-0-cust115.hari.cable.virginmedia.com ([82.35.196.116]:58998 helo=[192.168.1.4]) by ppsw-52.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.158]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1VKTE1-0006pn-En (Exim 4.80_167-5a66dd3) for seqan-dev@lists.fu-berlin.de (return-path ); Fri, 13 Sep 2013 14:15:53 +0100 Message-ID: <52331009.9090308@mail.cryst.bbk.ac.uk> Date: Fri, 13 Sep 2013 14:15:53 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130803 Thunderbird/17.0.8 MIME-Version: 1.0 To: SeqAn Development References: <9CAA1752-576D-407F-B96E-9F3EAEC916C0@campus.fu-berlin.de> In-Reply-To: <9CAA1752-576D-407F-B96E-9F3EAEC916C0@campus.fu-berlin.de> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.152 X-purgate: suspect X-purgate-type: suspect X-purgate-ID: 151147::1379078155-0000097E-7F635B74/3979172734-0/0-1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.059908, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.2 required=5.0 tests=FU_XPURGATE_SUSP, HTML_MESSAGE, MIME_HTML_ONLY,RCVD_IN_DNSWL_MED X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Benin.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] question about the efficiency of the sequan sequence classes X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 13:15:59 -0000 Hi Rene,

That looks like an interesting test but I think it is worth pointing out that it is not always best to pass-by-value as you assume. If you're interested Dave Abrahams explains why in this article:

http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

Regards, John.


On 13/09/13 11:34, Rahn, René wrote:
Hey Daniel, 

I tried out your code examples below. I did have some surprising observations but there are different from what you where reporting. I replaced some of your functionality. I adapted the select_event function to simply return the complement of a given base. I removed the randomness factor to select the index and simply used every index to be converted. I loaded the chr22 sequence of the human genome (~50 Mb)  and measured the time of running 50 times a) the replicate function and b) the inner loop with the assignment. I did the experiments with the seqan::String<Dna5>, std::vector<Dna5> , std::basic_string<Dna5> and std::string. I also implemented a replicate3 function which performs best as it reduces the number of copying whole Strings.
I did the parsing over the index with an c++11 range-based for loop and the standard for loop.
Here are my results built in release mode on a 2.3 GHz Core i7.

All times are the sum of 50 experiments.

C++11 style:

Seqan String Time: 11.18 s.   Inner Loop: 2.58064 s.
STL Vector Time: 10.9798 s. Inner Loop: 2.53835 s.
STL Basic String Dna5 Time: 10.6501 s. Inner Loop: 3.94554 s.
STL Basic String Char Time: 11.4799 s. Inner Loop: 4.85506 s.
replicate3 Time: 8.67172 s. Inner Loop: 2.52474 s.

C++98 style

Seqan String Time: 11.0828 s. Inner Loop: 2.49667 s.
STL Vector Time: 10.9178 s. Inner Loop: 2.54614 s.
STL Basic String Dna5 Time: 10.9048 s. Inner Loop: 4.20024 s.
STL Basic String Char Time: 12.3184 s. Inner Loop: 5.61231 s.
repliacte3 Time: 9.55719 s. Inner Loop: 3.30052 s.

As you can see the replicate3 function outperforms the other versions, however the inner loop gets slower when using the standard for loop, and I am not quite sure that I completely understand why, because I can't observe the same performance drop in the replicate2 function.
However, when comparing results with the C++11 version the assignment of the seqan::String is like the std::vector and faster than the std::string versions. 

Can you please give us some information about the dimension of you problem. How many sequences are you replicating? How long are the sequences?
Please consider the following performance boosters. Always prefer passing parameters by const-reference over passing them by copy (as long as you are sure these are not just simple types). Copying a big container with many values is slower than copying a 4/8 Byte reference :).

I also appended the benchmark file. So maybe you can run the tests on your machine and report your experience.


Kind regards, 

Ren�

Am 11.09.2013 um 15:43 schrieb Bartha D�niel <daniel.bartha@gmail.com>:

Hi Manuel and People there,

i promised to report over the performance comparsion between seqan::String<seqan::Dna5> and std::string. So here are the (for me) surprising results:

I replaced the strings and chars with the seqan types in all over my source files. I access the characters in the seqan strings trough [] operator and corrected the functions where needed.

The program does its job, but its 5 times slower then the simple std implementation! Thats not exactly what i expected, i thought it will be a little slower or much faster, but not this extreme slowdown.

I suppose it happens because i dont use seqan the right way. Do you have an idea, whats the reason? I paste here the responsible two functions, it would be great, if someone could spend a couple of minutes.


Dna5 eventspace::select_event(Dna5 base, double p)
{
    /**this function does only gives back a Dna5 char, if the random number i give is in some of the pre-stored intervals, so nothing special**/
    for(event e : E[base])
    {

        if(e.a > p)
        {
            if(p >= e.b)
            {
                return e.to;
                //which is a seqan::Dna5 character
            }
        }
    }
}


seqan::String<seqan::Dna5> replicate2(framework& sys, seqan::String<seqan::Dna5> seq, default_random_engine engine)
{
    uniform_real_distribution<> ur_dist(0, sys.Getscale());
    //this and the default_random_engine are needed for real random number generation

    vector<double> probs(length(seq));
    vector<int> index;

    for(unsigned i=0; i<probs.size(); ++i)
    {
        probs[i]=ur_dist(engine);
        if(probs[i] > sys.lookup[seq[i]])index.push_back(i);
    }
    for(unsigned i : index)
    {
       seq[i]=sys.events.select_event(seq[i],probs[i]);
       /**so practically one Dna5 = the other Dna5 variable, with assign() is it even a little slower**/
    }
return seq;}


Do you have any idea, or is this slowdown maybe normal?

Thanks, regards:

Daniel

Live long and prosper
Bartha D�niel
MTA-VMRI, 2013


2013/8/28 Bartha D�niel <daniel.bartha@gmail.com>
Hi Manuel (and other c++ fellows),

i try it, and tell you, if it's better.

But there is an other problem now, and there was a discussion about in februar already.(https://lists.fu-berlin.de/pipermail/seqan-dev/2013-February/msg00002.htm
I dont know if it is solved or not, but i still/again get exact the same error message:

/usr/include/seqan/bam_io/cigar.h||In function �bool seqan::operator<(const seqan::CigarElement<TOperation, TCount>&, const seqan::CigarElement<TOperation, TCount>&)�:|
/usr/include/seqan/bam_io/cigar.h|120|error: parse error in template argument list|
||=== Build finished: 1 errors, 0 warnings (0 minutes, 2 seconds) ===|

This is caused by the including of #include <seqan/seq_io.h>, and the program is completly empty (return 0;...). I use ubuntu linux amd64, and g++ 4.7.3.

I bypass the usage of this header now, but it doesn't seems to be uniqe.

Thank you very much again, and have a good day!

Daniel


Live long and prosper
Bartha D�niel
MTA-VMRI, 2013


2013/8/28 Holtgrewe, Manuel <manuel.holtgrewe@fu-berlin.de>
Hi Daniel,

it depends on your application and what you do with your strings. Using the SeqAn library can yield more elegant and faster code than using std::string or self-written string classes but it depends on the actual use case.

For Sequences, there are two aspects:

(1) Using SeqAn's Dna5, Dna for characters stores the alphabet as numbers 0..3/4 internally. This makes it easier for indices and mappings since they can work directly and efficiently on the ordinal value (ordValue).

For example, if you are counting the nucleotide content along strings, you can simply have a 4-element container (String in this case) for each position in your reads (thus a String of Strings). Thus, you do not need a possible mapping for 'A' => 0, 'C' => 1, 'G' => 2, 'T' => 3, 'N' => 4 since the mapping is done beforehand.

String<String<unsigned> > counters;
for (unsigned i = 0; i < length(reads); ++i)
{
    // Increase number of counters if reads[i] is longer than the previous reads.
    if (length(counters) < length(reads[i]))
    {
        unsigned oldSize = length(counters);
        resize(counters, length(reads[i]));
        for (unsigned j = oldSize; j < length(counters); ++j)
            resize(counters[j], 5, 0);
    }

    // Count nucleotides for each position in reads[i];
    for (unsigned j = 0; j < length(reads[i]); ++j)
        counters[ordValue(reads[i][j])] += 1;
}

(2) SeqAn's String class allows additionally giving an alternative implementation. The default implementation simply uses an array and would store a Dna character in a Byte. By using the Packed String, you can byte-compress four 4-character DNA characters into one Byte (each only needs 2 bits). This comes at the cost of some computation but in this case leads to a 4x memory consumption direction.

We as library writers can now combine these two aspects of sequences and alphabets with generic programming and write algorithms that allow the user to change the alphabet type and the string implementation depending on the user's requirements and get the best possible implementation for this case. Because template specialization allows us to decide for the the correct implementation of ordValue(), length() etc. at *compile time*, we do not need virtual functions and thus no cost for runtime polymorphism.

If you want to use the algorithms in the SeqAn library then you could benefit from using SeqAn sequences. However, many algorithms also work with std::string and without knowing your application and code it is hard to make any promise on acceleartion.

Cheers,
Manuel


From: Bartha D�niel [daniel.bartha@gmail.com]
Sent: Wednesday, August 28, 2013 11:49 AM
To: SeqAn Development
Subject: [Seqan-dev] question about the efficiency of the sequan sequence classes

Hi All,

i have a big queston there. I wrote an application, that currently uses my own custom std::string based implementation for some dna mutation stuff. I basically have to access every simple character in the dna, and then do something with them, but that is not important for the question.

I tend to rewrite the whole app with seqan, but it only has sense, if the manipulation and accessing of the seqan classes significant faster is, than my own. I read about the effectiveness in the Motivation chapter, but does anybody have any experience about the concrete yield of possible acceleration?

Thanks!

Regards: Daniel

Live long and prosper
Bartha D�niel
MTA-VMRI, 2013

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

---

Ren� Rahn
Ph.D. Student
--------------------------------
Tel:  (+49) 30 838 75277
--------------------------------
Institute of Computer Science
Algorithmic Bioinformatics (ABI)
--------------------------------
Freie Universit�t Berlin
Takustra�e 9
14195 Berlin
--------------------------------



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

From jer15@hermes.cam.ac.uk Fri Sep 13 15:24:33 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKTMM-000l3u-4D>; Fri, 13 Sep 2013 15:24:30 +0200 Received: from ppsw-52.csi.cam.ac.uk ([131.111.8.152]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VKTML-001xbu-VM>; Fri, 13 Sep 2013 15:24:30 +0200 X-Cam-AntiVirus: no malware found X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from cpc6-dals15-2-0-cust115.hari.cable.virginmedia.com ([82.35.196.116]:59127 helo=[192.168.1.4]) by ppsw-52.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.158]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1VKTMK-0001rA-EM (Exim 4.80_167-5a66dd3) for seqan-dev@lists.fu-berlin.de (return-path ); Fri, 13 Sep 2013 14:24:28 +0100 Message-ID: <5233120C.6000002@mail.cryst.bbk.ac.uk> Date: Fri, 13 Sep 2013 14:24:28 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130803 Thunderbird/17.0.8 MIME-Version: 1.0 To: SeqAn Development References: <9CAA1752-576D-407F-B96E-9F3EAEC916C0@campus.fu-berlin.de> <52331009.9090308@mail.cryst.bbk.ac.uk> In-Reply-To: <52331009.9090308@mail.cryst.bbk.ac.uk> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.152 X-purgate: suspect X-purgate-type: suspect X-purgate-ID: 151147::1379078670-0000097E-C87F57A9/3979189363-0/0-19 X-Bogosity: Ham, tests=bogofilter, spamicity=0.059908, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.2 required=5.0 tests=FU_XPURGATE_SUSP, HTML_MESSAGE, MIME_HTML_ONLY,RCVD_IN_DNSWL_MED X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Algerien.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] question about the efficiency of the sequan sequence classes X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 13:24:34 -0000 Sorry I meant "it is not always best to pass-by-reference"

On 13/09/13 14:15, John Reid wrote:
Hi Rene,

That looks like an interesting test but I think it is worth pointing out that it is not always best to pass-by-value as you assume. If you're interested Dave Abrahams explains why in this article:

http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

Regards, John.


On 13/09/13 11:34, Rahn, René wrote:
Hey Daniel, 

I tried out your code examples below. I did have some surprising observations but there are different from what you where reporting. I replaced some of your functionality. I adapted the select_event function to simply return the complement of a given base. I removed the randomness factor to select the index and simply used every index to be converted. I loaded the chr22 sequence of the human genome (~50 Mb)  and measured the time of running 50 times a) the replicate function and b) the inner loop with the assignment. I did the experiments with the seqan::String<Dna5>, std::vector<Dna5> , std::basic_string<Dna5> and std::string. I also implemented a replicate3 function which performs best as it reduces the number of copying whole Strings.
I did the parsing over the index with an c++11 range-based for loop and the standard for loop.
Here are my results built in release mode on a 2.3 GHz Core i7.

All times are the sum of 50 experiments.

C++11 style:

Seqan String Time: 11.18 s.   Inner Loop: 2.58064 s.
STL Vector Time: 10.9798 s. Inner Loop: 2.53835 s.
STL Basic String Dna5 Time: 10.6501 s. Inner Loop: 3.94554 s.
STL Basic String Char Time: 11.4799 s. Inner Loop: 4.85506 s.
replicate3 Time: 8.67172 s. Inner Loop: 2.52474 s.

C++98 style

Seqan String Time: 11.0828 s. Inner Loop: 2.49667 s.
STL Vector Time: 10.9178 s. Inner Loop: 2.54614 s.
STL Basic String Dna5 Time: 10.9048 s. Inner Loop: 4.20024 s.
STL Basic String Char Time: 12.3184 s. Inner Loop: 5.61231 s.
repliacte3 Time: 9.55719 s. Inner Loop: 3.30052 s.

As you can see the replicate3 function outperforms the other versions, however the inner loop gets slower when using the standard for loop, and I am not quite sure that I completely understand why, because I can't observe the same performance drop in the replicate2 function.
However, when comparing results with the C++11 version the assignment of the seqan::String is like the std::vector and faster than the std::string versions. 

Can you please give us some information about the dimension of you problem. How many sequences are you replicating? How long are the sequences?
Please consider the following performance boosters. Always prefer passing parameters by const-reference over passing them by copy (as long as you are sure these are not just simple types). Copying a big container with many values is slower than copying a 4/8 Byte reference :).

I also appended the benchmark file. So maybe you can run the tests on your machine and report your experience.


Kind regards, 

Ren�

Am 11.09.2013 um 15:43 schrieb Bartha D�niel <daniel.bartha@gmail.com>:

Hi Manuel and People there,

i promised to report over the performance comparsion between seqan::String<seqan::Dna5> and std::string. So here are the (for me) surprising results:

I replaced the strings and chars with the seqan types in all over my source files. I access the characters in the seqan strings trough [] operator and corrected the functions where needed.

The program does its job, but its 5 times slower then the simple std implementation! Thats not exactly what i expected, i thought it will be a little slower or much faster, but not this extreme slowdown.

I suppose it happens because i dont use seqan the right way. Do you have an idea, whats the reason? I paste here the responsible two functions, it would be great, if someone could spend a couple of minutes.


Dna5 eventspace::select_event(Dna5 base, double p)
{
    /**this function does only gives back a Dna5 char, if the random number i give is in some of the pre-stored intervals, so nothing special**/
    for(event e : E[base])
    {

        if(e.a > p)
        {
            if(p >= e.b)
            {
                return e.to;
                //which is a seqan::Dna5 character
            }
        }
    }
}


seqan::String<seqan::Dna5> replicate2(framework& sys, seqan::String<seqan::Dna5> seq, default_random_engine engine)
{
    uniform_real_distribution<> ur_dist(0, sys.Getscale());
    //this and the default_random_engine are needed for real random number generation

    vector<double> probs(length(seq));
    vector<int> index;

    for(unsigned i=0; i<probs.size(); ++i)
    {
        probs[i]=ur_dist(engine);
        if(probs[i] > sys.lookup[seq[i]])index.push_back(i);
    }
    for(unsigned i : index)
    {
       seq[i]=sys.events.select_event(seq[i],probs[i]);
       /**so practically one Dna5 = the other Dna5 variable, with assign() is it even a little slower**/
    }
return seq;}


Do you have any idea, or is this slowdown maybe normal?

Thanks, regards:

Daniel

Live long and prosper
Bartha D�niel
MTA-VMRI, 2013


2013/8/28 Bartha D�niel <daniel.bartha@gmail.com>
Hi Manuel (and other c++ fellows),

i try it, and tell you, if it's better.

But there is an other problem now, and there was a discussion about in februar already.(https://lists.fu-berlin.de/pipermail/seqan-dev/2013-February/msg00002.htm
I dont know if it is solved or not, but i still/again get exact the same error message:

/usr/include/seqan/bam_io/cigar.h||In function �bool seqan::operator<(const seqan::CigarElement<TOperation, TCount>&, const seqan::CigarElement<TOperation, TCount>&)�:|
/usr/include/seqan/bam_io/cigar.h|120|error: parse error in template argument list|
||=== Build finished: 1 errors, 0 warnings (0 minutes, 2 seconds) ===|

This is caused by the including of #include <seqan/seq_io.h>, and the program is completly empty (return 0;...). I use ubuntu linux amd64, and g++ 4.7.3.

I bypass the usage of this header now, but it doesn't seems to be uniqe.

Thank you very much again, and have a good day!

Daniel


Live long and prosper
Bartha D�niel
MTA-VMRI, 2013


2013/8/28 Holtgrewe, Manuel <manuel.holtgrewe@fu-berlin.de>
Hi Daniel,

it depends on your application and what you do with your strings. Using the SeqAn library can yield more elegant and faster code than using std::string or self-written string classes but it depends on the actual use case.

For Sequences, there are two aspects:

(1) Using SeqAn's Dna5, Dna for characters stores the alphabet as numbers 0..3/4 internally. This makes it easier for indices and mappings since they can work directly and efficiently on the ordinal value (ordValue).

For example, if you are counting the nucleotide content along strings, you can simply have a 4-element container (String in this case) for each position in your reads (thus a String of Strings). Thus, you do not need a possible mapping for 'A' => 0, 'C' => 1, 'G' => 2, 'T' => 3, 'N' => 4 since the mapping is done beforehand.

String<String<unsigned> > counters;
for (unsigned i = 0; i < length(reads); ++i)
{
    // Increase number of counters if reads[i] is longer than the previous reads.
    if (length(counters) < length(reads[i]))
    {
        unsigned oldSize = length(counters);
        resize(counters, length(reads[i]));
        for (unsigned j = oldSize; j < length(counters); ++j)
            resize(counters[j], 5, 0);
    }

    // Count nucleotides for each position in reads[i];
    for (unsigned j = 0; j < length(reads[i]); ++j)
        counters[ordValue(reads[i][j])] += 1;
}

(2) SeqAn's String class allows additionally giving an alternative implementation. The default implementation simply uses an array and would store a Dna character in a Byte. By using the Packed String, you can byte-compress four 4-character DNA characters into one Byte (each only needs 2 bits). This comes at the cost of some computation but in this case leads to a 4x memory consumption direction.

We as library writers can now combine these two aspects of sequences and alphabets with generic programming and write algorithms that allow the user to change the alphabet type and the string implementation depending on the user's requirements and get the best possible implementation for this case. Because template specialization allows us to decide for the the correct implementation of ordValue(), length() etc. at *compile time*, we do not need virtual functions and thus no cost for runtime polymorphism.

If you want to use the algorithms in the SeqAn library then you could benefit from using SeqAn sequences. However, many algorithms also work with std::string and without knowing your application and code it is hard to make any promise on acceleartion.

Cheers,
Manuel


From: Bartha D�niel [daniel.bartha@gmail.com]
Sent: Wednesday, August 28, 2013 11:49 AM
To: SeqAn Development
Subject: [Seqan-dev] question about the efficiency of the sequan sequence classes

Hi All,

i have a big queston there. I wrote an application, that currently uses my own custom std::string based implementation for some dna mutation stuff. I basically have to access every simple character in the dna, and then do something with them, but that is not important for the question.

I tend to rewrite the whole app with seqan, but it only has sense, if the manipulation and accessing of the seqan classes significant faster is, than my own. I read about the effectiveness in the Motivation chapter, but does anybody have any experience about the concrete yield of possible acceleration?

Thanks!

Regards: Daniel

Live long and prosper
Bartha D�niel
MTA-VMRI, 2013

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

---

Ren� Rahn
Ph.D. Student
--------------------------------
Tel:  (+49) 30 838 75277
--------------------------------
Institute of Computer Science
Algorithmic Bioinformatics (ABI)
--------------------------------
Freie Universit�t Berlin
Takustra�e 9
14195 Berlin
--------------------------------



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

From johdro@mpi-inf.mpg.de Wed Sep 18 14:53:23 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VMHFx-003lYU-Rg>; Wed, 18 Sep 2013 14:53:21 +0200 Received: from infao0809.mpi-klsb.mpg.de ([139.19.1.49] helo=hera.mpi-klsb.mpg.de) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VMHFx-002VRI-PA>; Wed, 18 Sep 2013 14:53:21 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mpi-inf.mpg.de; s=mail200803; h=Content-Transfer-Encoding:Content-Type:Subject:To:MIME-Version:From:Date:Message-ID; bh=nsJLSTxKbRH++LklSHJ419HdifEtKEtMWce1f5Bhuu4=; b=FjYgrUXJ7SQUAGfc11OMkDSD5ILoL2OH38ppoIB6CwexRidEwYxL14zF1oX/xut4wu28dJVHXPjCC/CzSWhpP1Abox+OFr8IbT+80ZY7j7TYHonm/FlhNe6se8d2kopQOa1kISNxjEpVrkw3U1drI25abHd2BATZpJn7btmYNbA=; Received: from maniac.mpi-klsb.mpg.de ([139.19.1.28]:43021) by hera.mpi-klsb.mpg.de (envelope-from ) with esmtp (Exim 4.72) id 1VMHFu-0001c0-JB for seqan-dev@lists.fu-berlin.de; Wed, 18 Sep 2013 14:53:20 +0200 Received: from monster.cs.uni-duesseldorf.de ([134.99.112.114]:60674 helo=linux-eu7n.site) by maniac.mpi-klsb.mpg.de (envelope-from ) with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) id 1VMHFu-0001rH-8U for seqan-dev@lists.fu-berlin.de; Wed, 18 Sep 2013 14:53:18 +0200 Message-ID: <5239A23D.30404@mpi-inf.mpg.de> Date: Wed, 18 Sep 2013 14:53:17 +0200 From: =?ISO-8859-15?Q?Johannes_Dr=F6ge?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: SeqAn Development Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 8bit X-MPI-Local-Sender: true X-Originating-IP: 139.19.1.49 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1379508801-0000097E-775B0AD1/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, T_DKIM_INVALID X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Algerien.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] FaiIndex + getIdByName/readRegion not threads-safe X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Sep 2013 12:53:23 -0000 Hi Seqan team, I recently ported my code to use an off-memory sequence store with your FaiIndex class. Here is the problem: Whenever I use more than two threads to read a region using getIdByName+readRegion, the region returned is incorrect. The class does not seem to be thread-safe, although I cannot image why read-only functions like the above should not be run in parallel. I could wrap the lookup into a class with a locking mechanism but I imagine it would be easy and quick to fix the class without introducing unnecessary locks. When I give readRegion an object of type "const FaiIndex", it segfaults. IMO, whenever your index is read-only, the methods should be thread-safe or not available at all. Gruß Johannes From weese@campus.fu-berlin.de Wed Sep 25 10:48:46 2013 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VOkm4-000euE-AX>; Wed, 25 Sep 2013 10:48:44 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VOkm4-002pzr-8G>; Wed, 25 Sep 2013 10:48:44 +0200 Received: from cas2.campus.fu-berlin.de ([130.133.170.202]) by relay2.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VOkm3-001BHQ-QZ>; Wed, 25 Sep 2013 10:48:44 +0200 Received: from EX02A.campus.fu-berlin.de ([130.133.170.132]) by CAS2.campus.fu-berlin.de ([130.133.170.202]) with mapi id 14.03.0123.003; Wed, 25 Sep 2013 10:48:43 +0200 From: "Weese, David" To: SeqAn Development Thread-Topic: [Seqan-dev] FaiIndex + getIdByName/readRegion not threads-safe Thread-Index: AQHOtG486KA/gAaEMky0JFwHGpt/4ZnWDa2A Message-ID: <7D097497-C4B6-4AF5-9217-4B9EC46EBEF4@fu-berlin.de> References: <5239A23D.30404@mpi-inf.mpg.de> In-Reply-To: <5239A23D.30404@mpi-inf.mpg.de> Accept-Language: de-DE, en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Content-Type: multipart/signed; boundary="Apple-Mail=_676E9607-DA17-4015-8FA0-354C437D509F"; protocol="application/pkcs7-signature"; micalg=sha1 MIME-Version: 1.0 Date: Wed, 25 Sep 2013 10:48:42 +0200 X-Original-Date: Wed, 25 Sep 2013 08:48:42 +0000 X-Originating-IP: 130.133.170.202 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1380098924-0000097E-771F8A64/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-50.0 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Botsuana.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] FaiIndex + getIdByName/readRegion not threads-safe X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Sep 2013 08:48:46 -0000 --Apple-Mail=_676E9607-DA17-4015-8FA0-354C437D509F Content-Type: multipart/alternative; boundary="Apple-Mail=_F8711DD8-6F5C-447A-81DD-88681179507F" --Apple-Mail=_F8711DD8-6F5C-447A-81DD-88681179507F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Hi Johannes, there is a mutable private string inside the NameCache that is not = thread-safe. So although getIdByName is used with a constant name store = cache, there will be some concurrent write accesses if you use it with = multiple threads. Internally a name store cache is a set of ids = (unsigned ints) that are sorted lexicographically with respect to the = corresponding names in the name store (a StringSet). The std::set::find() only supports to search for keys of the set, i.e. = ids. That means if a random string needs to be looked up we first store = it temporarily in a private mutable string inside the less operator and = search for the id -1 to signal the less operator that this string should = be compared with the strings inside the name store. If the std::set = would be more generic in the types that find and the less operator would = accept we wouldn't need to do it this way. So right now, you should put your lookups inside an critical section = until we found a better solution. Cheers, David -- David Weese, Ph.D. david.weese@fu-berlin.de Freie Universit=E4t Berlin http://www.inf.fu-berlin.de/ Institut f=FCr Informatik Phone: +49 30 838 75137 Takustra=DFe 9 Algorithmic Bioinformatics 14195 Berlin Room 020 Am 18.09.2013 um 14:53 schrieb Johannes Dr=F6ge : > Hi Seqan team, >=20 > I recently ported my code to use an off-memory sequence store with = your FaiIndex class. Here is the problem: Whenever I use more than two = threads to read a region using getIdByName+readRegion, the region = returned is incorrect. The class does not seem to be thread-safe, = although I cannot image why read-only functions like the above should = not be run in parallel. I could wrap the lookup into a class with a = locking mechanism but I imagine it would be easy and quick to fix the = class without introducing unnecessary locks. >=20 > When I give readRegion an object of type "const FaiIndex", it = segfaults. IMO, whenever your index is read-only, the methods should be = thread-safe or not available at all. >=20 > Gru=DF Johannes >=20 > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev --Apple-Mail=_F8711DD8-6F5C-447A-81DD-88681179507F Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 Hi = Johannes,

there is a mutable private string inside = the NameCache that is not thread-safe. So although getIdByName is = used with a constant name store cache, there will be = some concurrent write accesses if you use it with multiple threads. = Internally a name store cache is a set of ids (unsigned ints) that are = sorted lexicographically with respect to the corresponding names in the = name store (a StringSet).
The std::set::find() only supports = to search for keys of the set, i.e. ids. That means if a random string = needs to be looked up we first store it temporarily in a private mutable = string inside the less operator and search for the id -1 to signal the = less operator that this string should be compared with the strings = inside the name store. If the std::set would be more generic in the = types that find and the less operator would accept we wouldn't need to = do it this way.

So right now, you should put = your lookups inside an critical section until we found a better = solution.

Cheers,
David
David Weese, Ph.D.     =          david.weese@fu-berlin.de
F= reie Universit=E4t Berlin        http://www.inf.fu-berlin.de/
= Institut f=FCr Informatik         Phone: = +49 30 838 75137
Takustra=DFe 9         =           =  Algorithmic Bioinformatics
14195 Berlin     =                Room = 020

Am 18.09.2013 um 14:53 schrieb Johannes Dr=F6ge <johdro@mpi-inf.mpg.de>:
=
Hi = Seqan team,

I recently ported my code to use an off-memory = sequence store with your FaiIndex class. Here is the problem: Whenever I = use more than two threads to read a region using getIdByName+readRegion, = the region returned is incorrect. The class does not seem to be = thread-safe, although I cannot image why read-only functions like the = above should not be run in parallel. I could wrap the lookup into a = class with a locking mechanism but I imagine it would be easy and quick = to fix the class without introducing unnecessary locks.

When I = give readRegion an object of type "const FaiIndex", it segfaults. IMO, = whenever your index is read-only, the methods should be thread-safe or = not available at all.

Gru=DF = Johannes

_______________________________________________
seqan-d= ev mailing list
seqan-dev@lists.fu-berlin.de<= /a>
https://lists.fu-berlin.de/listinfo/seqan-dev

= --Apple-Mail=_F8711DD8-6F5C-447A-81DD-88681179507F-- --Apple-Mail=_676E9607-DA17-4015-8FA0-354C437D509F Content-Disposition: attachment; filename="smime.p7s" Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIPZjCCBCEw ggMJoAMCAQICAgDHMA0GCSqGSIb3DQEBBQUAMHExCzAJBgNVBAYTAkRFMRwwGgYDVQQKExNEZXV0 c2NoZSBUZWxla29tIEFHMR8wHQYDVQQLExZULVRlbGVTZWMgVHJ1c3QgQ2VudGVyMSMwIQYDVQQD ExpEZXV0c2NoZSBUZWxla29tIFJvb3QgQ0EgMjAeFw0wNjEyMTkxMDI5MDBaFw0xOTA2MzAyMzU5 MDBaMFoxCzAJBgNVBAYTAkRFMRMwEQYDVQQKEwpERk4tVmVyZWluMRAwDgYDVQQLEwdERk4tUEtJ MSQwIgYDVQQDExtERk4tVmVyZWluIFBDQSBHbG9iYWwgLSBHMDEwggEiMA0GCSqGSIb3DQEBAQUA A4IBDwAwggEKAoIBAQDpm8NnhfkNrvWNVMOWUDU9YuluTO2U1wBblSJ01CDrNI/W7MAxBAuZgeKm FNJSoCgjhIt0iQReW+DieMF4yxbLKDU5ey2QRdDtoAB6fL9KDhsAw4bpXCsxEXsM84IkQ4wcOItq aACa7txPeKvSxhObdq3u3ibo7wGvdA/BCaL2a869080UME/15eOkyGKbghoDJzANAmVgTe3RCSMq ljVYJ9N2xnG2kB3E7f81hn1vM7PbD8URwoqDoZRdQWvY0hD1TP3KUazZve+Sg7va64sWVlZDz+HV Ez2mHycwzUlU28kTNJpxdcVs6qcLmPkhnSevPqM5OUhqjK3JmfvDEvK9AgMBAAGjgdkwgdYwcAYD VR0fBGkwZzBloGOgYYZfaHR0cDovL3BraS50ZWxlc2VjLmRlL2NnaS1iaW4vc2VydmljZS9hZl9E b3dubG9hZEFSTC5jcmw/LWNybF9mb3JtYXQ9WF81MDkmLWlzc3Vlcj1EVF9ST09UX0NBXzIwHQYD VR0OBBYEFEm3xs/oPR9/6kR7Eyn38QpwPt5kMB8GA1UdIwQYMBaAFDHDeRu69VPXF+CJei0XbAqz K50zMA4GA1UdDwEB/wQEAwIBBjASBgNVHRMBAf8ECDAGAQH/AgECMA0GCSqGSIb3DQEBBQUAA4IB AQA74Vp3wEgX3KkY7IGvWonwvSiSpspZGBJw7Cjy565/lizn8l0ZMfYTK3S9vYCyufdnyTmieTvh ERHua3iRM347XyYndVNljjNj7s9zw7CSI0khUHUjoR8Y4pSFPT8z6XcgjaK95qGFKUD2P3MyWA0J a6bahWzAP7uNZmRWJE6uDT8yNQFb6YyC2XJZT7GGhfF0hVblw/hc843uR7NTBXDn5U2KaYMo4RMJ hp5eyOpYHgwf+aTUWgRo/Sg+iwK2WLX2oSw3VwBnqyNojWOl75lrXP1LVvarQIc01BGSbOyHxQoL BzNytG8MHVQs2FHHzL8w00Ny8TK/jM5JY6gA9/IcMIIFPjCCBCagAwIBAgIECc2lKDANBgkqhkiG 9w0BAQUFADBaMQswCQYDVQQGEwJERTETMBEGA1UEChMKREZOLVZlcmVpbjEQMA4GA1UECxMHREZO LVBLSTEkMCIGA1UEAxMbREZOLVZlcmVpbiBQQ0EgR2xvYmFsIC0gRzAxMB4XDTA3MDExNzEzNDE0 NloXDTE5MDExNTAwMDAwMFowgbUxCzAJBgNVBAYTAkRFMQ8wDQYDVQQIEwZCZXJsaW4xDzANBgNV BAcTBkJlcmxpbjEiMCAGA1UEChMZRnJlaWUgVW5pdmVyc2l0YWV0IEJlcmxpbjEOMAwGA1UECxMF WkVEQVQxMDAuBgNVBAMTJ0ZyZWllIFVuaXZlcnNpdGFldCBCZXJsaW4gLSBGVS1DQSAtIEcwMTEe MBwGCSqGSIb3DQEJARYPY2FARlUtQmVybGluLkRFMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB CgKCAQEAjYUI0048zDNevkmXipCDjSpIr+sEbhiXPzWnZnCnkmLOrEMFaNDWDX6kcVQ1VP71opEf GuR5LtW0P6N+JM8E8y5HXdap62bD4Yfg0KQEmlh9vpMQ75BckReW7wRKH/Ntcrg8gwn97d17Hs8h gRGk8cpBRAs5v5hcqRZcjR63mKCismsjld6MVdWSNYhZJhpcnb0dVzMa3A7Rf1OsXHwDXrhusCNp h1+Pazuw2XbIKWSCsFS4qlhHOj5QA375qk5IjjsUnw2FqljLiziu9xB4/jhSx1fz6+5RVnTe5Tb9 GMbk5RVR+dvPTnzF96T/yW5DqsFIL+xBYQ8juFoBQog3MwIDAQABo4IBrjCCAaowDwYDVR0TAQH/ BAUwAwEB/zALBgNVHQ8EBAMCAQYwHQYDVR0OBBYEFAbhPfRv9DC3ejtXsDEFiXsNWFpsMB8GA1Ud IwQYMBaAFEm3xs/oPR9/6kR7Eyn38QpwPt5kMBoGA1UdEQQTMBGBD2NhQEZVLUJlcmxpbi5ERTCB iAYDVR0fBIGAMH4wPaA7oDmGN2h0dHA6Ly9jZHAxLnBjYS5kZm4uZGUvZ2xvYmFsLXJvb3QtY2Ev cHViL2NybC9jYWNybC5jcmwwPaA7oDmGN2h0dHA6Ly9jZHAyLnBjYS5kZm4uZGUvZ2xvYmFsLXJv b3QtY2EvcHViL2NybC9jYWNybC5jcmwwgaIGCCsGAQUFBwEBBIGVMIGSMEcGCCsGAQUFBzAChjto dHRwOi8vY2RwMS5wY2EuZGZuLmRlL2dsb2JhbC1yb290LWNhL3B1Yi9jYWNlcnQvY2FjZXJ0LmNy dDBHBggrBgEFBQcwAoY7aHR0cDovL2NkcDIucGNhLmRmbi5kZS9nbG9iYWwtcm9vdC1jYS9wdWIv Y2FjZXJ0L2NhY2VydC5jcnQwDQYJKoZIhvcNAQEFBQADggEBAEyG7FCK8+oQK8NxT7knkDmt587h p0A5N1xBtF7pwXAz3QMauXSPUp+2PikXV7BeeGxcb21TViGsdJg3nt4hLgBI+L+OKM+cs8jUDsSb WroUD5mwyIAlZENYRraiopYx4DlLaxFoor6shYCmWid7jo5/uELbJFAOpgqS6pQ2lCCAPVsSLLdU S+sZx62D/b42P57yipO+QF7p08j/J1mJIDec9QzfSzgNla1LTcxhsh3H8V+zFCo84lbonXiVNt+8 tcDXbEVSHMwyJo7vrtq+L0QstaGEcApa+sySpPdvTl6Bl+ayfRNOgeqJkbR6g0euV+7IZxCMDaWW MKkjDMXVqFEwggX7MIIE46ADAgECAgcWNEvNdomiMA0GCSqGSIb3DQEBBQUAMIG1MQswCQYDVQQG EwJERTEPMA0GA1UECBMGQmVybGluMQ8wDQYDVQQHEwZCZXJsaW4xIjAgBgNVBAoTGUZyZWllIFVu aXZlcnNpdGFldCBCZXJsaW4xDjAMBgNVBAsTBVpFREFUMTAwLgYDVQQDEydGcmVpZSBVbml2ZXJz aXRhZXQgQmVybGluIC0gRlUtQ0EgLSBHMDExHjAcBgkqhkiG9w0BCQEWD2NhQEZVLUJlcmxpbi5E RTAeFw0xMzA4MjExNDI3MTBaFw0xNjA4MjAxNDI3MTBaMIGZMQswCQYDVQQGEwJERTEPMA0GA1UE CBMGQmVybGluMQ8wDQYDVQQHEwZCZXJsaW4xIjAgBgNVBAoTGUZyZWllIFVuaXZlcnNpdGFldCBC ZXJsaW4xLjAsBgNVBAsTJUZhY2hiZXJlaWNoIE1hdGhlbWF0aWsgdW5kIEluZm9ybWF0aWsxFDAS BgNVBAMTC0RhdmlkIFdlZXNlMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAmQBDoqxg 9VM+7+JLkO1TD+hGnlNHBPBkokZLcsnM4hw1bHKjIfojs4oT6sJwXFjFv32Ab7ZGcGs90j7gI/VR 9St7M+PPBp9ALOR56pxn9R0IMqsoCJG8xIXYZLZ8M1QAYTHmAhkL/4IAYrwfKtO758zBke9Zhcf1 GA1UZTfugJxRp+NRo01aRqIhBnFMfGc0mSYIgC2o3gCHuC0Sc+PmZcQvxW8cqi4J5LWzIXE8c9lb WKDYiNkELIEv/1Pwsk76M7Kbb2TuN74v2P6O2Xm34Q1mzuWMfmDGHSw9rbVOSWt5KNu97LbJ4xev RvqOZZZSLI1Cy2CniBUBM0SDCV32bQIDAQABo4ICKDCCAiQwLwYDVR0gBCgwJjARBg8rBgEEAYGt IYIsAQEEAwAwEQYPKwYBBAGBrSGCLAIBBAMAMAkGA1UdEwQCMAAwCwYDVR0PBAQDAgXgMB0GA1Ud JQQWMBQGCCsGAQUFBwMCBggrBgEFBQcDBDAdBgNVHQ4EFgQUwnfAMwU291B6aGcD4fmaIjBGRb0w HwYDVR0jBBgwFoAUBuE99G/0MLd6O1ewMQWJew1YWmwwcAYDVR0RBGkwZ4EZd2Vlc2VAY2FtcHVz LmZ1LWJlcmxpbi5kZYEWd2Vlc2VAaW5mLmZ1LWJlcmxpbi5kZYEYZGF2aWQud2Vlc2VAZnUtYmVy bGluLmRlgRhEYXZpZC5XZWVzZUBmdS1iZXJsaW4uZGUwdQYDVR0fBG4wbDA0oDKgMIYuaHR0cDov L2NkcDEucGNhLmRmbi5kZS9mdS1jYS9wdWIvY3JsL2NhY3JsLmNybDA0oDKgMIYuaHR0cDovL2Nk cDIucGNhLmRmbi5kZS9mdS1jYS9wdWIvY3JsL2NhY3JsLmNybDCBkAYIKwYBBQUHAQEEgYMwgYAw PgYIKwYBBQUHMAKGMmh0dHA6Ly9jZHAxLnBjYS5kZm4uZGUvZnUtY2EvcHViL2NhY2VydC9jYWNl cnQuY3J0MD4GCCsGAQUFBzAChjJodHRwOi8vY2RwMi5wY2EuZGZuLmRlL2Z1LWNhL3B1Yi9jYWNl cnQvY2FjZXJ0LmNydDANBgkqhkiG9w0BAQUFAAOCAQEASV+HktbNSFbDrXfh9+clKz4Q6AhJ+Vna vwt6xwbsoiviGt1FwsuDUu5FYFFkJhdTu19RFlnuMgS8YqgROv2FA3SeaLvcts6XdXglwmZa5yCl dM7nRdT2l4yICjrPpg085p6GCVD5GpQNxUsyOtWoaf1EllTxYT7RZ9NOToFAstTKELp4pTyRf0pR GFOZiFC/ftgJeXORFpZjj0pATnAdjH/eui/9PUnIX+EJ3H9KndRdDR3g2zFUNu6ryOTkPb/leFbC 472BgZ8tm5bdHVMraz9KSfr3sdi28wBKFMuQ/sHBM5q48vz86lpOemoQ/hOnTZT+qQF3tPmlTJbf FPevdzGCA/YwggPyAgEBMIHBMIG1MQswCQYDVQQGEwJERTEPMA0GA1UECBMGQmVybGluMQ8wDQYD VQQHEwZCZXJsaW4xIjAgBgNVBAoTGUZyZWllIFVuaXZlcnNpdGFldCBCZXJsaW4xDjAMBgNVBAsT BVpFREFUMTAwLgYDVQQDEydGcmVpZSBVbml2ZXJzaXRhZXQgQmVybGluIC0gRlUtQ0EgLSBHMDEx HjAcBgkqhkiG9w0BCQEWD2NhQEZVLUJlcmxpbi5ERQIHFjRLzXaJojAJBgUrDgMCGgUAoIICCTAY BgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xMzA5MjUwODQ4NDJaMCMG CSqGSIb3DQEJBDEWBBRfj58tD4ub17VMGki2R7RDnrkN6jCB0gYJKwYBBAGCNxAEMYHEMIHBMIG1 MQswCQYDVQQGEwJERTEPMA0GA1UECBMGQmVybGluMQ8wDQYDVQQHEwZCZXJsaW4xIjAgBgNVBAoT GUZyZWllIFVuaXZlcnNpdGFldCBCZXJsaW4xDjAMBgNVBAsTBVpFREFUMTAwLgYDVQQDEydGcmVp ZSBVbml2ZXJzaXRhZXQgQmVybGluIC0gRlUtQ0EgLSBHMDExHjAcBgkqhkiG9w0BCQEWD2NhQEZV LUJlcmxpbi5ERQIHFjRLzXaJojCB1AYLKoZIhvcNAQkQAgsxgcSggcEwgbUxCzAJBgNVBAYTAkRF MQ8wDQYDVQQIEwZCZXJsaW4xDzANBgNVBAcTBkJlcmxpbjEiMCAGA1UEChMZRnJlaWUgVW5pdmVy c2l0YWV0IEJlcmxpbjEOMAwGA1UECxMFWkVEQVQxMDAuBgNVBAMTJ0ZyZWllIFVuaXZlcnNpdGFl dCBCZXJsaW4gLSBGVS1DQSAtIEcwMTEeMBwGCSqGSIb3DQEJARYPY2FARlUtQmVybGluLkRFAgcW NEvNdomiMA0GCSqGSIb3DQEBAQUABIIBADYDjaDwICfzNl9vJKy7x+daLugfhZ1Yp2wWp7WS37wr AOtidikHOtIH9hbDrUn5uLSb3MQ9PKVRtAZ+haiOAIw6flnUDmuIAijsg5WLFpPFwjVhwlRpkQVu bjF8GtH/5RF+NgBSXk4OZXNmtfpikO5rpWOrWZw6Cj/oSFJCGNLSPj1qqYnr3QIcGSGhSaGL2x2m f5WoBBuqOu97sQsqSvxrswRpsIPpBlx9YBNm+DyapYatnAYUMTb5AZNVReME4vHpAD9vzdDxwtWe jRzLPoGMgQHSR1Xf2xb90neVaNc4SqyQzwDN1qu4rGDr5rFR5DsXKX+h+T4MN/YPcFmrrKkAAAAA AAA= --Apple-Mail=_676E9607-DA17-4015-8FA0-354C437D509F-- From johdro@mpi-inf.mpg.de Thu Sep 26 15:29:33 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VPBdL-0024fG-Uy>; Thu, 26 Sep 2013 15:29:32 +0200 Received: from hera.mpi-sb.mpg.de ([139.19.1.49] helo=hera.mpi-klsb.mpg.de) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VPBdL-000I7s-SQ>; Thu, 26 Sep 2013 15:29:31 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mpi-inf.mpg.de; s=mail200803; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=eiBrusTH+maEQBaAKiQsnT9mjldu3M9Q5umVqAWSXCM=; b=weklEZ7G0whrz3WoW2K0H0hrz0ZyeeqV1xAUax6toXqbTi6TNhajhlktHHBfIz7D0i3q3kA8/7k9oNXhssiIi6PAFHkVkD3ucJKz2qOla0Mk0uujqpSWaFWMOyj4u98QO0mQC2RFwbpOu7naYGQ+ojBPNrrshwpiAVdnhfyAvIw=; Received: from maniac.mpi-klsb.mpg.de ([139.19.1.28]:57349) by hera.mpi-klsb.mpg.de (envelope-from ) with esmtp (Exim 4.72) id 1VPBdI-0008VB-Md for seqan-dev@lists.fu-berlin.de; Thu, 26 Sep 2013 15:29:30 +0200 Received: from monster.cs.uni-duesseldorf.de ([134.99.112.114]:36232 helo=linux-eu7n.site) by maniac.mpi-klsb.mpg.de (envelope-from ) with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) id 1VPBdI-0008Ss-E3 for seqan-dev@lists.fu-berlin.de; Thu, 26 Sep 2013 15:29:28 +0200 Message-ID: <524436B6.1020209@mpi-inf.mpg.de> Date: Thu, 26 Sep 2013 15:29:26 +0200 From: =?ISO-8859-1?Q?Johannes_Dr=F6ge?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: SeqAn Development References: <5239A23D.30404@mpi-inf.mpg.de> <7D097497-C4B6-4AF5-9217-4B9EC46EBEF4@fu-berlin.de> In-Reply-To: <7D097497-C4B6-4AF5-9217-4B9EC46EBEF4@fu-berlin.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-MPI-Local-Sender: true X-Originating-IP: 139.19.1.49 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1380202171-0000097E-32BAB424/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, T_DKIM_INVALID X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Dschibuti.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: Re: [Seqan-dev] FaiIndex + getIdByName/readRegion not threads-safe X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Sep 2013 13:29:33 -0000 Hi David, thanks for the detailed explanation but I haven't understood all of it. I should probably look at the code. So: 1) I suppose you cannot just make the private mutable a local variable? 2) Sometimes I found it to be useful to work with sets of memory addresses/pointers when it comes to string comparisons or lookups. Not sure if this is applicable here. 3) As far as I understand getIdByName it the evil part, here, so instead of locking its access I could also keep a global read-only lookup map outside your container? Gruß Johannes Am 25.09.2013 10:48, schrieb Weese, David: > Hi Johannes, > > there is a mutable private string inside the NameCache that is not thread-safe. So although getIdByName is used with a constant name store cache, there will be some concurrent write accesses if you use it with multiple threads. Internally a name store cache is a set of ids (unsigned ints) that are sorted lexicographically with respect to the corresponding names in the name store (a StringSet). > The std::set::find() only supports to search for keys of the set, i.e. ids. That means if a random string needs to be looked up we first store it temporarily in a private mutable string inside the less operator and search for the id -1 to signal the less operator that this string should be compared with the strings inside the name store. If the std::set would be more generic in the types that find and the less operator would accept we wouldn't need to do it this way. > > So right now, you should put your lookups inside an critical section until we found a better solution. > > Cheers, > David From listes.rusconi@laposte.net Thu Sep 26 18:15:42 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VPEE8-002UwU-El>; Thu, 26 Sep 2013 18:15:40 +0200 Received: from smtp1.u-psud.fr ([129.175.33.41]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VPEE8-000pJD-9k>; Thu, 26 Sep 2013 18:15:40 +0200 Received: from smtp1.u-psud.fr (localhost [127.0.0.1]) by localhost (MTA) with SMTP id 9002C2565CA; Thu, 26 Sep 2013 18:15:38 +0200 (CEST) Received: from roma.clio.u-psud.fr (roma.lcp.u-psud.fr [129.175.101.95]) by smtp1.u-psud.fr (MTA) with ESMTP id 28C702565ED; Thu, 26 Sep 2013 18:15:38 +0200 (CEST) Received: by roma.clio.u-psud.fr (Postfix, from userid 1000) id 300DFB6157D; Thu, 26 Sep 2013 18:15:38 +0200 (CEST) Date: Thu, 26 Sep 2013 18:15:38 +0200 From: Filippo Rusconi To: Hannes =?utf-8?B?UsO2c3Q=?= Message-ID: <20130926161538.GB26528@licorne> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.5.21 (2010-09-15) X-Originating-IP: 129.175.33.41 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1380212140-0000097E-94B1615B/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.7 required=5.0 tests=FAKE_REPLY_C,FREEMAIL_FROM, RCVD_IN_DNSWL_LOW X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Botsuana.ZEDAT.FU-Berlin.DE X-Spam-Level: Cc: tille@debian.org, sonne@debian.org, seqan-dev@lists.fu-berlin.de Subject: Re: [Seqan-dev] Seqan 1.4 : (OpenMS 1.11 packaging in Debian) X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Filippo Rusconi , SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Sep 2013 16:15:42 -0000 Greetings Hannes, and Fellow Developers, [[ CC seqan-dev@lists.fu-berlin.de as this discussion is relevant to the seqan developers, who kindly offered to try to find a solution to the problems that have arisen for Debian packaging after the switch to svn-based software releases has been implemented ]] On Tue, Sep 24, 2013 at 02:20:03PM +0200, Hannes Röst wrote: > I ran into a problem when trying to get the dependencies for packaging > OpenMS 1.11 which depends on seqan 1.4. Filippo and I found this > previous conversation > https://lists.debian.org/debian-med/2013/08/msg00037.html . I hope its > ok to write to you directly, I didnt quite understand yet where to > best add comments to a certain bug in debian (maybe you can help me > here as well?). 1) You get the number of the bug. 2) You copy/paste into your MUA the text of the bug report that is relevant to the discussion at hand. You write you own stuff. 3) You send your mail to @bugs.debian.org. For example, the bug that is about seqan-dev is recorded via the following address: 720995@bugs.debian.org You CC any person you want. > Concerning seqan: > > I just asked Knut Reinert directly and apparently all source code of > the apps is available in the svn tree at > > http://svn.seqan.de/seqan/trunk/extras/apps/ Yes, indeed, that is the problem. That source tree is very large. We, Debian Developers, would like to have some script feature that would automatically extract the source code and make tarballs that are relevant : - to the library (that is for the seqan-dev package) - to the binary apps (that is for the -apps package) How could that be setup ? > there is no licence change (still BSD) and probably an oversight on > their side to not provide the sources of the apps as tar-package. No, in fact this appears to be a decision. Not an oversight. > But the source code is in the repository and its all in the build > system so building it should be as easy as before. But we need to have a tarball that is unambiguously related to the version number of the software. > So a "cmake ." after a fresh checkout of trunk configures all the apps > on my system and allows you to build them using for example "make > splazers". For our purposes (e.g. OpenMS building) we mostly need the > headers anyway which might be nice to have as seqan-1.4.1-dev package > or so as standalone ... The seqan-dev package is for the library and it has always been a library standalone package. The apps were packaged in another deb: seqan-apps. > So what would be the best course of action here to get seqan 1.4 packaged? This is the whole point. The last mail of the following thread did seem encouraging: https://lists.fu-berlin.de/htdig/seqan-dev/2013-August/msg00022.html The proposal read: 1. The SeqAn team provides a "seqan-src-1.4.1.tar.gz" that essentially is a tarballed version of the whole SVN tag. 2. The seqan-dev package is built from this using the -DSEQAN_RELEASE_LIBRARY flag to cmake. 3. The SeqAn team adjusts the build system with a -DSEQAN_IS_DEBIAN_BUILD=TRUE flag that allows to build the apps using the headers from the seqan-dev built earlier. This was looking pretty nice, although one limitation was that the seqan-src-1.4.1.tar.gz tarball (step 1) would contain the huge SVN tag stuff. That's, as said above, is too large for us to handle as a source package as it is filled with stuff that is of no use to the building of the binary packages. My suggestion, here, would be that we somehow put into that single tarball all the code subset that is needed to perform steps 2 and 3. How about putting that setup into work ? Andreas, would that setup be a correct one for you ? Cheers, Filippo -- Filippo Rusconi, PhD - public crypto key C78F687C @ pgp.mit.edu Researcher at CNRS and Debian Developer my massXpert software: http://www.massxpert.org From listes.rusconi@laposte.net Mon Sep 30 13:13:55 2013 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VQbQH-000SbD-El>; Mon, 30 Sep 2013 13:13:53 +0200 Received: from smtp1.u-psud.fr ([129.175.33.41]) by relay1.zedat.fu-berlin.de (Exim 4.80.1) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1VQbQH-0007vM-A4>; Mon, 30 Sep 2013 13:13:53 +0200 Received: from smtp1.u-psud.fr (localhost [127.0.0.1]) by localhost (MTA) with SMTP id 74FFF25657F; Mon, 30 Sep 2013 13:13:51 +0200 (CEST) Received: from roma.clio.u-psud.fr (roma.lcp.u-psud.fr [129.175.101.95]) by smtp1.u-psud.fr (MTA) with ESMTP id 18B9625657D; Mon, 30 Sep 2013 13:13:51 +0200 (CEST) Received: by roma.clio.u-psud.fr (Postfix, from userid 1000) id 1F960B6049C; Mon, 30 Sep 2013 13:13:51 +0200 (CEST) Date: Mon, 30 Sep 2013 13:13:51 +0200 From: Filippo Rusconi To: Hannes =?utf-8?B?UsO2c3Q=?= Message-ID: <20130930111351.GA9697@licorne> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Originating-IP: 129.175.33.41 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1380539633-0000097E-0153F080/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.170019, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=-0.7 required=5.0 tests=FAKE_REPLY_C,FREEMAIL_FROM, RCVD_IN_DNSWL_LOW X-Spam-Checker-Version: SpamAssassin 3.3.3-zedat0a54d5a on Benin.ZEDAT.FU-Berlin.DE X-Spam-Level: Cc: tille@debian.org, sonne@debian.org, seqan-dev@lists.fu-berlin.de Subject: Re: [Seqan-dev] Seqan 1.4 : (OpenMS 1.11 packaging in Debian) X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Filippo Rusconi , SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Sep 2013 11:13:55 -0000 Greetings Hannes, and Fellow Developers, [[ CC seqan-dev@lists.fu-berlin.de as this discussion is relevant to the seqan developers, who kindly offered to try to find a solution to the problems that have arisen for Debian packaging after the switch to svn-based software releases has been implemented ]] As a follow-up of my previous ramblings about packaging seqan for Debian, which is required since OpenMS depends on it, here are my tests. I want to remind you of some points: seqan was packaged for Debian in the form of two binary packages: - seqan-dev - sequan-apps Since the most recent versions of seqan, the software cannot be packaged in Debian because the source release mechanisms have changed. Specifically, there are no more source tarballs for each new version, upon which to make packages for Debian. In lieu of the tarballs, now the developers are required to download full svn tagged branches that are, of course, way too large for making a source tarball by simply compressing the resulting directories. Because Debian uses--in the package creation process--a source tarball that is versioned and absolutely immutable (that is, with an immutable checksum), we cannot package seqan anymore. I have followed the descriptions of Manuel (see thread https://lists.fu-berlin.de/htdig/seqan-dev/2013-August/msg00022.html) to build both the library and the apps subpackages. To do that, I have first "purified" the directory in which the svn branch was checked out. Here is a transcript of the my source tree "purifying" commands, interspersed with commands to monitor the size of the tree after each removal of unused stuff: $ svn co http://svn.seqan.de/seqan/tags/seqan-1.4.1 seqan-1.4.1 $ du -sh seqan-1.4.1 478M seqan-1.4.1 ./core/tests$ du -sh .svn 225M .svn # We'll remove that .svn directory later $ find -name "tests" -type d | xargs rm -rf ./core/apps/sak/tests ./core/apps/mason/tests ./core/apps/seqan_tcoffee/tests ./core/apps/razers/tests ./core/apps/rabema/tests ./core/apps/stellar/tests ./core/apps/razers2/tests ./core/apps/pair_align/tests ./core/apps/snp_store/tests ./core/apps/dfi/tests ./core/apps/splazers/tests ./core/apps/micro_razers/tests ./core/apps/tree_recon/tests ./extras/tests ./extras/apps/razers3/tests ./extras/apps/insegt/tests ./extras/apps/gustaf/tests ./extras/apps/searchjoin/tests ./extras/apps/masai/tests ./extras/apps/variant_comp/tests ./extras/apps/sgip/tests ./extras/apps/breakpoint_calculator/tests ./misc/seqan_instrumentation/bin/classes/simplejson/tests $ du -sh ../seqan-1.4.1 319M ../seqan-1.4.1 $ rm -rf extras/ (We'll see later that, for the time being, we need this directory to build non-extra stuff) $ du -sh ../seqan-1.4.1 311M ../seqan-1.4.1 $ rm -rf .svn $ du -sh ../seqan-1.4.1 87M ../seqan-1.4.1 # Good, that seems reasonable, now. When compressed, that will make a source tarball something like 30 M of size, which is fine. # Remove the corresponding directory from the CMakeLists main CMake file. # Delete lines -> message (STATUS "Configuring extras") -> add_subdirectory (extras) # Now test the build Ok, start with the apps ======================= cd .. mkdir -p seqan-1.4.1-build/release cd seqan-1.4.1-build/release cmake ../../seqan-1.4.1 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../../seqan-1.4.1-install -DSEQAN_BUILD_SYSTEM=SEQAN_RELEASE_APPS $ make Scanning dependencies of target dfi [ 75%] Building CXX object core/apps/dfi/CMakeFiles/dfi.dir/dfi.cpp.o /home/rusconi/devel/packaging/seqan/seqan-1.4.1/core/apps/dfi/dfi.cpp:29:45: fatal error: ../../extras/include/seqan/math.h: No such file or directory #include <../../extras/include/seqan/math.h> ^ compilation terminated. # The point seems to be that the core/apps stuff needs headers that are located in extras/include/seqan. That' odd. How core stuff needs something in extras ? By not removing the extras directory, the build went fine up to the end. So, I decided to leave that extras directory in the source tree. OK, now, to the library stuff. ============================== $ cd .. $ mkdir library $ cd library $ cmake ../../seqan-1.4.1 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../../seqan-lib-1.4.1-install -DSEQAN_BUILD_SYSTEM=SEQAN_RELEASE_LIBRARY $ make docs $ make install This seems to end-up with proper installation of the library stuff. And I can see seq_io.h installed correctly, which is fine, since this is the file that was missing during the OpenMS build with a previous version of seqan. Ramblings ========= So, at this time I would say the following: The size of the tarball is -rw-r--r-- 1 rusconi rusconi 30M Sep 30 10:28 seqan-1.4.1-cleaned.tar.gz which is perfectly right for us. How could we automate this tarball creation step, that amounts to cleaning up the build-unused stuff that is in the subversion tag checkout ? I suggest you would put a simple script like what I have done above in the source tree and that you run it ONCE on the subversion branch during a release, such as to produce a single versioned source tarball. For example, running this script against the 1.4.1 tag would afford a single tar.gz file named: seqan-1.4.1.tar.gz That source tarball would then be made available on the downloads page, maybe with some explanation of how it differs from the corresponding tag checkout out of the svn repository. How about such possibility ? I feel like this should work fine in the long run, if the script is amended when the source tree changes significantly. We, Debian Developers, would anyhow provide feedback upon using the source tarball for the creation of the packages. Please, let me know of your opinions on the matter above, with my best regards, Filippo Rusconi -- Filippo Rusconi, PhD - public crypto key C78F687C @ pgp.mit.edu Researcher at CNRS and Debian Developer my massXpert software: http://www.massxpert.org