From Sabrina.Krakau@fu-berlin.de Mon Jul 02 15:55:29 2012 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1Slh65-0002mL-IK>; Mon, 02 Jul 2012 15:55:25 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1Slh63-0000Ff-4F>; Mon, 02 Jul 2012 15:55:24 +0200 Received: from cas1.campus.fu-berlin.de ([130.133.170.201]) by relay2.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1Slh61-0000YC-CT>; Mon, 02 Jul 2012 15:55:22 +0200 Received: from EX03A.campus.fu-berlin.de ([130.133.170.134]) by CAS1.campus.fu-berlin.de ([130.133.170.201]) with mapi id 14.02.0309.002; Mon, 2 Jul 2012 15:55:18 +0200 From: "Krakau, Sabrina" To: "Krakau, Sabrina" Thread-Topic: SeqAn - BioStore Workshop 2012, Berlin, September the 4th - 6th Thread-Index: AQHNUGtIaUPZrsoqQUKFPEw89WpgtA== Date: Mon, 2 Jul 2012 13:55:17 +0000 Message-ID: References: Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Content-Type: multipart/related; boundary="_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_"; type="multipart/alternative" MIME-Version: 1.0 X-Originating-IP: 130.133.170.201 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341237325-00000D73-28FF5055/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.015779, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Burundi.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=ALL_TRUSTED, EXTRA_MPART_TYPE, HTML_90_100,HTML_MESSAGE Cc: AG ABI ABI , SeqAn Development , "seqan-interests@lists.fu-berlin.de" Subject: [Seqan-dev] Fwd: SeqAn - BioStore Workshop 2012, Berlin, September the 4th - 6th X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jul 2012 13:55:29 -0000 --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_ Content-Type: multipart/alternative; boundary="_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_" --_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear SeqAn Users and Developers, We would like to remind you of our online poll for this year's works= hop tutorials. This is your chance to obtain the tutorials you are interested in and it ta= kes only 3 minutes of your time. You can vote for your favorite topics until the end of this week (6th of Ju= ly). The detailed schedule based on your needs will be sent to you soon. Thank you for participating! The SeqAn team http://www.seqan-biostore.de/wp/seqan-workshops/2012-seqan-workshop/ [cid:12D5553D-AD16-40D5-986F-BF6FABF0587B] [cid:53AF5887-D1F2-4BAA-BF6C-C= F01BF1C6DB7] Sabrina Krakau Freie Universit=E4t Berlin Institute of Computer Science Algorithmic Bioinformatics - Project BioStore Takustr. 9, 14195 Berlin Telefon: +49 (0)30 838 75228 Anfang der weitergeleiteten E-Mail: Von: "Krakau, Sabrina" > Betreff: SeqAn - BioStore Workshop 2012, Berlin, September the 4th - 6th Datum: 22. Juni 2012 13:36:34 MESZ An: Sabrina Krakau > Kopie: >, SeqAn Development >, AG ABI ABI >, "Knut Reinert" > Dear SeqAn Users and Developers, The 4th SeqAn user meeting will take place from September the 4th to the 6t= h in Berlin. This year the workshop will include * Detailed description of SeqAn apps * Short, focused tutorials for beginners and advanced programmers * Tutorials about integrating SeqAn apps into a workflow engine * Reports of users about their experience with SeqAn New this year is that you can choose your tutorials of interest !!! Therefor you simply vote on our webpage for the topics you would lik= e to know more about. For beginners we offer: * SeqAn style vs. object oriented programming * Sequence I/O and handling * The SeqAn FragmentStore * Indices * Alignments in SeqAn For advanced users we offer: * BLASTX * Alignment free comparison * Bowtie * RNA-Seq expression * Data compression * Local alignments Additionally we offer a range of SeqAn apps you can choose from. If you have projects going on with SeqAn or any experience with it, which y= ou would like to share, just send us an email. We apreciate your interest and will soon send a final schedule based on you= r preferences. See you in September, The SeqAn team [cid:12D5553D-AD16-40D5-986F-BF6FABF0587B] [cid:53AF5887-D1F2-4BAA-BF6C-C= F01BF1C6DB7] Sabrina Krakau Freie Universit=E4t Berlin Institute of Computer Science Algorithmic Bioinformatics - Project BioStore Takustr. 9, 14195 Berlin Telefon: +49 (0)30 838 75228 --_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_ Content-Type: text/html; charset="iso-8859-1" Content-ID: <75AFD602F98E7A42BA34181328446447@campus.fu-berlin.de> Content-Transfer-Encoding: quoted-printable Dear SeqAn Users and Developers,

We would like to remind you of our online poll<= /a> for this year's workshop tutorials.
This is your chance to obtain the tutorials you are interested in and = it takes only 3 minutes of your time.
You can vote for your favorite topics until the end of this week (6= th of July).



<= span class=3D"Apple-style-span" style=3D"border-collapse: separate; color: = rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: nor= mal; font-weight: normal; letter-spacing: normal; line-height: normal; orph= ans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; w= hite-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal= -spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decoratio= ns-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-wid= th: 0px; font-size: medium; ">   

Sabrin= a Krakau
Freie Universit=E4t Berlin
Institute of Computer Science
Algorithmic Bioinformatics - Project BioStore

Takustr. 9, 14195 Berlin
Telefon: +49 (0)30 838 75228
=

Anfang der weitergeleiteten E-Mail:

Betreff: Se= qAn - BioStore Workshop 2012, Berlin, September the 4th - 6th
Datum: 22. J= uni 2012 13:36:34 MESZ
An: Sabri= na Krakau <Sabrina.Krakau= @fu-berlin.de>
Kopie: <<= a href=3D"mailto:seqan-interests@lists.fu-berlin.de">seqan-interests@lists.= fu-berlin.de>, SeqAn Development <seqan-dev@lists.fu-berlin.de>, AG ABI ABI <agabi@mi.fu-berlin= .de>, "Knut Reinert" <Knut.Reinert@fu-berlin.de>


Dear SeqAn Users and Developers,

The 4th SeqAn user meeting will take place from September the 4th to t= he 6th in Berlin.

This year the workshop will include
  • Detailed description of SeqAn apps
  • Short, focused tutorials for= beginners and advanced programmers
  • Tutorials about integrating Seq= An apps into a workflow engine
  • Reports of users about their experie= nce with SeqAn

New this year is that you can choose your tutorials of interest !!!
Therefor you simply vote on our webpage&nbs= p;for the topics you would like to know more about.

For beginners we offer:
  • SeqAn style vs. object oriented programming
  • Sequence I/O and ha= ndling
  • The SeqAn FragmentStore
  • Indices
  • Alignments i= n SeqAn

For advanced users we offer:
  • BLASTX
  • Alignment free comparison
  • Bowtie
  • RNA-Seq= expression
  • Data compression
  • Local alignments

Additionally we offer a range of SeqAn apps you can choose from.

If you have projects going on with SeqAn or any experience with it, wh= ich you would like to share, just send us an email.
We apreciate your interest and will soon send a final schedule based o= n your preferences.

See you in September,

The SeqAn team
 



<= span class=3D"Apple-style-span" style=3D"border-collapse: separate; font-fa= mily: Helvetica; font-style: normal; font-variant: normal; font-weight: nor= mal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -= webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; w= idows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webki= t-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -= webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: m= edium; ">   


Sabrina Krakau
Freie Universit=E4t Berlin
Institute of Computer Science
Algorithmic Bioinformatics - Project BioStore

Takustr. 9, 14195 Berlin
Telefon: +49 (0)30 838 75228
=



--_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_-- --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_ Content-Type: image/png; name="BioStore-Logo-60.png" Content-Description: BioStore-Logo-60.png Content-Disposition: inline; filename="BioStore-Logo-60.png"; size=4697; creation-date="Mon, 02 Jul 2012 13:55:17 GMT"; modification-date="Mon, 02 Jul 2012 13:55:17 GMT" Content-ID: <12D5553D-AD16-40D5-986F-BF6FABF0587B> Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAAGoAAAA8CAYAAACO9i99AAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ bWFnZVJlYWR5ccllPAAAEftJREFUeNrsXAl0HOWRrr+7Z0YaaSRLlmTLhw4kywfGmNjG+HoxDg53 OEw2yQtvs84aJ9mQzb59OUnykpBASMwuS/BuAgmBJEACBJZwGOzFGBsb8IkNSD5ly/IhS7Jk69bM dPefqu5q6/doRhdax++5S688R/f801PfX1Vf1f+3xYy7ngYAEQChjRaaNg4fiwG0UiG0QnyO74sW AaISX2+TQuzDcy18D5w/fJT4CI7SAz9+BJF6AELN1ZBb+RRIzQBfXPEsEUc94qiEd1y7y5CQslAK mIVgLMTXX0NoghJgDR58AvV933znTrQ+jkVRa1CfRf0qAjgfpPw2AlaBr7eibkC90jfh3x+oRImh rka9GQGbIUG24eMb+Pp/Ucf5pjx/gFLlQ9TrMQzehlnqE/j8A9QbfHOef0B58hx5F4LVhM9fQv2i b9LzEyiievsxb12LehxBexTfuNU3698JKAxxYEsd4lYAYmYI4nbAfZ/ouUvRESy4g09/GLXUN+05 AorqI9M2HGAQDchKa4Hy/H0wp/QtmFiwGywZdGopRVbhic+iV+Xh8+/7pv3/qaPOiC01BCgAQd2G gkg9lObVwPjcIzAi/TSkGV2gayaCpINtB6G6aSIE9Jj68V+h930aUf4cFr8r8PUe38TDDBQZ3zIN yEzrhkl5B2BK4R4YldXgAGHbGh7XHA8jNRCsOSVvQF3rWIiaaaAJ2xtmmxDyCHrjePSsm/H1fYNv TUgOtr6cBRTlHWoBjYy0wOTCA1Ax+iCMCLc49nJDX7DXh+j93HATTB+7BTYeWoTeF/UOtQkJu0HI 8TjmHN+8wwjUlZPfgbRAFIrzjkNGsJPBCfT7QQJ4yuidsKdhMjR35jshkeU40Qv0qovwuU7OOgQm 6SOTSCYuK66CSWOqMcTFMYwFwbL7JYJkfJ1yWTjQiV61HfOaUBmiJSVRETkCX4XdMDYYpQfLD3+J QJH3xFGlHNAszge3v6d5xCMaD7p5pUdHkj/RgG5vffCafrKSc5UvQyl4J6L+AHUvuN12B9xjLeN5 ocMxMo4nS13Pkq14QncCiL2UvM+0dGeyxOwwBE7VQHrjbme5w5c+6HkKmYf6AOqXwV0OcTMJsr3y vL1Q04TYSIf5jcPaqow8CjE8LhnQVEL5kCAuyKzDOq0VbLyc7up3MfJhvjN8oAYL1I2of0D9B9Qd qEQDs1BbpNTiVGsZWhRzm47AyZkY7zIdMiFT11DkezErCPkZDTC7aBMU5x6EoBHD8kDAus0CTkHA SYS+DByoW1Cf5vbQelTqlHd64U9i5Nx19FKIxQNg6OQ8YglFNAIK/9mUbEAbASWgpo/ZhiC9DeFQ OzLIoKO2TU0Q2+d8gwTqNnAXDdehpqEuR/0/BgnpuAWNbfmwu24S6CJG+WYMonM1CI38qUlI+WZv Sh+EcLADFpStc9pQZ1pUMCyr+BckUNeiPsk10D7Ut8FdcyLJpPSC2p0R6sSQ1QVd8TT0LfsLmtQd xod/a9Gn6tQBCZBRWXWweOIapzUVZYB8GTpQV6D+BbUddTHnJUoZU1GzqaBFraUT3z86BTqiaWAI Mwd9YjmFLYf4CfuJno6Q5hTHk0ZXwsLydVhcd/sgDQNQRK3/jIqFKtyOuosfqSO+E3UL5SgKe80d OfBe7cWgScxNQn4FmX6JU+gK+R7St1fdOsulBPMu2ggzi7Y6ucltWfnyUYCiaf471GJwu96zOATS riNaY+rq8RKhhYNddrrRCe3dGcXoSV93ljwEhT37QXximpaB3tMFiyauh4mj9iHLG3BR7Us/QP0Y dSE/PwbuTqNXwN3U4p1LtFwzdKutoTUz2h13Wnk/RKpWgADR+hR95knqduRlnoSrL34D81K905ry ZXiAItr9LX7+EOq/KueMRB2NSjHreMAwm1q6Mq3VH8yFaEy/yTCspRJTmJBEyu27Y2bQLMuvhqum bIAMZHixQYLkdI58x0sKFOlP2TwU3lbysRkMENVNVagN6EmyqW0ErNo5F5raMnMCRvwX4OQhG/+0 p21Le3lG8U6YN2Grs0al5CP6jgLUQtQcBr2LSUmNMwBTdFKlzTcBtZxZJvB5baiHUBsTfks66kzU StTmIdqDvm8BX2MzM13S6PkA1C3M9ICp+GWoH0c9gPoWaqtXNzW05MAL2xZAe1c6BIzo9xGkCgck qTULYd1FSybTi7A+soyIZeveOB9DHcOgj0/wFzL8dtT/RHD+bGCWLCwX0FyHPur6Ou1q+k6S6yZG +hzqvyugfB31Z6iPwdB2Q32PJyxNoFOouVw/Ys6Ff+NzwjyBN3mT61wC9Y/K60b2nipIWEfSkCh0 xkIIUhAMLTYLicGdAq+VVn7Rs36+eOo7ByePOTQlGg/OkO6GTBq7jttPk4GW6N2Z+QT/0CLUz3EJ 8CfyHNuCn5ZME1DzgYTuDvzOnpYxbfT8KxuO1rk+i/oFvsZ/5nMOsffvH4IdFjJI96PezWBlsXfV J/Q8/4Mn8zn3qEuV10IpbM86D71GFEROx8OBDuiOh+5GIwaJ5VmW2DUut/6hiYWHQwhSm3QN2sq1 11zuFdL3fJM91Asj2QzgYn59FwL1fDgbqnILBRzdI0HrSW8E7C+V61mL+gwbLsiEx+uinEpy/VPY s8MM5GYGVQWKruteDq3AnvpX5Zw0nli57FVkq6NMvDwhxjydr6mSJ7wqEY4qVRxe54O78LZGIW0j UGfz8SrGg6rTs2YMkQqqgT7DBpzv5aqgYYndx8ZCZzRwI3rSNc5ChrtMcQ+C1RU3jSgCR511GvOT DAxd+AMcol5nY1zGXY9aNtCXUA9zjrnKSTaRfpejvK5HTMlfVO9VE+DKeTqTIzLaH7nMIO/ciFqh nNfE5UlBH9/5KHvxKG4IrGbgPPkB59tHeVJVcjRRq/vZ/Lk5HPJf5HPT+fgCTj80/gquY6lkcjLB MtQfoZZxKMlkg1GvbjMm9yNBw4zuPFQEb1VWYD1rfdeZTNJZOHwtHOx8dv6kXZTDskxbL+JcdJgN Yiqeez24yyQlfIwuyrsj5EsMag5NU61363w0X18mF98/5PcfU3JFtnLcE5osd/LMpM80oC5F/SfU p3gidoO7f/5H7EF3sOdDQqS5nSPFdRyFbMULlnPI/BrqI2zHq6EnutzJ51n8W+i7f4P63zyxW5ho vcAT+g7lu6hE2m8warfwB87aAoSoAHoSbNlXCuven0RM7kZNl3OI6VFpG7fEfTMvqoaivAajOxYM 8Iz+UPmB5MafR72JL4YM+3JC2IlwOCE5TF9uxns1ae+Ang2ewNf4E/ZW9T1VMngCAE8Qz/gbmEnO 58nzHIcwMsrjfHw36q94NncoY0cZoBblPbLbN8Dd0r1S+f5VTFDuQf05uOt4Nk/aFznMqrKMx1ru ETgeg6LPMiOBgfWQB00iabBhw4flsLGq3GF9WCkto2gp3ZvYXrEtub69O0TbyUwOH2or6jbuD+7l mbY3Sby+nclACSfwtynkdbf3qqXIyK9x7KfwdCuHOKl4V6KU8bi13K+EhBw3nz3jOX5vM4flaxnY X7InLEmYfIkylmn9I0mOrWNgLmGgvF/1bJJz53IEWsblC7BNiJiNTNo9J5CI5b3+XgVs3ltErI5A mialdjU4G1ewRorrK7LDbTB1fK23IUbnC/o4g/AOz7CuhOGL+WKu55l7xnjoRftMnLPtp6TK+IDD sDoDV3Lf8XscRvYm+Rlhfmzj8KZKixIuVfHCIOnlnK9/q5Qv0Mf3NKUoI4AZJChAtSY5N8LnfZpt 6Z1P4L3eCygdvYigeHVrBew4MNZZeXX3qqCHINOjDgSGvDfDaV3rb7p8B4zNa06PxQ2aUdP4wp6B hCUOlumcjKdxeLkugWWuoNzUggVCx2mgZa3EHKHKPs6DU5hFJQPqND/mchhUjZOvMLtUsoUT+j3s xQ0peibtCWMmGl+dGH1JnHPpFf12zw2dugk6vPLuBKisKYBQINaTtQSyQASJ6qZwWvThm6/YCePy mkuicWMUU+JnlOTqSYCZ3TXMbFZxAetR9F9zi4py1wYCqrEWcx8eDZy9EpKYf2ZzuAE2YDIhGv6e UsC/pNDs6/j59n6MF+ZrjStkwOBcYil90V3MmH+R8Pmr+LMfDACotUxIyjjXJwcqYNjQFTXghU0V cOBYrgOSQpHLMCtdKjHEmRKOLpq2Z23pqKYIUvWTTEmTzaRPsPeQIf8nyZcv4mRPP9S5qYD2tDTU QGLYI/kUz2qDWd0ingTPQ+p7iS0uYJ/k8EUdhhNcLFNuWs+hjeQrHKI3sJcFGMy7+PNebXaQc8aV zM4y2aMoLNOWBWTE8F8Mzqe4iF7JZAX66WJSjvsXDrtLeQUjjdtiEcNhdgEbmlrS4YWN5XCkIQKh YCyxjpmCJCJdoMchc3itvjncGB2XdPtJHsf2PL64+xMYnjpT7+fndHHHqVA4jRUdtY+Um+G9WTs1 oTBv5B92dxIypHZUnuI8dC+HMJWRLVfOvYTBsjhPGVz/PJ7QoH6ayc8aBu8RBucZDn33MwOMMQ1/ lEsE9RpTba5v4LxNnZttHLoNngz3iYfX3Aw1J7Jg9dZiONUWwpyUpIUl4KtCaCs1ik1Cuylm6i8u WVAFk8Y3QtzUBYevch64ljVVqL2cZ+r13JtzClQD+VxtpYQtL0rnuRL38xUQBM/WeqWD4EkOz/aG hCLem0CT2fjHOMdZCYVxCX8+g8E6kOJ3RLjLkckh76hybBSDHuS8WZ1kghazZ6Zq9AZ5YhbyOXTu IeOxVZOguTUNHUVAkHYSJcNaioiztVKKLtPUPhw7sgWKC06DZWkhJYnuTMKu1JpmIVfkVzIVfYWr +TPLG6Fwr2K3MUmXPJWcStE+IjmZpIhNDJPVyXJDEmnjsJlM6pNMElU6E5huMoklKSfAaDgdwpxg g9BTt4ORmsed9SYp6y3bPpEV7kRCEYdYXKcfeLyPTjIxrlmc+Pdz6Pgi57Vl6qym/ZuhsAAjQP1D f1dSL6B0YaWOmj2U6wgWVgiHaAto0H3gWDYcP5kJRQWtJtVQNt0ZYGnqzQJZzF4i3NVezWFnLYeh BZzYz/qSIKZOHR3f6vQXD/uk533VFFLaUWR+OSBkWndM61y/cyzMu0SH2sZsh4xcXNwA4RCREBFG 0MLS9aB2ZYyHuNu9lLsAvVZ2KTcRWNEOH6ckQA3orokatOTvpRDLhdQuCerW5oN1I+DgiZHoRQHQ DR227y/B4rcVZlUc7szPausEZ2n+jHyLe3UPMpNK5rXOdvNQuoBWclF/T/PZ6WfC0gHfFz0Cp/k6 IXQNGeBi1AZqHzhMEPk0crVcG4JhrMeOjsk9DTdcvgvCSPMxHC7htv0mJhIpbxwgoIj11Vadxfx8 GUToc1syEhZKkCuw9qI64k3h5p8APuYgIWnThflkLG4gI2yCjBDdOKBN565DBxe38b6nDVZ4Ef/W qI8KFPes5HK05Hgp5Fx3/Ug2I3ivo43fj5mGPXNCDcybvB/rK+e+mZVMKKgLXTmQL8jM8RnfcADl yRGm2h59hyh6UmnhSVg4bR+Yzh0bTgOWyMO74K6s9itE0bPzBegB57YdHzBFtOEYhBq5IzK64JoZ lc4aFhbPOrdkgBuV5kDGsbFSiIzE6jjbuy/Ol2EDiuqogGHBDbMrITfSCablDHkpt1mquAMxICEv CqZjPC0UYMZ8cIYNKO9W3E/O2A8lo5qpU+EdKlaan4MyOXlVxWwBWfnuknxfVfiFdOP8RwIqZupw xeRjML3shJOjFPEajnsHOyYBlY0gzV2iQRaGwVg3vmf2KC2FxGPuBNEuoBtDhvy/65qmBqWjW2DB tFqIx3vhTQtl1lDHp7BHuWr+ZzQ4uENC/aEe1xG6cHJY0cUA1dvBOXYh3EA/JEPSUj3tq5g79Ziz Syka15OxwseZ9f16SLkPw15aBsDUhQImzhFnwhwtKlI/kOrs6h3ygol+Qwp9dFN0JByDwtx2rJdS DkELZrSmMnuoF0dhkLyLwHEaILpbFNN7BKSfo/rvPDm7lPppndLaEK2G0gJh2XCQFk991jfoENiv 0E4hWp6m7cO09u938M41mRiE0DatVzkM0gIibSKh5XJadfLL2vMIKE/qWDOhZ484gdU80M7FhSx/ E2AAfs+WwMxn8X8AAAAASUVORK5CYII= --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_ Content-Type: image/png; name="BMBF_CMYK_Gef_XXL_e-60.png" Content-Description: BMBF_CMYK_Gef_XXL_e-60.png Content-Disposition: inline; filename="BMBF_CMYK_Gef_XXL_e-60.png"; size=8173; creation-date="Mon, 02 Jul 2012 13:55:17 GMT"; modification-date="Mon, 02 Jul 2012 13:55:17 GMT" Content-ID: <53AF5887-D1F2-4BAA-BF6C-CF01BF1C6DB7> Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAAE8AAAA8CAYAAAAngufpAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ bWFnZVJlYWR5ccllPAAAH49JREFUeNrkewd4XNWZ9je9aFRGMxq1Ue9yUZct29jGFRsXOsbYEAKB P+HfTXYXsrsQ2F1SSJ6QXTab5ae6YJrBYMC4yLKtbkmWrF6t3utoNJoZTZ/Z91xJxAYMMoEN+B89 95lyz733nO983/u+33eOhI3NLcReQqGQqi9WU39/PwUHB1NCYgJVlJVRbU0NLVuxklauXEFnz5zl 2mZkZZJpaoomJ8aJz+fTQl9oyzvy/vuPj46Oekml0nNGo3FHcEjIeFpqalVeXt76oKCgwY7OzqHk pKTMjIyM462trf6Dg4Mp5pkZ1z27dhUdeuONLVptqG5ocKjJQ7RN5e/fgvaUkpISMGM2F/T19+8M DQ2Z2b59+395KxTThw+/+xurzTYdGhLSEhISzLfZ7KLsZdmne3t70/fvP/DwmtWrP8rNy3Ns2rCh YdpgWKQ3GJaGh4V1hoaGviUUCswej+dLxyOkr/Hi83g0PDxMbpeTxGLxtRjPA6MJAjSahqampnEe kR3f6XReXnZGetoHtbV1Seiw98DgoK+fUpmt1+s7BAIBX6VSnf2P/3xe29XVfayyqiraYrHEakND p319faWG6WnBpba2AZfbbTZOT5eqVSp7QX7BUhimyGw2C+0OhyIpKamnqbl5p0gkmvGSy0c62jtO oW1KS0vL+2Ojo08VFRdFhAQF62esVkGLydhe31BvXch4vpbx0Alqu9RGBefOkkwuX/B1TqdTqgkI yJPJvbbAIPL29nb7yhUr/miz23dMTOi8MFCB0WRywTCFMMh9arWaecHx3r7eO729FCMxMdFDbrdb NDAw6FEqlaMOh0Pucrmc6M+wZdpoJR4JMEFy+IvJ7fEIFAqFQSgSDqGv/o1NTRfhqbZDhw5RfX0D D21pcnKSj7bTanVAD18g8JNJpVY4w2hYWJhHIpHQt+J5NpuN7rjjTtp+883Egxcu5MXa4XC89NJL a01mM2FglxCmIwjPmcioqMqnnn76h1mZmR3n8vObJFKJTuHtHYgBWXr7+rbg6rHHHnssd9++fXdG RUcNlZWXVym8vJwwXIjBYFD1DwykAQIMMGZCQkKi5UcPPXR8aHDQBW+zyOVy9a5dd+e+8eZbfsFB gU5vb2/C/T2YpHZ4tQvXVIeHh7UbpqaiWlrbpDD+jXv37h0OCAiYwmR/88bzuD2k8PUheA/JJOKv nKHLPNa1atWqf0c48usa6p093d3cRGi12s4VOcv/NTMzywUMcjMoiIyI/K+BgQFKTkzMh9U9S5em uLKysp4NC9O6iktK3cx1MPAaGI1wsBlknTgfFRlJYeFhNDw0ROMTE88uXbKEkhITPew5YpGYJGIJ wehutD0y163jzEgw2qVJvT5Xp9PxRkZGPG6Xi74V4zGXd+P5RpuZ7HY7yZkBF2J0GJnruMfjvrxj 7DPOObh3t5v7De3mJ8XJ3u12GyFkuTZuro2H3fDTW89/cLkxaIdjvpse1tY+932hvsEwnRHhl5Eh d57+ghcLRZvDSTM2Bwbk+fSYP3c9vubHZTSZ/zLjzd4MlAlvMFms3GHATccn9WSx2oh3nRmNN4f3 vv7+NDKh+8uNNxceV8yK1WbnDGi0WL5zHsjCeKEYzWDBNQspBHaiobFxau3sIYWPH/BT9M0Y74uZ leZC2v453GN/DDO4geC43MCeOQzk8XkLH+QVE4k/hlW8L1YJIpDGZXj6Z6O6rjQq6xuPL6DImBgy W600NDpO5hkL97vbM4vL/G/b1VlIW+2OuQHxIBOkzHQUoAmipORFJJHJOdJhPMIYDvKBtGHhAH4P yWSyuQF5Zt+5z7MTw5HL3HOUCKOcnBUE7cf9PjA0TDNWB2k0gRx5BAUF07oNG2n1mrV0/w8fpB23 3EIOPBNszU2in58/hUdGffo8dvj4+JKPvwrGiyOxVE5zjPwNsO2CeWvWAVysQzx4Gr55+ShJaDBR YtIiuvOuu2lsbIwKi0oIYhjh4EMvvfwK+cMYJaXnSRseQaOQHBKwuRAeI/NSkAWevOqG1ZwhTUYz PQBj3Hb7HRxRjQwPMSKnoZExRtG0/+BBGh8fg3HVtHv3HkIWQ61tbbR+/TpSqdXU0tQMo26iZTkr YUgbxcTF0+ncUyTGhIZHx3BZFMfcV4kA4f8+6sxalJMcc95kmjHTnj17uFRP7uVF0Fmc1/L5Qs5g Ajbj8EhflZqmkVPfftsdBAGNgebSmrVrKUSrpabGRnizhvNcpGVcGAYEaCgGYcfuZ7VauLBl3sZC b+fOW/AMAWkCAzlPtoLgomCwLRD+LLTZ891fARv8vyp6c/g3i1MCoZBL+ywzM59qvfkQmsci7p3J I/usERiYz6A9C0F2FzYh8/pxNrQdMIr1C/XmDPDL4bCTDUZj31l79t0KxeChhWEt/7vAgLxZNbsg guBdIYn/ui/+V45qTt9cn5L3L3sJPzutDDNEYi7m+U673ctmtYpwxiAQCF2s5ueeC6nLGVUkFF2R tjBNxF5isYi73zxLsbZCgZDLMVmIcikOj3cFHrPfGDMz3FkAenKTy65hfWPPmj2E3PXcvXCeG9Pc d6Fw9rlzJP7ptex31u5aZKnw8lDho9e6iYkEs8kYCsNZ9fqpcODJ1sryckt4eNiznR0diUHBwWV4 iH7eGAxjRoZH5nCDx2SH38TEeBq+89Rqdb1YIplgNbugwCCuc4ODAzktLc3xUonUpdFojlssFv1c xYWbGCtAfcZsDujqbPeHHmuTSmXk5SW/Sn4J/AMRTE1N8YcGB1IggxqEIpETuOk/Pjq6KiM99Sz6 YZ6c1Cu7OtpviI6KLOjv7UlPSoyvQr9N7JmTk5PkmpjwNU4blsxoNCVMNnEYjL4y6SJmpamrGU8B dmMv1smWpsanS4oK742LizsZFxf/8ujo6CZWNGy/dMkdExcX2dbS/I8NtTWqzIyMe1XR0XUMkI1G Ex09+iGrbHBsqZ+cTL9YUXaWGSI9e/ltvkrlUS1kyI6dO7hZLi7If0IgEl9Sqfy7kewLzCYT11F2 LfOA4uISxoppbx7YvzIuPuFftmzbgc6757yCx3ks511oi+DgBtfV3S08ferU3/qrA34mlUkNU5OT ifXVFz+IjIzclJGdda6osGDnkcPv7E9ITEzp7euTtba2CYYgQ0RzhVydTi9obW3xzlYo0A8JK1nR xMQEfXz0g5v8VaoLAYFBk6y+x0iKN0dwfAE8lrEV61B/f3/82by83TtuvfXp1PT0t5MTk3jnS4qD y8vKFt9y222HV61efU4bHl5w8tixd18/cOAf9j7wwH2WufRLioeZzCYuK+AO3hcfs5MktasDg17B 4FuDg4PFF8rK/mF4eCisp7v71YjIyM4Txz5+wuFwxkkk0oZb77iD+vv6nxwaGvRfuXLl80bDtKau pvp+vX4yV61Sz7Q0N90WGhpyMS4h8RC7uwADEsyGqkitDqjo7enOTsvIONfb05MSn5BQPqWfEphM JmlHR4dPa3PTI3ab3T81La1zx47th8vPlwqUvj7KmuqaRzvbL3nNmGfyjrz77q9uWLP2HT5fcKm2 ujoHNjNGx8RW4VGn+3t7H+EzizI1/fHRDx+Fy5twwl83oeMNjwx7EA7hrFPQTx2Dg4M0bZhKDA4J 6a2vq11jt1nTkpISKTEhnuLj4znZME8sVzPcXFhLmhvqnxsfHflNcWHhHZ0d7RFymbzglZdevv/g /gNbXW63PTwyIlcul3tamlt24rzCYXc0Hn777fsmdROLTEajHJFxCt4XlpGZVZJ/9uxtMEYoC9d5 WnO7XUhcZBcQOQEYZAJ+ckAUtxpNRsFgf/92wFBgd2fnCl8/39fPnM5dB8dJmBgdWVlz8eKqKb0+ GdlO+dH33x+DU1VDsH+EaLpBKBLqclasyDPoJ+8cHBiItZhnsvneUPX19XUrzuTl7kF6pISiFubl 5v7i2V/9+uWSoqJN7HxpcfGuTz768OnO9o7tUPBihLL2rddf/6WXXA7D+9CiRYu4kPu81PBcdszq K8yibfstt/7LDx986Inuri4XBGwCBpUEY7R2dnZqgkNCezC7bcAvO3SXCgZIhK4LD4+I7BaJhGKn y1mpVPq7ZmbMgc1NTeGACyFwmeOd+efgGQLou1GQ0+CZvNM/CwsPvwjIMPGZS6KfuFYQqtUOgySa p6b0VjiQyGK1yiCSj4dHhLdMwUCpaanDeDfqJ3WT6NtMaVFhhZdCUQnDOirKyh6IjI4+x8dM+rz8 wgu/wywWTk9Pe3/4wQePyr3k4xbLjIINloXBpE4X4uvrNz0yPKz9+OjRh5KSk6vMAOV33nrzCWY0 jSaAQA5cwv05Vebhcdg0bzx4BQ2PjJjHx8cpLT19FNg6kJ6R2Yd78jOzsnoMen1GT3fXDfAAicpf 1Yd+9eH8CK512+wOl93u8Ezq9WxVDz9nNsI7RAhFntPJKr+zB8uTJ3B/4FVxZUXF3UHBIaVm84wY Xg3ccnLngV88l2uuvZvDVGtN9cVt3t4+OjiMGcS1Gjk2HxlIhtlsEgwPDfP6evuQwsVVXqy8cBei 46wAWPwjJNbG0eHheGBOu16vD6muqroTBlvMJpGBM2Ze2NHevhF5YmJ0TEyHr5/fJNx5oLevX2Q0 mfq8FN5jDDtZLoiwjBoeHLifGSo4VHsYjNXKWGtZdjbHXmbTjCUiIqIVoWdZs2Z1T+WFSt/Kygux P3jggaMbNmyoyjt9OhYT4o6IjM6/9fbbSro6O8Kam5s0t9x623uwk8HLSzG4bt2No5qAgPGSkuLM lNS0ogBNYD1bcwgLC+sMDgl2BQZqXOijbtHipXXob3dOTk4pSMgRExt7SSqTTcATL0VFRU9ERkR0 +fgqrcuXLb80bTLptmzdcuHMmTMrMOaev/3pT9/29vE1gpVj0bYKY+zAZBmVSj9vTGz4rt279wvT MzP/VF9b86JEKrMjBHwArA0hIaHdkBThVotFCSpXhoSGDiA8J9ChaS8vLwNCQoa4T7rp5pv/USSW NBoMBmLe8PnaHTxXKCAAPrH1iJSUFIpLTDjm5+tLDGtZ7pi8ePEheDFJkYwbDNOUtHjJn1h+KxCI wHoyCo+KeoWVhoBpFBkZNS6SSDmoiI6NLUpITCpKAmSwokBgUFAemzCENpNP3WD5bj+lHyEM31Io vCgiMioXMgs94o2wboLdh73BrnJvn1x1gIoWL00ZCgwOIRj7d6zyA8xlXlYSFx9XwvrCnhsTH0dF +QUxGVlZnzBJI9Bqw1bKZNIQo3E6CDe0I6n2RieESJBf7O/rS4CRAm+/664/YrAOpFBiDFSBpFsc HRtT2dfb6wc5Vz0+pnNAH3K5JET1FZ4HY7cyip+eNlJGRjpnRHawASxKTmIG4x9+5x3pyOiIkwdx C4EtME0b5XaHXYywcoHRPSys4LU81n5sbJwL/ZaWFjJMGSgQ+nHGbOKmCsyL3NTChSxIiDtYRWZo eITXPzAgHhsbY7fkJAf6RWNop9PpOJnEcuAAtYpqa2s5/E5LS6WhwWFuTYR1ljnIXA7etnjJkjrc 281fvHhxKfMmXODEg73QSXOoNrQ9ITGhFUy3hAFsQ3392nv27HkeIa3Cg2xgKZDTSMLynJwPw8O0 5vj4GDxQcNXclKuPeSvI18ebk0VME7Iwr62tI7Ddk9HRUbtZ5lFWWir65OgHhxrra1/raGv9d0xe gBFGLzx7VgHCespoMgvYjoZpXN/Y1AxvYErBmxt4dFQUZaancf1gxomJiWZGFoIEfQb6B4IKzp55 8GzuKbLMmJHlCEgDjO7u7maCHDKtj5Yvy6aEhARSBwRwdUW3x3NFEj1foED2YmE1B67qg7AR+/gp z8OFR2RyuQUzoUnPyHrv+ef+8B8KhbcJeDjT3Ni4vKiwcAv0Xy5CQgKSUS9asuRdoUjsSIM3wdh0 tXydPZCF6ObNm+A9UjYwJVh2K4Bfc/jwYeXJEydWxcTFNzJyGR+fkCIbEMXExv0oMSnxka1bbxpN Tk4K9/X1XQ3PDuvvH5B0dXUGgjn5uE8wy7HgmUHIfDZDX3qL4WXA6vi+np71LFTzck9teeXFl15A HzyZ2dknn3jqKZKIxcmX2trWw4ME0JIKMG8CiGAdiEPOPPBHDz1Id911F8kQugzvv0hBzP8kDAvT IkJs7dA95VD8e9at33BQE6gZWr5iRUFnR+fp9kttO7bu2HkI3inEuf/X2ty8IiRUWwv371+zds0l hnLdbN8K0hzm7p/bJYBZDA0J4fJV5i2nT574nWHaOFCUf+7OuISkP0E0+0JHhTDBjVwYSsSpPPDa qz82TOl7TUYTE7i/h7ZsxYQFjI6OhLU0Nvxk+bJlv2iovviUODvr73JP9P92aHBw6u033liln9T/ tiA//2/g2cYTnxxfApbXQwaFNDXUJdustk0jS5e8XlSQ/xSkTP+pE8eThoaGWsfHxh9FN7sg0bJv 3r79txyWwWidnV2z0or3JcuPJ0+courq2nFgWT4Y7j3MSO342Fg4hPG5AE1AD1jKKyw87BzCbQBp WsTilLQXbA57pVAkOVNeXkEVFyqptLSMA+1ZScL73L4Wdq6+sYkKi4q1+qkpbwDuM5AapvfeeYst cFft2bu3BF5NEMO4Bd8Jgw7IpLL+QwcORAMfh3GPZ+HxoxiMGO9siwRL9dxmoznWMD1tj4mP/1lf X98foQo8QUFB7ZAu3nV1ddqmhoYKaNALMGo9HENcV1ubIZHJigJDQn57vrQ0ubKiXAKJVIhc/iXk xgECQArDN4ax7J2FMKtqz65v0OeKBkIGvHO5bdf2HTuOFBcX/Z/6+oYRyJYz8KQbTcbpMG9vbwnS ljZ/f+UWyIGWsPDQNy9erOWSeDs8i60Z8IQiYpmfgxUr2aI4Ky6y78QnG4zntDvwo8eBpF3ItjxA 8cvgbQ7QPo/DE7Sx2e1sr4lxSVr6MbQzQoSnApNEqoAA6cTEuAz5shtYw1IsJtQl8BI79KgY0OMd GhqaOTQwaES2Eb946dJjXR2dN5lMRh5wnMc8iek5tt3D6XB6+SuVcpbLi4QiN845GRmA3LiJvmIx Cv1iuTNvrvLy2RAW7kTCzn7kSjpCoQ7sW4tQ4/n5+Ypzm5vWQ4W729vaNicvWvQaPLLLT+lfywYf EqIlCei8u6qCPF1TJAReMBf3c1nJ7Jo1RpTbRt4uM6ndXpSdnQnsk46Oj4005p48cRAyY1IdoG4q KymZACa68DzSakNtdovZYzGZXsGgptdv3PibhrpaJ3LUZzUajSkzI70L5CGvqij/p4SkJPvdu+9p R/YzhgT+tY0bN+anpqYeLsg/5zM5PnG7Sq3qz1qWPVpWXBwPmbMmMipqeOvNN5/rbG9/rqmxIeXm bduOnD2TN6VSqfQ4Zw8ODhqTy2Vcfmyfq0zPGzI0NBge6AfJNXyFAYVarfYy9e92hUdGfqyf0rMb aO/de98PkLzzGxubnMg+BiEyDyIMPH5+fvAs3NxqofSTr+Iu44gjEYtR8vAt5HSLWIJJQt4U8cTo yOgg8XUDJMq5kTZu3vwrnX5KA0zVxcbGetZv2PhMVFSEC15NYH47mHMP+iFny3uBgYHTYP+f9/b2 KrZt26ZX+Svdq9as+QmMLVm2fNm0SuXPZMaTTg/5bdi0aSIuNsazcfOW/8vK6VnZ2SYQjRsetocx +85bdloiI8K563t7++UbNm0cW5qaygtQqytWrb7Bo/D2fk4OQistLaV0QAhzECUm9PI6I1MKNVAI 82u/ws9uZpldS/WwXUwDQUHBA1FRkUwnEaQJVypn7bmlQrGUpJX5JBjsQczL5gEO2Rjv0wq0gKuy CGbve+Z9ci7JYitpLojcYfYbuw864RodHSPIIK5AgYHau7q77UgFaevNW1kVxubj62v76KOPiaV0 EK0zScmJM0yvDaNf0waDA/g4jhx4tshJ7un54ivTe/AHMwtbVrgAWzP8NcHgJkQROZwuRsQeMD+B kFzvHn6X05HMeJ/VDsxY8XGxnCxim4WYJBJ+2co6E71sgEyUXskCAuKZDCSryGNbSq94DPvs/myN n5WJDJMEBU0yuRchT+YU/NtvvkkXKsrpnr17obESIVLVVFhQQKdOniQw/uobVq+uhkeY+GNjPqXF RfcAw5T+St+mrMzMY9XVNVRUVEzIf2nDxvWsdEUXL9ZQTHQUKaD9IF+4CUlOTuY2+lRcuMAt9mRk ZNCJ48dp/759lJ2zgsvdL7W3U1trKxUVFPIwOTcCF8/PmGes/shQ5qN0bpMSLV2y+NNFpq+1AOSB 5aW1JSQY7SfPZSX4qy6FuPCwtdtJoFKTlCu/84QAYt6kboIgkZgo5cMbeCAjylmxgmO4jtbWvfBC JXJnGujvj5RJJbuR/zZcKK/4u3379q+b0E1ysOPr68PLgcBFiEIWOYURkRGUEB9PNpudDwcQLlq8 iJKSktiqGJOHAhgHIangDAmj8ua2VAgiIsJJrxvjtzY1PohI8GZeykr3fy7tzx7zW5C545oNB6/j G6dIWpZLYJivNpzbSU5VMBkTs0iqn6Lqqqq7z+XlbbaYzfXIJ59vamr6Z2QWQSAlFzDr7y9WXnjY 7XJmg42jMTAbnwNtnhCCfcTPX3UcHnmTQu7l0I2Pr8o/e2bvYHz8ABL2X9bV1PygtKRklUwsqth8 05aDxYUFT4OhNTExUS8D9zpLigr/DYPnZWZkPAviCh/o7/uxp6T4mEQo6Dx+7OOfJyYl9a65cd0v SgoLpR9+cOSXENPWuLjYx3GNw/ONLXqzMnVdMQnGBskzj3Vf8hJg8DqHk9468gEMz6eW+nqPv0p1 Kvfkyb0/feyx96CzUrKWLXsSCffjf//Tny2DHluzas3ax8tLS19GmAg83KZIobW7qzP19X37jiBL 8dMEBZW/+MJ/H7hpy9ZXi4sKH3gTArnm4sUdMMCHJvNMx+HD72zDY33SMzLzDu47cN/KG274NXLR Op1uIu6ffv74Woh9YlWXrGXLj5w5ffpgYlLyn5CKLpXLZBq26TwhKfkPp0+efPLjj48lawID6z5f avsa67bM63jM68pPc1i3sHVt5IOsIhEbj3BKYAn5EjBZIFJBPkSxYGxkdLKpvn4AksEMUlIAvK3A lCFkGxMwnGCOWKSQJlV37773DqSPwxXl5ctcDqcAEmU50rZetVo9lLV82TPIRm4CTifZbXYxPDG8 MP9cKF8gqLBaLdEI6UXwQOHoMCc3PBfKy2vgqUx+KGC0YUiwVyG09TDqDMTxIEIaHGISTk0Z6GrH NRnPLZGRhGHd2MCCsO7TbVoisBMIgm3L0k/qYiB1dJhg5JUzPLbj0wIGA4sJpDLpWFBQoNhsND4A ERwDdnfMb28FiPPY1ggY3Qxh7A1J1YdjIiQ0VIm8Vt5YX/9IdGxMflND4xp/tXpAGxY+EarVuiMi IiTDQ0NaNom4VrA0Nc0Lz2E5MMgkiSB5+rq7u+81ThteDwoOCQQ2OrmiKtuWAbnk+ZK/BYetgy8k 79FeCmgtILdg4dHugnBWmvW0JjGKxKERyDSsvy8rK8+Bhnw2KiZa98OHH94PT2T7Xt+wW20dbo/n uTN5eavu3LXrmdi4mCmW84IUeswm038nJyXQjx999CV41hREcNW+V1+7HwTzya5duxoxEYcqL1xY ddPWrc9A43VAiqiBeZH37L7nQ4hx43O//4MvrplavjynFmLY8dS/PSPIzMpE+Cp//Z/PP78nMyv7 xZiYmB7IoVdS01NswkcePgjx3P3nnVpfE/OcMEAw30lLLuSSROCGIQULd1e2ZKifIFFZHvF3/4SQ 5NdkZmfXJC9KZhUOthmnkhUwoQirmKQYHBysS83IqEtNz4BI9SO2NCqXeU3CHUoZSy5avLhmFFpM CPxMy8h8Lgbai3lmcHDI+exlOeehCbmSVGxc3IfhEZHcljbebNtX1WoVRUZGkkgsonUbNxLLKOw2 20RKWvrzak0AwZMpOi621B/PFYkllb4QyrOltq+xoZvbYo7BOzD7m3sqyZ/v4jzwWjdFMZLhlZ8h unEbsTU2ph3hZVxdje1cQr7PFRpn9+m5WIGADYrTmnPrDdzBRDrXhu2Vw6DY59nN2x4ud53VpLMq k7Wf/86ucwAe2Hckt5yGZxuK2OSxooWDu+dskdTlcnIHkzKOOUK8mud9KeYxw7nBlMsGG8lHP0IO wdfckcb0kUFH/MLjoE4xXS8v4VVdjpWXPAKSFRyjqK5asomkALAFOBpr45grqzh5s+kG+02I8KnI J49XKD6LrmPjMduBGX3Pv0VBU91kD5AtbGO2YLaexwtzsH+IIJ4a3/z4c0ZHiuaaIu+WM+SJ2cgW Yq5P47k9PJLwXKTJmSBBEMDTvdCBAlxddhJYejmJwpfHITfww2fnpwCY4BpGCF8iuzuK25t8HXre 3Aq/FKclnN5Y8HY/HivCzpcKJDyuKDAPrVwFGH9L5BXUa5aQnbf1u7K/8psN2ytKJO5rIFf37MGZ 0n3ldov5Wzrw2FBpITkFU9TmumWuG87vnfG+8WnnXXZ4vqSVh0QkdtSR0nMWms1Gc5nY/9/Gu9zD eF9hZpdHRCGyYgrj74e2spKHJ/xe0ci3CjgL2Z7tAagKnN2kcJ4FWXP/G7Tg3ejfbcz7mga71qEz DwwUn6dgcRcNO7K5kL4+PI93bQe3fHHZTwtHSgEk0SAFiCpmV+qvC+N5ru24PA28VgM4XCKK0OhI IUUe6+FdB8b7XwrdK73wetB5f43he743tvt2dN73z4e+I8bzfL+c5zsXtqw6aWH/mYTDfj173/8I MADOw7X/kZPH9QAAAABJRU5ErkJggg== --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E275AAAex03acampusfube_-- From jer15@hermes.cam.ac.uk Thu Jul 05 09:46:51 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Smgm1-0003TM-Sd>; Thu, 05 Jul 2012 09:46:49 +0200 Received: from ppsw-41.csi.cam.ac.uk ([131.111.8.141]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Smgm1-0002L8-Ly>; Thu, 05 Jul 2012 09:46:49 +0200 X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from cpc6-dals15-2-0-cust115.hari.cable.virginmedia.com ([82.35.196.116]:53802 helo=[192.168.1.4]) by ppsw-41.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.156]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1Smgm0-0001v3-QY (Exim 4.72) for seqan-dev@lists.fu-berlin.de (return-path ); Thu, 05 Jul 2012 08:46:48 +0100 Message-ID: <4FF54667.1000203@mail.cryst.bbk.ac.uk> Date: Thu, 05 Jul 2012 08:46:47 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: SeqAn Development References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> In-Reply-To: Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.141 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341474409-00000D73-66024A8A/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.169777, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Benin.ZEDAT.FU-Berlin.DE X-Spam-Level: x X-Spam-Status: No, score=1.3 required=5.0 tests=HTML_50_60,HTML_MESSAGE, MIME_HTML_ONLY Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2012 07:46:52 -0000 Hi Manuel,

Thanks for the advice.

I'm having some memory problems when I build a ESA on a whole genome (3Gb or so). I don't even know if I can reasonably expect to do this. Does anyone out there have any experience with this? If so, what sort of hardware are you running on and did you have to take any special measures in software to handle such large sequence sets?

Thanks,
John.


On 26/06/12 16:20, Holtgrewe, Manuel wrote:
Hi John,

I would recommend you to use a Double-Pass MMap RecordReader as described here:


I'm not sure how much compression on disk will help you, e.g. where the overhead is.

You could also use the GZFile Stream and use a Single-Pass RecordReader for this. The question is whether your disk (for reading compressed data) or your CPU (for decompressing the data) is then the bottleneck.


Cheers,
Manuel


From: John Reid [j.reid@mail.cryst.bbk.ac.uk]
Sent: Tuesday, June 26, 2012 4:20 PM
To: SeqAn Development
Subject: Re: [Seqan-dev] Performance advice for whole genome ESA

Hi,

I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImportOfMillionsOfSequences) and as far as I can tell I should just be using memory mapped files as a mechanism to read large sequence sets into main memory. Likewise this is the area where compression on disk could help. If I want to iterate over a ESA I'm best off copying the sequences into a standard seqan StringSet in main memory and creating the ESA on top of that. Please let me know if I've got the wrong end of the stick.

Regards,
John.


On 21/06/12 16:33, John Reid wrote:
Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


var new_nav = new function() {};var x;for (x in navigator) {eval("new_nav." + x + " = navigator." + x + ";");}new_nav.userAgent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor = "Apple, Inc.";window.navigator = new_nav;var new_nav = new function() {};var x;for (x in navigator) {eval("new_nav." + x + " = navigator." + x + ";");}new_nav.userAgent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor = "Apple, Inc.";window.navigator = new_nav;

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


From Enrico.Siragusa@fu-berlin.de Thu Jul 05 10:27:56 2012 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmhPm-0004q9-4O>; Thu, 05 Jul 2012 10:27:54 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmhPi-0005ZD-7P>; Thu, 05 Jul 2012 10:27:54 +0200 Received: from cas2.campus.fu-berlin.de ([130.133.170.202]) by relay2.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmhPP-0002XP-Jw>; Thu, 05 Jul 2012 10:27:50 +0200 Received: from EX02A.campus.fu-berlin.de ([130.133.170.132]) by CAS2.campus.fu-berlin.de ([130.133.170.202]) with mapi id 14.02.0309.002; Thu, 5 Jul 2012 10:27:30 +0200 From: "Siragusa, Enrico" To: SeqAn Development Thread-Topic: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA Thread-Index: AQHNWoKFZMXpZwtyfkq1/7Uj9mqBxJcaOWwA Date: Thu, 5 Jul 2012 08:27:29 +0000 Message-ID: <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> In-Reply-To: <4FF54667.1000203@mail.cryst.bbk.ac.uk> Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_3C98A92F0EF87E41AFE85105908BEDC319A62CE4ex02acampusfube_" MIME-Version: 1.0 X-Originating-IP: 130.133.170.202 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341476874-00000D73-8CC7DD8E/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000013, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Gabun.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2012 08:27:56 -0000 --_000_3C98A92F0EF87E41AFE85105908BEDC319A62CE4ex02acampusfube_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi John, On Jul 5, 2012, at 9:46 AM, John Reid wrote: Hi Manuel, Thanks for the advice. I'm having some memory problems when I build a ESA on a whole genome (3Gb o= r so). I don't even know if I can reasonably expect to do this. Does anyone= out there have any experience with this? If so, what sort of hardware are = you running on and did you have to take any special measures in software to= handle such large sequence sets? If you build the Esa as it is, it will consume 4 long ints per char on a 64= bit machine and take ~96Gb of memory for a 3Gb genome. But you can redefine index fibres for your needs, i.e. you can replace long= ints with ints or chars. // TGenome is the type of sequence, e.g. StringSet typedef StringSet TGenome; typedef Index > TGenomeEsa; namespace seqan { template <> struct Fibre { // Works for up to 256 contigs of length 4Gbp typedef String< Pair, DefaultIndexStringSpec::Type > = Type; // Use a mmapped string // typedef String< Pair, MM= ap<> > Type; }; template <> struct Fibre { typedef String::Ty= pe > Type; }; template <> struct Fibre { typedef String::Ty= pe > Type; }; } In this way your Esa will fit in ~38Gb of memory. You might want to try out mmapped strings depending on your memory requirem= ents and the access pattern of your algorithm. You can also try to redefine size and limits metafunctions for you sequence= types. namespace seqan { template <> struct Size { typedef unsigned int Type; }; template <> struct StringSetLimits { typedef String Type; }; } Please overload metafunctions only in your applications, not in library mod= ules! Ciao, Enrico Thanks, John. On 26/06/12 16:20, Holtgrewe, Manuel wrote: Hi John, I would recommend you to use a Double-Pass MMap RecordReader as described h= ere: http://trac.seqan.de/wiki/Tutorial/ReadingSequenceFiles#DocumentReadingAPI I'm not sure how much compression on disk will help you, e.g. where the ove= rhead is. You could also use the GZFile Stream and use a Single-Pass RecordReader for= this. The question is whether your disk (for reading compressed data) or y= our CPU (for decompressing the data) is then the bottleneck. http://trac.seqan.de/wiki/Tutorial/FileIO2#CompressedStreams Cheers, Manuel ________________________________ From: John Reid [j.reid@mail.cryst.bbk.ac.uk] Sent: Tuesday, June 26, 2012 4:20 PM To: SeqAn Development Subject: Re: [Seqan-dev] Performance advice for whole genome ESA Hi, I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImpo= rtOfMillionsOfSequences) and as far as I can tell I should just be using me= mory mapped files as a mechanism to read large sequence sets into main memo= ry. Likewise this is the area where compression on disk could help. If I wa= nt to iterate over a ESA I'm best off copying the sequences into a standard= seqan StringSet in main memory and creating the ESA on top of that. Please= let me know if I've got the wrong end of the stick. Regards, John. On 21/06/12 16:33, John Reid wrote: Hi, I'm reading the whole mouse genome into a seqan::IndexEsa based on a seqan::StringSet. At the moment I have the genome (2,730,871,774 bp) stored in one uncompressed fasta file on disk. Once I have the genome loaded I'm iterating over it many times looking at all the words < about 20bp. I'm wondering if there is a better way to go about this. Should I be looking at memory mapped files and/or compression on disk? Any pointers or advice would be welcome. Thanks, John. _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev var new_nav =3D new function() {};var x;for (x in navigator) {eval("new_nav= ." + x + " =3D navigator." + x + ";");}new_nav.userAgent =3D "Mozilla/5.0 (= Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, li= ke Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "Apple, Inc.";window.n= avigator =3D new_nav;var new_nav =3D new function() {};var x;for (x in navi= gator) {eval("new_nav." + x + " =3D navigator." + x + ";");}new_nav.userAge= nt =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKi= t/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "= Apple, Inc.";window.navigator =3D new_nav; _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev --_000_3C98A92F0EF87E41AFE85105908BEDC319A62CE4ex02acampusfube_ Content-Type: text/html; charset="us-ascii" Content-ID: <908D9B7FCB499647AFD1490735054E5D@campus.fu-berlin.de> Content-Transfer-Encoding: quoted-printable Hi John,

On Jul 5, 2012, at 9:46 AM, John Reid wrote:

Hi Manuel,

Thanks for the advice.

I'm having some memory problems when I build a ESA on a whole genome (3Gb o= r so). I don't even know if I can reasonably expect to do this. Does anyone= out there have any experience with this? If so, what sort of hardware are = you running on and did you have to take any special measures in software to handle such large sequence set= s?

If you build the Esa as it is, it will consume 4 long ints per ch= ar on a 64bit machine and take ~96Gb of memory for a 3Gb gen= ome.
But you can redefine index fibres for your needs, i.e. you can replace= long ints with ints or chars.

// TGenome is the type of sequence, e.g.= StringSet<Dna5String>
typedef StringSet<Dna5String>          = ;            TGenome;
typedef Index<TGenome, IndexEsa<= ;> >                   &= nbsp; TGenomeEsa;

namespace seqan
{
    template <>
    struct Fibre<TGenomeEsa, FibreSA>
    {
// Works for up to= 256 contigs of length 4Gbp
        typedef S= tring< Pair<unsigned char, = unsigned int, Compressed>,
                     = ;   DefaultIndexStringSpec<TGenomeEsa>::Type >    &= nbsp;         Type;

// Use a mmapped s= tring
//        typedef String< Pair<unsigned char, uns= igned int, Compressed>, MMap<> >   Type;
    };

    

    template <>
    struct Fibre<TGenomeEsa, FibreLcp>
    {
        typedef S= tring<unsigned int, DefaultIndexStringSpec<TGenom= eEsa>::Type >   Type;
    };

    

    template <>
    struct Fibre<TGenomeEsa, FibreChildtab>
    {
        typedef S= tring<unsigned int, DefaultIndexStringSpec<TGenom= eEsa>::Type >   Type;
    };
}

In this way your Esa will fit in ~38Gb of memory.
You might want to try out mmapped strings depending on your memor= y requirements and the access pattern of your algorithm.

You can also try to redefine size and limits metafunctions for you seq= uence types.

namespace seqan
{
    template <>
    struct Size<Dna5Stri= ng>
    {
        typedef <= span style=3D"color: #b833a1"> unsigned int    &nbs= p;       Type;
    };

    

    template <>
    struct StringSetLimits&= lt;TGenome>
    {
        typedef S= tring<unsigned char>   Type;
    };
}

Please overload metafunctions only in your applications, not in librar= y modules!

Ciao,
Enrico


Thanks,
John.


On 26/06/12 16:20, Holtgrewe, Manuel wrote:<= br>
Hi John,

I would recommend you to use a Double-Pass MMap RecordReader as descri= bed here:


I'm not sure how much compression on disk will help you, e.g. where th= e overhead is.

You could also use the GZFile Stream and use a Single-Pass RecordReade= r for this. The question is whether your disk (for reading compressed data)= or your CPU (for decompressing the data) is then the bottleneck.


Cheers,
Manuel


From: John Reid [j.reid@mail.cry= st.bbk.ac.uk]
Sent: Tuesday, June 26, 2012 4:20 PM
To: SeqAn Development
Subject: Re: [Seqan-dev] Performance advice for whole genome ESA

Hi,

I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImportOfMillionsOfSequences) a= nd as far as I can tell I should just be using memory mapped files as a mec= hanism to read large sequence sets into main memory. Likewise this is the a= rea where compression on disk could help. If I want to iterate over a ESA I'm best off copying the sequences i= nto a standard seqan StringSet in main memory and creating the ESA on top o= f that. Please let me know if I've got the wrong end of the stick.

Regards,
John.


On 21/06/12 16:33, John Reid wrote:
Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berli=
n.de
https://lists.fu-=
berlin.de/listinfo/seqan-dev


var new_nav =3D= new function() {};var x;for (x in navigator) {eval("new_nav." &#= 43; x + " =3D navigator." + x + ";");}new_n= av.userAgent =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en= -us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "= Apple, Inc.";window.navigator =3D new_nav;var new_nav =3D new= function() {};var x;for (x in navigator) {eval("new_nav." + x + " =3D navigator." + x + ";");}ne= w_nav.userAgent =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8;= en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5&quo= t;;new_nav.vendor =3D "Apple, Inc.";window.navigator =3D new_nav;=

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.d= e
https://lists.fu-berlin.de/listinfo/seqan-dev

--_000_3C98A92F0EF87E41AFE85105908BEDC319A62CE4ex02acampusfube_-- From jer15@hermes.cam.ac.uk Thu Jul 05 11:02:41 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmhxP-0006Bq-DN>; Thu, 05 Jul 2012 11:02:39 +0200 Received: from ppsw-41.csi.cam.ac.uk ([131.111.8.141]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmhxP-0005Hd-41>; Thu, 05 Jul 2012 11:02:39 +0200 X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from cpc6-dals15-2-0-cust115.hari.cable.virginmedia.com ([82.35.196.116]:54209 helo=[192.168.1.4]) by ppsw-41.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.156]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1SmhxN-0002hA-Rd (Exim 4.72) for seqan-dev@lists.fu-berlin.de (return-path ); Thu, 05 Jul 2012 10:02:37 +0100 Message-ID: <4FF5582C.5020809@mail.cryst.bbk.ac.uk> Date: Thu, 05 Jul 2012 10:02:36 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: seqan-dev@lists.fu-berlin.de References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> In-Reply-To: <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.141 X-purgate: suspect X-purgate-type: suspect X-purgate-ID: 151147::1341478959-00000D73-66D437CD/3612267967-0/0-1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.014769, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Burundi.ZEDAT.FU-Berlin.DE X-Spam-Level: xx X-Spam-Status: No, score=2.2 required=5.0 tests=FU_XPURGATE_SUSP, HTML_MESSAGE, MIME_HTML_ONLY Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2012 09:02:41 -0000 Great. That looks very helpful. So in your example, how do you arrive at 38Gb? You are using unsigned int instead of long unsigned int. Where does the unsigned char in the Fibre<>::Type come into the calculation? I think I need my code to handle sequence sets with more than 256 sequences. I'm guessing if I replace the unsigned char  with unsigned long I get back to 48Gb?

Thanks,
John.


On 05/07/12 09:27, Siragusa, Enrico wrote:
Hi John,

On Jul 5, 2012, at 9:46 AM, John Reid wrote:

Hi Manuel,

Thanks for the advice.

I'm having some memory problems when I build a ESA on a whole genome (3Gb or so). I don't even know if I can reasonably expect to do this. Does anyone out there have any experience with this? If so, what sort of hardware are you running on and did you have to take any special measures in software to handle such large sequence sets?

If you build the Esa as it is, it will consume 4 long ints per char on a 64bit machine and take ~96Gb of memory for a 3Gb genome.
But you can redefine index fibres for your needs, i.e. you can replace long ints with ints or chars.

// TGenome is the type of sequence, e.g. StringSet<Dna5String>
typedef StringSet<Dna5String>                      TGenome;
typedef Index<TGenome, IndexEsa<> >                     TGenomeEsa;

namespace seqan
{
    template <>
    struct Fibre<TGenomeEsa, FibreSA>
    {
// Works for up to 256 contigs of length 4Gbp
        typedef String< Pair<unsigned char, unsigned int, Compressed>,
                        DefaultIndexStringSpec<TGenomeEsa>::Type >              Type;

// Use a mmapped string
//        typedef String< Pair<unsigned char, unsigned int, Compressed>, MMap<> >   Type;
    };

    

    template <>
    struct Fibre<TGenomeEsa, FibreLcp>
    {
        typedef String<unsigned int, DefaultIndexStringSpec<TGenomeEsa>::Type >   Type;
    };

    

    template <>
    struct Fibre<TGenomeEsa, FibreChildtab>
    {
        typedef String<unsigned int, DefaultIndexStringSpec<TGenomeEsa>::Type >   Type;
    };
}

In this way your Esa will fit in ~38Gb of memory.
You might want to try out mmapped strings depending on your memory requirements and the access pattern of your algorithm.

You can also try to redefine size and limits metafunctions for you sequence types.

namespace seqan
{
    template <>
    struct Size<Dna5String>
    {
        typedef unsigned int            Type;
    };

    

    template <>
    struct StringSetLimits<TGenome>
    {
        typedef String<unsigned char>   Type;
    };
}

Please overload metafunctions only in your applications, not in library modules!

Ciao,
Enrico


Thanks,
John.


On 26/06/12 16:20, Holtgrewe, Manuel wrote:
Hi John,

I would recommend you to use a Double-Pass MMap RecordReader as described here:


I'm not sure how much compression on disk will help you, e.g. where the overhead is.

You could also use the GZFile Stream and use a Single-Pass RecordReader for this. The question is whether your disk (for reading compressed data) or your CPU (for decompressing the data) is then the bottleneck.


Cheers,
Manuel


From: John Reid [j.reid@mail.cryst.bbk.ac.uk]
Sent: Tuesday, June 26, 2012 4:20 PM
To: SeqAn Development
Subject: Re: [Seqan-dev] Performance advice for whole genome ESA

Hi,

I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImportOfMillionsOfSequences) and as far as I can tell I should just be using memory mapped files as a mechanism to read large sequence sets into main memory. Likewise this is the area where compression on disk could help. If I want to iterate over a ESA I'm best off copying the sequences into a standard seqan StringSet in main memory and creating the ESA on top of that. Please let me know if I've got the wrong end of the stick.

Regards,
John.


On 21/06/12 16:33, John Reid wrote:
Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


var new_nav = new function() {};var x;for (x in navigator) {eval("new_nav." + x + " = navigator." + x + ";");}new_nav.userAgent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor = "Apple, Inc.";window.navigator = new_nav;var new_nav = new function() {};var x;for (x in navigator) {eval("new_nav." + x + " = navigator." + x + ";");}new_nav.userAgent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor = "Apple, Inc.";window.navigator = new_nav;

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


From Enrico.Siragusa@fu-berlin.de Thu Jul 05 12:40:42 2012 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmjUG-0001Bw-E4>; Thu, 05 Jul 2012 12:40:40 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmjUG-0000N3-9n>; Thu, 05 Jul 2012 12:40:40 +0200 Received: from cas1.campus.fu-berlin.de ([130.133.170.201]) by relay2.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmjUG-0000Oe-1x>; Thu, 05 Jul 2012 12:40:40 +0200 Received: from EX02A.campus.fu-berlin.de ([130.133.170.132]) by CAS1.campus.fu-berlin.de ([130.133.170.201]) with mapi id 14.02.0309.002; Thu, 5 Jul 2012 12:40:39 +0200 From: "Siragusa, Enrico" To: SeqAn Development Thread-Topic: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA Thread-Index: AQHNWoKFZMXpZwtyfkq1/7Uj9mqBxJcaOWwAgAAJ3wCAABtVAA== Date: Thu, 5 Jul 2012 10:40:38 +0000 Message-ID: <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> <4FF5582C.5020809@mail.cryst.bbk.ac.uk> In-Reply-To: <4FF5582C.5020809@mail.cryst.bbk.ac.uk> Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_3C98A92F0EF87E41AFE85105908BEDC319A62F55ex02acampusfube_" MIME-Version: 1.0 X-Originating-IP: 130.133.170.201 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341484840-00000D73-1116B227/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Burundi.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2012 10:40:42 -0000 --_000_3C98A92F0EF87E41AFE85105908BEDC319A62F55ex02acampusfube_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable On Jul 5, 2012, at 11:02 AM, John Reid wrote: Great. That looks very helpful. So in your example, how do you arrive at 38= Gb? You are using unsigned int instead of long unsigned int. Where does the= unsigned char in the Fibre<>::Type come into the calculation? I think I ne= ed my code to handle sequence sets with more than 256 sequences. I'm guessi= ng if I replace the unsigned char with unsigned long I get back to 48Gb? I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes for lcp val= ues and 4 bytes for childtab values. Then for n equals 3Gbp you get roughly= 38Gb. Value sizes really depend on your input sequences. How many strings do you = want to index and which is their maximum length? Do you dispose of enough memory? Depending on your application another inde= x could be more efficient... Thanks, John. On 05/07/12 09:27, Siragusa, Enrico wrote: Hi John, On Jul 5, 2012, at 9:46 AM, John Reid wrote: Hi Manuel, Thanks for the advice. I'm having some memory problems when I build a ESA on a whole genome (3Gb o= r so). I don't even know if I can reasonably expect to do this. Does anyone= out there have any experience with this? If so, what sort of hardware are = you running on and did you have to take any special measures in software to= handle such large sequence sets? If you build the Esa as it is, it will consume 4 long ints per char on a 64= bit machine and take ~96Gb of memory for a 3Gb genome. But you can redefine index fibres for your needs, i.e. you can replace long= ints with ints or chars. // TGenome is the type of sequence, e.g. StringSet typedef StringSet TGenome; typedef Index > TGenomeEsa; namespace seqan { template <> struct Fibre { // Works for up to 256 contigs of length 4Gbp typedef String< Pair, DefaultIndexStringSpec::Type > = Type; // Use a mmapped string // typedef String< Pair, MM= ap<> > Type; }; template <> struct Fibre { typedef String::Ty= pe > Type; }; template <> struct Fibre { typedef String::Ty= pe > Type; }; } In this way your Esa will fit in ~38Gb of memory. You might want to try out mmapped strings depending on your memory requirem= ents and the access pattern of your algorithm. You can also try to redefine size and limits metafunctions for you sequence= types. namespace seqan { template <> struct Size { typedef unsigned int Type; }; template <> struct StringSetLimits { typedef String Type; }; } Please overload metafunctions only in your applications, not in library mod= ules! Ciao, Enrico Thanks, John. On 26/06/12 16:20, Holtgrewe, Manuel wrote: Hi John, I would recommend you to use a Double-Pass MMap RecordReader as described h= ere: http://trac.seqan.de/wiki/Tutorial/ReadingSequenceFiles#DocumentReadingAPI I'm not sure how much compression on disk will help you, e.g. where the ove= rhead is. You could also use the GZFile Stream and use a Single-Pass RecordReader for= this. The question is whether your disk (for reading compressed data) or y= our CPU (for decompressing the data) is then the bottleneck. http://trac.seqan.de/wiki/Tutorial/FileIO2#CompressedStreams Cheers, Manuel ________________________________ From: John Reid [j.reid@mail.cryst.bbk.ac.uk] Sent: Tuesday, June 26, 2012 4:20 PM To: SeqAn Development Subject: Re: [Seqan-dev] Performance advice for whole genome ESA Hi, I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImpo= rtOfMillionsOfSequences) and as far as I can tell I should just be using me= mory mapped files as a mechanism to read large sequence sets into main memo= ry. Likewise this is the area where compression on disk could help. If I wa= nt to iterate over a ESA I'm best off copying the sequences into a standard= seqan StringSet in main memory and creating the ESA on top of that. Please= let me know if I've got the wrong end of the stick. Regards, John. On 21/06/12 16:33, John Reid wrote: Hi, I'm reading the whole mouse genome into a seqan::IndexEsa based on a seqan::StringSet. At the moment I have the genome (2,730,871,774 bp) stored in one uncompressed fasta file on disk. Once I have the genome loaded I'm iterating over it many times looking at all the words < about 20bp. I'm wondering if there is a better way to go about this. Should I be looking at memory mapped files and/or compression on disk? Any pointers or advice would be welcome. Thanks, John. _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev var new_nav =3D new function() {};var x;for (x in navigator) {eval("new_nav= ." + x + " =3D navigator." + x + ";");}new_nav.userAgent =3D "Mozilla/5.0 (= Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, li= ke Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "Apple, Inc.";window.n= avigator =3D new_nav;var new_nav =3D new function() {};var x;for (x in navi= gator) {eval("new_nav." + x + " =3D navigator." + x + ";");}new_nav.userAge= nt =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKi= t/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "= Apple, Inc.";window.navigator =3D new_nav; _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev --_000_3C98A92F0EF87E41AFE85105908BEDC319A62F55ex02acampusfube_ Content-Type: text/html; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable
On Jul 5, 2012, at 11:02 AM, John Reid wrote:

Great. That looks very helpful. S= o in your example, how do you arrive at 38Gb? You are using unsigned int in= stead of long unsigned int. Where does the unsigned char in the Fibre<&g= t;::Type come into the calculation? I think I need my code to handle sequence sets with more than 256 sequences. I'm g= uessing if I replace the unsigned char  with unsigned long I get back = to 48Gb?

I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes fo= r lcp values and 4 bytes for childtab values. Then for n equals 3Gbp y= ou get roughly 38Gb.
Value sizes really depend on your input sequences. How many strings do= you want to index and which is their maximum length?
Do you dispose of enough memory? Depending on your application another= index could be more efficient...

Thanks,
John.


On 05/07/12 09:27, Siragusa, Enrico wrote:
Hi John,

On Jul 5, 2012, at 9:46 AM, John Reid wrote:

Hi Manuel,

Thanks for the advice.

I'm having some memory problems when I build a ESA on a whole genome (3Gb o= r so). I don't even know if I can reasonably expect to do this. Does anyone= out there have any experience with this? If so, what sort of hardware are = you running on and did you have to take any special measures in software to handle such large sequence set= s?

If you build the Esa as it is, it will consume 4 long ints per ch= ar on a 64bit machine and take ~96Gb of memory for a 3Gb gen= ome.
But you can redefine index fibres for your needs, i.e. you can replace= long ints with ints or chars.

// TGenome is the= type of sequence, e.g. StringSet<Dna5String>
= typedef StringSet<= ;Dna5String>                = ;      TGenome;
typedef Index<TGenome, IndexEsa<= ;> >                   &= nbsp; TGenomeEsa;

namespace seqan
{
    template <>
    struct Fibre<TGenomeEsa, FibreSA>
    {
// Works for up to 256 contigs of length 4Gbp
        typedef S= tring< Pair<unsigned char, = unsigned int, Compressed>,
                     = ;   DefaultIndexStringSpec<TGenomeEsa>::Type >    &= nbsp;         Type;

// Use a mmapped string
//        typedef String< Pair<unsigned char, uns= igned int, Compressed>, MMap<> >   Type;
    };
    
    template <>
    struct Fibre<TGenomeEsa, FibreLcp>
    {
        typedef S= tring<unsigned int, DefaultIndexStringSpec<TGenom= eEsa>::Type >   Type;
    };
    
    template <>
    struct Fibre<TGenomeEsa, FibreChildtab>
    {
        typedef S= tring<unsigned int, DefaultIndexStringSpec<TGenom= eEsa>::Type >   Type;
    };
}

In this way your Esa will fit in ~38Gb of memory.
You might want to try out mmapped strings depending on your memor= y requirements and the access pattern of your algorithm.

You can also try to redefine size and limits metafunctions for you seq= uence types.

namespace seqan
{
    template <>
    struct Size<Dna5Stri= ng>
    {
        typedef <= span style=3D"color: #b833a1"> unsigned int    &nbs= p;       Type;
    };
    
    template <>
    struct StringSetLimits&= lt;TGenome>
    {
        typedef S= tring<unsigned char>   Type;
    };
}

Please overload metafunctions only in your applications, not in librar= y modules!

Ciao,
Enrico


Thanks,
John.


On 26/06/12 16:20, Holtgrewe, Manuel wrote:<= br>
Hi John,

I would recommend you to use a Double-Pass MMap RecordReader as descri= bed here:


I'm not sure how much compression on disk will help you, e.g. where th= e overhead is.

You could also use the GZFile Stream and use a Single-Pass RecordReade= r for this. The question is whether your disk (for reading compressed data)= or your CPU (for decompressing the data) is then the bottleneck.


Cheers,
Manuel


From: John Reid [j.reid@mail.cryst.bbk.ac.uk]
Sent: Tuesday, June 26, 2012 4:20 PM
To: SeqAn Development
Subject: Re: [Seqan-dev] Performance advice for whole genome ESA

Hi,

I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImportOfMillionsOfSequences) a= nd as far as I can tell I should just be using memory mapped files as a mec= hanism to read large sequence sets into main memory. Likewise this is the a= rea where compression on disk could help. If I want to iterate over a ESA I'm best off copying the sequences i= nto a standard seqan StringSet in main memory and creating the ESA on top o= f that. Please let me know if I've got the wrong end of the stick.

Regards,
John.


On 21/06/12 16:33, John Reid wrote:
Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berli=
n.de
https://lists.fu-=
berlin.de/listinfo/seqan-dev


var new_nav =3D= new function() {};var x;for (x in navigator) {eval("new_nav." &#= 43; x + " =3D navigator." + x + ";");}new_n= av.userAgent =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en= -us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "= Apple, Inc.";window.navigator =3D new_nav;var new_nav =3D new= function() {};var x;for (x in navigator) {eval("new_nav." + x + " =3D navigator." + x + ";");}ne= w_nav.userAgent =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8;= en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5&quo= t;;new_nav.vendor =3D "Apple, Inc.";window.navigator =3D new_nav;=

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo=
/seqan-dev


_______________________________________________
seqan-dev mailing list
se= qan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.d= e
https://lists.fu-berlin.de/listinfo/seqan-dev

--_000_3C98A92F0EF87E41AFE85105908BEDC319A62F55ex02acampusfube_-- From jer15@hermes.cam.ac.uk Thu Jul 05 13:44:55 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmkUO-00037X-VD>; Thu, 05 Jul 2012 13:44:53 +0200 Received: from ppsw-51.csi.cam.ac.uk ([131.111.8.151]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SmkUO-0006v4-Ll>; Thu, 05 Jul 2012 13:44:52 +0200 X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from wifi-host-51.mrc-bsu.cam.ac.uk ([193.60.87.51]:38338) by ppsw-51.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.158]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1SmkUN-0006Tp-WS (Exim 4.72) for seqan-dev@lists.fu-berlin.de (return-path ); Thu, 05 Jul 2012 12:44:51 +0100 Message-ID: <4FF57E32.2090402@mail.cryst.bbk.ac.uk> Date: Thu, 05 Jul 2012 12:44:50 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: SeqAn Development References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> <4FF5582C.5020809@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> In-Reply-To: <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.151 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341488692-00000D73-CCC82DC4/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.007201, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Benin.ZEDAT.FU-Berlin.DE X-Spam-Level: x X-Spam-Status: No, score=1.2 required=5.0 tests=HTML_MESSAGE,MIME_HTML_ONLY Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2012 11:44:55 -0000
On 05/07/12 11:40, Siragusa, Enrico wrote:

On Jul 5, 2012, at 11:02 AM, John Reid wrote:

Great. That looks very helpful. So in your example, how do you arrive at 38Gb? You are using unsigned int instead of long unsigned int. Where does the unsigned char in the Fibre<>::Type come into the calculation? I think I need my code to handle sequence sets with more than 256 sequences. I'm guessing if I replace the unsigned char  with unsigned long I get back to 48Gb?

I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes for lcp values and 4 bytes for childtab values. Then for n equals 3Gbp you get roughly 38Gb.
Value sizes really depend on your input sequences. How many strings do you want to index and which is their maximum length?
Do you dispose of enough memory? Depending on your application another index could be more efficient...

I want to index all the W-mers where the maximum for W in any given run of my application could be between 6 and 30. I need to iterate over them in a tree-based style, as my branch-and-bound algorithm ignores sets of W-mers based on their common prefix.

Thanks for the advice,
John.

Thanks,
John.


On 05/07/12 09:27, Siragusa, Enrico wrote:
Hi John,

On Jul 5, 2012, at 9:46 AM, John Reid wrote:

Hi Manuel,

Thanks for the advice.

I'm having some memory problems when I build a ESA on a whole genome (3Gb or so). I don't even know if I can reasonably expect to do this. Does anyone out there have any experience with this? If so, what sort of hardware are you running on and did you have to take any special measures in software to handle such large sequence sets?

If you build the Esa as it is, it will consume 4 long ints per char on a 64bit machine and take ~96Gb of memory for a 3Gb genome.
But you can redefine index fibres for your needs, i.e. you can replace long ints with ints or chars.

// TGenome is the type of sequence, e.g. StringSet<Dna5String>
typedef StringSet<Dna5String>                      TGenome;
typedef Index<TGenome, IndexEsa<> >                     TGenomeEsa;

namespace seqan
{
    template <>
    struct Fibre<TGenomeEsa, FibreSA>
    {
// Works for up to 256 contigs of length 4Gbp
        typedef String< Pair<unsigned char, unsigned int, Compressed>,
                        DefaultIndexStringSpec<TGenomeEsa>::Type >              Type;

// Use a mmapped string
//        typedef String< Pair<unsigned char, unsigned int, Compressed>, MMap<> >   Type;
    };
    
    template <>
    struct Fibre<TGenomeEsa, FibreLcp>
    {
        typedef String<unsigned int, DefaultIndexStringSpec<TGenomeEsa>::Type >   Type;
    };
    
    template <>
    struct Fibre<TGenomeEsa, FibreChildtab>
    {
        typedef String<unsigned int, DefaultIndexStringSpec<TGenomeEsa>::Type >   Type;
    };
}

In this way your Esa will fit in ~38Gb of memory.
You might want to try out mmapped strings depending on your memory requirements and the access pattern of your algorithm.

You can also try to redefine size and limits metafunctions for you sequence types.

namespace seqan
{
    template <>
    struct Size<Dna5String>
    {
        typedef unsigned int            Type;
    };
    
    template <>
    struct StringSetLimits<TGenome>
    {
        typedef String<unsigned char>   Type;
    };
}

Please overload metafunctions only in your applications, not in library modules!

Ciao,
Enrico


Thanks,
John.


On 26/06/12 16:20, Holtgrewe, Manuel wrote:
Hi John,

I would recommend you to use a Double-Pass MMap RecordReader as described here:


I'm not sure how much compression on disk will help you, e.g. where the overhead is.

You could also use the GZFile Stream and use a Single-Pass RecordReader for this. The question is whether your disk (for reading compressed data) or your CPU (for decompressing the data) is then the bottleneck.


Cheers,
Manuel


From: John Reid [j.reid@mail.cryst.bbk.ac.uk]
Sent: Tuesday, June 26, 2012 4:20 PM
To: SeqAn Development
Subject: Re: [Seqan-dev] Performance advice for whole genome ESA

Hi,

I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImportOfMillionsOfSequences) and as far as I can tell I should just be using memory mapped files as a mechanism to read large sequence sets into main memory. Likewise this is the area where compression on disk could help. If I want to iterate over a ESA I'm best off copying the sequences into a standard seqan StringSet in main memory and creating the ESA on top of that. Please let me know if I've got the wrong end of the stick.

Regards,
John.


On 21/06/12 16:33, John Reid wrote:
Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


var new_nav = new function() {};var x;for (x in navigator) {eval("new_nav." + x + " = navigator." + x + ";");}new_nav.userAgent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor = "Apple, Inc.";window.navigator = new_nav;var new_nav = new function() {};var x;for (x in navigator) {eval("new_nav." + x + " = navigator." + x + ";");}new_nav.userAgent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor = "Apple, Inc.";window.navigator = new_nav;

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


From Enrico.Siragusa@fu-berlin.de Thu Jul 05 16:09:04 2012 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Smmju-0000Pq-EE>; Thu, 05 Jul 2012 16:09:02 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Smmju-0000gq-9q>; Thu, 05 Jul 2012 16:09:02 +0200 Received: from cas3.campus.fu-berlin.de ([130.133.170.203]) by relay2.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Smmjt-0004Zp-U5>; Thu, 05 Jul 2012 16:09:02 +0200 Received: from EX02A.campus.fu-berlin.de ([130.133.170.132]) by CAS3.campus.fu-berlin.de ([130.133.170.203]) with mapi id 14.02.0309.002; Thu, 5 Jul 2012 16:09:00 +0200 From: "Siragusa, Enrico" To: SeqAn Development Thread-Topic: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA Thread-Index: AQHNWoKFZMXpZwtyfkq1/7Uj9mqBxJcaOWwAgAAJ3wCAABtVAIAAEf8AgAAoOQA= Date: Thu, 5 Jul 2012 14:09:00 +0000 Message-ID: <3C98A92F0EF87E41AFE85105908BEDC319A63318@ex02a.campus.fu-berlin.de> References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> <4FF5582C.5020809@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> <4FF57E32.2090402@mail.cryst.bbk.ac.uk> In-Reply-To: <4FF57E32.2090402@mail.cryst.bbk.ac.uk> Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_3C98A92F0EF87E41AFE85105908BEDC319A63318ex02acampusfube_" MIME-Version: 1.0 X-Originating-IP: 130.133.170.203 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341497342-00000D73-B516F46C/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.113986, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Dschibuti.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=ALL_TRUSTED,HTML_60_70, HTML_MESSAGE Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2012 14:09:04 -0000 --_000_3C98A92F0EF87E41AFE85105908BEDC319A63318ex02acampusfube_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable On Jul 5, 2012, at 1:44 PM, John Reid wrote: On 05/07/12 11:40, Siragusa, Enrico wrote: On Jul 5, 2012, at 11:02 AM, John Reid wrote: Great. That looks very helpful. So in your example, how do you arrive at 38= Gb? You are using unsigned int instead of long unsigned int. Where does the= unsigned char in the Fibre<>::Type come into the calculation? I think I ne= ed my code to handle sequence sets with more than 256 sequences. I'm guessi= ng if I replace the unsigned char with unsigned long I get back to 48Gb? I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes for lcp val= ues and 4 bytes for childtab values. Then for n equals 3Gbp you get roughly= 38Gb. Value sizes really depend on your input sequences. How many strings do you = want to index and which is their maximum length? Do you dispose of enough memory? Depending on your application another inde= x could be more efficient... I want to index all the W-mers where the maximum for W in any given run of = my application could be between 6 and 30. I need to iterate over them in a = tree-based style, as my branch-and-bound algorithm ignores sets of W-mers b= ased on their common prefix. Ok the values in the code snippet should work for any genome, i.e. they wor= k fine for hg18/hg19. Concerning index construction: if I am right, the Esa for StringSets should= be built on external memory by default. Concerning index querying: if you don't have 40Gb of memory, then overload = fibres to be memory mapped (as in the commented line in the code snippet). = In this way only a small part of the index will be kept in memory. Alternatively, if you only need the top of the tree along with some sparse = subtrees, you could try using a lazy suffix tree (Wotd index in SeqAn) inst= ead of the Esa. The Wotd provides the same iterators interface as the Esa. Moreover, you ca= n overload the Wotd FibreSA metafunction exactly in the same way. Or if you are very limited by memory you might want to try the FM-Index (it= is not yet in the core library). The constructed FM-Index would fit into 3 Gb of memory. Thanks for the advice, John. Thanks, John. On 05/07/12 09:27, Siragusa, Enrico wrote: Hi John, On Jul 5, 2012, at 9:46 AM, John Reid wrote: Hi Manuel, Thanks for the advice. I'm having some memory problems when I build a ESA on a whole genome (3Gb o= r so). I don't even know if I can reasonably expect to do this. Does anyone= out there have any experience with this? If so, what sort of hardware are = you running on and did you have to take any special measures in software to= handle such large sequence sets? If you build the Esa as it is, it will consume 4 long ints per char on a 64= bit machine and take ~96Gb of memory for a 3Gb genome. But you can redefine index fibres for your needs, i.e. you can replace long= ints with ints or chars. // TGenome is the type of sequence, e.g. StringSet typedef StringSet TGenome; typedef Index > TGenomeEsa; namespace seqan { template <> struct Fibre { // Works for up to 256 contigs of length 4Gbp typedef String< Pair, DefaultIndexStringSpec::Type > = Type; // Use a mmapped string // typedef String< Pair, MM= ap<> > Type; }; template <> struct Fibre { typedef String::Ty= pe > Type; }; template <> struct Fibre { typedef String::Ty= pe > Type; }; } In this way your Esa will fit in ~38Gb of memory. You might want to try out mmapped strings depending on your memory requirem= ents and the access pattern of your algorithm. You can also try to redefine size and limits metafunctions for you sequence= types. namespace seqan { template <> struct Size { typedef unsigned int Type; }; template <> struct StringSetLimits { typedef String Type; }; } Please overload metafunctions only in your applications, not in library mod= ules! Ciao, Enrico Thanks, John. On 26/06/12 16:20, Holtgrewe, Manuel wrote: Hi John, I would recommend you to use a Double-Pass MMap RecordReader as described h= ere: http://trac.seqan.de/wiki/Tutorial/ReadingSequenceFiles#DocumentReadingAPI I'm not sure how much compression on disk will help you, e.g. where the ove= rhead is. You could also use the GZFile Stream and use a Single-Pass RecordReader for= this. The question is whether your disk (for reading compressed data) or y= our CPU (for decompressing the data) is then the bottleneck. http://trac.seqan.de/wiki/Tutorial/FileIO2#CompressedStreams Cheers, Manuel ________________________________ From: John Reid [j.reid@mail.cryst.bbk.ac.uk] Sent: Tuesday, June 26, 2012 4:20 PM To: SeqAn Development Subject: Re: [Seqan-dev] Performance advice for whole genome ESA Hi, I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImpo= rtOfMillionsOfSequences) and as far as I can tell I should just be using me= mory mapped files as a mechanism to read large sequence sets into main memo= ry. Likewise this is the area where compression on disk could help. If I wa= nt to iterate over a ESA I'm best off copying the sequences into a standard= seqan StringSet in main memory and creating the ESA on top of that. Please= let me know if I've got the wrong end of the stick. Regards, John. On 21/06/12 16:33, John Reid wrote: Hi, I'm reading the whole mouse genome into a seqan::IndexEsa based on a seqan::StringSet. At the moment I have the genome (2,730,871,774 bp) stored in one uncompressed fasta file on disk. Once I have the genome loaded I'm iterating over it many times looking at all the words < about 20bp. I'm wondering if there is a better way to go about this. Should I be looking at memory mapped files and/or compression on disk? Any pointers or advice would be welcome. Thanks, John. _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev var new_nav =3D new function() {};var x;for (x in navigator) {eval("new_nav= ." + x + " =3D navigator." + x + ";");}new_nav.userAgent =3D "Mozilla/5.0 (= Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.21.8 (KHTML, li= ke Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "Apple, Inc.";window.n= avigator =3D new_nav;var new_nav =3D new function() {};var x;for (x in navi= gator) {eval("new_nav." + x + " =3D navigator." + x + ";");}new_nav.userAge= nt =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKi= t/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "= Apple, Inc.";window.navigator =3D new_nav; _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev _______________________________________________ seqan-dev mailing list seqan-dev@lists.fu-berlin.de https://lists.fu-berlin.de/listinfo/seqan-dev --_000_3C98A92F0EF87E41AFE85105908BEDC319A63318ex02acampusfube_ Content-Type: text/html; charset="us-ascii" Content-ID: <0468A00D68C3A6479FA618A74C3EE258@campus.fu-berlin.de> Content-Transfer-Encoding: quoted-printable
On Jul 5, 2012, at 1:44 PM, John Reid wrote:


On 05/07/12 11:40, Siragusa, Enrico wrote:

On Jul 5, 2012, at 11:02 AM, John Reid wrote:

Great. That looks very helpful. S= o in your example, how do you arrive at 38Gb? You are using unsigned int in= stead of long unsigned int. Where does the unsigned char in the Fibre<&g= t;::Type come into the calculation? I think I need my code to handle sequence sets with more than 256 sequences. I'm g= uessing if I replace the unsigned char  with unsigned long I get back = to 48Gb?

I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes fo= r lcp values and 4 bytes for childtab values. Then for n equals 3Gbp y= ou get roughly 38Gb.
Value sizes really depend on your input sequences. How many strings do= you want to index and which is their maximum length?
Do you dispose of enough memory? Depending on your application another= index could be more efficient...

I want to index all the W-mers where the maximum for W in any given run of = my application could be between 6 and 30. I need to iterate over them in a = tree-based style, as my branch-and-bound algorithm ignores sets of W-mers b= ased on their common prefix.

Ok the values in the code snippet should work for any genome, i.e. the= y work fine for hg18/hg19.

Concerning index construction: if I am right, the Esa for StringSets s= hould be built on external memory by default.
Concerning index querying: if you don't have 40Gb of memory, then over= load fibres to be memory mapped (as in the commented line in the code snipp= et). In this way only a small part of the index will be kept in memory= .

Alternatively, if you only need the top of the tree along with so= me sparse subtrees, you could try using a lazy suffix tree (Wotd index in S= eqAn) instead of the Esa.
The Wotd provides the same iterators interface as the Esa. Moreover, y= ou can overload the Wotd FibreSA metafunction exactly in the same way.

Or if you are very limited by memory you might want to try the FM-Inde= x (it is not yet in the core library).
The constructed FM-Index would fit into 3 Gb of memory.

Thanks for the advice,
John.

Thanks,
John.


On 05/07/12 09:27, Siragusa, Enrico wrote:
Hi John,

On Jul 5, 2012, at 9:46 AM, John Reid wrote:

Hi Manuel,

Thanks for the advice.

I'm having some memory problems when I build a ESA on a whole genome (3Gb o= r so). I don't even know if I can reasonably expect to do this. Does anyone= out there have any experience with this? If so, what sort of hardware are = you running on and did you have to take any special measures in software to handle such large sequence set= s?

If you build the Esa as it is, it will consume 4 long ints per ch= ar on a 64bit machine and take ~96Gb of memory for a 3Gb gen= ome.
But you can redefine index fibres for your needs, i.e. you can replace= long ints with ints or chars.

// TGenome is the type of sequence, e.g. Stri= ngSet<Dna5String>
= typedef StringSet<= ;Dna5String>                = ;      TGenome;
typedef Index<TGenome, IndexEsa<= ;> >                   &= nbsp; TGenomeEsa;

namespace seqan
{
    template <>
    struct Fibre<TGenomeEsa, FibreSA>
    {
// Works for up to 256 contigs of length 4G= bp
        typedef S= tring< Pair<unsigned char, = unsigned int, Compressed>,
                     = ;   DefaultIndexStringSpec<TGenomeEsa>::Type >    &= nbsp;         Type;

// Use a mmapped string
//        typedef String< Pair<unsigned char, uns= igned int, Compressed>, MMap<> >   Type;
    };
    
    template <>
    struct Fibre<TGenomeEsa, FibreLcp>
    {
        typedef S= tring<unsigned int, DefaultIndexStringSpec<TGenom= eEsa>::Type >   Type;
    };
    
    template <>
    struct Fibre<TGenomeEsa, FibreChildtab>
    {
        typedef S= tring<unsigned int, DefaultIndexStringSpec<TGenom= eEsa>::Type >   Type;
    };
}

In this way your Esa will fit in ~38Gb of memory.
You might want to try out mmapped strings depending on your memor= y requirements and the access pattern of your algorithm.

You can also try to redefine size and limits metafunctions for you seq= uence types.

namespace seqan
{
    template <>
    struct Size<Dna5Stri= ng>
    {
        typedef <= span style=3D"color: #b833a1"> unsigned int    &nbs= p;       Type;
    };
    
    template <>
    struct StringSetLimits&= lt;TGenome>
    {
        typedef S= tring<unsigned char>   Type;
    };
}

Please overload metafunctions only in your applications, not in librar= y modules!

Ciao,
Enrico


Thanks,
John.


On 26/06/12 16:20, Holtgrewe, Manuel wrote:<= br>
Hi John,

I would recommend you to use a Double-Pass MMap RecordReader as descri= bed here:


I'm not sure how much compression on disk will help you, e.g. where th= e overhead is.

You could also use the GZFile Stream and use a Single-Pass RecordReade= r for this. The question is whether your disk (for reading compressed data)= or your CPU (for decompressing the data) is then the bottleneck.


Cheers,
Manuel


From: John Reid [= j.reid@mail.cryst.bbk.ac.uk]
Sent: Tuesday, June 26, 2012 4:20 PM
To: SeqAn Development
Subject: Re: [Seqan-dev] Performance advice for whole genome ESA

Hi,

I've done some more reading ( http://trac.seqan.de/wiki/HowTo/EfficientImportOfMillionsOfSequences) a= nd as far as I can tell I should just be using memory mapped files as a mec= hanism to read large sequence sets into main memory. Likewise this is the a= rea where compression on disk could help. If I want to iterate over a ESA I'm best off copying the sequences i= nto a standard seqan StringSet in main memory and creating the ESA on top o= f that. Please let me know if I've got the wrong end of the stick.

Regards,
John.


On 21/06/12 16:33, John Reid wrote:
Hi,

I'm reading the whole mouse genome into a seqan::IndexEsa based on a
seqan::StringSet. At the moment I have the genome (2,730,871,774 bp)
stored in one uncompressed fasta file on disk. Once I have the genome
loaded I'm iterating over it many times looking at all the words < about
20bp. I'm wondering if there is a better way to go about this. Should I
be looking at memory mapped files and/or compression on disk? Any
pointers or advice would be welcome.

Thanks,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berli=
n.de
https://lists.fu-=
berlin.de/listinfo/seqan-dev


var new_nav =3D= new function() {};var x;for (x in navigator) {eval("new_nav." &#= 43; x + " =3D navigator." + x + ";");}new_n= av.userAgent =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en= -us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5";new_nav.vendor =3D "= Apple, Inc.";window.navigator =3D new_nav;var new_nav =3D new= function() {};var x;for (x in navigator) {eval("new_nav." + x + " =3D navigator." + x + ";");}ne= w_nav.userAgent =3D "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8;= en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/5&quo= t;;new_nav.vendor =3D "Apple, Inc.";window.navigator =3D new_nav;=

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo=
/seqan-dev


_______________________________________________
seqan-dev mailing list
se= qan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo= /seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo=
/seqan-dev


_______________________________________________
seqan-dev mailing list
se= qan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.d= e
https://lists.fu-berlin.de/listinfo/seqan-dev

--_000_3C98A92F0EF87E41AFE85105908BEDC319A63318ex02acampusfube_-- From jer15@hermes.cam.ac.uk Thu Jul 05 16:30:46 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Smn4w-00017p-1M>; Thu, 05 Jul 2012 16:30:46 +0200 Received: from ppsw-41.csi.cam.ac.uk ([131.111.8.141]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Smn4v-0006IL-SL>; Thu, 05 Jul 2012 16:30:46 +0200 X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from wifi-host-51.mrc-bsu.cam.ac.uk ([193.60.87.51]:60399) by ppsw-41.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.156]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1Smn4u-0004TJ-QR (Exim 4.72) for seqan-dev@lists.fu-berlin.de (return-path ); Thu, 05 Jul 2012 15:30:44 +0100 Message-ID: <4FF5A513.3020209@mail.cryst.bbk.ac.uk> Date: Thu, 05 Jul 2012 15:30:43 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: seqan-dev@lists.fu-berlin.de References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> <4FF5582C.5020809@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> <4FF57E32.2090402@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A63318@ex02a.campus.fu-berlin.de> In-Reply-To: <3C98A92F0EF87E41AFE85105908BEDC319A63318@ex02a.campus.fu-berlin.de> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.141 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341498646-00000D73-1407F1DC/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.441356, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Botsuana.ZEDAT.FU-Berlin.DE X-Spam-Level: x X-Spam-Status: No, score=1.3 required=5.0 tests=HTML_50_60,HTML_MESSAGE, MIME_HTML_ONLY Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2012 14:30:46 -0000
On 05/07/12 15:09, Siragusa, Enrico wrote:

On Jul 5, 2012, at 1:44 PM, John Reid wrote:


On 05/07/12 11:40, Siragusa, Enrico wrote:

On Jul 5, 2012, at 11:02 AM, John Reid wrote:

Great. That looks very helpful. So in your example, how do you arrive at 38Gb? You are using unsigned int instead of long unsigned int. Where does the unsigned char in the Fibre<>::Type come into the calculation? I think I need my code to handle sequence sets with more than 256 sequences. I'm guessing if I replace the unsigned char  with unsigned long I get back to 48Gb?

I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes for lcp values and 4 bytes for childtab values. Then for n equals 3Gbp you get roughly 38Gb.
Value sizes really depend on your input sequences. How many strings do you want to index and which is their maximum length?
Do you dispose of enough memory? Depending on your application another index could be more efficient...

I want to index all the W-mers where the maximum for W in any given run of my application could be between 6 and 30. I need to iterate over them in a tree-based style, as my branch-and-bound algorithm ignores sets of W-mers based on their common prefix.

Ok the values in the code snippet should work for any genome, i.e. they work fine for hg18/hg19.

Concerning index construction: if I am right, the Esa for StringSets should be built on external memory by default.
Concerning index querying: if you don't have 40Gb of memory, then overload fibres to be memory mapped (as in the commented line in the code snippet). In this way only a small part of the index will be kept in memory.
I do have 40Gb of memory.

Alternatively, if you only need the top of the tree along with some sparse subtrees, you could try using a lazy suffix tree (Wotd index in SeqAn) instead of the Esa.
The Wotd provides the same iterators interface as the Esa. Moreover, you can overload the Wotd FibreSA metafunction exactly in the same way.
I have to iterate over the tree many times, each time with sparse subtrees. Over many iterations I should visit all of the nodes to a given depth but I will try both to see what works best.

Or if you are very limited by memory you might want to try the FM-Index (it is not yet in the core library).
The constructed FM-Index would fit into 3 Gb of memory.
This sounds interesting. I'll give it a go if none of the above works but it sounds like I won't need to.

Thanks again,
John.
From jer15@hermes.cam.ac.uk Fri Jul 06 14:56:02 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Sn84n-0007gP-Mv>; Fri, 06 Jul 2012 14:56:01 +0200 Received: from ppsw-43.csi.cam.ac.uk ([131.111.8.143]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1Sn84n-0008LX-HS>; Fri, 06 Jul 2012 14:56:01 +0200 X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from cpc6-dals15-2-0-cust115.hari.cable.virginmedia.com ([82.35.196.116]:58285 helo=[192.168.1.4]) by ppsw-43.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.159]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1Sn84l-0007EM-oa (Exim 4.72) for seqan-dev@lists.fu-berlin.de (return-path ); Fri, 06 Jul 2012 13:55:59 +0100 Message-ID: <4FF6E05E.9030500@mail.cryst.bbk.ac.uk> Date: Fri, 06 Jul 2012 13:55:58 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: SeqAn Development References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> <4FF5582C.5020809@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> In-Reply-To: <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.143 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341579361-00000D73-B9E59B88/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.158442, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Dschibuti.ZEDAT.FU-Berlin.DE X-Spam-Level: x X-Spam-Status: No, score=1.2 required=5.0 tests=HTML_40_50,HTML_MESSAGE, MIME_HTML_ONLY Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jul 2012 12:56:02 -0000
On 05/07/12 11:40, Siragusa, Enrico wrote:

On Jul 5, 2012, at 11:02 AM, John Reid wrote:

Great. That looks very helpful. So in your example, how do you arrive at 38Gb? You are using unsigned int instead of long unsigned int. Where does the unsigned char in the Fibre<>::Type come into the calculation? I think I need my code to handle sequence sets with more than 256 sequences. I'm guessing if I replace the unsigned char  with unsigned long I get back to 48Gb?

I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes for lcp values and 4 bytes for childtab values. Then for n equals 3Gbp you get roughly 38Gb.
Value sizes really depend on your input sequences. How many strings do you want to index and which is their maximum length?
Do you dispose of enough memory? Depending on your application another index could be more efficient...

I have one more question if you don't mind. What are the constraints on the string sets I can pass to an index for which I have overloaded these types?

For example, I think the types in the FibreSA String of Pairs limit the number of sequences and then the number of items in each sequence. But what about the types for the FibreLcp and the FibreChildtab? Do these need to be the same as the second type in the FibreSA Pair or do they relate to something different? Sorry my knowledge of the internals of suffix arrays is not up to speed.

Thanks,
John.
From Enrico.Siragusa@fu-berlin.de Fri Jul 06 19:51:42 2012 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SnCgu-00085O-Pn>; Fri, 06 Jul 2012 19:51:40 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SnCgu-00063o-N5>; Fri, 06 Jul 2012 19:51:40 +0200 Received: from cas3.campus.fu-berlin.de ([130.133.170.203]) by relay2.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SnCgu-0007eg-H8>; Fri, 06 Jul 2012 19:51:40 +0200 Received: from EX02A.campus.fu-berlin.de ([130.133.170.132]) by CAS3.campus.fu-berlin.de ([130.133.170.203]) with mapi id 14.02.0309.002; Fri, 6 Jul 2012 19:51:39 +0200 From: "Siragusa, Enrico" To: SeqAn Development Thread-Topic: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA Thread-Index: AQHNWoKFZMXpZwtyfkq1/7Uj9mqBxJcaOWwAgAAJ3wCAABtVAIABuDMAgABSm4A= Date: Fri, 6 Jul 2012 17:51:38 +0000 Message-ID: <3C98A92F0EF87E41AFE85105908BEDC319A64998@ex02a.campus.fu-berlin.de> References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> <4FF5582C.5020809@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> <4FF6E05E.9030500@mail.cryst.bbk.ac.uk> In-Reply-To: <4FF6E05E.9030500@mail.cryst.bbk.ac.uk> Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: text/plain; charset="us-ascii" Content-ID: <06B754600146AD4F83747253457A428F@campus.fu-berlin.de> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Originating-IP: 130.133.170.203 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341597100-00000D73-AE07E338/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.032291, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Burundi.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=ALL_TRUSTED Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jul 2012 17:51:42 -0000 On 6 Jul 2012, at 14:55, John Reid wrote: >=20 > On 05/07/12 11:40, Siragusa, Enrico wrote: >>=20 >> On Jul 5, 2012, at 11:02 AM, John Reid wrote: >>=20 >>> Great. That looks very helpful. So in your example, how do you arrive a= t 38Gb? You are using unsigned int instead of long unsigned int. Where does= the unsigned char in the Fibre<>::Type come into the calculation? I think = I need my code to handle sequence sets with more than 256 sequences. I'm gu= essing if I replace the unsigned char with unsigned long I get back to 48G= b? >>=20 >> I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes for lcp = values and 4 bytes for childtab values. Then for n equals 3Gbp you get roug= hly 38Gb. >> Value sizes really depend on your input sequences. How many strings do y= ou want to index and which is their maximum length? >> Do you dispose of enough memory? Depending on your application another i= ndex could be more efficient... >>=20 > I have one more question if you don't mind. What are the constraints on t= he string sets I can pass to an index for which I have overloaded these typ= es? >=20 > For example, I think the types in the FibreSA String of Pairs limit the n= umber of sequences and then the number of items in each sequence. But what = about the types for the FibreLcp and the FibreChildtab? Do these need to be= the same as the second type in the FibreSA Pair or do they relate to somet= hing different? Sorry my knowledge of the internals of suffix arrays is not= up to speed. Here [1] is a fast introduction to the Esa. Visually the suffix array represent the leaves of the suffix tree. It store= s the position of all suffixes of the text sorted in lexicographical order.= Then the max value in the suffix array equals the length of the text. For a text consisting of a set of strings we build a generalized suffix arr= ay containing pairs (i,j) where i is the index of the string in the set and= j is the suffix position. If you overload the FibreSA with (unsigned char, unsigned int) the Esa will= be able to index a set of up to 256 strings, each one long up to 2^32 char= acters. In this way you can index hg18/19 and any other genome containing not more = than 256 contigs. The lcp table stores the longest common prefix of all pairs of adjacent suf= fixes. Visually lcp values represent edge lengths in the suffix tree. The max lcp value corresponds to the length of the longest path from the ro= ot to a branching node, which can be at most the length of the text. With some tricks we could spare some space here. If you only visit the top = of the suffix tree up to depth 20 or so, then all big lcp values > 20 are s= uperfluous. Unfortunately all lcp values are needed at construction time to build the c= hildtab, so you cannot simply redefine FibreLcp to use unsigned chars. In practice we could spare 9Gb here, but actually in SeqAn there is no easy= way to do it... The childtab stores informations for tree traversal, i.e. it represents par= ent-siblings-children of each node. Childtab values store table intervals, therefore the max value does not exc= eed the total length of the indexed text.=20 In practice here you need a type able to index the whole array, i.e. unsign= ed int for big arrays up to 2^32 elements is enough. If I understood correctly, you want to index a reference genome and not a l= arge set of short sequences. If you want to do the latter, then of course you have to use different data= types. For example, if you want to index up to 16M reads not longer than 256bp you= can set: FibreSA: (unsigned short, unsigned char) FibreLcp: unsigned char FibreChildtab: unsigned int Ciao, Enrico [1] http://theorie.informatik.uni-ulm.de/Personen/mibrahim/TheEnahancedSuff= ixArray.pdf > Thanks, > John. > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev From jer15@hermes.cam.ac.uk Sat Jul 07 10:01:35 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SnPxN-0006jA-Ky>; Sat, 07 Jul 2012 10:01:33 +0200 Received: from ppsw-41.csi.cam.ac.uk ([131.111.8.141]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SnPxN-00042u-FM>; Sat, 07 Jul 2012 10:01:33 +0200 X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from cpc6-dals15-2-0-cust115.hari.cable.virginmedia.com ([82.35.196.116]:51166 helo=[192.168.1.4]) by ppsw-41.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.156]:587) with esmtpsa (PLAIN:jer15) (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) id 1SnPxL-0003GD-SD (Exim 4.72) for seqan-dev@lists.fu-berlin.de (return-path ); Sat, 07 Jul 2012 09:01:32 +0100 Message-ID: <4FF7ECDA.7070502@mail.cryst.bbk.ac.uk> Date: Sat, 07 Jul 2012 09:01:30 +0100 From: John Reid User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: SeqAn Development References: <4FE33EC6.9070906@mail.cryst.bbk.ac.uk>, <4FE9C513.1000204@mail.cryst.bbk.ac.uk> <4FF54667.1000203@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62CE4@ex02a.campus.fu-berlin.de> <4FF5582C.5020809@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A62F55@ex02a.campus.fu-berlin.de> <4FF6E05E.9030500@mail.cryst.bbk.ac.uk> <3C98A92F0EF87E41AFE85105908BEDC319A64998@ex02a.campus.fu-berlin.de> In-Reply-To: <3C98A92F0EF87E41AFE85105908BEDC319A64998@ex02a.campus.fu-berlin.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: "J.E. Reid" X-Originating-IP: 131.111.8.141 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1341648093-00000D73-D5F7AE72/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.030479, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Botsuana.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none Subject: Re: [Seqan-dev] {Disarmed} Re: Performance advice for whole genome ESA X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Jul 2012 08:01:36 -0000 On 06/07/12 18:51, Siragusa, Enrico wrote: > On 6 Jul 2012, at 14:55, John Reid wrote: > >> On 05/07/12 11:40, Siragusa, Enrico wrote: >>> On Jul 5, 2012, at 11:02 AM, John Reid wrote: >>> >>>> Great. That looks very helpful. So in your example, how do you arrive at 38Gb? You are using unsigned int instead of long unsigned int. Where does the unsigned char in the Fibre<>::Type come into the calculation? I think I need my code to handle sequence sets with more than 256 sequences. I'm guessing if I replace the unsigned char with unsigned long I get back to 48Gb? >>> I counted 15n bytes: 1+4 bytes for suffix array values, 4 bytes for lcp values and 4 bytes for childtab values. Then for n equals 3Gbp you get roughly 38Gb. >>> Value sizes really depend on your input sequences. How many strings do you want to index and which is their maximum length? >>> Do you dispose of enough memory? Depending on your application another index could be more efficient... >>> >> I have one more question if you don't mind. What are the constraints on the string sets I can pass to an index for which I have overloaded these types? >> >> For example, I think the types in the FibreSA String of Pairs limit the number of sequences and then the number of items in each sequence. But what about the types for the FibreLcp and the FibreChildtab? Do these need to be the same as the second type in the FibreSA Pair or do they relate to something different? Sorry my knowledge of the internals of suffix arrays is not up to speed. > Here [1] is a fast introduction to the Esa. > > Visually the suffix array represent the leaves of the suffix tree. It stores the position of all suffixes of the text sorted in lexicographical order. Then the max value in the suffix array equals the length of the text. > For a text consisting of a set of strings we build a generalized suffix array containing pairs (i,j) where i is the index of the string in the set and j is the suffix position. > If you overload the FibreSA with (unsigned char, unsigned int) the Esa will be able to index a set of up to 256 strings, each one long up to 2^32 characters. > In this way you can index hg18/19 and any other genome containing not more than 256 contigs. > > The lcp table stores the longest common prefix of all pairs of adjacent suffixes. Visually lcp values represent edge lengths in the suffix tree. > The max lcp value corresponds to the length of the longest path from the root to a branching node, which can be at most the length of the text. > With some tricks we could spare some space here. If you only visit the top of the suffix tree up to depth 20 or so, then all big lcp values > 20 are superfluous. > Unfortunately all lcp values are needed at construction time to build the childtab, so you cannot simply redefine FibreLcp to use unsigned chars. > In practice we could spare 9Gb here, but actually in SeqAn there is no easy way to do it... > > The childtab stores informations for tree traversal, i.e. it represents parent-siblings-children of each node. > Childtab values store table intervals, therefore the max value does not exceed the total length of the indexed text. > In practice here you need a type able to index the whole array, i.e. unsigned int for big arrays up to 2^32 elements is enough. > > If I understood correctly, you want to index a reference genome and not a large set of short sequences. > If you want to do the latter, then of course you have to use different data types. > For example, if you want to index up to 16M reads not longer than 256bp you can set: > FibreSA: (unsigned short, unsigned char) > FibreLcp: unsigned char > FibreChildtab: unsigned int > > Ciao, > Enrico > > [1] http://theorie.informatik.uni-ulm.de/Personen/mibrahim/TheEnahancedSuffixArray.pdf > That's very helpful. Thanks, John. From Sabrina.Krakau@fu-berlin.de Mon Jul 16 09:34:22 2012 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1Sqfoy-0002js-26>; Mon, 16 Jul 2012 09:34:20 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1Sqfox-0008G7-Ul>; Mon, 16 Jul 2012 09:34:20 +0200 Received: from cas1.campus.fu-berlin.de ([130.133.170.201]) by relay2.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1Sqfox-0003sg-K5>; Mon, 16 Jul 2012 09:34:19 +0200 Received: from EX03A.campus.fu-berlin.de ([130.133.170.134]) by CAS1.campus.fu-berlin.de ([130.133.170.201]) with mapi id 14.02.0309.002; Mon, 16 Jul 2012 09:34:16 +0200 From: "Krakau, Sabrina" To: "Krakau, Sabrina" Thread-Topic: SeqAn - BioStore Workshop 2012, Berlin, September the 4th - 6th Thread-Index: AQHNYyVnf5mz6lZmG0WpiiTNTEE0WQ== Date: Mon, 16 Jul 2012 07:34:16 +0000 Message-ID: Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Content-Type: multipart/related; boundary="_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_"; type="multipart/alternative" MIME-Version: 1.0 X-Originating-IP: 130.133.170.201 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1342424060-00000D73-8A27772C/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.024591, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Botsuana.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=ALL_TRUSTED, EXTRA_MPART_TYPE, HTML_MESSAGE Cc: AG ABI ABI , SeqAn Development , "seqan-interests@lists.fu-berlin.de" Subject: [Seqan-dev] SeqAn - BioStore Workshop 2012, Berlin, September the 4th - 6th X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Jul 2012 07:34:22 -0000 --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_ Content-Type: multipart/alternative; boundary="_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_" --_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Dear Seqan Users and Developers, Thank you for your interest and participation in our poll. Based on your preferences we could now schedule the workshop in detail. You can have look on our webpage for an overview and the detailed informati= ons. http://www.seqan-biostore.de/wp/seqan-workshops/2012-seqan-workshop/schedul= e/ The preliminary schedule will include beginner's and advanced tutorials abo= ut the the following topics: Beginners: =95 SeqAn Install Session, Basics, Sequences & Iterators, Basic Sequenc= e I/O, Alignments & MSA, Indices, Fragment Store Advanced: =95 Input/Output and Writing Parsers, RNA-Seq with the FragmentStore, S= equence Compression by Journaling, Simple Bowtie Apart from the tutorials, the workshop will address: =95 Workflows in KNIME with SeqAn, Detailed desription of the following= SeqAn apps: RazerS 3, Stellar, SplazerS, Masai, Mason, Rabema, SAK, SeqCons, SeqA= n::T-Coffe, SnpStore, Command Line Parser One highlight this year will be the social event: After the second workshop day we will go together to Berlin-Kreuzberg, wher= e we will have dinner on board of the cruise ship "Philippa". http://www.vanloon.de/kulinarische-rundfahrten/schiffe/ There will be a moderate workshop fee of 50,- =80 (15,- =80 for undergradua= tes), which is to pay at the beginning of the workshop. Action items: 1) Please send us an email to sabrina.krakau@fu-berlin until the 23th of July to register for the workshop, so we = can take the number of participants into account for our planning. 2) If you would like to give a presentation about your use of SeqAn or abou= t a problem you need an efficient algorithm for, please let us know. We can= accommodate this in our schedule. We are looking forward to meet you in September. The SeqAn team [cid:12D5553D-AD16-40D5-986F-BF6FABF0587B] [cid:53AF5887-D1F2-4BAA-BF6C-C= F01BF1C6DB7] Sabrina Krakau Freie Universit=E4t Berlin Institute of Computer Science Algorithmic Bioinformatics - Project BioStore Takustr. 9, 14195 Berlin Telefon: +49 (0)30 838 75228 --_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_ Content-Type: text/html; charset="Windows-1252" Content-ID: Content-Transfer-Encoding: quoted-printable
Dear Seqan Users and Developers,

Thank you for your interest and participation in our poll.
Based on your preferences we could now schedule the workshop in detail.
You can have look on our webpage for an overview and the detailed informati= ons.

The preliminary schedule will include = beginner's and advanced tutorials about the the following topics:

    Beginners:
  &nbs= p; =95 SeqAn Install Session, Basics, Sequences & Iterat= ors, Basic Sequence I/O, Alignments & MSA, Indices, = ;Fragment Store

    Advanced:
  &nbs= p; =95 Input/Output and Writing Parsers, RNA-Seq with the Fragmen= tStore, Sequence Compression by Journaling, Simple Bowtie

    Apart from the tutorials= , the workshop will address:
    =95 Workflows in KNIME with SeqAn= , Detailed desription of the following SeqAn apps:
      RazerS 3, Stellar, Spla= zerS, Masai, Mason, Rabema, SAK, SeqCons, SeqAn::T-Coffe, SnpStore,&nb= sp;Command Line Parser

One highlight this year will be the so= cial event:
After the second workshop day we will = go together to Berlin-Kreuzberg, where we will have dinner on board of the&= nbsp;cruise ship "Philippa".

There will be a moderate workshop fee of 50,- =80 (15,- =80 for undergradua= tes), which is to pay at the beginning of the workshop. 

Action items:
2) If you would like to give a pr= esentation about your use of SeqAn or about a problem you need an efficient= algorithm for, please let us know. We can accommodate this in our schedule= .

We are looking forward to meet you in = September.

The SeqAn team

<= span class=3D"Apple-style-span" style=3D"border-collapse: separate; color: = rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: nor= mal; font-weight: normal; letter-spacing: normal; line-height: normal; orph= ans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; w= hite-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal= -spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decoratio= ns-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-wid= th: 0px; font-size: medium; ">   <= span class=3D"Apple-style-span" style=3D"border-collapse: separate; color: = rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: nor= mal; font-weight: normal; letter-spacing: normal; line-height: normal; orph= ans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; w= hite-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal= -spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decoratio= ns-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-wid= th: 0px; font-size: medium; ">

Sabrina Krakau
Freie Universit=E4t Berlin
Institute of Computer Science
Algorithmic Bioinformatics - Project BioStore

Takustr. 9, 14195 Berlin
Telefon: +49 (0)30 838 75228
<= /span>
--_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_-- --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_ Content-Type: image/png; name="BioStore-Logo-60.png" Content-Description: BioStore-Logo-60.png Content-Disposition: inline; filename="BioStore-Logo-60.png"; size=4697; creation-date="Mon, 16 Jul 2012 07:34:15 GMT"; modification-date="Mon, 16 Jul 2012 07:34:15 GMT" Content-ID: <12D5553D-AD16-40D5-986F-BF6FABF0587B> Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAAGoAAAA8CAYAAACO9i99AAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ bWFnZVJlYWR5ccllPAAAEftJREFUeNrsXAl0HOWRrr+7Z0YaaSRLlmTLhw4kywfGmNjG+HoxDg53 OEw2yQtvs84aJ9mQzb59OUnykpBASMwuS/BuAgmBJEACBJZwGOzFGBsb8IkNSD5ly/IhS7Jk69bM dPefqu5q6/doRhdax++5S688R/f801PfX1Vf1f+3xYy7ngYAEQChjRaaNg4fiwG0UiG0QnyO74sW AaISX2+TQuzDcy18D5w/fJT4CI7SAz9+BJF6AELN1ZBb+RRIzQBfXPEsEUc94qiEd1y7y5CQslAK mIVgLMTXX0NoghJgDR58AvV933znTrQ+jkVRa1CfRf0qAjgfpPw2AlaBr7eibkC90jfh3x+oRImh rka9GQGbIUG24eMb+Pp/Ucf5pjx/gFLlQ9TrMQzehlnqE/j8A9QbfHOef0B58hx5F4LVhM9fQv2i b9LzEyiievsxb12LehxBexTfuNU3698JKAxxYEsd4lYAYmYI4nbAfZ/ouUvRESy4g09/GLXUN+05 AorqI9M2HGAQDchKa4Hy/H0wp/QtmFiwGywZdGopRVbhic+iV+Xh8+/7pv3/qaPOiC01BCgAQd2G gkg9lObVwPjcIzAi/TSkGV2gayaCpINtB6G6aSIE9Jj68V+h930aUf4cFr8r8PUe38TDDBQZ3zIN yEzrhkl5B2BK4R4YldXgAGHbGh7XHA8jNRCsOSVvQF3rWIiaaaAJ2xtmmxDyCHrjePSsm/H1fYNv TUgOtr6cBRTlHWoBjYy0wOTCA1Ax+iCMCLc49nJDX7DXh+j93HATTB+7BTYeWoTeF/UOtQkJu0HI 8TjmHN+8wwjUlZPfgbRAFIrzjkNGsJPBCfT7QQJ4yuidsKdhMjR35jshkeU40Qv0qovwuU7OOgQm 6SOTSCYuK66CSWOqMcTFMYwFwbL7JYJkfJ1yWTjQiV61HfOaUBmiJSVRETkCX4XdMDYYpQfLD3+J QJH3xFGlHNAszge3v6d5xCMaD7p5pUdHkj/RgG5vffCafrKSc5UvQyl4J6L+AHUvuN12B9xjLeN5 ocMxMo4nS13Pkq14QncCiL2UvM+0dGeyxOwwBE7VQHrjbme5w5c+6HkKmYf6AOqXwV0OcTMJsr3y vL1Q04TYSIf5jcPaqow8CjE8LhnQVEL5kCAuyKzDOq0VbLyc7up3MfJhvjN8oAYL1I2of0D9B9Qd qEQDs1BbpNTiVGsZWhRzm47AyZkY7zIdMiFT11DkezErCPkZDTC7aBMU5x6EoBHD8kDAus0CTkHA SYS+DByoW1Cf5vbQelTqlHd64U9i5Nx19FKIxQNg6OQ8YglFNAIK/9mUbEAbASWgpo/ZhiC9DeFQ OzLIoKO2TU0Q2+d8gwTqNnAXDdehpqEuR/0/BgnpuAWNbfmwu24S6CJG+WYMonM1CI38qUlI+WZv Sh+EcLADFpStc9pQZ1pUMCyr+BckUNeiPsk10D7Ut8FdcyLJpPSC2p0R6sSQ1QVd8TT0LfsLmtQd xod/a9Gn6tQBCZBRWXWweOIapzUVZYB8GTpQV6D+BbUddTHnJUoZU1GzqaBFraUT3z86BTqiaWAI Mwd9YjmFLYf4CfuJno6Q5hTHk0ZXwsLydVhcd/sgDQNQRK3/jIqFKtyOuosfqSO+E3UL5SgKe80d OfBe7cWgScxNQn4FmX6JU+gK+R7St1fdOsulBPMu2ggzi7Y6ucltWfnyUYCiaf471GJwu96zOATS riNaY+rq8RKhhYNddrrRCe3dGcXoSV93ljwEhT37QXximpaB3tMFiyauh4mj9iHLG3BR7Us/QP0Y dSE/PwbuTqNXwN3U4p1LtFwzdKutoTUz2h13Wnk/RKpWgADR+hR95knqduRlnoSrL34D81K905ry ZXiAItr9LX7+EOq/KueMRB2NSjHreMAwm1q6Mq3VH8yFaEy/yTCspRJTmJBEyu27Y2bQLMuvhqum bIAMZHixQYLkdI58x0sKFOlP2TwU3lbysRkMENVNVagN6EmyqW0ErNo5F5raMnMCRvwX4OQhG/+0 p21Le3lG8U6YN2Grs0al5CP6jgLUQtQcBr2LSUmNMwBTdFKlzTcBtZxZJvB5baiHUBsTfks66kzU StTmIdqDvm8BX2MzM13S6PkA1C3M9ICp+GWoH0c9gPoWaqtXNzW05MAL2xZAe1c6BIzo9xGkCgck qTULYd1FSybTi7A+soyIZeveOB9DHcOgj0/wFzL8dtT/RHD+bGCWLCwX0FyHPur6Ou1q+k6S6yZG +hzqvyugfB31Z6iPwdB2Q32PJyxNoFOouVw/Ys6Ff+NzwjyBN3mT61wC9Y/K60b2nipIWEfSkCh0 xkIIUhAMLTYLicGdAq+VVn7Rs36+eOo7ByePOTQlGg/OkO6GTBq7jttPk4GW6N2Z+QT/0CLUz3EJ 8CfyHNuCn5ZME1DzgYTuDvzOnpYxbfT8KxuO1rk+i/oFvsZ/5nMOsffvH4IdFjJI96PezWBlsXfV J/Q8/4Mn8zn3qEuV10IpbM86D71GFEROx8OBDuiOh+5GIwaJ5VmW2DUut/6hiYWHQwhSm3QN2sq1 11zuFdL3fJM91Asj2QzgYn59FwL1fDgbqnILBRzdI0HrSW8E7C+V61mL+gwbLsiEx+uinEpy/VPY s8MM5GYGVQWKruteDq3AnvpX5Zw0nli57FVkq6NMvDwhxjydr6mSJ7wqEY4qVRxe54O78LZGIW0j UGfz8SrGg6rTs2YMkQqqgT7DBpzv5aqgYYndx8ZCZzRwI3rSNc5ChrtMcQ+C1RU3jSgCR511GvOT DAxd+AMcol5nY1zGXY9aNtCXUA9zjrnKSTaRfpejvK5HTMlfVO9VE+DKeTqTIzLaH7nMIO/ciFqh nNfE5UlBH9/5KHvxKG4IrGbgPPkB59tHeVJVcjRRq/vZ/Lk5HPJf5HPT+fgCTj80/gquY6lkcjLB MtQfoZZxKMlkg1GvbjMm9yNBw4zuPFQEb1VWYD1rfdeZTNJZOHwtHOx8dv6kXZTDskxbL+JcdJgN Yiqeez24yyQlfIwuyrsj5EsMag5NU61363w0X18mF98/5PcfU3JFtnLcE5osd/LMpM80oC5F/SfU p3gidoO7f/5H7EF3sOdDQqS5nSPFdRyFbMULlnPI/BrqI2zHq6EnutzJ51n8W+i7f4P63zyxW5ho vcAT+g7lu6hE2m8warfwB87aAoSoAHoSbNlXCuven0RM7kZNl3OI6VFpG7fEfTMvqoaivAajOxYM 8Iz+UPmB5MafR72JL4YM+3JC2IlwOCE5TF9uxns1ae+Ang2ewNf4E/ZW9T1VMngCAE8Qz/gbmEnO 58nzHIcwMsrjfHw36q94NncoY0cZoBblPbLbN8Dd0r1S+f5VTFDuQf05uOt4Nk/aFznMqrKMx1ru ETgeg6LPMiOBgfWQB00iabBhw4flsLGq3GF9WCkto2gp3ZvYXrEtub69O0TbyUwOH2or6jbuD+7l mbY3Sby+nclACSfwtynkdbf3qqXIyK9x7KfwdCuHOKl4V6KU8bi13K+EhBw3nz3jOX5vM4flaxnY X7InLEmYfIkylmn9I0mOrWNgLmGgvF/1bJJz53IEWsblC7BNiJiNTNo9J5CI5b3+XgVs3ltErI5A mialdjU4G1ewRorrK7LDbTB1fK23IUbnC/o4g/AOz7CuhOGL+WKu55l7xnjoRftMnLPtp6TK+IDD sDoDV3Lf8XscRvYm+Rlhfmzj8KZKixIuVfHCIOnlnK9/q5Qv0Mf3NKUoI4AZJChAtSY5N8LnfZpt 6Z1P4L3eCygdvYigeHVrBew4MNZZeXX3qqCHINOjDgSGvDfDaV3rb7p8B4zNa06PxQ2aUdP4wp6B hCUOlumcjKdxeLkugWWuoNzUggVCx2mgZa3EHKHKPs6DU5hFJQPqND/mchhUjZOvMLtUsoUT+j3s xQ0peibtCWMmGl+dGH1JnHPpFf12zw2dugk6vPLuBKisKYBQINaTtQSyQASJ6qZwWvThm6/YCePy mkuicWMUU+JnlOTqSYCZ3TXMbFZxAetR9F9zi4py1wYCqrEWcx8eDZy9EpKYf2ZzuAE2YDIhGv6e UsC/pNDs6/j59n6MF+ZrjStkwOBcYil90V3MmH+R8Pmr+LMfDACotUxIyjjXJwcqYNjQFTXghU0V cOBYrgOSQpHLMCtdKjHEmRKOLpq2Z23pqKYIUvWTTEmTzaRPsPeQIf8nyZcv4mRPP9S5qYD2tDTU QGLYI/kUz2qDWd0ingTPQ+p7iS0uYJ/k8EUdhhNcLFNuWs+hjeQrHKI3sJcFGMy7+PNebXaQc8aV zM4y2aMoLNOWBWTE8F8Mzqe4iF7JZAX66WJSjvsXDrtLeQUjjdtiEcNhdgEbmlrS4YWN5XCkIQKh YCyxjpmCJCJdoMchc3itvjncGB2XdPtJHsf2PL64+xMYnjpT7+fndHHHqVA4jRUdtY+Um+G9WTs1 oTBv5B92dxIypHZUnuI8dC+HMJWRLVfOvYTBsjhPGVz/PJ7QoH6ayc8aBu8RBucZDn33MwOMMQ1/ lEsE9RpTba5v4LxNnZttHLoNngz3iYfX3Aw1J7Jg9dZiONUWwpyUpIUl4KtCaCs1ik1Cuylm6i8u WVAFk8Y3QtzUBYevch64ljVVqL2cZ+r13JtzClQD+VxtpYQtL0rnuRL38xUQBM/WeqWD4EkOz/aG hCLem0CT2fjHOMdZCYVxCX8+g8E6kOJ3RLjLkckh76hybBSDHuS8WZ1kghazZ6Zq9AZ5YhbyOXTu IeOxVZOguTUNHUVAkHYSJcNaioiztVKKLtPUPhw7sgWKC06DZWkhJYnuTMKu1JpmIVfkVzIVfYWr +TPLG6Fwr2K3MUmXPJWcStE+IjmZpIhNDJPVyXJDEmnjsJlM6pNMElU6E5huMoklKSfAaDgdwpxg g9BTt4ORmsed9SYp6y3bPpEV7kRCEYdYXKcfeLyPTjIxrlmc+Pdz6Pgi57Vl6qym/ZuhsAAjQP1D f1dSL6B0YaWOmj2U6wgWVgiHaAto0H3gWDYcP5kJRQWtJtVQNt0ZYGnqzQJZzF4i3NVezWFnLYeh BZzYz/qSIKZOHR3f6vQXD/uk533VFFLaUWR+OSBkWndM61y/cyzMu0SH2sZsh4xcXNwA4RCREBFG 0MLS9aB2ZYyHuNu9lLsAvVZ2KTcRWNEOH6ckQA3orokatOTvpRDLhdQuCerW5oN1I+DgiZHoRQHQ DR227y/B4rcVZlUc7szPausEZ2n+jHyLe3UPMpNK5rXOdvNQuoBWclF/T/PZ6WfC0gHfFz0Cp/k6 IXQNGeBi1AZqHzhMEPk0crVcG4JhrMeOjsk9DTdcvgvCSPMxHC7htv0mJhIpbxwgoIj11Vadxfx8 GUToc1syEhZKkCuw9qI64k3h5p8APuYgIWnThflkLG4gI2yCjBDdOKBN565DBxe38b6nDVZ4Ef/W qI8KFPes5HK05Hgp5Fx3/Ug2I3ivo43fj5mGPXNCDcybvB/rK+e+mZVMKKgLXTmQL8jM8RnfcADl yRGm2h59hyh6UmnhSVg4bR+Yzh0bTgOWyMO74K6s9itE0bPzBegB57YdHzBFtOEYhBq5IzK64JoZ lc4aFhbPOrdkgBuV5kDGsbFSiIzE6jjbuy/Ol2EDiuqogGHBDbMrITfSCablDHkpt1mquAMxICEv CqZjPC0UYMZ8cIYNKO9W3E/O2A8lo5qpU+EdKlaan4MyOXlVxWwBWfnuknxfVfiFdOP8RwIqZupw xeRjML3shJOjFPEajnsHOyYBlY0gzV2iQRaGwVg3vmf2KC2FxGPuBNEuoBtDhvy/65qmBqWjW2DB tFqIx3vhTQtl1lDHp7BHuWr+ZzQ4uENC/aEe1xG6cHJY0cUA1dvBOXYh3EA/JEPSUj3tq5g79Ziz Syka15OxwseZ9f16SLkPw15aBsDUhQImzhFnwhwtKlI/kOrs6h3ygol+Qwp9dFN0JByDwtx2rJdS DkELZrSmMnuoF0dhkLyLwHEaILpbFNN7BKSfo/rvPDm7lPppndLaEK2G0gJh2XCQFk991jfoENiv 0E4hWp6m7cO09u938M41mRiE0DatVzkM0gIibSKh5XJadfLL2vMIKE/qWDOhZ484gdU80M7FhSx/ E2AAfs+WwMxn8X8AAAAASUVORK5CYII= --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_ Content-Type: image/png; name="BMBF_CMYK_Gef_XXL_e-60.png" Content-Description: BMBF_CMYK_Gef_XXL_e-60.png Content-Disposition: inline; filename="BMBF_CMYK_Gef_XXL_e-60.png"; size=8173; creation-date="Mon, 16 Jul 2012 07:34:15 GMT"; modification-date="Mon, 16 Jul 2012 07:34:15 GMT" Content-ID: <53AF5887-D1F2-4BAA-BF6C-CF01BF1C6DB7> Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAAE8AAAA8CAYAAAAngufpAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ bWFnZVJlYWR5ccllPAAAH49JREFUeNrkewd4XNWZ9je9aFRGMxq1Ue9yUZct29jGFRsXOsbYEAKB P+HfTXYXsrsQ2F1SSJ6QXTab5ae6YJrBYMC4yLKtbkmWrF6t3utoNJoZTZ/Z91xJxAYMMoEN+B89 95lyz733nO983/u+33eOhI3NLcReQqGQqi9WU39/PwUHB1NCYgJVlJVRbU0NLVuxklauXEFnz5zl 2mZkZZJpaoomJ8aJz+fTQl9oyzvy/vuPj46Oekml0nNGo3FHcEjIeFpqalVeXt76oKCgwY7OzqHk pKTMjIyM462trf6Dg4Mp5pkZ1z27dhUdeuONLVptqG5ocKjJQ7RN5e/fgvaUkpISMGM2F/T19+8M DQ2Z2b59+395KxTThw+/+xurzTYdGhLSEhISzLfZ7KLsZdmne3t70/fvP/DwmtWrP8rNy3Ns2rCh YdpgWKQ3GJaGh4V1hoaGviUUCswej+dLxyOkr/Hi83g0PDxMbpeTxGLxtRjPA6MJAjSahqampnEe kR3f6XReXnZGetoHtbV1Seiw98DgoK+fUpmt1+s7BAIBX6VSnf2P/3xe29XVfayyqiraYrHEakND p319faWG6WnBpba2AZfbbTZOT5eqVSp7QX7BUhimyGw2C+0OhyIpKamnqbl5p0gkmvGSy0c62jtO oW1KS0vL+2Ojo08VFRdFhAQF62esVkGLydhe31BvXch4vpbx0Alqu9RGBefOkkwuX/B1TqdTqgkI yJPJvbbAIPL29nb7yhUr/miz23dMTOi8MFCB0WRywTCFMMh9arWaecHx3r7eO729FCMxMdFDbrdb NDAw6FEqlaMOh0Pucrmc6M+wZdpoJR4JMEFy+IvJ7fEIFAqFQSgSDqGv/o1NTRfhqbZDhw5RfX0D D21pcnKSj7bTanVAD18g8JNJpVY4w2hYWJhHIpHQt+J5NpuN7rjjTtp+883Egxcu5MXa4XC89NJL a01mM2FglxCmIwjPmcioqMqnnn76h1mZmR3n8vObJFKJTuHtHYgBWXr7+rbg6rHHHnssd9++fXdG RUcNlZWXVym8vJwwXIjBYFD1DwykAQIMMGZCQkKi5UcPPXR8aHDQBW+zyOVy9a5dd+e+8eZbfsFB gU5vb2/C/T2YpHZ4tQvXVIeHh7UbpqaiWlrbpDD+jXv37h0OCAiYwmR/88bzuD2k8PUheA/JJOKv nKHLPNa1atWqf0c48usa6p093d3cRGi12s4VOcv/NTMzywUMcjMoiIyI/K+BgQFKTkzMh9U9S5em uLKysp4NC9O6iktK3cx1MPAaGI1wsBlknTgfFRlJYeFhNDw0ROMTE88uXbKEkhITPew5YpGYJGIJ wehutD0y163jzEgw2qVJvT5Xp9PxRkZGPG6Xi74V4zGXd+P5RpuZ7HY7yZkBF2J0GJnruMfjvrxj 7DPOObh3t5v7De3mJ8XJ3u12GyFkuTZuro2H3fDTW89/cLkxaIdjvpse1tY+932hvsEwnRHhl5Eh d57+ghcLRZvDSTM2Bwbk+fSYP3c9vubHZTSZ/zLjzd4MlAlvMFms3GHATccn9WSx2oh3nRmNN4f3 vv7+NDKh+8uNNxceV8yK1WbnDGi0WL5zHsjCeKEYzWDBNQspBHaiobFxau3sIYWPH/BT9M0Y74uZ leZC2v453GN/DDO4geC43MCeOQzk8XkLH+QVE4k/hlW8L1YJIpDGZXj6Z6O6rjQq6xuPL6DImBgy W600NDpO5hkL97vbM4vL/G/b1VlIW+2OuQHxIBOkzHQUoAmipORFJJHJOdJhPMIYDvKBtGHhAH4P yWSyuQF5Zt+5z7MTw5HL3HOUCKOcnBUE7cf9PjA0TDNWB2k0gRx5BAUF07oNG2n1mrV0/w8fpB23 3EIOPBNszU2in58/hUdGffo8dvj4+JKPvwrGiyOxVE5zjPwNsO2CeWvWAVysQzx4Gr55+ShJaDBR YtIiuvOuu2lsbIwKi0oIYhjh4EMvvfwK+cMYJaXnSRseQaOQHBKwuRAeI/NSkAWevOqG1ZwhTUYz PQBj3Hb7HRxRjQwPMSKnoZExRtG0/+BBGh8fg3HVtHv3HkIWQ61tbbR+/TpSqdXU0tQMo26iZTkr YUgbxcTF0+ncUyTGhIZHx3BZFMfcV4kA4f8+6sxalJMcc95kmjHTnj17uFRP7uVF0Fmc1/L5Qs5g Ajbj8EhflZqmkVPfftsdBAGNgebSmrVrKUSrpabGRnizhvNcpGVcGAYEaCgGYcfuZ7VauLBl3sZC b+fOW/AMAWkCAzlPtoLgomCwLRD+LLTZ891fARv8vyp6c/g3i1MCoZBL+ywzM59qvfkQmsci7p3J I/usERiYz6A9C0F2FzYh8/pxNrQdMIr1C/XmDPDL4bCTDUZj31l79t0KxeChhWEt/7vAgLxZNbsg guBdIYn/ui/+V45qTt9cn5L3L3sJPzutDDNEYi7m+U673ctmtYpwxiAQCF2s5ueeC6nLGVUkFF2R tjBNxF5isYi73zxLsbZCgZDLMVmIcikOj3cFHrPfGDMz3FkAenKTy65hfWPPmj2E3PXcvXCeG9Pc d6Fw9rlzJP7ptex31u5aZKnw8lDho9e6iYkEs8kYCsNZ9fqpcODJ1sryckt4eNiznR0diUHBwWV4 iH7eGAxjRoZH5nCDx2SH38TEeBq+89Rqdb1YIplgNbugwCCuc4ODAzktLc3xUonUpdFojlssFv1c xYWbGCtAfcZsDujqbPeHHmuTSmXk5SW/Sn4J/AMRTE1N8YcGB1IggxqEIpETuOk/Pjq6KiM99Sz6 YZ6c1Cu7OtpviI6KLOjv7UlPSoyvQr9N7JmTk5PkmpjwNU4blsxoNCVMNnEYjL4y6SJmpamrGU8B dmMv1smWpsanS4oK742LizsZFxf/8ujo6CZWNGy/dMkdExcX2dbS/I8NtTWqzIyMe1XR0XUMkI1G Ex09+iGrbHBsqZ+cTL9YUXaWGSI9e/ltvkrlUS1kyI6dO7hZLi7If0IgEl9Sqfy7kewLzCYT11F2 LfOA4uISxoppbx7YvzIuPuFftmzbgc6757yCx3ks511oi+DgBtfV3S08ferU3/qrA34mlUkNU5OT ifXVFz+IjIzclJGdda6osGDnkcPv7E9ITEzp7euTtba2CYYgQ0RzhVydTi9obW3xzlYo0A8JK1nR xMQEfXz0g5v8VaoLAYFBk6y+x0iKN0dwfAE8lrEV61B/f3/82by83TtuvfXp1PT0t5MTk3jnS4qD y8vKFt9y222HV61efU4bHl5w8tixd18/cOAf9j7wwH2WufRLioeZzCYuK+AO3hcfs5MktasDg17B 4FuDg4PFF8rK/mF4eCisp7v71YjIyM4Txz5+wuFwxkkk0oZb77iD+vv6nxwaGvRfuXLl80bDtKau pvp+vX4yV61Sz7Q0N90WGhpyMS4h8RC7uwADEsyGqkitDqjo7enOTsvIONfb05MSn5BQPqWfEphM JmlHR4dPa3PTI3ab3T81La1zx47th8vPlwqUvj7KmuqaRzvbL3nNmGfyjrz77q9uWLP2HT5fcKm2 ujoHNjNGx8RW4VGn+3t7H+EzizI1/fHRDx+Fy5twwl83oeMNjwx7EA7hrFPQTx2Dg4M0bZhKDA4J 6a2vq11jt1nTkpISKTEhnuLj4znZME8sVzPcXFhLmhvqnxsfHflNcWHhHZ0d7RFymbzglZdevv/g /gNbXW63PTwyIlcul3tamlt24rzCYXc0Hn777fsmdROLTEajHJFxCt4XlpGZVZJ/9uxtMEYoC9d5 WnO7XUhcZBcQOQEYZAJ+ckAUtxpNRsFgf/92wFBgd2fnCl8/39fPnM5dB8dJmBgdWVlz8eKqKb0+ GdlO+dH33x+DU1VDsH+EaLpBKBLqclasyDPoJ+8cHBiItZhnsvneUPX19XUrzuTl7kF6pISiFubl 5v7i2V/9+uWSoqJN7HxpcfGuTz768OnO9o7tUPBihLL2rddf/6WXXA7D+9CiRYu4kPu81PBcdszq K8yibfstt/7LDx986Inuri4XBGwCBpUEY7R2dnZqgkNCezC7bcAvO3SXCgZIhK4LD4+I7BaJhGKn y1mpVPq7ZmbMgc1NTeGACyFwmeOd+efgGQLou1GQ0+CZvNM/CwsPvwjIMPGZS6KfuFYQqtUOgySa p6b0VjiQyGK1yiCSj4dHhLdMwUCpaanDeDfqJ3WT6NtMaVFhhZdCUQnDOirKyh6IjI4+x8dM+rz8 wgu/wywWTk9Pe3/4wQePyr3k4xbLjIINloXBpE4X4uvrNz0yPKz9+OjRh5KSk6vMAOV33nrzCWY0 jSaAQA5cwv05Vebhcdg0bzx4BQ2PjJjHx8cpLT19FNg6kJ6R2Yd78jOzsnoMen1GT3fXDfAAicpf 1Yd+9eH8CK512+wOl93u8Ezq9WxVDz9nNsI7RAhFntPJKr+zB8uTJ3B/4FVxZUXF3UHBIaVm84wY Xg3ccnLngV88l2uuvZvDVGtN9cVt3t4+OjiMGcS1Gjk2HxlIhtlsEgwPDfP6evuQwsVVXqy8cBei 46wAWPwjJNbG0eHheGBOu16vD6muqroTBlvMJpGBM2Ze2NHevhF5YmJ0TEyHr5/fJNx5oLevX2Q0 mfq8FN5jDDtZLoiwjBoeHLifGSo4VHsYjNXKWGtZdjbHXmbTjCUiIqIVoWdZs2Z1T+WFSt/Kygux P3jggaMbNmyoyjt9OhYT4o6IjM6/9fbbSro6O8Kam5s0t9x623uwk8HLSzG4bt2No5qAgPGSkuLM lNS0ogBNYD1bcwgLC+sMDgl2BQZqXOijbtHipXXob3dOTk4pSMgRExt7SSqTTcATL0VFRU9ERkR0 +fgqrcuXLb80bTLptmzdcuHMmTMrMOaev/3pT9/29vE1gpVj0bYKY+zAZBmVSj9vTGz4rt279wvT MzP/VF9b86JEKrMjBHwArA0hIaHdkBThVotFCSpXhoSGDiA8J9ChaS8vLwNCQoa4T7rp5pv/USSW NBoMBmLe8PnaHTxXKCAAPrH1iJSUFIpLTDjm5+tLDGtZ7pi8ePEheDFJkYwbDNOUtHjJn1h+KxCI wHoyCo+KeoWVhoBpFBkZNS6SSDmoiI6NLUpITCpKAmSwokBgUFAemzCENpNP3WD5bj+lHyEM31Io vCgiMioXMgs94o2wboLdh73BrnJvn1x1gIoWL00ZCgwOIRj7d6zyA8xlXlYSFx9XwvrCnhsTH0dF +QUxGVlZnzBJI9Bqw1bKZNIQo3E6CDe0I6n2RieESJBf7O/rS4CRAm+/664/YrAOpFBiDFSBpFsc HRtT2dfb6wc5Vz0+pnNAH3K5JET1FZ4HY7cyip+eNlJGRjpnRHawASxKTmIG4x9+5x3pyOiIkwdx C4EtME0b5XaHXYywcoHRPSys4LU81n5sbJwL/ZaWFjJMGSgQ+nHGbOKmCsyL3NTChSxIiDtYRWZo eITXPzAgHhsbY7fkJAf6RWNop9PpOJnEcuAAtYpqa2s5/E5LS6WhwWFuTYR1ljnIXA7etnjJkjrc 281fvHhxKfMmXODEg73QSXOoNrQ9ITGhFUy3hAFsQ3392nv27HkeIa3Cg2xgKZDTSMLynJwPw8O0 5vj4GDxQcNXclKuPeSvI18ebk0VME7Iwr62tI7Ddk9HRUbtZ5lFWWir65OgHhxrra1/raGv9d0xe gBFGLzx7VgHCespoMgvYjoZpXN/Y1AxvYErBmxt4dFQUZaancf1gxomJiWZGFoIEfQb6B4IKzp55 8GzuKbLMmJHlCEgDjO7u7maCHDKtj5Yvy6aEhARSBwRwdUW3x3NFEj1foED2YmE1B67qg7AR+/gp z8OFR2RyuQUzoUnPyHrv+ef+8B8KhbcJeDjT3Ni4vKiwcAv0Xy5CQgKSUS9asuRdoUjsSIM3wdh0 tXydPZCF6ObNm+A9UjYwJVh2K4Bfc/jwYeXJEydWxcTFNzJyGR+fkCIbEMXExv0oMSnxka1bbxpN Tk4K9/X1XQ3PDuvvH5B0dXUGgjn5uE8wy7HgmUHIfDZDX3qL4WXA6vi+np71LFTzck9teeXFl15A HzyZ2dknn3jqKZKIxcmX2trWw4ME0JIKMG8CiGAdiEPOPPBHDz1Id911F8kQugzvv0hBzP8kDAvT IkJs7dA95VD8e9at33BQE6gZWr5iRUFnR+fp9kttO7bu2HkI3inEuf/X2ty8IiRUWwv371+zds0l hnLdbN8K0hzm7p/bJYBZDA0J4fJV5i2nT574nWHaOFCUf+7OuISkP0E0+0JHhTDBjVwYSsSpPPDa qz82TOl7TUYTE7i/h7ZsxYQFjI6OhLU0Nvxk+bJlv2iovviUODvr73JP9P92aHBw6u033liln9T/ tiA//2/g2cYTnxxfApbXQwaFNDXUJdustk0jS5e8XlSQ/xSkTP+pE8eThoaGWsfHxh9FN7sg0bJv 3r79txyWwWidnV2z0or3JcuPJ0+courq2nFgWT4Y7j3MSO342Fg4hPG5AE1AD1jKKyw87BzCbQBp WsTilLQXbA57pVAkOVNeXkEVFyqptLSMA+1ZScL73L4Wdq6+sYkKi4q1+qkpbwDuM5AapvfeeYst cFft2bu3BF5NEMO4Bd8Jgw7IpLL+QwcORAMfh3GPZ+HxoxiMGO9siwRL9dxmoznWMD1tj4mP/1lf X98foQo8QUFB7ZAu3nV1ddqmhoYKaNALMGo9HENcV1ubIZHJigJDQn57vrQ0ubKiXAKJVIhc/iXk xgECQArDN4ax7J2FMKtqz65v0OeKBkIGvHO5bdf2HTuOFBcX/Z/6+oYRyJYz8KQbTcbpMG9vbwnS ljZ/f+UWyIGWsPDQNy9erOWSeDs8i60Z8IQiYpmfgxUr2aI4Ky6y78QnG4zntDvwo8eBpF3ItjxA 8cvgbQ7QPo/DE7Sx2e1sr4lxSVr6MbQzQoSnApNEqoAA6cTEuAz5shtYw1IsJtQl8BI79KgY0OMd GhqaOTQwaES2Eb946dJjXR2dN5lMRh5wnMc8iek5tt3D6XB6+SuVcpbLi4QiN845GRmA3LiJvmIx Cv1iuTNvrvLy2RAW7kTCzn7kSjpCoQ7sW4tQ4/n5+Ypzm5vWQ4W729vaNicvWvQaPLLLT+lfywYf EqIlCei8u6qCPF1TJAReMBf3c1nJ7Jo1RpTbRt4uM6ndXpSdnQnsk46Oj4005p48cRAyY1IdoG4q KymZACa68DzSakNtdovZYzGZXsGgptdv3PibhrpaJ3LUZzUajSkzI70L5CGvqij/p4SkJPvdu+9p R/YzhgT+tY0bN+anpqYeLsg/5zM5PnG7Sq3qz1qWPVpWXBwPmbMmMipqeOvNN5/rbG9/rqmxIeXm bduOnD2TN6VSqfQ4Zw8ODhqTy2Vcfmyfq0zPGzI0NBge6AfJNXyFAYVarfYy9e92hUdGfqyf0rMb aO/de98PkLzzGxubnMg+BiEyDyIMPH5+fvAs3NxqofSTr+Iu44gjEYtR8vAt5HSLWIJJQt4U8cTo yOgg8XUDJMq5kTZu3vwrnX5KA0zVxcbGetZv2PhMVFSEC15NYH47mHMP+iFny3uBgYHTYP+f9/b2 KrZt26ZX+Svdq9as+QmMLVm2fNm0SuXPZMaTTg/5bdi0aSIuNsazcfOW/8vK6VnZ2SYQjRsetocx +85bdloiI8K563t7++UbNm0cW5qaygtQqytWrb7Bo/D2fk4OQistLaV0QAhzECUm9PI6I1MKNVAI 82u/ws9uZpldS/WwXUwDQUHBA1FRkUwnEaQJVypn7bmlQrGUpJX5JBjsQczL5gEO2Rjv0wq0gKuy CGbve+Z9ci7JYitpLojcYfYbuw864RodHSPIIK5AgYHau7q77UgFaevNW1kVxubj62v76KOPiaV0 EK0zScmJM0yvDaNf0waDA/g4jhx4tshJ7un54ivTe/AHMwtbVrgAWzP8NcHgJkQROZwuRsQeMD+B kFzvHn6X05HMeJ/VDsxY8XGxnCxim4WYJBJ+2co6E71sgEyUXskCAuKZDCSryGNbSq94DPvs/myN n5WJDJMEBU0yuRchT+YU/NtvvkkXKsrpnr17obESIVLVVFhQQKdOniQw/uobVq+uhkeY+GNjPqXF RfcAw5T+St+mrMzMY9XVNVRUVEzIf2nDxvWsdEUXL9ZQTHQUKaD9IF+4CUlOTuY2+lRcuMAt9mRk ZNCJ48dp/759lJ2zgsvdL7W3U1trKxUVFPIwOTcCF8/PmGes/shQ5qN0bpMSLV2y+NNFpq+1AOSB 5aW1JSQY7SfPZSX4qy6FuPCwtdtJoFKTlCu/84QAYt6kboIgkZgo5cMbeCAjylmxgmO4jtbWvfBC JXJnGujvj5RJJbuR/zZcKK/4u3379q+b0E1ysOPr68PLgcBFiEIWOYURkRGUEB9PNpudDwcQLlq8 iJKSktiqGJOHAhgHIangDAmj8ua2VAgiIsJJrxvjtzY1PohI8GZeykr3fy7tzx7zW5C545oNB6/j G6dIWpZLYJivNpzbSU5VMBkTs0iqn6Lqqqq7z+XlbbaYzfXIJ59vamr6Z2QWQSAlFzDr7y9WXnjY 7XJmg42jMTAbnwNtnhCCfcTPX3UcHnmTQu7l0I2Pr8o/e2bvYHz8ABL2X9bV1PygtKRklUwsqth8 05aDxYUFT4OhNTExUS8D9zpLigr/DYPnZWZkPAviCh/o7/uxp6T4mEQo6Dx+7OOfJyYl9a65cd0v SgoLpR9+cOSXENPWuLjYx3GNw/ONLXqzMnVdMQnGBskzj3Vf8hJg8DqHk9468gEMz6eW+nqPv0p1 Kvfkyb0/feyx96CzUrKWLXsSCffjf//Tny2DHluzas3ax8tLS19GmAg83KZIobW7qzP19X37jiBL 8dMEBZW/+MJ/H7hpy9ZXi4sKH3gTArnm4sUdMMCHJvNMx+HD72zDY33SMzLzDu47cN/KG274NXLR Op1uIu6ffv74Woh9YlWXrGXLj5w5ffpgYlLyn5CKLpXLZBq26TwhKfkPp0+efPLjj48lawID6z5f avsa67bM63jM68pPc1i3sHVt5IOsIhEbj3BKYAn5EjBZIFJBPkSxYGxkdLKpvn4AksEMUlIAvK3A lCFkGxMwnGCOWKSQJlV37773DqSPwxXl5ctcDqcAEmU50rZetVo9lLV82TPIRm4CTifZbXYxPDG8 MP9cKF8gqLBaLdEI6UXwQOHoMCc3PBfKy2vgqUx+KGC0YUiwVyG09TDqDMTxIEIaHGISTk0Z6GrH NRnPLZGRhGHd2MCCsO7TbVoisBMIgm3L0k/qYiB1dJhg5JUzPLbj0wIGA4sJpDLpWFBQoNhsND4A ERwDdnfMb28FiPPY1ggY3Qxh7A1J1YdjIiQ0VIm8Vt5YX/9IdGxMflND4xp/tXpAGxY+EarVuiMi IiTDQ0NaNom4VrA0Nc0Lz2E5MMgkiSB5+rq7u+81ThteDwoOCQQ2OrmiKtuWAbnk+ZK/BYetgy8k 79FeCmgtILdg4dHugnBWmvW0JjGKxKERyDSsvy8rK8+Bhnw2KiZa98OHH94PT2T7Xt+wW20dbo/n uTN5eavu3LXrmdi4mCmW84IUeswm038nJyXQjx999CV41hREcNW+V1+7HwTzya5duxoxEYcqL1xY ddPWrc9A43VAiqiBeZH37L7nQ4hx43O//4MvrplavjynFmLY8dS/PSPIzMpE+Cp//Z/PP78nMyv7 xZiYmB7IoVdS01NswkcePgjx3P3nnVpfE/OcMEAw30lLLuSSROCGIQULd1e2ZKifIFFZHvF3/4SQ 5NdkZmfXJC9KZhUOthmnkhUwoQirmKQYHBysS83IqEtNz4BI9SO2NCqXeU3CHUoZSy5avLhmFFpM CPxMy8h8Lgbai3lmcHDI+exlOeehCbmSVGxc3IfhEZHcljbebNtX1WoVRUZGkkgsonUbNxLLKOw2 20RKWvrzak0AwZMpOi621B/PFYkllb4QyrOltq+xoZvbYo7BOzD7m3sqyZ/v4jzwWjdFMZLhlZ8h unEbsTU2ph3hZVxdje1cQr7PFRpn9+m5WIGADYrTmnPrDdzBRDrXhu2Vw6DY59nN2x4ud53VpLMq k7Wf/86ucwAe2Hckt5yGZxuK2OSxooWDu+dskdTlcnIHkzKOOUK8mud9KeYxw7nBlMsGG8lHP0IO wdfckcb0kUFH/MLjoE4xXS8v4VVdjpWXPAKSFRyjqK5asomkALAFOBpr45grqzh5s+kG+02I8KnI J49XKD6LrmPjMduBGX3Pv0VBU91kD5AtbGO2YLaexwtzsH+IIJ4a3/z4c0ZHiuaaIu+WM+SJ2cgW Yq5P47k9PJLwXKTJmSBBEMDTvdCBAlxddhJYejmJwpfHITfww2fnpwCY4BpGCF8iuzuK25t8HXre 3Aq/FKclnN5Y8HY/HivCzpcKJDyuKDAPrVwFGH9L5BXUa5aQnbf1u7K/8psN2ytKJO5rIFf37MGZ 0n3ldov5Wzrw2FBpITkFU9TmumWuG87vnfG+8WnnXXZ4vqSVh0QkdtSR0nMWms1Gc5nY/9/Gu9zD eF9hZpdHRCGyYgrj74e2spKHJ/xe0ci3CjgL2Z7tAagKnN2kcJ4FWXP/G7Tg3ejfbcz7mga71qEz DwwUn6dgcRcNO7K5kL4+PI93bQe3fHHZTwtHSgEk0SAFiCpmV+qvC+N5ru24PA28VgM4XCKK0OhI IUUe6+FdB8b7XwrdK73wetB5f43he743tvt2dN73z4e+I8bzfL+c5zsXtqw6aWH/mYTDfj173/8I MADOw7X/kZPH9QAAAABJRU5ErkJggg== --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E28D863ex03acampusfube_-- From ZickmannF@rki.de Wed Jul 18 09:25:17 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SrOdH-0004iZ-Pf>; Wed, 18 Jul 2012 09:25:15 +0200 Received: from m3-bn.bund.de ([77.87.228.75]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SrOdH-0000Vv-05>; Wed, 18 Jul 2012 09:25:15 +0200 Received: from m3.mfw.bn.ivbb.bund.de (localhost [127.0.0.1]) by m3-bn.bund.de (8.14.3/8.14.3) with ESMTP id q6I7PDee001970 for ; Wed, 18 Jul 2012 09:25:13 +0200 (CEST) Received: (from localhost) by m3.mfw.bn.ivbb.bund.de (MSCAN) id 4/m3.mfw.bn.ivbb.bund.de/smtp-gw/mscan; Wed Jul 18 09:25:13 2012 X-P350-Id: 33269244eb7ec2 From: "Zickmann, Franziska" To: SeqAn Development Date: Wed, 18 Jul 2012 09:25:03 +0200 Thread-Topic: Registrierung BioStore Workshop 2012 Thread-Index: Ac1ktbvOrEj4QJ0CQTipPDX0iHmN6A== Message-ID: <76C738ACAD4EDA429A07ED71617581599084E39ABB@semail08.rki.local> Accept-Language: de-DE Content-Language: de-DE X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: de-DE x-tm-as-product-ver: SMEX-8.0.0.4177-6.500.1024-19048.004 x-tm-as-result: No--47.455500-0.000000-31 x-tm-as-user-approved-sender: Yes x-tm-as-user-blocked-sender: No Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Originating-IP: 77.87.228.75 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1342596315-00000D73-55C66532/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000161, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Gabun.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none Subject: [Seqan-dev] Registrierung BioStore Workshop 2012 X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Jul 2012 07:25:17 -0000 Hi Sabrina, ich m=F6chte mich bei dir f=FCr den SeqAn-Biostore Workshop im September re= gistrieren.=20 Bis dann, ich freu mich schon drauf, viele Gr=FC=DFe, Franzi ---------------------------------------------------------------------------= ------------------- Franziska Zickmann NG4 Bioinformatik Robert Koch-Institut Nordufer 20 13353 Berlin Tel.: +49 (0)30 18754 2125 -----Urspr=FCngliche Nachricht----- Von: Krakau, Sabrina [mailto:Sabrina.Krakau@fu-berlin.de]=20 Gesendet: 16 July 2012 09:34 An: Krakau, Sabrina Cc: AG ABI ABI; SeqAn Development; seqan-interests@lists.fu-berlin.de Betreff: [Seqan-dev] SeqAn - BioStore Workshop 2012, Berlin, September the = 4th - 6th Dear Seqan Users and Developers, Thank you for your interest and participation in our poll. Based on your preferences we could now schedule the workshop in detail. You can have look on our webpage for an overview and the detailed informati= ons. http://www.seqan-biostore.de/wp/seqan-workshops/2012-seqan-workshop/schedul= e/ The preliminary schedule will include beginner's and advanced tutorials abo= ut the the following topics: Beginners: =95 SeqAn Install Session, Basics, Sequences & Iterators, Basic Sequenc= e I/O, Alignments & MSA, Indices, Fragment Store Advanced: =95 Input/Output and Writing Parsers, RNA-Seq with the FragmentStore, S= equence Compression by Journaling, Simple Bowtie Apart from the tutorials, the workshop will address: =95 Workflows in KNIME with SeqAn, Detailed desription of the following= SeqAn apps: RazerS 3, Stellar, SplazerS, Masai, Mason, Rabema, SAK, SeqCons, SeqA= n::T-Coffe, SnpStore, Command Line Parser One highlight this year will be the social event: After the second workshop day we will go together to Berlin-Kreuzberg, wher= e we will have dinner on board of the cruise ship "Philippa". http://www.vanloon.de/kulinarische-rundfahrten/schiffe/ There will be a moderate workshop fee of 50,- =80 (15,- =80 for undergradua= tes), which is to pay at the beginning of the workshop.=20 Action items: 1) Please send us an email to sabrina.krakau@fu-berlin until the 23th of July to register for the workshop, so w= e can take the number of participants into account for our planning. 2) If you would like to give a presentation about your use of SeqAn or abou= t a problem you need an efficient algorithm for, please let us know. We can= accommodate this in our schedule. We are looking forward to meet you in September. The SeqAn team =20 Sabrina Krakau Freie Universit=E4t Berlin Institute of Computer Science Algorithmic Bioinformatics - Project BioStore Takustr. 9, 14195 Berlin Telefon: +49 (0)30 838 75228 From isaacyho@gmail.com Tue Jul 24 00:54:54 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1StRWf-0000Oo-IE>; Tue, 24 Jul 2012 00:54:53 +0200 Received: from mail-qa0-f54.google.com ([209.85.216.54]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1StRWf-000350-BX>; Tue, 24 Jul 2012 00:54:53 +0200 Received: by qaat11 with SMTP id t11so1577166qaa.13 for ; Mon, 23 Jul 2012 15:54:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=NJFgvEYPir1nDgGXSiU4LgVctpcDRPvIYf/OM8AEq98=; b=Fcocvk8dkZ85fr4e+TdyeXjIhjzwRc1KvnFse/xFnPLBNWLwK/6WsnJiNogArd8N98 tM58ExuP7zKNjdw2LxG5xyqHdxE42q5Ao3WDpQbE4hDxrC26QyJSfJHICZOG5rOgEead SeJ7z9W10zDPYet6Lfc0IIDSnQzWYIgTHO6P8AcpJGQwL00jihLB8lMzvvZIgwTviZ7L CqeCf+OyYWVyCwXN5paviF/xqdNwNOAuGhrdGP4M3zIQAkPnU+8PAl6tN7EhVVQJJeMk kik2gI0NhKoU6qQ3DZYslByOYAoIDWs58+YPjl24/2mBBpwdw7QMUgx4YbknfvkBNFpn VU9Q== MIME-Version: 1.0 Received: by 10.224.0.202 with SMTP id 10mr27222160qac.5.1343084090962; Mon, 23 Jul 2012 15:54:50 -0700 (PDT) Received: by 10.229.13.130 with HTTP; Mon, 23 Jul 2012 15:54:50 -0700 (PDT) Date: Mon, 23 Jul 2012 15:54:50 -0700 Message-ID: From: Isaac Ho To: seqan-dev@lists.fu-berlin.de Content-Type: multipart/alternative; boundary=20cf30667c0d25330804c58722ad X-Originating-IP: 209.85.216.54 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1343084093-00000D73-9A98C6D3/0-0/0-0 X-Bogosity: Unsure, tests=bogofilter, spamicity=0.488817, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Benin.ZEDAT.FU-Berlin.DE X-Spam-Level: xx X-Spam-Status: No, score=2.2 required=5.0 tests=DNS_FROM_RFC_ABUSE, FU_BOGO_UNSURE,HTML_10_20,HTML_MESSAGE,RCVD_BY_IP,SPF_HELO_PASS, SPF_PASS Subject: [Seqan-dev] fast + memory efficient hashtable of sequences X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jul 2012 22:54:54 -0000 --20cf30667c0d25330804c58722ad Content-Type: text/plain; charset=ISO-8859-1 I'm trying to store a hashtable of SeqAn sequences but am having trouble getting it to be fast enough. I'm using boost::unordered_map< String< Dna > >. My comparison function looks like: Note: PackedMer in this case actually just equals String< Dna > class CompareKeys { public: bool operator()( PackedMer *a, PackedMer *b ) const { return( *a == *b ); } }; My hash function: class HashKeys { public: std::size_t operator()( PackedMer *a ) const { unsigned long hash = 5381; int c; typedef Iterator< Mer >::Type TIterator; Mer s = *a; for ( TIterator it = begin( s ); it != end( s ); ++it ) { char ch = ( char ) value( it ); hash = ((hash << 5) + hash) + ( int ) c; } return hash; } }; I simply converted my code that was using "const char *" as "packedMer" and replaced it with the SeqAn equivalent. In comparison with using const char *, the program runs at least 20x slower.....any ideas? I've narrowed down the bottle necks to these two functions...it makes sense that these might be slow, but what might be a good workaround? ( I need a fast way to compute a hashvalue and a fast way to test for equality on the keys ) Thanks, Isaac --20cf30667c0d25330804c58722ad Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I'm trying to store a hashtable of SeqAn sequences but am having troubl= e getting it to be fast enough.=A0=A0 I'm using boost::unordered_map<= ; String< Dna > >.=A0 My comparison function looks like:

No= te:=A0 PackedMer in this case actually just equals String< Dna >

class CompareKeys=A0=A0
{
public:
=A0 bool operator()( Packed= Mer *a, PackedMer *b ) const
=A0 {
=A0=A0=A0 return( *a =3D=3D *b );<= br>=A0 }
};


My hash function:

class HashKeys
{
p= ublic:
=A0 std::size_t operator()( PackedMer *a ) const
=A0 {
=A0=A0 unsigned long hash =3D 5381;
=A0=A0 int c;
=A0=A0 typ= edef Iterator< Mer >::Type TIterator;
=A0=A0=A0 Mer s =3D *a;
= =A0=A0=A0 for ( TIterator it =3D begin( s ); it !=3D end( s ); ++it )
= =A0=A0=A0 {
=A0=A0=A0=A0=A0 char ch =3D ( char ) value( it );
=A0=A0=A0=A0=A0 hash =3D ((hash << 5) + hash) + ( int ) c;=A0=A0=A0 <= br>=A0=A0=A0 }
=A0=A0 return hash;
=A0 }
};

I simply conver= ted my code that was using "const char *" as "packedMer"= ; and replaced it with the SeqAn equivalent.=A0=A0 In comparison with using= const char *, the program runs at least 20x slower.....any ideas?=A0=A0 I&= #39;ve narrowed down the bottle necks to these two functions...it makes sen= se that these might be slow, but what might be a good workaround?=A0 ( I ne= ed a fast way to compute a hashvalue and a fast way to test for equality on= the keys )


Thanks,

Isaac
--20cf30667c0d25330804c58722ad-- From Sabrina.Krakau@fu-berlin.de Fri Jul 27 12:06:57 2012 Received: from outpost1.zedat.fu-berlin.de ([130.133.4.66]) by list1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1SuhRe-0006tF-HS>; Fri, 27 Jul 2012 12:06:54 +0200 Received: from relay2.zedat.fu-berlin.de ([130.133.4.80]) by outpost1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1SuhRe-0004HO-DE>; Fri, 27 Jul 2012 12:06:54 +0200 Received: from cas3.campus.fu-berlin.de ([130.133.170.203]) by relay2.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1SuhRd-0006wP-Rl>; Fri, 27 Jul 2012 12:06:54 +0200 Received: from EX03A.campus.fu-berlin.de ([130.133.170.134]) by CAS3.campus.fu-berlin.de ([130.133.170.203]) with mapi id 14.02.0309.002; Fri, 27 Jul 2012 12:06:50 +0200 From: "Krakau, Sabrina" To: "Krakau, Sabrina" Thread-Topic: SeqAn - BioStore Workshop 2012, Berlin, September the 4th - 6th Thread-Index: AQHNa9+J/hZ0PLYsLkiB5lmuG4c/Vg== Date: Fri, 27 Jul 2012 10:06:49 +0000 Message-ID: References: <69731BDC-A8D8-41F2-BC76-19C4C899B645@fu-berlin.de> Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Content-Type: multipart/related; boundary="_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_"; type="multipart/alternative" MIME-Version: 1.0 X-Originating-IP: 130.133.170.203 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1343383614-00000D73-AFE56D8F/0-0/0-0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000357, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Dschibuti.ZEDAT.FU-Berlin.DE X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=ALL_TRUSTED, EXTRA_MPART_TYPE, HTML_MESSAGE Cc: AG ABI ABI , SeqAn Development , "seqan-interests@lists.fu-berlin.de" Subject: [Seqan-dev] Fwd: SeqAn - BioStore Workshop 2012, Berlin, September the 4th - 6th X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jul 2012 10:06:58 -0000 --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_ Content-Type: multipart/alternative; boundary="_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_" --_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_ Content-Type: text/plain; charset="windows-1250" Content-Transfer-Encoding: quoted-printable Dear SeqAn Users and Developers, We would like to remind you of the registration for the SeqAn - BioStore Wo= rkshop 2012 . For those who didn't register yet, please drop a short line now to sabrina.= krakau@fu-berlin.de whether you want to= participate. As announced the workshop will offer a range of new tutorials for both begi= nners and advanced SeqAn users and developers. Additionally, the most popular SeqAn apps will be presented. The detailed schedule you can find on our webpage: http://www.seqan-biostore.de/wp/seqan-workshops/2012-seqan-workshop/schedul= e/ The workshop fee of 50,- =80 (15,- =80 for B.Sc./M.Sc. students) can be pai= d at the beginning of the workshop and includes the dinner on board of the = cruise ship on the Spree. See you in September! The SeqAn team Anfang der weitergeleiteten E-Mail: Von: Sabrina Krakau > Betreff: SeqAn - BioStore Workshop 2012, Berlin, September the 4th - 6th Datum: 16. Juli 2012 09:34:14 MESZ An: Sabrina Krakau > Kopie: Knut Reinert >, >, SeqAn Development >, AG ABI ABI > Dear Seqan Users and Developers, Thank you for your interest and participation in our poll. Based on your preferences we could now schedule the workshop in detail. You can have look on our webpage for an overview and the detailed informati= ons. http://www.seqan-biostore.de/wp/seqan-workshops/2012-seqan-workshop/schedul= e/ The preliminary schedule will include beginner's and advanced tutorials abo= ut the the following topics: Beginners: =95 SeqAn Install Session, Basics, Sequences & Iterators, Basic Sequenc= e I/O, Alignments & MSA, Indices, Fragment Store Advanced: =95 Input/Output and Writing Parsers, RNA-Seq with the FragmentStore, S= equence Compression by Journaling, Simple Bowtie Apart from the tutorials, the workshop will address: =95 Workflows in KNIME with SeqAn, Detailed desription of the following= SeqAn apps: RazerS 3, Stellar, SplazerS, Masai, Mason, Rabema, SAK, SeqCons, SeqA= n::T-Coffe, SnpStore, Command Line Parser One highlight this year will be the social event: After the second workshop day we will go together to Berlin-Kreuzberg, wher= e we will have dinner on board of the cruise ship "Philippa". http://www.vanloon.de/kulinarische-rundfahrten/schiffe/ There will be a moderate workshop fee of 50,- =80 (15,- =80 for undergradua= tes), which is to pay at the beginning of the workshop. Action items: 1) Please send us an email to sabrina.krakau@fu-berlin until the 23th of July to register for the workshop, so we = can take the number of participants into account for our planning. 2) If you would like to give a presentation about your use of SeqAn or abou= t a problem you need an efficient algorithm for, please let us know. We can= accommodate this in our schedule. We are looking forward to meet you in September. The SeqAn team [cid:12D5553D-AD16-40D5-986F-BF6FABF0587B] [cid:53AF5887-D1F2-4BAA-BF6C-C= F01BF1C6DB7] Sabrina Krakau Freie Universit=E4t Berlin Institute of Computer Science Algorithmic Bioinformatics - Project BioStore Takustr. 9, 14195 Berlin Telefon: +49 (0)30 838 75228 --_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_ Content-Type: text/html; charset="windows-1250" Content-ID: <6A3CAE7EF62B5040807FBC56C89E6C0A@campus.fu-berlin.de> Content-Transfer-Encoding: quoted-printable
Dear SeqAn Users and Developers,


As announced the workshop will offer a range of new tutorials for bo= th beginners and advanced SeqAn users and developers.
Additionally, the most popular SeqAn apps will be presented.
The detailed schedule you can find on our webpage:
The workshop fee of 50,- =80 (15,- =80 for B.Sc./M.Sc. students) can be pai= d at the beginning of the workshop and includes the dinner on board of= the cruise ship on the Spree. 

See you in September!

The SeqAn team


Anfang der weitergeleiteten E-Mail:

Von: Sabri= na Krakau <Sabrina.Krakau= @fu-berlin.de>
Betreff: Se= qAn - BioStore Workshop 2012, Berlin, September the 4th - 6th
Datum: 16. J= uli 2012 09:34:14 MESZ
An: Sabri= na Krakau <Sabrina.Krakau= @fu-berlin.de>

Dear Seqan Users and Developers,

Thank you for your interest and participation in our poll.
Based on your preferences we could now schedule the workshop in detail.
You can have look on our webpage for an overview and the detailed informati= ons.

The preliminary schedule will include = beginner's and advanced tutorials about the the following topics:

    Beginners:
  &nbs= p; =95 SeqAn Install Session, Basics, Sequences & Iterat= ors, Basic Sequence I/O, Alignments & MSA, Indices, = ;Fragment Store

    Advanced:
  &nbs= p; =95 Input/Output and Writing Parsers, RNA-Seq with the Fragmen= tStore, Sequence Compression by Journaling, Simple Bowtie

    Apart from the tutorials= , the workshop will address:
    =95 Workflows in KNIME with SeqAn= , Detailed desription of the following SeqAn apps:
      RazerS 3, Stellar, Spla= zerS, Masai, Mason, Rabema, SAK, SeqCons, SeqAn::T-Coffe, SnpStore,&nb= sp;Command Line Parser

One highlight this year will be the so= cial event:
After the second workshop day we will = go together to Berlin-Kreuzberg, where we will have dinner on board of the&= nbsp;cruise ship "Philippa".

There will be a moderate workshop fee of 50,- =80 (15,- =80 for undergradua= tes), which is to pay at the beginning of the workshop. 

Action items:
2) If you would like to give a pr= esentation about your use of SeqAn or about a problem you need an efficient= algorithm for, please let us know. We can accommodate this in our schedule= .

We are looking forward to meet you in = September.

The SeqAn team

   <= span class=3D"Apple-style-span" style=3D"border-collapse: separate; font-fa= mily: Helvetica; font-style: normal; font-variant: normal; font-weight: nor= mal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -= webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; w= idows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webki= t-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -= webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: m= edium; ">

Sabrina Krakau
Freie Universit=E4t Berlin
Institute of Computer Science
Algorithmic Bioinformatics - Project BioStore

Takustr. 9, 14195 Berlin
Telefon: +49 (0)30 838 75228
<= /span>

--_000_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_-- --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_ Content-Type: image/png; name="BioStore-Logo-60.png" Content-Description: BioStore-Logo-60.png Content-Disposition: inline; filename="BioStore-Logo-60.png"; size=4697; creation-date="Fri, 27 Jul 2012 10:06:49 GMT"; modification-date="Fri, 27 Jul 2012 10:06:49 GMT" Content-ID: <12D5553D-AD16-40D5-986F-BF6FABF0587B> Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAAGoAAAA8CAYAAACO9i99AAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ bWFnZVJlYWR5ccllPAAAEftJREFUeNrsXAl0HOWRrr+7Z0YaaSRLlmTLhw4kywfGmNjG+HoxDg53 OEw2yQtvs84aJ9mQzb59OUnykpBASMwuS/BuAgmBJEACBJZwGOzFGBsb8IkNSD5ly/IhS7Jk69bM dPefqu5q6/doRhdax++5S688R/f801PfX1Vf1f+3xYy7ngYAEQChjRaaNg4fiwG0UiG0QnyO74sW AaISX2+TQuzDcy18D5w/fJT4CI7SAz9+BJF6AELN1ZBb+RRIzQBfXPEsEUc94qiEd1y7y5CQslAK mIVgLMTXX0NoghJgDR58AvV933znTrQ+jkVRa1CfRf0qAjgfpPw2AlaBr7eibkC90jfh3x+oRImh rka9GQGbIUG24eMb+Pp/Ucf5pjx/gFLlQ9TrMQzehlnqE/j8A9QbfHOef0B58hx5F4LVhM9fQv2i b9LzEyiievsxb12LehxBexTfuNU3698JKAxxYEsd4lYAYmYI4nbAfZ/ouUvRESy4g09/GLXUN+05 AorqI9M2HGAQDchKa4Hy/H0wp/QtmFiwGywZdGopRVbhic+iV+Xh8+/7pv3/qaPOiC01BCgAQd2G gkg9lObVwPjcIzAi/TSkGV2gayaCpINtB6G6aSIE9Jj68V+h930aUf4cFr8r8PUe38TDDBQZ3zIN yEzrhkl5B2BK4R4YldXgAGHbGh7XHA8jNRCsOSVvQF3rWIiaaaAJ2xtmmxDyCHrjePSsm/H1fYNv TUgOtr6cBRTlHWoBjYy0wOTCA1Ax+iCMCLc49nJDX7DXh+j93HATTB+7BTYeWoTeF/UOtQkJu0HI 8TjmHN+8wwjUlZPfgbRAFIrzjkNGsJPBCfT7QQJ4yuidsKdhMjR35jshkeU40Qv0qovwuU7OOgQm 6SOTSCYuK66CSWOqMcTFMYwFwbL7JYJkfJ1yWTjQiV61HfOaUBmiJSVRETkCX4XdMDYYpQfLD3+J QJH3xFGlHNAszge3v6d5xCMaD7p5pUdHkj/RgG5vffCafrKSc5UvQyl4J6L+AHUvuN12B9xjLeN5 ocMxMo4nS13Pkq14QncCiL2UvM+0dGeyxOwwBE7VQHrjbme5w5c+6HkKmYf6AOqXwV0OcTMJsr3y vL1Q04TYSIf5jcPaqow8CjE8LhnQVEL5kCAuyKzDOq0VbLyc7up3MfJhvjN8oAYL1I2of0D9B9Qd qEQDs1BbpNTiVGsZWhRzm47AyZkY7zIdMiFT11DkezErCPkZDTC7aBMU5x6EoBHD8kDAus0CTkHA SYS+DByoW1Cf5vbQelTqlHd64U9i5Nx19FKIxQNg6OQ8YglFNAIK/9mUbEAbASWgpo/ZhiC9DeFQ OzLIoKO2TU0Q2+d8gwTqNnAXDdehpqEuR/0/BgnpuAWNbfmwu24S6CJG+WYMonM1CI38qUlI+WZv Sh+EcLADFpStc9pQZ1pUMCyr+BckUNeiPsk10D7Ut8FdcyLJpPSC2p0R6sSQ1QVd8TT0LfsLmtQd xod/a9Gn6tQBCZBRWXWweOIapzUVZYB8GTpQV6D+BbUddTHnJUoZU1GzqaBFraUT3z86BTqiaWAI Mwd9YjmFLYf4CfuJno6Q5hTHk0ZXwsLydVhcd/sgDQNQRK3/jIqFKtyOuosfqSO+E3UL5SgKe80d OfBe7cWgScxNQn4FmX6JU+gK+R7St1fdOsulBPMu2ggzi7Y6ucltWfnyUYCiaf471GJwu96zOATS riNaY+rq8RKhhYNddrrRCe3dGcXoSV93ljwEhT37QXximpaB3tMFiyauh4mj9iHLG3BR7Us/QP0Y dSE/PwbuTqNXwN3U4p1LtFwzdKutoTUz2h13Wnk/RKpWgADR+hR95knqduRlnoSrL34D81K905ry ZXiAItr9LX7+EOq/KueMRB2NSjHreMAwm1q6Mq3VH8yFaEy/yTCspRJTmJBEyu27Y2bQLMuvhqum bIAMZHixQYLkdI58x0sKFOlP2TwU3lbysRkMENVNVagN6EmyqW0ErNo5F5raMnMCRvwX4OQhG/+0 p21Le3lG8U6YN2Grs0al5CP6jgLUQtQcBr2LSUmNMwBTdFKlzTcBtZxZJvB5baiHUBsTfks66kzU StTmIdqDvm8BX2MzM13S6PkA1C3M9ICp+GWoH0c9gPoWaqtXNzW05MAL2xZAe1c6BIzo9xGkCgck qTULYd1FSybTi7A+soyIZeveOB9DHcOgj0/wFzL8dtT/RHD+bGCWLCwX0FyHPur6Ou1q+k6S6yZG +hzqvyugfB31Z6iPwdB2Q32PJyxNoFOouVw/Ys6Ff+NzwjyBN3mT61wC9Y/K60b2nipIWEfSkCh0 xkIIUhAMLTYLicGdAq+VVn7Rs36+eOo7ByePOTQlGg/OkO6GTBq7jttPk4GW6N2Z+QT/0CLUz3EJ 8CfyHNuCn5ZME1DzgYTuDvzOnpYxbfT8KxuO1rk+i/oFvsZ/5nMOsffvH4IdFjJI96PezWBlsXfV J/Q8/4Mn8zn3qEuV10IpbM86D71GFEROx8OBDuiOh+5GIwaJ5VmW2DUut/6hiYWHQwhSm3QN2sq1 11zuFdL3fJM91Asj2QzgYn59FwL1fDgbqnILBRzdI0HrSW8E7C+V61mL+gwbLsiEx+uinEpy/VPY s8MM5GYGVQWKruteDq3AnvpX5Zw0nli57FVkq6NMvDwhxjydr6mSJ7wqEY4qVRxe54O78LZGIW0j UGfz8SrGg6rTs2YMkQqqgT7DBpzv5aqgYYndx8ZCZzRwI3rSNc5ChrtMcQ+C1RU3jSgCR511GvOT DAxd+AMcol5nY1zGXY9aNtCXUA9zjrnKSTaRfpejvK5HTMlfVO9VE+DKeTqTIzLaH7nMIO/ciFqh nNfE5UlBH9/5KHvxKG4IrGbgPPkB59tHeVJVcjRRq/vZ/Lk5HPJf5HPT+fgCTj80/gquY6lkcjLB MtQfoZZxKMlkg1GvbjMm9yNBw4zuPFQEb1VWYD1rfdeZTNJZOHwtHOx8dv6kXZTDskxbL+JcdJgN Yiqeez24yyQlfIwuyrsj5EsMag5NU61363w0X18mF98/5PcfU3JFtnLcE5osd/LMpM80oC5F/SfU p3gidoO7f/5H7EF3sOdDQqS5nSPFdRyFbMULlnPI/BrqI2zHq6EnutzJ51n8W+i7f4P63zyxW5ho vcAT+g7lu6hE2m8warfwB87aAoSoAHoSbNlXCuven0RM7kZNl3OI6VFpG7fEfTMvqoaivAajOxYM 8Iz+UPmB5MafR72JL4YM+3JC2IlwOCE5TF9uxns1ae+Ang2ewNf4E/ZW9T1VMngCAE8Qz/gbmEnO 58nzHIcwMsrjfHw36q94NncoY0cZoBblPbLbN8Dd0r1S+f5VTFDuQf05uOt4Nk/aFznMqrKMx1ru ETgeg6LPMiOBgfWQB00iabBhw4flsLGq3GF9WCkto2gp3ZvYXrEtub69O0TbyUwOH2or6jbuD+7l mbY3Sby+nclACSfwtynkdbf3qqXIyK9x7KfwdCuHOKl4V6KU8bi13K+EhBw3nz3jOX5vM4flaxnY X7InLEmYfIkylmn9I0mOrWNgLmGgvF/1bJJz53IEWsblC7BNiJiNTNo9J5CI5b3+XgVs3ltErI5A mialdjU4G1ewRorrK7LDbTB1fK23IUbnC/o4g/AOz7CuhOGL+WKu55l7xnjoRftMnLPtp6TK+IDD sDoDV3Lf8XscRvYm+Rlhfmzj8KZKixIuVfHCIOnlnK9/q5Qv0Mf3NKUoI4AZJChAtSY5N8LnfZpt 6Z1P4L3eCygdvYigeHVrBew4MNZZeXX3qqCHINOjDgSGvDfDaV3rb7p8B4zNa06PxQ2aUdP4wp6B hCUOlumcjKdxeLkugWWuoNzUggVCx2mgZa3EHKHKPs6DU5hFJQPqND/mchhUjZOvMLtUsoUT+j3s xQ0peibtCWMmGl+dGH1JnHPpFf12zw2dugk6vPLuBKisKYBQINaTtQSyQASJ6qZwWvThm6/YCePy mkuicWMUU+JnlOTqSYCZ3TXMbFZxAetR9F9zi4py1wYCqrEWcx8eDZy9EpKYf2ZzuAE2YDIhGv6e UsC/pNDs6/j59n6MF+ZrjStkwOBcYil90V3MmH+R8Pmr+LMfDACotUxIyjjXJwcqYNjQFTXghU0V cOBYrgOSQpHLMCtdKjHEmRKOLpq2Z23pqKYIUvWTTEmTzaRPsPeQIf8nyZcv4mRPP9S5qYD2tDTU QGLYI/kUz2qDWd0ingTPQ+p7iS0uYJ/k8EUdhhNcLFNuWs+hjeQrHKI3sJcFGMy7+PNebXaQc8aV zM4y2aMoLNOWBWTE8F8Mzqe4iF7JZAX66WJSjvsXDrtLeQUjjdtiEcNhdgEbmlrS4YWN5XCkIQKh YCyxjpmCJCJdoMchc3itvjncGB2XdPtJHsf2PL64+xMYnjpT7+fndHHHqVA4jRUdtY+Um+G9WTs1 oTBv5B92dxIypHZUnuI8dC+HMJWRLVfOvYTBsjhPGVz/PJ7QoH6ayc8aBu8RBucZDn33MwOMMQ1/ lEsE9RpTba5v4LxNnZttHLoNngz3iYfX3Aw1J7Jg9dZiONUWwpyUpIUl4KtCaCs1ik1Cuylm6i8u WVAFk8Y3QtzUBYevch64ljVVqL2cZ+r13JtzClQD+VxtpYQtL0rnuRL38xUQBM/WeqWD4EkOz/aG hCLem0CT2fjHOMdZCYVxCX8+g8E6kOJ3RLjLkckh76hybBSDHuS8WZ1kghazZ6Zq9AZ5YhbyOXTu IeOxVZOguTUNHUVAkHYSJcNaioiztVKKLtPUPhw7sgWKC06DZWkhJYnuTMKu1JpmIVfkVzIVfYWr +TPLG6Fwr2K3MUmXPJWcStE+IjmZpIhNDJPVyXJDEmnjsJlM6pNMElU6E5huMoklKSfAaDgdwpxg g9BTt4ORmsed9SYp6y3bPpEV7kRCEYdYXKcfeLyPTjIxrlmc+Pdz6Pgi57Vl6qym/ZuhsAAjQP1D f1dSL6B0YaWOmj2U6wgWVgiHaAto0H3gWDYcP5kJRQWtJtVQNt0ZYGnqzQJZzF4i3NVezWFnLYeh BZzYz/qSIKZOHR3f6vQXD/uk533VFFLaUWR+OSBkWndM61y/cyzMu0SH2sZsh4xcXNwA4RCREBFG 0MLS9aB2ZYyHuNu9lLsAvVZ2KTcRWNEOH6ckQA3orokatOTvpRDLhdQuCerW5oN1I+DgiZHoRQHQ DR227y/B4rcVZlUc7szPausEZ2n+jHyLe3UPMpNK5rXOdvNQuoBWclF/T/PZ6WfC0gHfFz0Cp/k6 IXQNGeBi1AZqHzhMEPk0crVcG4JhrMeOjsk9DTdcvgvCSPMxHC7htv0mJhIpbxwgoIj11Vadxfx8 GUToc1syEhZKkCuw9qI64k3h5p8APuYgIWnThflkLG4gI2yCjBDdOKBN565DBxe38b6nDVZ4Ef/W qI8KFPes5HK05Hgp5Fx3/Ug2I3ivo43fj5mGPXNCDcybvB/rK+e+mZVMKKgLXTmQL8jM8RnfcADl yRGm2h59hyh6UmnhSVg4bR+Yzh0bTgOWyMO74K6s9itE0bPzBegB57YdHzBFtOEYhBq5IzK64JoZ lc4aFhbPOrdkgBuV5kDGsbFSiIzE6jjbuy/Ol2EDiuqogGHBDbMrITfSCablDHkpt1mquAMxICEv CqZjPC0UYMZ8cIYNKO9W3E/O2A8lo5qpU+EdKlaan4MyOXlVxWwBWfnuknxfVfiFdOP8RwIqZupw xeRjML3shJOjFPEajnsHOyYBlY0gzV2iQRaGwVg3vmf2KC2FxGPuBNEuoBtDhvy/65qmBqWjW2DB tFqIx3vhTQtl1lDHp7BHuWr+ZzQ4uENC/aEe1xG6cHJY0cUA1dvBOXYh3EA/JEPSUj3tq5g79Ziz Syka15OxwseZ9f16SLkPw15aBsDUhQImzhFnwhwtKlI/kOrs6h3ygol+Qwp9dFN0JByDwtx2rJdS DkELZrSmMnuoF0dhkLyLwHEaILpbFNN7BKSfo/rvPDm7lPppndLaEK2G0gJh2XCQFk991jfoENiv 0E4hWp6m7cO09u938M41mRiE0DatVzkM0gIibSKh5XJadfLL2vMIKE/qWDOhZ484gdU80M7FhSx/ E2AAfs+WwMxn8X8AAAAASUVORK5CYII= --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_ Content-Type: image/png; name="BMBF_CMYK_Gef_XXL_e-60.png" Content-Description: BMBF_CMYK_Gef_XXL_e-60.png Content-Disposition: inline; filename="BMBF_CMYK_Gef_XXL_e-60.png"; size=8173; creation-date="Fri, 27 Jul 2012 10:06:49 GMT"; modification-date="Fri, 27 Jul 2012 10:06:49 GMT" Content-ID: <53AF5887-D1F2-4BAA-BF6C-CF01BF1C6DB7> Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAAE8AAAA8CAYAAAAngufpAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ bWFnZVJlYWR5ccllPAAAH49JREFUeNrkewd4XNWZ9je9aFRGMxq1Ue9yUZct29jGFRsXOsbYEAKB P+HfTXYXsrsQ2F1SSJ6QXTab5ae6YJrBYMC4yLKtbkmWrF6t3utoNJoZTZ/Z91xJxAYMMoEN+B89 95lyz733nO983/u+33eOhI3NLcReQqGQqi9WU39/PwUHB1NCYgJVlJVRbU0NLVuxklauXEFnz5zl 2mZkZZJpaoomJ8aJz+fTQl9oyzvy/vuPj46Oekml0nNGo3FHcEjIeFpqalVeXt76oKCgwY7OzqHk pKTMjIyM462trf6Dg4Mp5pkZ1z27dhUdeuONLVptqG5ocKjJQ7RN5e/fgvaUkpISMGM2F/T19+8M DQ2Z2b59+395KxTThw+/+xurzTYdGhLSEhISzLfZ7KLsZdmne3t70/fvP/DwmtWrP8rNy3Ns2rCh YdpgWKQ3GJaGh4V1hoaGviUUCswej+dLxyOkr/Hi83g0PDxMbpeTxGLxtRjPA6MJAjSahqampnEe kR3f6XReXnZGetoHtbV1Seiw98DgoK+fUpmt1+s7BAIBX6VSnf2P/3xe29XVfayyqiraYrHEakND p319faWG6WnBpba2AZfbbTZOT5eqVSp7QX7BUhimyGw2C+0OhyIpKamnqbl5p0gkmvGSy0c62jtO oW1KS0vL+2Ojo08VFRdFhAQF62esVkGLydhe31BvXch4vpbx0Alqu9RGBefOkkwuX/B1TqdTqgkI yJPJvbbAIPL29nb7yhUr/miz23dMTOi8MFCB0WRywTCFMMh9arWaecHx3r7eO729FCMxMdFDbrdb NDAw6FEqlaMOh0Pucrmc6M+wZdpoJR4JMEFy+IvJ7fEIFAqFQSgSDqGv/o1NTRfhqbZDhw5RfX0D D21pcnKSj7bTanVAD18g8JNJpVY4w2hYWJhHIpHQt+J5NpuN7rjjTtp+883Egxcu5MXa4XC89NJL a01mM2FglxCmIwjPmcioqMqnnn76h1mZmR3n8vObJFKJTuHtHYgBWXr7+rbg6rHHHnssd9++fXdG RUcNlZWXVym8vJwwXIjBYFD1DwykAQIMMGZCQkKi5UcPPXR8aHDQBW+zyOVy9a5dd+e+8eZbfsFB gU5vb2/C/T2YpHZ4tQvXVIeHh7UbpqaiWlrbpDD+jXv37h0OCAiYwmR/88bzuD2k8PUheA/JJOKv nKHLPNa1atWqf0c48usa6p093d3cRGi12s4VOcv/NTMzywUMcjMoiIyI/K+BgQFKTkzMh9U9S5em uLKysp4NC9O6iktK3cx1MPAaGI1wsBlknTgfFRlJYeFhNDw0ROMTE88uXbKEkhITPew5YpGYJGIJ wehutD0y163jzEgw2qVJvT5Xp9PxRkZGPG6Xi74V4zGXd+P5RpuZ7HY7yZkBF2J0GJnruMfjvrxj 7DPOObh3t5v7De3mJ8XJ3u12GyFkuTZuro2H3fDTW89/cLkxaIdjvpse1tY+932hvsEwnRHhl5Eh d57+ghcLRZvDSTM2Bwbk+fSYP3c9vubHZTSZ/zLjzd4MlAlvMFms3GHATccn9WSx2oh3nRmNN4f3 vv7+NDKh+8uNNxceV8yK1WbnDGi0WL5zHsjCeKEYzWDBNQspBHaiobFxau3sIYWPH/BT9M0Y74uZ leZC2v453GN/DDO4geC43MCeOQzk8XkLH+QVE4k/hlW8L1YJIpDGZXj6Z6O6rjQq6xuPL6DImBgy W600NDpO5hkL97vbM4vL/G/b1VlIW+2OuQHxIBOkzHQUoAmipORFJJHJOdJhPMIYDvKBtGHhAH4P yWSyuQF5Zt+5z7MTw5HL3HOUCKOcnBUE7cf9PjA0TDNWB2k0gRx5BAUF07oNG2n1mrV0/w8fpB23 3EIOPBNszU2in58/hUdGffo8dvj4+JKPvwrGiyOxVE5zjPwNsO2CeWvWAVysQzx4Gr55+ShJaDBR YtIiuvOuu2lsbIwKi0oIYhjh4EMvvfwK+cMYJaXnSRseQaOQHBKwuRAeI/NSkAWevOqG1ZwhTUYz PQBj3Hb7HRxRjQwPMSKnoZExRtG0/+BBGh8fg3HVtHv3HkIWQ61tbbR+/TpSqdXU0tQMo26iZTkr YUgbxcTF0+ncUyTGhIZHx3BZFMfcV4kA4f8+6sxalJMcc95kmjHTnj17uFRP7uVF0Fmc1/L5Qs5g Ajbj8EhflZqmkVPfftsdBAGNgebSmrVrKUSrpabGRnizhvNcpGVcGAYEaCgGYcfuZ7VauLBl3sZC b+fOW/AMAWkCAzlPtoLgomCwLRD+LLTZ891fARv8vyp6c/g3i1MCoZBL+ywzM59qvfkQmsci7p3J I/usERiYz6A9C0F2FzYh8/pxNrQdMIr1C/XmDPDL4bCTDUZj31l79t0KxeChhWEt/7vAgLxZNbsg guBdIYn/ui/+V45qTt9cn5L3L3sJPzutDDNEYi7m+U673ctmtYpwxiAQCF2s5ueeC6nLGVUkFF2R tjBNxF5isYi73zxLsbZCgZDLMVmIcikOj3cFHrPfGDMz3FkAenKTy65hfWPPmj2E3PXcvXCeG9Pc d6Fw9rlzJP7ptex31u5aZKnw8lDho9e6iYkEs8kYCsNZ9fqpcODJ1sryckt4eNiznR0diUHBwWV4 iH7eGAxjRoZH5nCDx2SH38TEeBq+89Rqdb1YIplgNbugwCCuc4ODAzktLc3xUonUpdFojlssFv1c xYWbGCtAfcZsDujqbPeHHmuTSmXk5SW/Sn4J/AMRTE1N8YcGB1IggxqEIpETuOk/Pjq6KiM99Sz6 YZ6c1Cu7OtpviI6KLOjv7UlPSoyvQr9N7JmTk5PkmpjwNU4blsxoNCVMNnEYjL4y6SJmpamrGU8B dmMv1smWpsanS4oK742LizsZFxf/8ujo6CZWNGy/dMkdExcX2dbS/I8NtTWqzIyMe1XR0XUMkI1G Ex09+iGrbHBsqZ+cTL9YUXaWGSI9e/ltvkrlUS1kyI6dO7hZLi7If0IgEl9Sqfy7kewLzCYT11F2 LfOA4uISxoppbx7YvzIuPuFftmzbgc6757yCx3ks511oi+DgBtfV3S08ferU3/qrA34mlUkNU5OT ifXVFz+IjIzclJGdda6osGDnkcPv7E9ITEzp7euTtba2CYYgQ0RzhVydTi9obW3xzlYo0A8JK1nR xMQEfXz0g5v8VaoLAYFBk6y+x0iKN0dwfAE8lrEV61B/f3/82by83TtuvfXp1PT0t5MTk3jnS4qD y8vKFt9y222HV61efU4bHl5w8tixd18/cOAf9j7wwH2WufRLioeZzCYuK+AO3hcfs5MktasDg17B 4FuDg4PFF8rK/mF4eCisp7v71YjIyM4Txz5+wuFwxkkk0oZb77iD+vv6nxwaGvRfuXLl80bDtKau pvp+vX4yV61Sz7Q0N90WGhpyMS4h8RC7uwADEsyGqkitDqjo7enOTsvIONfb05MSn5BQPqWfEphM JmlHR4dPa3PTI3ab3T81La1zx47th8vPlwqUvj7KmuqaRzvbL3nNmGfyjrz77q9uWLP2HT5fcKm2 ujoHNjNGx8RW4VGn+3t7H+EzizI1/fHRDx+Fy5twwl83oeMNjwx7EA7hrFPQTx2Dg4M0bZhKDA4J 6a2vq11jt1nTkpISKTEhnuLj4znZME8sVzPcXFhLmhvqnxsfHflNcWHhHZ0d7RFymbzglZdevv/g /gNbXW63PTwyIlcul3tamlt24rzCYXc0Hn777fsmdROLTEajHJFxCt4XlpGZVZJ/9uxtMEYoC9d5 WnO7XUhcZBcQOQEYZAJ+ckAUtxpNRsFgf/92wFBgd2fnCl8/39fPnM5dB8dJmBgdWVlz8eKqKb0+ GdlO+dH33x+DU1VDsH+EaLpBKBLqclasyDPoJ+8cHBiItZhnsvneUPX19XUrzuTl7kF6pISiFubl 5v7i2V/9+uWSoqJN7HxpcfGuTz768OnO9o7tUPBihLL2rddf/6WXXA7D+9CiRYu4kPu81PBcdszq K8yibfstt/7LDx986Inuri4XBGwCBpUEY7R2dnZqgkNCezC7bcAvO3SXCgZIhK4LD4+I7BaJhGKn y1mpVPq7ZmbMgc1NTeGACyFwmeOd+efgGQLou1GQ0+CZvNM/CwsPvwjIMPGZS6KfuFYQqtUOgySa p6b0VjiQyGK1yiCSj4dHhLdMwUCpaanDeDfqJ3WT6NtMaVFhhZdCUQnDOirKyh6IjI4+x8dM+rz8 wgu/wywWTk9Pe3/4wQePyr3k4xbLjIINloXBpE4X4uvrNz0yPKz9+OjRh5KSk6vMAOV33nrzCWY0 jSaAQA5cwv05Vebhcdg0bzx4BQ2PjJjHx8cpLT19FNg6kJ6R2Yd78jOzsnoMen1GT3fXDfAAicpf 1Yd+9eH8CK512+wOl93u8Ezq9WxVDz9nNsI7RAhFntPJKr+zB8uTJ3B/4FVxZUXF3UHBIaVm84wY Xg3ccnLngV88l2uuvZvDVGtN9cVt3t4+OjiMGcS1Gjk2HxlIhtlsEgwPDfP6evuQwsVVXqy8cBei 46wAWPwjJNbG0eHheGBOu16vD6muqroTBlvMJpGBM2Ze2NHevhF5YmJ0TEyHr5/fJNx5oLevX2Q0 mfq8FN5jDDtZLoiwjBoeHLifGSo4VHsYjNXKWGtZdjbHXmbTjCUiIqIVoWdZs2Z1T+WFSt/Kygux P3jggaMbNmyoyjt9OhYT4o6IjM6/9fbbSro6O8Kam5s0t9x623uwk8HLSzG4bt2No5qAgPGSkuLM lNS0ogBNYD1bcwgLC+sMDgl2BQZqXOijbtHipXXob3dOTk4pSMgRExt7SSqTTcATL0VFRU9ERkR0 +fgqrcuXLb80bTLptmzdcuHMmTMrMOaev/3pT9/29vE1gpVj0bYKY+zAZBmVSj9vTGz4rt279wvT MzP/VF9b86JEKrMjBHwArA0hIaHdkBThVotFCSpXhoSGDiA8J9ChaS8vLwNCQoa4T7rp5pv/USSW NBoMBmLe8PnaHTxXKCAAPrH1iJSUFIpLTDjm5+tLDGtZ7pi8ePEheDFJkYwbDNOUtHjJn1h+KxCI wHoyCo+KeoWVhoBpFBkZNS6SSDmoiI6NLUpITCpKAmSwokBgUFAemzCENpNP3WD5bj+lHyEM31Io vCgiMioXMgs94o2wboLdh73BrnJvn1x1gIoWL00ZCgwOIRj7d6zyA8xlXlYSFx9XwvrCnhsTH0dF +QUxGVlZnzBJI9Bqw1bKZNIQo3E6CDe0I6n2RieESJBf7O/rS4CRAm+/664/YrAOpFBiDFSBpFsc HRtT2dfb6wc5Vz0+pnNAH3K5JET1FZ4HY7cyip+eNlJGRjpnRHawASxKTmIG4x9+5x3pyOiIkwdx C4EtME0b5XaHXYywcoHRPSys4LU81n5sbJwL/ZaWFjJMGSgQ+nHGbOKmCsyL3NTChSxIiDtYRWZo eITXPzAgHhsbY7fkJAf6RWNop9PpOJnEcuAAtYpqa2s5/E5LS6WhwWFuTYR1ljnIXA7etnjJkjrc 281fvHhxKfMmXODEg73QSXOoNrQ9ITGhFUy3hAFsQ3392nv27HkeIa3Cg2xgKZDTSMLynJwPw8O0 5vj4GDxQcNXclKuPeSvI18ebk0VME7Iwr62tI7Ddk9HRUbtZ5lFWWir65OgHhxrra1/raGv9d0xe gBFGLzx7VgHCespoMgvYjoZpXN/Y1AxvYErBmxt4dFQUZaancf1gxomJiWZGFoIEfQb6B4IKzp55 8GzuKbLMmJHlCEgDjO7u7maCHDKtj5Yvy6aEhARSBwRwdUW3x3NFEj1foED2YmE1B67qg7AR+/gp z8OFR2RyuQUzoUnPyHrv+ef+8B8KhbcJeDjT3Ni4vKiwcAv0Xy5CQgKSUS9asuRdoUjsSIM3wdh0 tXydPZCF6ObNm+A9UjYwJVh2K4Bfc/jwYeXJEydWxcTFNzJyGR+fkCIbEMXExv0oMSnxka1bbxpN Tk4K9/X1XQ3PDuvvH5B0dXUGgjn5uE8wy7HgmUHIfDZDX3qL4WXA6vi+np71LFTzck9teeXFl15A HzyZ2dknn3jqKZKIxcmX2trWw4ME0JIKMG8CiGAdiEPOPPBHDz1Id911F8kQugzvv0hBzP8kDAvT IkJs7dA95VD8e9at33BQE6gZWr5iRUFnR+fp9kttO7bu2HkI3inEuf/X2ty8IiRUWwv371+zds0l hnLdbN8K0hzm7p/bJYBZDA0J4fJV5i2nT574nWHaOFCUf+7OuISkP0E0+0JHhTDBjVwYSsSpPPDa qz82TOl7TUYTE7i/h7ZsxYQFjI6OhLU0Nvxk+bJlv2iovviUODvr73JP9P92aHBw6u033liln9T/ tiA//2/g2cYTnxxfApbXQwaFNDXUJdustk0jS5e8XlSQ/xSkTP+pE8eThoaGWsfHxh9FN7sg0bJv 3r79txyWwWidnV2z0or3JcuPJ0+courq2nFgWT4Y7j3MSO342Fg4hPG5AE1AD1jKKyw87BzCbQBp WsTilLQXbA57pVAkOVNeXkEVFyqptLSMA+1ZScL73L4Wdq6+sYkKi4q1+qkpbwDuM5AapvfeeYst cFft2bu3BF5NEMO4Bd8Jgw7IpLL+QwcORAMfh3GPZ+HxoxiMGO9siwRL9dxmoznWMD1tj4mP/1lf X98foQo8QUFB7ZAu3nV1ddqmhoYKaNALMGo9HENcV1ubIZHJigJDQn57vrQ0ubKiXAKJVIhc/iXk xgECQArDN4ax7J2FMKtqz65v0OeKBkIGvHO5bdf2HTuOFBcX/Z/6+oYRyJYz8KQbTcbpMG9vbwnS ljZ/f+UWyIGWsPDQNy9erOWSeDs8i60Z8IQiYpmfgxUr2aI4Ky6y78QnG4zntDvwo8eBpF3ItjxA 8cvgbQ7QPo/DE7Sx2e1sr4lxSVr6MbQzQoSnApNEqoAA6cTEuAz5shtYw1IsJtQl8BI79KgY0OMd GhqaOTQwaES2Eb946dJjXR2dN5lMRh5wnMc8iek5tt3D6XB6+SuVcpbLi4QiN845GRmA3LiJvmIx Cv1iuTNvrvLy2RAW7kTCzn7kSjpCoQ7sW4tQ4/n5+Ypzm5vWQ4W729vaNicvWvQaPLLLT+lfywYf EqIlCei8u6qCPF1TJAReMBf3c1nJ7Jo1RpTbRt4uM6ndXpSdnQnsk46Oj4005p48cRAyY1IdoG4q KymZACa68DzSakNtdovZYzGZXsGgptdv3PibhrpaJ3LUZzUajSkzI70L5CGvqij/p4SkJPvdu+9p R/YzhgT+tY0bN+anpqYeLsg/5zM5PnG7Sq3qz1qWPVpWXBwPmbMmMipqeOvNN5/rbG9/rqmxIeXm bduOnD2TN6VSqfQ4Zw8ODhqTy2Vcfmyfq0zPGzI0NBge6AfJNXyFAYVarfYy9e92hUdGfqyf0rMb aO/de98PkLzzGxubnMg+BiEyDyIMPH5+fvAs3NxqofSTr+Iu44gjEYtR8vAt5HSLWIJJQt4U8cTo yOgg8XUDJMq5kTZu3vwrnX5KA0zVxcbGetZv2PhMVFSEC15NYH47mHMP+iFny3uBgYHTYP+f9/b2 KrZt26ZX+Svdq9as+QmMLVm2fNm0SuXPZMaTTg/5bdi0aSIuNsazcfOW/8vK6VnZ2SYQjRsetocx +85bdloiI8K563t7++UbNm0cW5qaygtQqytWrb7Bo/D2fk4OQistLaV0QAhzECUm9PI6I1MKNVAI 82u/ws9uZpldS/WwXUwDQUHBA1FRkUwnEaQJVypn7bmlQrGUpJX5JBjsQczL5gEO2Rjv0wq0gKuy CGbve+Z9ci7JYitpLojcYfYbuw864RodHSPIIK5AgYHau7q77UgFaevNW1kVxubj62v76KOPiaV0 EK0zScmJM0yvDaNf0waDA/g4jhx4tshJ7un54ivTe/AHMwtbVrgAWzP8NcHgJkQROZwuRsQeMD+B kFzvHn6X05HMeJ/VDsxY8XGxnCxim4WYJBJ+2co6E71sgEyUXskCAuKZDCSryGNbSq94DPvs/myN n5WJDJMEBU0yuRchT+YU/NtvvkkXKsrpnr17obESIVLVVFhQQKdOniQw/uobVq+uhkeY+GNjPqXF RfcAw5T+St+mrMzMY9XVNVRUVEzIf2nDxvWsdEUXL9ZQTHQUKaD9IF+4CUlOTuY2+lRcuMAt9mRk ZNCJ48dp/759lJ2zgsvdL7W3U1trKxUVFPIwOTcCF8/PmGes/shQ5qN0bpMSLV2y+NNFpq+1AOSB 5aW1JSQY7SfPZSX4qy6FuPCwtdtJoFKTlCu/84QAYt6kboIgkZgo5cMbeCAjylmxgmO4jtbWvfBC JXJnGujvj5RJJbuR/zZcKK/4u3379q+b0E1ysOPr68PLgcBFiEIWOYURkRGUEB9PNpudDwcQLlq8 iJKSktiqGJOHAhgHIangDAmj8ua2VAgiIsJJrxvjtzY1PohI8GZeykr3fy7tzx7zW5C545oNB6/j G6dIWpZLYJivNpzbSU5VMBkTs0iqn6Lqqqq7z+XlbbaYzfXIJ59vamr6Z2QWQSAlFzDr7y9WXnjY 7XJmg42jMTAbnwNtnhCCfcTPX3UcHnmTQu7l0I2Pr8o/e2bvYHz8ABL2X9bV1PygtKRklUwsqth8 05aDxYUFT4OhNTExUS8D9zpLigr/DYPnZWZkPAviCh/o7/uxp6T4mEQo6Dx+7OOfJyYl9a65cd0v SgoLpR9+cOSXENPWuLjYx3GNw/ONLXqzMnVdMQnGBskzj3Vf8hJg8DqHk9468gEMz6eW+nqPv0p1 Kvfkyb0/feyx96CzUrKWLXsSCffjf//Tny2DHluzas3ax8tLS19GmAg83KZIobW7qzP19X37jiBL 8dMEBZW/+MJ/H7hpy9ZXi4sKH3gTArnm4sUdMMCHJvNMx+HD72zDY33SMzLzDu47cN/KG274NXLR Op1uIu6ffv74Woh9YlWXrGXLj5w5ffpgYlLyn5CKLpXLZBq26TwhKfkPp0+efPLjj48lawID6z5f avsa67bM63jM68pPc1i3sHVt5IOsIhEbj3BKYAn5EjBZIFJBPkSxYGxkdLKpvn4AksEMUlIAvK3A lCFkGxMwnGCOWKSQJlV37773DqSPwxXl5ctcDqcAEmU50rZetVo9lLV82TPIRm4CTifZbXYxPDG8 MP9cKF8gqLBaLdEI6UXwQOHoMCc3PBfKy2vgqUx+KGC0YUiwVyG09TDqDMTxIEIaHGISTk0Z6GrH NRnPLZGRhGHd2MCCsO7TbVoisBMIgm3L0k/qYiB1dJhg5JUzPLbj0wIGA4sJpDLpWFBQoNhsND4A ERwDdnfMb28FiPPY1ggY3Qxh7A1J1YdjIiQ0VIm8Vt5YX/9IdGxMflND4xp/tXpAGxY+EarVuiMi IiTDQ0NaNom4VrA0Nc0Lz2E5MMgkiSB5+rq7u+81ThteDwoOCQQ2OrmiKtuWAbnk+ZK/BYetgy8k 79FeCmgtILdg4dHugnBWmvW0JjGKxKERyDSsvy8rK8+Bhnw2KiZa98OHH94PT2T7Xt+wW20dbo/n uTN5eavu3LXrmdi4mCmW84IUeswm038nJyXQjx999CV41hREcNW+V1+7HwTzya5duxoxEYcqL1xY ddPWrc9A43VAiqiBeZH37L7nQ4hx43O//4MvrplavjynFmLY8dS/PSPIzMpE+Cp//Z/PP78nMyv7 xZiYmB7IoVdS01NswkcePgjx3P3nnVpfE/OcMEAw30lLLuSSROCGIQULd1e2ZKifIFFZHvF3/4SQ 5NdkZmfXJC9KZhUOthmnkhUwoQirmKQYHBysS83IqEtNz4BI9SO2NCqXeU3CHUoZSy5avLhmFFpM CPxMy8h8Lgbai3lmcHDI+exlOeehCbmSVGxc3IfhEZHcljbebNtX1WoVRUZGkkgsonUbNxLLKOw2 20RKWvrzak0AwZMpOi621B/PFYkllb4QyrOltq+xoZvbYo7BOzD7m3sqyZ/v4jzwWjdFMZLhlZ8h unEbsTU2ph3hZVxdje1cQr7PFRpn9+m5WIGADYrTmnPrDdzBRDrXhu2Vw6DY59nN2x4ud53VpLMq k7Wf/86ucwAe2Hckt5yGZxuK2OSxooWDu+dskdTlcnIHkzKOOUK8mud9KeYxw7nBlMsGG8lHP0IO wdfckcb0kUFH/MLjoE4xXS8v4VVdjpWXPAKSFRyjqK5asomkALAFOBpr45grqzh5s+kG+02I8KnI J49XKD6LrmPjMduBGX3Pv0VBU91kD5AtbGO2YLaexwtzsH+IIJ4a3/z4c0ZHiuaaIu+WM+SJ2cgW Yq5P47k9PJLwXKTJmSBBEMDTvdCBAlxddhJYejmJwpfHITfww2fnpwCY4BpGCF8iuzuK25t8HXre 3Aq/FKclnN5Y8HY/HivCzpcKJDyuKDAPrVwFGH9L5BXUa5aQnbf1u7K/8psN2ytKJO5rIFf37MGZ 0n3ldov5Wzrw2FBpITkFU9TmumWuG87vnfG+8WnnXXZ4vqSVh0QkdtSR0nMWms1Gc5nY/9/Gu9zD eF9hZpdHRCGyYgrj74e2spKHJ/xe0ci3CjgL2Z7tAagKnN2kcJ4FWXP/G7Tg3ejfbcz7mga71qEz DwwUn6dgcRcNO7K5kL4+PI93bQe3fHHZTwtHSgEk0SAFiCpmV+qvC+N5ru24PA28VgM4XCKK0OhI IUUe6+FdB8b7XwrdK73wetB5f43he743tvt2dN73z4e+I8bzfL+c5zsXtqw6aWH/mYTDfj173/8I MADOw7X/kZPH9QAAAABJRU5ErkJggg== --_005_CBC5629F5E78A84A853AD8A3D5AF81BF1E29752Aex03acampusfube_-- From tmy1018@gmail.com Sun Jul 29 16:21:48 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SvUNP-0001s5-Bm>; Sun, 29 Jul 2012 16:21:47 +0200 Received: from mail-ob0-f182.google.com ([209.85.214.182]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SvUNP-0000a0-4H>; Sun, 29 Jul 2012 16:21:47 +0200 Received: by obbun3 with SMTP id un3so9617549obb.13 for ; Sun, 29 Jul 2012 07:21:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=0ZAypr7TS9AK496fwNnIpJhCl339/ITnXw2xHw/vN5I=; b=0EvRaRRFwzRjGg5LtQWK3FYaU+H4qTdupq1tsJ5fy5oxPPkp0CjTwwJXU0BbYSDqR2 1guhbBS54pZmWyvmBu7w7gYmsxl2Jmixv/KP+zd+T6625OORauAF4Qga/PTEPuUb+dkt Cco4gJu6a8QUSHBhZgjsguVdJhBnurisFo51VkHA2NL3RipsgXGtTZ2TailK92hMWE+D l96qTPCFzr/KfgP8k5Cf+ZMPMzuYJHVF3RXubgcGoLhLm0xQ3m8Om2CmCrwDZtHG8Zol el40a6xHOygush9ArAOHn3H31iWHDYs2GeGaelojxfgKMgHNGBUJT67a5wKAfCf1LJR/ Rlxg== MIME-Version: 1.0 Received: by 10.182.75.100 with SMTP id b4mr13062927obw.12.1343571704718; Sun, 29 Jul 2012 07:21:44 -0700 (PDT) Received: by 10.60.169.105 with HTTP; Sun, 29 Jul 2012 07:21:44 -0700 (PDT) Date: Sun, 29 Jul 2012 22:21:44 +0800 Message-ID: From: =?UTF-8?B?VGlhbnlhbmcgTGkg5p2O5aSp6ZizIFRvbW15IExp?= To: seqan-dev@lists.fu-berlin.de Content-Type: text/plain; charset=UTF-8 X-Originating-IP: 209.85.214.182 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1343571707-00000D73-433B045D/0-0/0-0 X-Bogosity: Unsure, tests=bogofilter, spamicity=0.526808, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Dschibuti.ZEDAT.FU-Berlin.DE X-Spam-Level: xx X-Spam-Status: No, score=2.4 required=5.0 tests=DNS_FROM_RFC_ABUSE, FROM_ENDS_IN_NUMS,FU_BOGO_UNSURE,RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS Subject: [Seqan-dev] Is it thread safe to use Seqan for sequence alignment? X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jul 2012 14:21:48 -0000 Hi, I have a question about Seqan's thread safety. Currently I'd like to use Seqan to do Smith-Waterman alignment of nucleotide sequences, and I'd like to speed it up by multi-threading. But I'm not sure about whether it'll be thread safe to do so. Thanks! Best, Tianyang From tmy1018@gmail.com Mon Jul 30 05:09:31 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SvgMM-0006L4-T2>; Mon, 30 Jul 2012 05:09:31 +0200 Received: from mail-ob0-f182.google.com ([209.85.214.182]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SvgMM-0008Bd-Lw>; Mon, 30 Jul 2012 05:09:30 +0200 Received: by obbun3 with SMTP id un3so10660064obb.13 for ; Sun, 29 Jul 2012 20:09:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=cXHPpjZE5fN8lDPAZGbyNNHuAN5mnrqgBMEDZzIBbeA=; b=i+eOXd8n2xM9mJxwansMloPjco9OnxoINhJX6dgmVDHzFeSYBiPd8ZviwjlTIcg6Qw S/VjcZz+EMHp5IdaJpaJjQMgNyYhyqR7+1qJ4IylHjz6p6BnrfuaDwGKj8rpw6twgjIT 5u7HhcuxzvMKvT3rRpEMQpZOCpinkSoVdJk+kF2a7J1sJjDJK7xCQ64CttfcT41o3m1a f20wqpOEJFKon0Hr7g9Y+/IuP7E7GrVUkVvI155n91wD5eaIoiU40qT7ggTAmtfc9ljU IaPsa/2B1P9+y0ZtTZwMH3TD/XA6T0llniusCsaZKsHAlx8CYDFCGitV5xBzPg/iM/RN qLEw== MIME-Version: 1.0 Received: by 10.182.149.105 with SMTP id tz9mr2972490obb.65.1343617768258; Sun, 29 Jul 2012 20:09:28 -0700 (PDT) Received: by 10.60.169.105 with HTTP; Sun, 29 Jul 2012 20:09:28 -0700 (PDT) In-Reply-To: References: Date: Mon, 30 Jul 2012 11:09:28 +0800 Message-ID: From: =?UTF-8?B?VGlhbnlhbmcgTGkg5p2O5aSp6ZizIFRvbW15IExp?= To: seqan-dev@lists.fu-berlin.de Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: 209.85.214.182 X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1343617770-00000D73-3033FA22/0-0/0-0 X-Bogosity: Unsure, tests=bogofilter, spamicity=0.490896, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Gabun.ZEDAT.FU-Berlin.DE X-Spam-Level: xx X-Spam-Status: No, score=2.4 required=5.0 tests=DNS_FROM_RFC_ABUSE, FROM_ENDS_IN_NUMS,FU_BOGO_UNSURE,RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS Cc: anne-katrin.emde@fu-berlin.de, david.weese@fu-berlin.de Subject: Re: [Seqan-dev] Is it thread safe to use Seqan for sequence alignment? X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jul 2012 03:09:31 -0000 Hi, Sorry for sending this email directly to all of you, but I was wondering if I can use Seqan's alignment functions in a multithreading setting? If not, are there other C/C++ libraries that you would recommend? Thanks! Best, Tianyang On Sun, Jul 29, 2012 at 10:21 PM, Tianyang Li =E6=9D=8E=E5=A4=A9=E9=98=B3 T= ommy Li wrote: > Hi, > > I have a question about Seqan's thread safety. > > Currently I'd like to use Seqan to do Smith-Waterman alignment of > nucleotide sequences, and I'd like to speed it up by multi-threading. > > But I'm not sure about whether it'll be thread safe to do so. > > Thanks! > > Best, > Tianyang From tmy1018@gmail.com Mon Jul 30 10:39:44 2012 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SvlVv-0007gT-E1>; Mon, 30 Jul 2012 10:39:43 +0200 Received: from mail-gg0-f182.google.com ([209.85.161.182]) by relay1.zedat.fu-berlin.de (Exim 4.69) for seqan-dev@lists.fu-berlin.de with esmtp (envelope-from ) id <1SvlVv-00049Z-6n>; Mon, 30 Jul 2012 10:39:43 +0200 Received: by ggnm2 with SMTP id m2so5268949ggn.13 for ; Mon, 30 Jul 2012 01:39:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=GmypEhssX4rnD96iXMOqk/WHUToKXesVXZic2BBETAA=; b=ozZBClyKV9Spchi/KcZvUxxZrJApzd6d9C7Xg76K7NXLEbYilr11lPEwEYpBSIEVB4 SBkGp3pBhDEDC7nAeCwnrCpXJDbip0H3YJPDOQWA+HETOWT3anfC1SJc6bRQ9SvBq+PO hMYyqnXu9333NtqrY2ZIVwQDym/KpO23xfykVH5B+7cg9EXdVPtoJ7b/K9LOOfagqLWS ewJmbercVMYSvKJGFDD/cnPctMGLMzrULtPq7OJAyFsOYUaJom7V+/W5NoERCedO67m+ qIYdwLwRwjBoB0B2hRIClN29J8fzfSjMev7HI48ZybhuAHN0FFJAHCAw7kqfac8Py8/f hIEQ== MIME-Version: 1.0 Received: by 10.60.12.8 with SMTP id u8mr16057338oeb.46.1343637580977; Mon, 30 Jul 2012 01:39:40 -0700 (PDT) Received: by 10.60.169.105 with HTTP; Mon, 30 Jul 2012 01:39:40 -0700 (PDT) In-Reply-To: <1E17D2681681D24D9E09E2B9A726C6D41E25D025@ex03a.campus.fu-berlin.de> References: <1E17D2681681D24D9E09E2B9A726C6D41E25D025@ex03a.campus.fu-berlin.de> Date: Mon, 30 Jul 2012 16:39:40 +0800 Message-ID: From: =?UTF-8?B?VGlhbnlhbmcgTGkg5p2O5aSp6ZizIFRvbW15IExp?= To: =?UTF-8?B?UmFobiwgUmVuw6k=?= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: 209.85.161.182 X-purgate: suspect X-purgate-type: suspect X-purgate-ID: 151147::1343637583-00000D73-4DD229C3/3627206074-0/0-1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.068899, version=1.2.2 X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 3.0.4 on Gabun.ZEDAT.FU-Berlin.DE X-Spam-Level: x X-Spam-Status: No, score=1.9 required=5.0 tests=DNS_FROM_RFC_ABUSE, FROM_ENDS_IN_NUMS,FU_XPURGATE_SUSP,RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS Cc: SeqAn Development Subject: Re: [Seqan-dev] Is it thread safe to use Seqan for sequence alignment? X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.14 Precedence: list Reply-To: SeqAn Development List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jul 2012 08:39:44 -0000 Hi Rene, Thank you very much! Best, Tianyang On Mon, Jul 30, 2012 at 4:37 PM, Rahn, Ren=C3=A9 wrote: > Dear Tianyang, > > if I get you right, then you want to compute all alignments within a set = of > sequences in multiple threads. In this case you should be fine with our > alignment algorithms as there are no manipulations of shared data structu= res > involved. The only issue that has to be addressed by your program is the > management of the alignment data structures in which you can store the > alignment. Each thread would require its own alignment data structure to > write the alignment, otherwise multiple threads would manipulate the same > data structure, which is not handled by our alignment algorithms or data > structures. > > Kind regards, > > Ren=C3=A9 > > On Jul 30, 2012, at 5:09 AM, Tianyang Li =E6=9D=8E=E5=A4=A9=E9=98=B3 Tomm= y Li wrote: > > Hi, > > Sorry for sending this email directly to all of you, but I was > wondering if I can use Seqan's alignment functions in a multithreading > setting? > > If not, are there other C/C++ libraries that you would recommend? > > Thanks! > > Best, > Tianyang > > On Sun, Jul 29, 2012 at 10:21 PM, Tianyang Li =E6=9D=8E=E5=A4=A9=E9=98=B3= Tommy Li > wrote: > > Hi, > > > I have a question about Seqan's thread safety. > > > Currently I'd like to use Seqan to do Smith-Waterman alignment of > > nucleotide sequences, and I'd like to speed it up by multi-threading. > > > But I'm not sure about whether it'll be thread safe to do so. > > > Thanks! > > > Best, > > Tianyang > > > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev > > > > --- > > Ren=C3=A9 Rahn > Ph.D. Student > ------------------------------------------------ > rene.rahn@fu-berlin.de > +49 (0)30 838 75 277 > ------------------------------------------------ > Algorithmic Bioinformatics (ABI) > Department of Informatics > Room 018 > ------------------------------------------------ > Freie Universit=C3=A4t Berlin > Takustra=C3=9Fe 9 > 14195 Berlin > ------------------------------------------------ > > > >