Re: [Seqan-dev] Client-server lambda?

Hannes Hauswedell <hannes.hauswedell@fu-berlin.de> · Mon, 11 May 2015 14:01:39 +0200

Am Montag, 11. Mai 2015, 12:05:55 schrieb Jose Manuel Duarte:
> After trying lambda a bit, I am quite impressed with the results. One
> logical question that comes with it, is whether a client-server
> implementation of the search would be possible.
> 
> As I understand it, loading the indexed database into memory takes quite
> some time (> 1 min for my db), then the multiquery search is very fast.
> That's fine if one has just a one-off multiple query. But for other
> use-cases one would want to query it from time to time. The optimal way
> to do that would be to have some kind of client-server system where the
> queries are served by a server having the db pre-loaded in memory.
> 
> This is for instance what BLAT or SANSparallel do. Is there any plans to
> implement such a feature in lambda? That would be an incredibly helpful
> feature.

Thanks for the feedback! Most of the use cases reported were very large query 
files (where the loading time of the database is small compared to the total 
time), so we haven't thought about this, yet. However I do understand that if 
you repeatedly search few sequences in a large database, the database loading 
time will be a large factor.
Have you tried storing the database file (including lambda's files) in a 
shared memory filesystem, e.g. /dev/shm ? If you do this all data will already 
be in main memory when the program is started -- however it will still need to 
be copied around, so of course its not optimal. Also during program run-time 
the sequences will both be in the program's allocated memory and in the shm, 
so they will effectively use double the space. But it might still be 
worthwhile for you, I can't say without knowing the exact use-case and 
hardware available.

What we also have planned for one of the next releases is using mmapped IO for 
the database loading; this should reduce the initial loading time 
significantly (although obviously it still has to be loaded from disk one way 
or another).

A seperation into a client-server architecture might be done in the future, 
but I can't promise a time-frame for that.

Best regards,
-- 
Hannes Hauswedell

PhD student
Max Planck Institute for Molecular Genetics / Freie Universität Berlin

address     Institut für Informatik
            Takustraße 9
            Room 019
            14195 Berlin
telephone   +49 (0)30 838-75241
fax         +49 (0)30 838-75218
e-mail      hannes.hauswedell@[molgen.mpg.de|fu-berlin.de]