Re: [Seqan-dev] Best index for task

Marcel Schulz <maschulz@andrew.cmu.edu> · Tue, 22 Mar 2011 12:23:08 -0400

Hi John,

you should definitely try the Wotd-algorithm if you do a pruned search. 

We have had huge improvement over the ESA for different problems in 

terms of memory and running time using the Wotd, especially if your 

working on a range of suffix lengths where it is more technical to 

employ Q-gram indices.

Also, the pruning with the Wotd-Algorithm is very effective if you work 

on large alphabets, like proteins.

Bests,
Marcel

Am 22.03.11 06:50, schrieb John Reid:

Hi,

I have a motif search algorithm I have coded using a enhanced suffix 

array. I'm wondering if its worth investigating other indexes to see 

if they are more efficient. The algorithm builds  an index over a sets 

of sequences, say 5Mb average total size. My algorithm descends the 

index to a given maximum depth (say 20 bases) many times but never 

goes deeper. It doesn't descend all paths, it does some pruning on the 

way down. Up until now I have been using the IndexEsa. I notice I 

could also use the IndexWotd, the IndexQGram or perhaps something from 

Pizza&Chili. Has anyone got any recommendations about what might be 

quickest for this sort of task? I realise I haven't given you too much 

to go on but perhaps it is enough without describing the algorithm in 

full. My code compiles with either the IndexWotd or the IndexEsa but 

with IndexQGram I get compilation errors. Should these indexes have 

the same programming interface?

Thanks for a great library,
John.

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

--
------------------------------------------------------------------------------
Marcel H. Schulz
Ray and Stephanie Lane Center      email: maschulz@cs.cmu.edu
for Computational Biology  		
Carnegie Mellon University	
7413 Gates-Hillman Complex
5000 Forbes Avenue
Pittsburgh, PA 15213
http://www.cs.cmu.edu/~maschulz/
------------------------------------------------------------------------------