Re: [Seqan-dev] Best index for task

"Weese, David" <weese@campus.fu-berlin.de> · Wed, 23 Mar 2011 19:44:30 +0100

Hi John,

you descend in the suffix tree using a TopDown iterator? IndexEsa, IndexWotd and IndexDfi  are the only indices that supports the suffix tree interface. IndexDfi is certainly not what you want, except you are mining strings with a certain frequency. In our applications it turned out that using the Wotd-Index for iterating only parts of the suffix array is much faster than constructing the whole enhanced suffix array (SA,LCP,ChildTab) as the ESA-Index does. So without knowing the details of your problem I would recommend the IndexWotd. The q-gram Index only supports searching for occurrences of certain q-grams without providing a suffix tree interface. However, a q-gram Index is the fastest index to retrieve occurrences of substring up to length q.

Hope that helps,
David

Am 22.03.2011 um 11:50 schrieb John Reid:

> Hi,
> 
> I have a motif search algorithm I have coded using a enhanced suffix 
> array. I'm wondering if its worth investigating other indexes to see if 
> they are more efficient. The algorithm builds  an index over a sets of 
> sequences, say 5Mb average total size. My algorithm descends the index 
> to a given maximum depth (say 20 bases) many times but never goes 
> deeper. It doesn't descend all paths, it does some pruning on the way 
> down. Up until now I have been using the IndexEsa. I notice I could also 
> use the IndexWotd, the IndexQGram or perhaps something from Pizza&Chili. 
> Has anyone got any recommendations about what might be quickest for this 
> sort of task? I realise I haven't given you too much to go on but 
> perhaps it is enough without describing the algorithm in full. My code 
> compiles with either the IndexWotd or the IndexEsa but with IndexQGram I 
> get compilation errors. Should these indexes have the same programming 
> interface?
> 
> Thanks for a great library,
> John.
> 
> 
> _______________________________________________
> seqan-dev mailing list
> seqan-dev@lists.fu-berlin.de
> https://lists.fu-berlin.de/listinfo/seqan-dev