Re: [Seqan-dev] Best index for task
Hi John,
you descend in the suffix tree using a TopDown iterator? IndexEsa, IndexWotd and IndexDfi are the only indices that supports the suffix tree interface. IndexDfi is certainly not what you want, except you are mining strings with a certain frequency. In our applications it turned out that using the Wotd-Index for iterating only parts of the suffix array is much faster than constructing the whole enhanced suffix array (SA,LCP,ChildTab) as the ESA-Index does. So without knowing the details of your problem I would recommend the IndexWotd. The q-gram Index only supports searching for occurrences of certain q-grams without providing a suffix tree interface. However, a q-gram Index is the fastest index to retrieve occurrences of substring up to length q.
Hope that helps,
David
Am 22.03.2011 um 11:50 schrieb John Reid:
> Hi,
>
> I have a motif search algorithm I have coded using a enhanced suffix
> array. I'm wondering if its worth investigating other indexes to see if
> they are more efficient. The algorithm builds an index over a sets of
> sequences, say 5Mb average total size. My algorithm descends the index
> to a given maximum depth (say 20 bases) many times but never goes
> deeper. It doesn't descend all paths, it does some pruning on the way
> down. Up until now I have been using the IndexEsa. I notice I could also
> use the IndexWotd, the IndexQGram or perhaps something from Pizza&Chili.
> Has anyone got any recommendations about what might be quickest for this
> sort of task? I realise I haven't given you too much to go on but
> perhaps it is enough without describing the algorithm in full. My code
> compiles with either the IndexWotd or the IndexEsa but with IndexQGram I
> get compilation errors. Should these indexes have the same programming
> interface?
>
> Thanks for a great library,
> John.
>
>
> _______________________________________________
> seqan-dev mailing list
> seqan-dev@lists.fu-berlin.de
> https://lists.fu-berlin.de/listinfo/seqan-dev