Hi Beifang, the problem here is that the seed is defined on the original sequences, but the extension is conducted on the infixes. So, we got some confusion about sequence positions. If you specify the seed positions relative to the infix start positions, it will work. In your example this would be: typedef Infix<DnaString>::Type TInfix; TInfix infix0 = infix(seq0, 9 - 4, 9 + 7 + 4); TInfix infix1 = infix(seq1, 10 - 4, 10 + 7 + 4); TSeed seed(4, 4, 7); writeSeed(seed, infix0, infix1); extendSeed(seed, scoreDropOff, scoreMatrix, infix0, infix1, 0, GappedXDrop()); Sorry about that. -Birte From: Beifang Niu [mailto:neilniu.cn@gmail.com] Hi Birte, I test the extension limit but it doesn't work. It still exceed the extension limit. (attachment is source code) Is the extendSeed seed extension globally? I got one alignment with score -1 from 20 exact match seed extension and the match number is less than 20. It is not what I want and actually I just want a best local extension around the seed and this extension includes the seed part with score >=20 at least. There is also no extension score cutoff for globally extension. I tried use extendSeedScore to do extension and set the score=20 (A reference to the score of the seed. This will be increased by the score of the extension.) . I still got the extension with score -1. It is so lower score extension. thanks, Beifang. On Fri, Oct 28, 2011 at 2:25 PM, Kehr, Birte <Birte.Kehr@fu-berlin.de> wrote: Hi Beifang, in non-template functions like main you do not need the keyword “typename”. But keep the “::Type”: typedef Infix<DnaString>::Type TInfix; -Birte From: Beifang Niu [mailto:neilniu.cn@gmail.com]
Hi Birte, There is error when I try to compile the limit extension code you provided. " typedef typename Infix<DnaString>::Type TInfix; TInfix infix0 = infix(seq0, _max(0,leftPosition(seed, 0)-100), _min(length(seq0),rightPosition(seed, 0)+ 101) ); " error message: ./seedextend.cpp: In function 'int main()': ./seedextend.cpp:48: error: using 'typename' outside of template when I changed the code like this: " typedef Infix<DnaString> TInfix; TInfix infix0 = infix(seq0, _max(0,leftPosition(seed, 0)-100), _min(length(seq0),rightPosition(seed, 0)+ 101) ); " error message: ./seedextend.cpp: In function 'int main()': ./seedextend.cpp:50: error: conversion from 'seqan::Segment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna_>, seqan::Alloc<void> >, seqan::InfixSegment>' to non-scalar type 'seqan::Infix<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna_>, seqan::Alloc<void> > >' requested any ideas? thanks, Beifang. On Fri, Oct 28, 2011 at 12:21 PM, Kehr, Birte <Birte.Kehr@fu-berlin.de> wrote: Hi Beifang, Apart from modifying the function extendSeed, there is another simple way how you can limit the extension to up to 100bp. You can use infixes of the input sequences that you pass over to the function extendSeed instead of seq0 and seq1, e.g.: typedef typename Infix<DnaString>::Type TInfix; TInfix infix0 = infix(seq0, _max(0, leftPosition(seed, 0) – 100), _min(length(seq0), rightPosition(seed, 0) + 101)); TInfix infix1 = infix(seq1, _max(0, leftPosition(seed, 1) – 100), _min(length(seq1), rightPosition(seed, 1) + 101)); extendSeed(…, infix0, infix1, 2, …); In order to get the number of matching positions of an alignment you have to iterate over the alignment and test at every position if there is a gap in any of the sequences, and if not if the characters are equal. Have a look at the Alignment Tutorial for an introduction to the Align data structure. http://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Alignments -Birte From: Beifang Niu [mailto:neilniu.cn@gmail.com]
Hi Birte, I have other questions for you. One is about the scope of seed. I just want to do seed extension around seed because there are so many seeds (maximal exact matchs) between two genomes. There is just one seed extension example for two sequences in the tutorial. Can I set the scope of seed extension? for example, I just want to do seed extension within 100bps around the seed in two directions, not the extension on the whole sequence. I checked the code of gapped extension and found the prefix() and suffix() function. I don't know if it is feasible to modify these two functions to get the part prefix and suffix of the seed. It will be simple to ungapped extension and I can directly give a threshold to limit the seed extension within 100bps. another question is : How do i get the match numbers from globalAlignment results of the seed? thank you, Beifang. On Thu, Oct 27, 2011 at 3:41 PM, Kehr, Birte <Birte.Kehr@fu-berlin.de> wrote: Yes, in the case of gapped X-drop extension you have to do globalAlignment. The function extendSeed does not do the traceback and does not determine the number of matching positions.
Sent: Thursday, October 27, 2011 11:15 PM
On Thu, Oct 27, 2011 at 2:08 PM, Kehr, Birte <Birte.Kehr@fu-berlin.de<mailto:Birte.Kehr@fu-berlin.de>> wrote: From: Beifang Niu [neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com>] Sent: Thursday, October 27, 2011 9:19 PM On Tue, Oct 25, 2011 at 3:00 AM, <seqan-dev-request@lists.fu-berlin.de<mailto:seqan-dev-request@lists.fu-berlin.de><mailto:seqan-dev-request@lists.fu-berlin.de<mailto:seqan-dev-request@lists.fu-berlin.de>>> wrote: seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>>
seqan-dev-request@lists.fu-berlin.de<mailto:seqan-dev-request@lists.fu-berlin.de><mailto:seqan-dev-request@lists.fu-berlin.de<mailto:seqan-dev-request@lists.fu-berlin.de>>
seqan-dev-owner@lists.fu-berlin.de<mailto:seqan-dev-owner@lists.fu-berlin.de><mailto:seqan-dev-owner@lists.fu-berlin.de<mailto:seqan-dev-owner@lists.fu-berlin.de>>
From: Beifang Niu <neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com><mailto:neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com>>> Subject: [Seqan-dev] about extendSeed of Seqan To: seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>> Content-Type: text/plain; charset="iso-8859-1" From: "Kehr, Birte" <Birte.Kehr@fu-berlin.de<mailto:Birte.Kehr@fu-berlin.de><mailto:Birte.Kehr@fu-berlin.de<mailto:Birte.Kehr@fu-berlin.de>>> Subject: Re: [Seqan-dev] about extendSeed of Seqan To: SeqAn Development <seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>>> Content-Type: text/plain; charset="us-ascii" From: Beifang Niu [mailto:neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com><mailto:neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com>>] Sent: Montag, 24. Oktober 2011 14:51 To: seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>> Subject: [Seqan-dev] about extendSeed of Seqan seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>> https://lists.fu-berlin.de/listinfo/seqan-dev |