FU Logo
  • Startseite
  • Kontakt
  • Impressum
  • Home
  • Listenauswahl
  • Anleitungen

Re: [Seqan-dev] about extendSeed of Seqan

<-- thread -->
<-- date -->
  • From: "Kehr, Birte" <Birte.Kehr@fu-berlin.de>
  • To: Beifang Niu <neilniu.cn@gmail.com>, SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Date: Fri, 28 Oct 2011 21:21:10 +0200
  • Acceptlanguage: en-US, de-DE
  • Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Subject: Re: [Seqan-dev] about extendSeed of Seqan

Hi Beifang,

 

Apart from modifying the function extendSeed, there is another simple way how you can limit the extension to up to 100bp.

You can use infixes of the input sequences that you pass over to the function extendSeed instead of seq0 and seq1, e.g.:

 

typedef typename Infix<DnaString>::Type TInfix;

TInfix infix0 = infix(seq0, _max(0, leftPosition(seed, 0) – 100), _min(length(seq0), rightPosition(seed, 0) + 101));

TInfix infix1 = infix(seq1, _max(0, leftPosition(seed, 1) – 100), _min(length(seq1), rightPosition(seed, 1) + 101));

extendSeed(…, infix0, infix1, 2, …);

 

In order to get the number of matching positions of an alignment you have to iterate over the alignment and test at every position if there is a gap in any of the sequences, and if not if the characters are equal. Have a look at the Alignment Tutorial for an introduction to the Align data structure.

http://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Alignments

 

-Birte

 

 

From: Beifang Niu [mailto:neilniu.cn@gmail.com]
Sent: Donnerstag, 27.
Oktober 2011 17:17
To: Kehr, Birte
Subject: Re: [Seqan-dev] about extendSeed of Seqan

 

Hi Birte,

 

I have other questions for you.

One is about the scope of seed. I just want to do seed extension around seed because there are so many seeds (maximal exact matchs) between two genomes.

There is just one seed extension example for two sequences in the tutorial. Can I set the scope of seed extension? for example, I just want to do seed extension within 100bps around the seed in two directions, not the extension on the whole sequence.

I checked the code of gapped extension and found the prefix() and suffix() function. I don't know if it is feasible to modify these two functions to get the part prefix and suffix of the seed. 

It will be simple to ungapped extension and I can directly give a threshold to limit the seed extension within 100bps.

 

 

another question is : 

 

How do i get the match numbers from globalAlignment results of the seed?

 

thank you,

Beifang.

 

On Thu, Oct 27, 2011 at 3:41 PM, Kehr, Birte <Birte.Kehr@fu-berlin.de> wrote:

Yes, in the case of gapped X-drop extension you have to do globalAlignment. The function extendSeed does not do the traceback and does not determine the number of matching positions.
But it does compute maximal and minimal diagonals, such that you can band the global alignment (see the Tutorial example).

What kind of score are you using? Would the score of the extensions help you?


-Birte

________________________________________
From: Beifang Niu [neilniu.cn@gmail.com]

Sent: Thursday, October 27, 2011 11:15 PM
To: Kehr, Birte
Subject: Re: [Seqan-dev] about extendSeed of Seqan


Hi Birte,


Unfortunately, I have to use gapped X-drop extension.
Do I have to do globalAlignment to get the matched number?  I just need the matched number after gapped seed extension and there will be increase in computation time if I have to do globalAlignment for getting matched number.
Any ideas?

thanks,
Beifang.

On Thu, Oct 27, 2011 at 2:08 PM, Kehr, Birte <Birte.Kehr@fu-berlin.de<mailto:Birte.Kehr@fu-berlin.de>> wrote:
Hi Beifang,

the function extendSeed does not return the number of matched sequence positions.

I assume you have used ungapped X-drop extension? Then you can count matching positions by simply iterating over the infixes:

typedef typename Infix<TSeq>::Type TInfix;
TInfix infix1 = infix(seq1, leftPosition(seed, 0), rightPosition(seed, 0)+1);
TInfix infix2 = infix(seq2, leftPosition(seed, 1), rightPosition(seed, 1)+1);

unsigned count = 0;
for(int i = 0; i < length(seed); ++i)
{
  if (value(infix1, i) == value(infix2, i))
      ++count;
}

-Birte

________________________________________

From: Beifang Niu [neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com>]

Sent: Thursday, October 27, 2011 9:19 PM
To: Kehr, Birte
Subject: Re: seqan-dev Digest, Vol 25, Issue 6

Hi Birte,

Thank you for your prompt response but I didn't receive your reply from seqan development mail list.
 I did see the example for seed extension in the SeqAn-Tutorial. Now, I have other questions for you.
 How can I get the actual aligned bases number between two extended seeds after running extendSeeds?
 for example, sequences: ACGTAGTTT  and ACGTGGTTT , there is one seed GTTT, after the extension of left , I got the extension seeds:     ACGTAGTTT   and ACGTGGTTT ( there is only one mismatch) , the actual aligned bases number is  8.
 I want to get this number but I don;t know how to get it only running extendSeeds.



thank you,
Beifang.

On Tue, Oct 25, 2011 at 3:00 AM, <seqan-dev-request@lists.fu-berlin.de<mailto:seqan-dev-request@lists.fu-berlin.de><mailto:seqan-dev-request@lists.fu-berlin.de<mailto:seqan-dev-request@lists.fu-berlin.de>>> wrote:
Send seqan-dev mailing list submissions to

     seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>>


To subscribe or unsubscribe via the World Wide Web, visit
     https://lists.fu-berlin.de/listinfo/seqan-dev
or, via email, send a message with subject or body 'help' to

     seqan-dev-request@lists.fu-berlin.de<mailto:seqan-dev-request@lists.fu-berlin.de><mailto:seqan-dev-request@lists.fu-berlin.de<mailto:seqan-dev-request@lists.fu-berlin.de>>


You can reach the person managing the list at

     seqan-dev-owner@lists.fu-berlin.de<mailto:seqan-dev-owner@lists.fu-berlin.de><mailto:seqan-dev-owner@lists.fu-berlin.de<mailto:seqan-dev-owner@lists.fu-berlin.de>>


When replying, please edit your Subject line so it is more specific
than "Re: Contents of seqan-dev digest..."


Today's Topics:

 1. about extendSeed of Seqan (Beifang Niu)
 2. Re: about extendSeed of Seqan (Kehr, Birte)


----------------------------------------------------------------------

Message: 1
Date: Mon, 24 Oct 2011 14:50:33 -0700

From: Beifang Niu <neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com><mailto:neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com>>>

Subject: [Seqan-dev] about extendSeed of Seqan

To: seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>>
Message-ID:
     <CABnPkb9P5nAvhQD_4in6mo+Vtim6uhjWEFgzGMhxfb5XWuLQ2g@mail.gmail.com<mailto:CABnPkb9P5nAvhQD_4in6mo%2BVtim6uhjWEFgzGMhxfb5XWuLQ2g@mail.gmail.com><mailto:CABnPkb9P5nAvhQD_4in6mo%2BVtim6uhjWEFgzGMhxfb5XWuLQ2g@mail.gmail.com<mailto:CABnPkb9P5nAvhQD_4in6mo%252BVtim6uhjWEFgzGMhxfb5XWuLQ2g@mail.gmail.com>>>

Content-Type: text/plain; charset="iso-8859-1"

Hi,

I am trying to use Seqan library to do the MEM (max exact match) extension.
Firstly, I get the MEM of the two genome sequences using MUMMER3 and then I
want to use extendSeed of Seqan to do extension of MEMs.
Is there any examples for extendSeed function of seeds class?

thanks,
Beifang.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.fu-berlin.de/pipermail/seqan-dev/attachments/20111024/bca37918/attachment.htm>

------------------------------

Message: 2
Date: Tue, 25 Oct 2011 00:59:24 +0200

From: "Kehr, Birte" <Birte.Kehr@fu-berlin.de<mailto:Birte.Kehr@fu-berlin.de><mailto:Birte.Kehr@fu-berlin.de<mailto:Birte.Kehr@fu-berlin.de>>>

Subject: Re: [Seqan-dev] about extendSeed of Seqan

To: SeqAn Development <seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>>>
Message-ID:
     <DAD226CB6878494EABEFD5215AA102015A8DE1CD68@exchange6.fu-berlin.de">DAD226CB6878494EABEFD5215AA102015A8DE1CD68@exchange6.fu-berlin.de<mailto:DAD226CB6878494EABEFD5215AA102015A8DE1CD68@exchange6.fu-berlin.de">DAD226CB6878494EABEFD5215AA102015A8DE1CD68@exchange6.fu-berlin.de><mailto:DAD226CB6878494EABEFD5215AA102015A8DE1CD68@exchange6.fu-berlin.de">DAD226CB6878494EABEFD5215AA102015A8DE1CD68@exchange6.fu-berlin.de<mailto:DAD226CB6878494EABEFD5215AA102015A8DE1CD68@exchange6.fu-berlin.de">DAD226CB6878494EABEFD5215AA102015A8DE1CD68@exchange6.fu-berlin.de>>>

Content-Type: text/plain; charset="us-ascii"

Hi Beifang,

you can find an example for seed extension in the SeqAn-Tutorial at
http://trac.mi.fu-berlin.de/seqan/wiki/Tutorial/Seed-and-Extend#SeedExtensionAndBandedAlignment.

You might also want to consider to use the seeds2 module instead of the seeds module since we plan to replace the seeds module by the seeds2 module. Unfortunately, there is no example on how to use the seeds2 module, yet.

-Birte

From: Beifang Niu [mailto:neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com><mailto:neilniu.cn@gmail.com<mailto:neilniu.cn@gmail.com>>]

Sent: Montag, 24. Oktober 2011 14:51

To: seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>>

Subject: [Seqan-dev] about extendSeed of Seqan

Hi,

I am trying to use Seqan library to do the MEM (max exact match) extension.
Firstly, I get the MEM of the two genome sequences using MUMMER3 and then I want to use extendSeed of Seqan to do extension of MEMs.
Is there any examples for extendSeed function of seeds class?

thanks,
Beifang.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.fu-berlin.de/pipermail/seqan-dev/attachments/20111025/a7b787b0/attachment.htm>

------------------------------

_______________________________________________
seqan-dev mailing list

seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de><mailto:seqan-dev@lists.fu-berlin.de<mailto:seqan-dev@lists.fu-berlin.de>>

https://lists.fu-berlin.de/listinfo/seqan-dev


End of seqan-dev Digest, Vol 25, Issue 6
****************************************

 

<-- thread -->
<-- date -->
  • References:
    • Re: [Seqan-dev] about extendSeed of Seqan
      • From: "Kehr, Birte" <Birte.Kehr@fu-berlin.de>
    • Re: [Seqan-dev] about extendSeed of Seqan
      • From: "Kehr, Birte" <Birte.Kehr@fu-berlin.de>
  • seqan-dev - October 2011 - Archives indexes sorted by:
    [ thread ] [ subject ] [ author ] [ date ]
  • Complete archive of the seqan-dev mailing list
  • More info on this list...

Hilfe

  • FAQ
  • Dienstbeschreibung
  • ZEDAT Beratung
  • postmaster@lists.fu-berlin.de

Service-Navigation

  • Startseite
  • Listenauswahl

Einrichtung Mailingliste

  • ZEDAT-Portal
  • Mailinglisten Portal