[Seqan-dev] Read trimming question

From: John St John <johnthesaintjohn@gmail.com>
To: seqan-dev@lists.fu-berlin.de
Date: Sat, 8 Oct 2011 17:06:47 -0700
Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
Subject: [Seqan-dev] Read trimming question

Hello,

I am working on a quick re-write of an alignment based short-read trimmer I wrote in C using SeqAnn. So far things are going really well. I followed the Alignment tutorial on the trac wiki, and now I have an Alignment Graph of a global alignment where gaps at all ends of the two short sequences aren't penalized and gaps in the middle are treated harshly.

So pictorially I have an alignment like this:

Seq1: ----ACATAG

Seq2: TTAGATA---

I want to output the following trimmed sequences:

Seq1: ACA

Seq2: ATA

However if the above alignment were reversed:

Seq1: TTAGATA---

Seq2: ----ACATAG

Then I want to output the merged and extended consensus, where I call mismatches using a seperate quality score string as a tie breaker.

Basically I don't know how to traverse the Alignment Graph to pull out the information I need. I need to keep track of which sequence is which in the alignment graph so that I can deal with the above two cases properly. Any help or at least a link to a good resource on traversing an alignment graph and doing something similar would be greatly appreciated. I need to keep track of the indices of the bases I trim so that I can also output the trimmed or merged quality string.

Thanks everyone for your time,

-John

<-- thread -->

<-- date -->

Follow-Ups:
- Re: [Seqan-dev] Read trimming question
  - From: Manuel Holtgrewe <manuel.holtgrewe@fu-berlin.de>

seqan-dev - October 2011 - Archives indexes sorted by:
[ thread ] [ subject ] [ author ] [ date ]
Complete archive of the seqan-dev mailing list
More info on this list...