Hey Sam,
wrote:
Thanks for your help! However the alignment is not right. The second sequence lines up about 100 residues from where it should, whereas the "Simple" scoring scheme works reasonably. See the "LMIS", which should line up at residue 251 on the first sequence.
I'm using -1 for both gap and gapOpen:
200 . : . : . : . : . :
NVAHPASSTKVDKKIEPRGPTIKPCPPCKCPAPNLLGGPSVFIFPPKIKD
--------------------------------------------------
250 . : . : . : . : . :
VLMISLSPIVTCVVVDVSEDDPDVQISWFVNNVEVHTAQTQTHREDYNST
--------------------------------------------------
300 . : . : . : . : . :
LRVVSALPIQHQDWMSGKEFKCKVNNKDLPAPIERTISKPKGSVRAPQVY
| |
---------------------------------PELLGGPSVFLFPPKPK
350 . : . : . : . : . :
VLPPPEEEMTKKQVTLTCMVTDFMPEDIYVEWTNNGKTELNYKNTEPVLD
| | |
DTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNS
400 . : . : . : . :
SDGSYFMYSKLRVEKKNWVERNSYSCSVVHEGLHNHHTTKSFSR
| | | |
TYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQ
Here's the code snippet:
1002 seqan::Score<int, seqan::ScoreMatrix<seqan::AminoAcid, seqan::Pam120_> > scorePam120(gap,gapOpenPenalty);
1003 int score = globalAlignment(align, scorePam120 );
It is really hard to tell what went wrong without a proper code example. Could you please provide all the code associated with the alignment computation?
Maybe I need to specify the algorithm as well?
Also, how can I retrieve the aligned residue numbers? Right now I'm just retrieving the two aligned sequences and assuming they are aligned if neither character is a '-':
1022 if ((String(seqan::row (align,0)[i]).compare("-") != 0 ) &&
1023 (String(seqan::row (align,1)[i]).compare("-") != 0 )) { /* aligned residues*/ }
At the moment we don't store any state of the alignment. Thus it is not possible to receive such information. However, we have the function isGap(gaps, viewPosition) or isGap(GapsIterator). In this case the code could look like this, avoiding the string comparisons:
typedef Row<Align<DnaString> >::Type TRow;
typedef Iterator<TRow, Standard>::Type TRowIterator;
TRowIterator itRow0 = begin(row(align, 0), Standard());
TRowIterator itRow0End = end(row(align, 0), Standard());
TRowIterator itRow1 = begin(row(align, 1), Standard());
for (; itRow0 != itRow0End; ++itRow0, ++itRow1)
{
if (!isGap(itRow0) && !isGap(itRow1))
// Count aligned residues.
}
Since this is only a linear scan over the aligned sequences, this won't add much to the overall runtime considering the quadratic runtime for the alignments you have to do before.
Kind regards,
René Rahn
This works well enough for very simple, high sequence identity alignments. However I'd really like to know which residues are positively aligned ( "|" in the above alignment).
Thanks
Sam
Hey Sam,
if you want to use scoring matrices you need to use a different specialization of our Score class. To use scoring schemes like the Pam or Blosum matrices you need to use the following code:
Score<int, ScoreMatrix<AminoAcid, Pam120_> > scorePam120(gap,
gapOpen);
For the example above this would look like:
Pam120 scorePam120(gap, gapOpen);
I hope that will solve your problem!
Kind regards,
René Rahn
Guys,
I can't seem to get globalAlignment to use any scoring scheme otehr than seqan::Simple:
TAlign align;
seqan::resize(rows(align), 2);
assignSource(row(align,0),seqA);
assignSource(row(align,1),seqB);
int score = globalAlignment(align, seqan::Score<int,seqan::Simple>(0,-1,-1,gapOpenPenalty));
I tried changing the last line to just:
int score = globalAlignment(align, seqan::Score<int,Pam>(0,-1,-1,gapOpenPenalty));
and every conceivable variant thereof. None seem to be recognized, e.g.:
/Users/Sam/svn/RNAToolbox/trunk/src/ParameterReader.cpp:1000:65: error: use of undeclared identifier 'Pam'
Can someone tell me how to switch to Blossum, PAM, or some other?
Thanks
Sam
_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev
---
René Rahn
Ph.D. Student
---------------------------------
Phone: +49 (0)30 838 75 277
---------------------------------
Algorithmic Bioinformatics (ABI)
Department of Computer Science
Room 018
---------------------------------
Freie Universität Berlin
Takustraße 9
14195 Berlin
---------------------------------
_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev
_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev
---
René Rahn
Ph.D. Student
---------------------------------
Phone: +49 (0)30 838 75 277
---------------------------------
Algorithmic Bioinformatics (ABI)
Department of Computer Science
Room 018
---------------------------------
Freie Universität Berlin
Takustraße 9
14195 Berlin
---------------------------------
|