FU Logo
  • Startseite
  • Kontakt
  • Impressum
  • Home
  • Listenauswahl
  • Anleitungen

Re: [Seqan-dev] Two Alignment Questions

<-- thread -->
<-- date
  • From: Rahn, René <rene.maerker@fu-berlin.de>
  • To: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Date: Thu, 22 Jan 2015 11:44:09 +0100
  • Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Subject: Re: [Seqan-dev] Two Alignment Questions

Hi Brett,

On 22 Jan 2015, at 08:47, Brett Bowman <bnbowman@gmail.com> wrote:

Given an Alignment, how do I get the "TGapSpec" of the underlying rows?

I've submitted a pull request with the first half of the features I'd like, now I'm trying to get match strings to work in a generalized way.

Thank you for the PR. I will merge it into develop today, as we plan to release very soon this not going into the current master.


I'm trying to implement a functions called "alignmentIdentityString", which takes either an Align object, or just two Gaps objects representing rows, and generations a match string with specified characters for matches / mismatches / gaps.  It works just fine when I work directly from a supplied Alignment or pair of Rows, but I can't translate an Alignment object down into its component Rows so I can avoid duplicating code.

The second template argument of Align specifies the underlying Gaps structure. 
So typename Row<Align<DnaString, TGapsSpec> >::Type == Gaps<DnaString, TGapsSpec>. 

Source<TGaps> gets me the alphabet, and I assume there is a Metafunction to get the Spec of a Gaps/Row object, but the documentation isn't clear and neither Value<> nor GetValue<> appear to give me what I need. 

Source<TGaps> hopefully gives you the type of the source, otherwise please raise an issue on github.com/seqan/seqan/issues. :)
Apparently there is no Metafunction to receive the Spec type of the Gaps nor the Align structure.

"""
template <typename TAlphabet, typename TAlignSpec>
String<char> alignmentIdentityString(Align<TAlphabet, TAlignSpec> const & align,
                                       char const matchChar,
                                       char const mismatchChar,
                                       char const gapChar)
{
    // Check that the supplied alignment is pairwise
    SEQAN_ASSERT_EQ_MSG(length(rows(align)), 2u, "Only works with pairwise alignments.");

    typedef Align<TAlphabet, TAlignSpec> const TAlign;
    typedef typename Row<TAlign>::Type TGaps;
    typedef typename Value<TGaps>::Type TGapSpec;
Remove the one line above.

    // Try #1 
    //TGaps queryRow = row(align, 0);
    //TGaps referenceRow = row(align, 1);
This one would be correct.

    // Try #2
    Gaps<TAlphabet, TGapSpec> queryRow = row(align, 0);
    Gaps<TAlphabet, TGapSpec> referenceRow = row(align, 1);
Value of TGaps would give you the Value of the Source of Gaps which is the alphabet type used for the source.

    return alignmentSimilarityString(queryRow, referenceRow, scoringScheme,
                                     matchChar, mismatchChar, gapChar);
}
"""

What am I missing?
From the code above I don’t see, why exactly you need the Gaps type.
Wouldn’t this here be sufficient?
BTW it should read Align<TSource, TGapsSpec>. The first template argument of Align is the type of the source.

```
template <typename TSource, typename TGapsSpec>
String<char> alignmentIdentityString(Align<TSource, TGapsSpec> const & align,
                                     char const matchChar,
                                     char const mismatchChar,
                                     char const gapChar)
{
    // Check that the supplied alignment is pairwise
    SEQAN_ASSERT_EQ_MSG(length(rows(align)), 2u, "Only works with pairwise alignments.”);
  
  return alignmentSimilarityString(row(align,0), row(align,1), scoringScheme,
                                   matchChar, mismatchChar, gapChar);
```

cheers,

René



-Brett

On Mon, Jan 12, 2015 at 1:11 AM, Holtgrewe, Manuel <manuel.holtgrewe@fu-berlin.de> wrote:
Hi Brett,

would you be willing to perform this extension yourself? You could send a patch or a Github pull request with the results.

Cheers,
Manuel


From: Brett Bowman [bnbowman@gmail.com]
Sent: Sunday, January 11, 2015 6:47 PM

To: SeqAn Development
Subject: Re: [Seqan-dev] Two Alignment Questions

Manuel,

I saw the AlignmentStats class when I was digging around in the API for what I need, but I'm afraid it doesn't quite fit:
-It doesn't differentiate between Insertions and Deletions (the ratio of which is important for diagnosing some sequencing problems and bad alignments)
-It doesn't appear to provide access to the alignment string, meaning I still need to traverse the whole alignment to generate alternate representations.

-B

On Sat, Jan 10, 2015 at 2:21 AM, Holtgrewe, Manuel <manuel.holtgrewe@fu-berlin.de> wrote:
Does this help?

http://docs.seqan.de/seqan/develop/?p=computeAlignmentStatshttp://docs.seqan.de/seqan/develop/?p=computeAlignmentStats

*m


From: Brett Bowman [bnbowman@gmail.com]
Sent: Saturday, January 10, 2015 3:16 AM
To: SeqAn Development
Subject: Re: [Seqan-dev] Two Alignment Questions

Thanks Rene,

>What do you mean by accuracy? What do you need exactly? 

What I am trying to do is write a tool to align single-molecule sequencing data (PacBio, Oxford Nanopore) to reference genomes.  For which the alignment accuracy and the break-down of the types of errors is very useful, as the per-read quality on both platforms varies substantially.

What I'd like are member methods or functions-on-alignment-objects to give the following:
-The number of base-pair matches in the alignment
-The number of mismatches in the alignment
-The number of insertions in the alignment
-The number of deletions in the alignment
-The alignment string as represented by the Align object's toString() method

The first four I need to summarize the similarity of an alignment (i.e. indels are likely sequencing error, where as mismatches are likely not), while the latter is helpful in creating alternate representations for alignments

Does that help?

-B

On Fri, Jan 9, 2015 at 12:51 AM, Rahn, René <rene.maerker@fu-berlin.de> wrote:
Hey Brett, 


On 30 Dec 2014, at 21:29, Brett Bowman <bnbowman@gmail.com> wrote:

1)  Is there an easy way to get the accuracy of an Alignment object, or is that something I need to calculate myself?

What do you mean by accuracy? What do you need exactly? 
The Align data structure simply represents the computed alignment. The score is returned by the global align interface. 
There are several ways to manipulate the alignment of gaps and to compute all best traceback paths. However, those options are not supported on the high level interfaces but only in the core of the alignment engine. 
If you tell me what you need exactly we can think of a way to make those interfaces public which is planned for the future anyway.
But there is no method telling you the accuracy of the “computed" alignment compared to another “real” alignment. Please, give me some specifics and we see if we can incorporate this.
...
Well, there is a method in the tool seqan-tcoffee, which checks if the computed alignment is correct given the score model and the alignment, but I believe this is not what you are looking for?


2)  bandedChainAlignment fails messily when given a Vector of Seeds instead of a String, where as other functions work fine (e.g. chainSeedsGlobally).  Why does it work for the one and not the other?  Excerpt from error message below.

"""
/usr/include/seqan/seeds/banded_chain_alignment_impl.h:1195:57: error: call to 'end' is ambiguous
    SEQAN_ASSERT(itEnd != static_cast<TSeedSetIterator>(end(seedSet)));
                                                        ^~~
/usr/include/seqan/basic/debug_test_system.h:2146:44: note: expanded from macro 'SEQAN_ASSERT'
                                          (_arg1), # _arg1)) {           \
                                           ^
/usr/include/seqan/seeds/banded_chain_alignment_profile.h:254:16: note: in instantiation of function template specialization

...

/usr/include/seqan/seeds/banded_chain_alignment.h:207:12: note: in instantiation of function template specialization
      'seqan::bandedChainAlignment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc<void> >, seqan::Tag<seqan::ArrayGaps_>,
      std::vector<seqan::Seed<seqan::Tag<seqan::Simple_>, seqan::DefaultSeedConfig>, std::allocator<seqan::Seed<seqan::Tag<seqan::Simple_>, seqan::DefaultSeedConfig> > >, long,
      seqan::Tag<seqan::Simple_>, seqan::Tag<seqan::Simple_>, false, false, false, false, seqan::Tag<seqan::Default_> >' requested here
    return bandedChainAlignment(align, seedSet, scoreScheme, scoreScheme, alignConfig, bandExtension);
           ^
/home/bbowman/git/SRSLI/src/C++/SparseAlignment2.cpp:72:25: note: in instantiation of function template specialization
      'seqan::bandedChainAlignment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc<void> >, seqan::Tag<seqan::ArrayGaps_>,
      std::vector<seqan::Seed<seqan::Tag<seqan::Simple_>, seqan::DefaultSeedConfig>, std::allocator<seqan::Seed<seqan::Tag<seqan::Simple_>, seqan::DefaultSeedConfig> > >, long,
      seqan::Tag<seqan::Simple_>, false, false, false, false, seqan::Tag<seqan::Default_> >' requested here
        long alnScore = bandedChainAlignment(alignment, *seedChain, scoring, globalConfig);
                        ^
/home/bbowman/git/SRSLI/src/C++/main.cpp:95:27: note: in instantiation of function template specialization 'RefChainsToAlignments<seqan::AlignConfig<false, true, true, false,
      seqan::Tag<seqan::Default_> > ()>' requested here
        auto alignments = RefChainsToAlignments(record->Seq,
                          ^
/usr/include/seqan/seeds/banded_chain_alignment_impl.h:583:1: note: candidate template ignored: substitution failure [with TSeedSet = std::vector<seqan::Seed<seqan::Tag<seqan::Simple_>,
      seqan::DefaultSeedConfig>, std::allocator<seqan::Seed<seqan::Tag<seqan::Simple_>, seqan::DefaultSeedConfig> > >]
_findFirstAnchor(TSeedSet const & seedSet, int bandExtension)
"”" 

Ok, this seems to be a bug to me.
I’ll have a look into it.


Sincerely,
-Brett
_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

Bests, 

René 

---

René Rahn
Ph.D. Student
--------------------------------
Tel:  (+49) 30 838 75277
Mail: rene.rahn@fu-berlin.de
--------------------------------
Institute of Computer Science
Algorithmic Bioinformatics (ABI)
--------------------------------
Freie Universität Berlin
Takustraße 9
14195 Berlin
--------------------------------


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev


_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

---

René Rahn
Ph.D. Student
--------------------------------
Tel:  (+49) 30 838 75277
Mail: rene.rahn@fu-berlin.de
--------------------------------
Institute of Computer Science
Algorithmic Bioinformatics (ABI)
--------------------------------
Freie Universität Berlin
Takustraße 9
14195 Berlin
--------------------------------

<-- thread -->
<-- date
  • References:
    • Re: [Seqan-dev] Two Alignment Questions
      • From: Rahn, René <rene.maerker@fu-berlin.de>
    • Re: [Seqan-dev] Two Alignment Questions
      • From: Brett Bowman <bnbowman@gmail.com>
    • Re: [Seqan-dev] Two Alignment Questions
      • From: "Holtgrewe, Manuel" <manuel.holtgrewe@fu-berlin.de>
    • Re: [Seqan-dev] Two Alignment Questions
      • From: Brett Bowman <bnbowman@gmail.com>
    • Re: [Seqan-dev] Two Alignment Questions
      • From: "Holtgrewe, Manuel" <manuel.holtgrewe@fu-berlin.de>
    • Re: [Seqan-dev] Two Alignment Questions
      • From: Brett Bowman <bnbowman@gmail.com>
  • seqan-dev - January 2015 - Archives indexes sorted by:
    [ thread ] [ subject ] [ author ] [ date ]
  • Complete archive of the seqan-dev mailing list
  • More info on this list...

Hilfe

  • FAQ
  • Dienstbeschreibung
  • ZEDAT Beratung
  • postmaster@lists.fu-berlin.de

Service-Navigation

  • Startseite
  • Listenauswahl

Einrichtung Mailingliste

  • ZEDAT-Portal
  • Mailinglisten Portal