Re: [Seqan-dev] Getting a sequence as a char *
- From: Nick Mapsy <nmapsy@gmail.com>
- To: SeqAn Development <seqan-dev@lists.fu-berlin.de>
- Date: Fri, 22 Sep 2017 18:44:30 -0400
- Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
- Subject: Re: [Seqan-dev] Getting a sequence as a char *
Hi René,
Thanks for the suggestion. I'd certainly like to get the gapped sequence directly. Unfortunately, I'm not sure exactly which function could do that, and what its input would be.
Nick
On Sep 18, 2017 5:16 AM, "Rahn, René" <Rene.Rahn@fu-berlin.de> wrote:
Hi Nick,
you should be able to simply copy the gapped sequence into a CharString.
Cheers,
René
On 9. Sep 2017, at 00:55, Nick Mapsy <nmapsy@gmail.com> wrote:
(I'm using SeqAn 2.2.0 now)P.S. My solution, in case anyone else runs into the same trouble:NickThank you for the tip on going from the row to a CharString. Is there a function which can take my TRow and return a CharString? I couldn't find anything in the documentation. Instead, I found out that my TRow can act as a Gaps, which allowed me to use all the Gaps functions. That allowed me to reconstruct the alignment using isGap() and the unaligned sequences.Hi Hannes,Unfortunately I do need a char **, since I'm passing the data back to C code (actually, to Python ctypes).
Thank you so much for your reply! I'm really lost here so I appreciate it a lot.
Thank you,
#include <iostream>
#include <stdlib.h>
#include <seqan/align.h>
#include <seqan/score.h>
#include <seqan/sequence.h>
#include <seqan/graph_msa.h>
using namespace seqan;
char **align(int nseq, char *seqs[]) {
Align<String<Dna5>> align;
resize(rows(align), nseq);
for (int i = 0; i < nseq; i++) {
assignSource(row(align, i), seqs[i]);
}
globalMsaAlignment(align, EditDistanceScore());
// Convert the Align rows to char *'s and store back in seqs.
typedef typename Row<Align<String<Dna5>>>::Type TRow;
for (int i = 0; i < nseq; i++) {
// Each row is type TRow, but also functions as a Gaps. This is why isGap accepts it.
TRow arow = row(align, i);
int len = (int)length(arow);
char *new_seq = (char *)malloc(sizeof(char) * len+1);
int offset = 0;
for (int j = 0; j < len; j++) {
if (isGap(arow, j)) {
new_seq[j] = '-';
offset--;
} else {
new_seq[j] = seqs[i][j+offset];
}
}
new_seq[len] = '\0';
seqs[i] = new_seq;
}
return seqs;
}
int main(int argc, char *argv[]) {
for (int i = 1; i < argc; i++) {
argv[i-1] = argv[i];
}
char **aligned_seqs = align(argc-1, argv);
for (int i = 0; i < argc-1; i++) {
std::cout << aligned_seqs[i] << std::endl;
}
return 0;
}
______________________________
On Fri, Sep 8, 2017 at 8:18 AM, Hannes Hauswedell <hannes.hauswedell@fu-berlin.de > wrote:
Hi Nick,
Am Mittwoch, 6. September 2017, 03:38:54 schrieb Nick Mapsy:
> Hi, I'm just getting started with SeqAn (and C++), so I'm sure I'm missing
> something simple here.
>
> I've got a multiple sequence alignment working and producing an
> Align<String<Dna5> > object. Now all I need is to return the aligned
> sequences (with gaps) as C strings (char *) from the function.
Are you sure you want to be passing around these char** ? This is C++ after
all and we have references :D
> It seems like a simple thing, but after hours reading through the
> documentation of all the types and functions (and yes, Language Entity
> Types), I can't find the path from Align to char *.
>
> I found toCString(), but it takes a String, and I don't know how to get
> (gapped) Strings out of an Align.
>
> Thank you for any help, and hopefully I'm able to make use of this great
> library!
You can create a CharString from the alignment row and then call toCString()
on the CharString. But, like I said, I would really recommend working with
Strings and StringSets instead of pointers and [].
> P.S. Here's what I've written so far:
> (I'm using SeqAn 1.4.1 on Ubuntu 16.04.)
Please update to SeqAn2 as SeqAn1 has been deprecated for a while now. The
ubuntu package is called libseqan2-dev. It is available since Ubuntu 17.04,
but can also be installed manually:
http://seqan.readthedocs.io/en/master/Infrastructure/Use/Ins tall.html#library-package
Best regards,
Hannes
--
Hannes Hauswedell
Scientific staff & PhD candidate
Freie Universität Berlin / Max Planck Institute for Molecular Genetics
address Institut für Informatik
Takustraße 9
Room 019
14195 Berlin
telephone +49 (0)30 838-75241
fax +49 (0)30 838-75218
e-mail hannes.hauswedell@[molgen.mpg.de |fu-berlin.de]
_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev
_________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev
---
René RahnPh.D. Student (de.NBI - CIBI)-------------------------------- Institute of Computer ScienceAlgorithmic Bioinformatics (ABI)-------------------------------- Freie Universität BerlinTakustraße 914195 Berlin--------------------------------
_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev
- References:
- [Seqan-dev] Getting a sequence as a char *
- From: Nick Mapsy <nmapsy@gmail.com>
- Re: [Seqan-dev] Getting a sequence as a char *
- From: Hannes Hauswedell <hannes.hauswedell@fu-berlin.de>
- Re: [Seqan-dev] Getting a sequence as a char *
- From: Nick Mapsy <nmapsy@gmail.com>
- Re: [Seqan-dev] Getting a sequence as a char *
- From: Rahn, René <Rene.Rahn@fu-berlin.de>
- [Seqan-dev] Getting a sequence as a char *
-
seqan-dev - September 2017 - Archives indexes sorted by:
[ thread ] [ subject ] [ author ] [ date ] - Complete archive of the seqan-dev mailing list
- More info on this list...