From sbdk82@gmail.com Thu Oct 10 05:19:42 2019 Received: from relay1.zedat.fu-berlin.de ([130.133.4.67]) by list1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (envelope-from ) id <1iIOzN-000ap2-SJ>; Thu, 10 Oct 2019 05:19:42 +0200 Received: from mail-vs1-f42.google.com ([209.85.217.42]) by relay1.zedat.fu-berlin.de (Exim 4.85) for seqan-dev@lists.fu-berlin.de with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (envelope-from ) id <1iIOzN-001H2j-Jy>; Thu, 10 Oct 2019 05:19:41 +0200 Received: by mail-vs1-f42.google.com with SMTP id v19so2958578vsv.3 for ; Wed, 09 Oct 2019 20:19:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=qskmjTZDk8Qnn6iQ4lvUyPczYK1bnBpxZb09icUbEgw=; b=pifiH7H5DbnJuIQohfbriF50sIXzqc7BDQXT6lGHgGjzcUwVCDiVnbVF6w3V8AzeCh R+gLfrLsnV8hTK7eJGr0kWx10N5bi9gi5osKOvzdSQI9egxayhy3OgAKXmPgxgpkoJNH mOm4uGWOo88XMPcWySg41BKBIegsfpDjDWkxTT70uL2BhI5wqnzFPnpqchFaq4s3euHD 42lwM7bAM6YZELAo3eeJCGiNkwmujMPz3XiN4/07ICQ0kynzYpIMCHREToNrTt75cRwg sBbvxgUE8Bh155icRS4EUlg6C0WKluGgptOpzBSQODwNEXpHZjy0bJktq0LEafyLQh5R E1Ow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=qskmjTZDk8Qnn6iQ4lvUyPczYK1bnBpxZb09icUbEgw=; b=V/E+Pt5JpiqPyBP+sqnkfy3Wef8eWVnYtuX0reGXcm9SANMJYiJM7+3mVkEHaCb5yy Ow3O9iNrowVsc7CDG5LBOEXZmHYnWUm2c1igJALBlWJ9XvyeqfaaaFVdVRdupzR8VCHS hM8CIhXqhBbUK8vdUJm2h/eqroFSVP0wn/WyLJAIuq+weUoDSNXM6GCFmH3u/6VuG5PR FuNrmLbvgmlFalMzByrC5u6IW+y03nXc07nGSBNn++DRL02SgIghqSTdQcJoz7+Dt2Mr c1XTihuIoOjWfucl0ydRixHiybOCi9rOwM8DU3LmOvxY1GY8SBNqao4TetYmCGar6iAH ycdw== X-Gm-Message-State: APjAAAUeUJ3+4+3bmuq2kP4AuXLcdXKgV9Pn1VP1cXEC9xQdkRKhpsgg XEZwdzP1orVQfhEGuiMlSfn7/hreYYQp7mGBW0zfIUvE X-Google-Smtp-Source: APXvYqyg51m4ql3OVesHubOMRq49XWRw2loVlTgYicJntGFnehVl3vS7PI1VsRo5G1UewtJ1fNM0dfy+4rOpOIpjDoM= X-Received: by 2002:a05:6102:3194:: with SMTP id c20mr3979786vsh.135.1570677578672; Wed, 09 Oct 2019 20:19:38 -0700 (PDT) MIME-Version: 1.0 From: SR B Date: Wed, 9 Oct 2019 22:19:27 -0500 Message-ID: To: seqan-dev@lists.fu-berlin.de Content-Type: multipart/alternative; boundary="000000000000223067059485df6b" X-Originating-IP: 209.85.217.42 X-ZEDAT-Hint: A X-purgate: clean X-purgate-type: clean X-purgate-ID: 151147::1570677581-00007E3B-AD94F03F/0/0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.398122, version=1.2.4 X-Spam-Flag: NO X-Spam-Status: No, score=0.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS X-Spam-Checker-Version: SpamAssassin 3.4.2 on Palau.ZEDAT.FU-Berlin.DE X-Spam-Level: Subject: [Seqan-dev] Consensus protein sequence X-BeenThere: seqan-dev@lists.fu-berlin.de X-Mailman-Version: 2.1.29 Precedence: list List-Id: SeqAn Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Oct 2019 03:19:42 -0000 --000000000000223067059485df6b Content-Type: text/plain; charset="UTF-8" I am trying to generate a consensus sequence from a list of protein sequences. I think there is a nice tutorial for nucleotide sequences here https://seqan.readthedocs.io/en/seqan-v2.0.2/Tutorial/MultipleSequenceAlignment.html Here is the following code I came up with, but it is giving lots of error. Could anyone take a look and let me know what is wrong with this? globalMsaAlignment(align, Blosum80(-1, -11)); std::cout << align << std::endl; String > profile; resize(profile, length(row(align, 0))); for (unsigned rowNo = 0; rowNo < 20u; ++rowNo) for (unsigned i = 0; i < length(row(align, rowNo)); ++i) profile[i].count[ordValue(row(align, rowNo)[i])] += 1; // call consensus from this string String consensus; for (unsigned i = 0; i < length(profile); ++i) { int idx = getMaxIndex(profile[i]); if (idx < 20) // is not gap appendValue(consensus, AminoAcid(getMaxIndex(profile[i]))); } --000000000000223067059485df6b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I am trying to generate a consensus sequence from a l= ist of protein sequences. I think there is a nice tutorial for nucleotide= =C2=A0sequences here
https://seqan.readthedocs.i= o/en/seqan-v2.0.2/Tutorial/MultipleSequenceAlignment.html

Here is the following code I came up with, but it is giving lots of e= rror. Could anyone take a look and let me know what is wrong with this?
=

globalMsaAlignment(align, Blosu= m80(-1, -11));

=C2=A0=C2=A0 std::cout << align << std::end= l;=C2=A0

=C2=A0=C2=A0 String<ProfileChar<AminoAcid> >= ; profile;

=C2=A0 =C2=A0 resize(profile, length(row(align, 0)));

=C2=A0 =C2=A0 for (unsigned rowNo =3D 0; rowNo < 20u= ; ++rowNo)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 for (unsigned i =3D 0; i &l= t; length(row(align, rowNo)); ++i)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 profile[i].co= unt[ordValue(row(align, rowNo)[i])] +=3D 1;=C2=A0

=C2=A0 =C2=A0=C2=A0

=C2=A0 =C2=A0 // call consensus from this string=

=C2=A0 =C2=A0 String<AminoAcid> consensus;=

=C2=A0 =C2=A0 for (unsigned i =3D 0; i < length(prof= ile); ++i)

=C2=A0 =C2=A0 {

=C2=A0 =C2=A0 =C2=A0 =C2=A0 int idx =3D getMaxIndex(pro= file[i]);

=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (idx < 20)=C2=A0 // is not gap=C2=A0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 appendValue(c= onsensus, AminoAcid(getMaxIndex(profile[i])));

=C2=A0 =C2=A0 }

--000000000000223067059485df6b--