FU Logo
  • Startseite
  • Kontakt
  • Impressum
  • Home
  • Listenauswahl
  • Anleitungen

Re: [Seqan-dev] CheckStreamFormat for FastQ

<-- thread
<-- date
  • From: Manuel Holtgrewe <manuel.holtgrewe@fu-berlin.de>
  • To: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Date: Wed, 21 Dec 2011 18:31:15 +0100
  • Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Subject: Re: [Seqan-dev] CheckStreamFormat for FastQ

Felix,

The documentation of checkStreamFormat() was misleading. I fixed it in [10948].

http://docs.seqan.de/seqan/dev2/?i=Function.checkStreamFormat

(The documentation is regenerated every hour, so you might wait for a bit to see it).

The following is a simple example program I compiled and tested. Please write another email, if the problem persists.

HTH,
Manuel

#include <fstream>
#include <iostream>

#include <seqan/sequence.h>
#include <seqan/stream.h>

int main(int argc, char ** argv)
{
    using namespace seqan;

    if (argc != 2)
        return 1;
    std::fstream in(argv[1]);

    RecordReader<std::fstream, SinglePass<> > reader(in);
    AutoSeqStreamFormat tagSelector;
    bool b = checkStreamFormat(reader, tagSelector);
    if (!b)
    {
        std::cerr << "Could not detect file format!" << std::endl;
        return 1;
    }

    // b is true if any format was detected successfully.
    if (tagSelector.tagId == 1)
        std::cerr << "Detected FASTA." << std::endl;
    else if (tagSelector.tagId == 2)
        std::cerr << "Detected FASTQ." << std::endl;
    else
        std::cerr << "Unknown file format!" << std::endl;
    return 0;
}


On 12/21/2011 05:15 PM, Felix Heeger wrote:
Hi,

I have to different functions I want to call depending on the fact if a
input file is fasta or fastq format.

My approach to this is:

RecordReader<std::ifstream, SinglePass<>  >  reader(inFile);
if (checkStreamFormat(reader, Fasta()))
{
     std::cerr<<  "Input file format is fasta."<<  std::endl;
     [call function for fasta]
}
else if (checkStreamFormat(reader, Fastq()))
{
     std::cerr<<  "Input file format is fastq."<<  std::endl;
     [call function for fastq]
}
else
{
     std::cerr<<  "ERORR: Input file format is not fasta or fastq."<<  std::endl;
     return -1;
}

This works fine for fasta. However my fastq file is not recognized.
I looked into the code for checkStreamFormat a bit and the file is not
recognized because the iterator in the readRecord function reaches
atEnd before the quality meta data for the 35th record is finished (l. 392).
This happens with two different fastq files.

So my theory is the following:
In the checkStreamFormat function LimitRecordReaderInScope
is used. The documentation states that this prevents the stream from
"rebuffering". This probably prevents the reader from finishing to read
the complete record and the recognition of the file fails.

I hope I could make myself clear. I can also provide my code and a sample
fastq file if it would be helpful.

Cheers,
felix



_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev



<-- thread
<-- date
  • References:
    • [Seqan-dev] CheckStreamFormat for FastQ
      • From: Felix Heeger <fheeger@mi.fu-berlin.de>
  • seqan-dev - December 2011 - Archives indexes sorted by:
    [ thread ] [ subject ] [ author ] [ date ]
  • Complete archive of the seqan-dev mailing list
  • More info on this list...

Hilfe

  • FAQ
  • Dienstbeschreibung
  • ZEDAT Beratung
  • postmaster@lists.fu-berlin.de

Service-Navigation

  • Startseite
  • Listenauswahl

Einrichtung Mailingliste

  • ZEDAT-Portal
  • Mailinglisten Portal