FU Logo
  • Startseite
  • Kontakt
  • Impressum
  • Home
  • Listenauswahl
  • Anleitungen

Re: [Seqan-dev] CheckStreamFormat for FastQ

<-- thread -->
<-- date -->
  • From: Felix Heeger <fheeger@mi.fu-berlin.de>
  • To: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Date: Mon, 09 Jan 2012 13:02:52 +0100
  • Reply-to: SeqAn Development <seqan-dev@lists.fu-berlin.de>
  • Subject: Re: [Seqan-dev] CheckStreamFormat for FastQ

Hi,

its working fine now. Thank you very much for your help.

felix

On Mon, 2012-01-09 at 11:09 +0100, Manuel Holtgrewe wrote:
> OK, apparently the file was converted from unix line endings to windows 
> line endings through the list.
> 
> r1018 has a fix to the bug. Try updating and report back if the problem 
> is not fixed yet.
> 
> HTH
> 
> On 01/09/2012 10:35 AM, Felix Heeger wrote:
> > Hi,
> >
> > My configuration:
> > Ubuntu 11.04 64bit
> > 64 bit processor
> > gcc version 4.5.2
> >
> > felix
> >
> > On Fri, 2012-01-06 at 16:53 +0100, Manuel Holtgrewe wrote:
> >> What is your configuration (OS, 32/64 bit, compiler, version)
> >>
> >> On 01/06/2012 03:53 PM, Felix Heeger wrote:
> >>> Hi Manuel,
> >>>
> >>> I did a fresh check out, but the still the same problem.
> >>>
> >>> However if I remove the last 6 records from the file it will be
> >>> recognized. I also removed the first 6 records to make sure it is the
> >>> file size that is causing the issue and not a specific record. Same
> >>> result.
> >>>
> >>> In short: it is working for me if the file size is<= 8KB.
> >>>
> >>> felix
> >>>
> >>> On Fri, 2012-01-06 at 15:06 +0100, Manuel Holtgrewe wrote:
> >>>> I tested the program on the file that you attached and it worked. Does
> >>>> the program detect the format of the small file, too?
> >>>>
> >>>> $ make file_detect
> >>>> [...]
> >>>> $ ./sandbox/holtgrew/demos/file_detect /tmp/lane_5_p1.fastq
> >>>> Detected FASTQ.
> >>>>
> >>>> Could you try again with a fresh checkout?
> >>>>
> >>>> On 01/06/2012 02:35 PM, Felix Heeger wrote:
> >>>>> Hi Manual,
> >>>>>
> >>>>> thank you for your effort. I checked your suggestion today and it did
> >>>>> not fix my problem. Also your example program can not identify my FASTQ
> >>>>> file. I am pretty sure it is valid FASTQ as other programs work fine on
> >>>>> it. I attached the first part of the file, if you want to have a look at
> >>>>> it.
> >>>>>
> >>>>> felix
> >>>>>
> >>>>> On Wed, 2011-12-21 at 18:31 +0100, Manuel Holtgrewe wrote:
> >>>>>> Felix,
> >>>>>>
> >>>>>> The documentation of checkStreamFormat() was misleading. I fixed it in
> >>>>>> [10948].
> >>>>>>
> >>>>>> http://docs.seqan.de/seqan/dev2/?i=Function.checkStreamFormat
> >>>>>>
> >>>>>> (The documentation is regenerated every hour, so you might wait for a
> >>>>>> bit to see it).
> >>>>>>
> >>>>>> The following is a simple example program I compiled and tested. Please
> >>>>>> write another email, if the problem persists.
> >>>>>>
> >>>>>> HTH,
> >>>>>> Manuel
> >>>>>>
> >>>>>> #include<fstream>
> >>>>>> #include<iostream>
> >>>>>>
> >>>>>> #include<seqan/sequence.h>
> >>>>>> #include<seqan/stream.h>
> >>>>>>
> >>>>>> int main(int argc, char ** argv)
> >>>>>> {
> >>>>>>         using namespace seqan;
> >>>>>>
> >>>>>>         if (argc != 2)
> >>>>>>             return 1;
> >>>>>>         std::fstream in(argv[1]);
> >>>>>>
> >>>>>>         RecordReader<std::fstream, SinglePass<>    >    reader(in);
> >>>>>>         AutoSeqStreamFormat tagSelector;
> >>>>>>         bool b = checkStreamFormat(reader, tagSelector);
> >>>>>>         if (!b)
> >>>>>>         {
> >>>>>>             std::cerr<<    "Could not detect file format!"<<    std::endl;
> >>>>>>             return 1;
> >>>>>>         }
> >>>>>>
> >>>>>>         // b is true if any format was detected successfully.
> >>>>>>         if (tagSelector.tagId == 1)
> >>>>>>             std::cerr<<    "Detected FASTA."<<    std::endl;
> >>>>>>         else if (tagSelector.tagId == 2)
> >>>>>>             std::cerr<<    "Detected FASTQ."<<    std::endl;
> >>>>>>         else
> >>>>>>             std::cerr<<    "Unknown file format!"<<    std::endl;
> >>>>>>         return 0;
> >>>>>> }
> >>>>>>
> >>>>>>
> >>>>>> On 12/21/2011 05:15 PM, Felix Heeger wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I have to different functions I want to call depending on the fact if a
> >>>>>>> input file is fasta or fastq format.
> >>>>>>>
> >>>>>>> My approach to this is:
> >>>>>>>
> >>>>>>>> RecordReader<std::ifstream, SinglePass<>     >     reader(inFile);
> >>>>>>>> if (checkStreamFormat(reader, Fasta()))
> >>>>>>>> {
> >>>>>>>>         std::cerr<<     "Input file format is fasta."<<     std::endl;
> >>>>>>>>         [call function for fasta]
> >>>>>>>> }
> >>>>>>>> else if (checkStreamFormat(reader, Fastq()))
> >>>>>>>> {
> >>>>>>>>         std::cerr<<     "Input file format is fastq."<<     std::endl;
> >>>>>>>>         [call function for fastq]
> >>>>>>>> }
> >>>>>>>> else
> >>>>>>>> {
> >>>>>>>>         std::cerr<<     "ERORR: Input file format is not fasta or fastq."<<     std::endl;
> >>>>>>>>         return -1;
> >>>>>>>> }
> >>>>>>>
> >>>>>>> This works fine for fasta. However my fastq file is not recognized.
> >>>>>>> I looked into the code for checkStreamFormat a bit and the file is not
> >>>>>>> recognized because the iterator in the readRecord function reaches
> >>>>>>> atEnd before the quality meta data for the 35th record is finished (l. 392).
> >>>>>>> This happens with two different fastq files.
> >>>>>>>
> >>>>>>> So my theory is the following:
> >>>>>>> In the checkStreamFormat function LimitRecordReaderInScope
> >>>>>>> is used. The documentation states that this prevents the stream from
> >>>>>>> "rebuffering". This probably prevents the reader from finishing to read
> >>>>>>> the complete record and the recognition of the file fails.
> >>>>>>>
> >>>>>>> I hope I could make myself clear. I can also provide my code and a sample
> >>>>>>> fastq file if it would be helpful.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> felix
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> seqan-dev mailing list
> >>>>>>> seqan-dev@lists.fu-berlin.de
> >>>>>>> https://lists.fu-berlin.de/listinfo/seqan-dev
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> seqan-dev mailing list
> >>>>>> seqan-dev@lists.fu-berlin.de
> >>>>>> https://lists.fu-berlin.de/listinfo/seqan-dev
> >>>>>
> >>>>
> >>>> _______________________________________________
> >>>> seqan-dev mailing list
> >>>> seqan-dev@lists.fu-berlin.de
> >>>> https://lists.fu-berlin.de/listinfo/seqan-dev
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> seqan-dev mailing list
> >>> seqan-dev@lists.fu-berlin.de
> >>> https://lists.fu-berlin.de/listinfo/seqan-dev
> >>
> >> _______________________________________________
> >> seqan-dev mailing list
> >> seqan-dev@lists.fu-berlin.de
> >> https://lists.fu-berlin.de/listinfo/seqan-dev
> >
> >
> >
> > _______________________________________________
> > seqan-dev mailing list
> > seqan-dev@lists.fu-berlin.de
> > https://lists.fu-berlin.de/listinfo/seqan-dev
> 
> _______________________________________________
> seqan-dev mailing list
> seqan-dev@lists.fu-berlin.de
> https://lists.fu-berlin.de/listinfo/seqan-dev





<-- thread -->
<-- date -->
  • References:
    • Re: [Seqan-dev] CheckStreamFormat for FastQ
      • From: Felix Heeger <fheeger@mi.fu-berlin.de>
    • Re: [Seqan-dev] CheckStreamFormat for FastQ
      • From: Manuel Holtgrewe <manuel.holtgrewe@fu-berlin.de>
    • Re: [Seqan-dev] CheckStreamFormat for FastQ
      • From: Felix Heeger <fheeger@mi.fu-berlin.de>
    • Re: [Seqan-dev] CheckStreamFormat for FastQ
      • From: Manuel Holtgrewe <manuel.holtgrewe@fu-berlin.de>
    • Re: [Seqan-dev] CheckStreamFormat for FastQ
      • From: Felix Heeger <fheeger@mi.fu-berlin.de>
    • Re: [Seqan-dev] CheckStreamFormat for FastQ
      • From: Manuel Holtgrewe <manuel.holtgrewe@fu-berlin.de>
  • seqan-dev - January 2012 - Archives indexes sorted by:
    [ thread ] [ subject ] [ author ] [ date ]
  • Complete archive of the seqan-dev mailing list
  • More info on this list...

Hilfe

  • FAQ
  • Dienstbeschreibung
  • ZEDAT Beratung
  • postmaster@lists.fu-berlin.de

Service-Navigation

  • Startseite
  • Listenauswahl

Einrichtung Mailingliste

  • ZEDAT-Portal
  • Mailinglisten Portal