Re: [Seqan-dev] CheckStreamFormat for FastQ
Hi Manuel,
I did a fresh check out, but the still the same problem.
However if I remove the last 6 records from the file it will be
recognized. I also removed the first 6 records to make sure it is the
file size that is causing the issue and not a specific record. Same
result.
In short: it is working for me if the file size is <= 8KB.
felix
On Fri, 2012-01-06 at 15:06 +0100, Manuel Holtgrewe wrote:
> I tested the program on the file that you attached and it worked. Does
> the program detect the format of the small file, too?
>
> $ make file_detect
> [...]
> $ ./sandbox/holtgrew/demos/file_detect /tmp/lane_5_p1.fastq
> Detected FASTQ.
>
> Could you try again with a fresh checkout?
>
> On 01/06/2012 02:35 PM, Felix Heeger wrote:
> > Hi Manual,
> >
> > thank you for your effort. I checked your suggestion today and it did
> > not fix my problem. Also your example program can not identify my FASTQ
> > file. I am pretty sure it is valid FASTQ as other programs work fine on
> > it. I attached the first part of the file, if you want to have a look at
> > it.
> >
> > felix
> >
> > On Wed, 2011-12-21 at 18:31 +0100, Manuel Holtgrewe wrote:
> >> Felix,
> >>
> >> The documentation of checkStreamFormat() was misleading. I fixed it in
> >> [10948].
> >>
> >> http://docs.seqan.de/seqan/dev2/?i=Function.checkStreamFormat
> >>
> >> (The documentation is regenerated every hour, so you might wait for a
> >> bit to see it).
> >>
> >> The following is a simple example program I compiled and tested. Please
> >> write another email, if the problem persists.
> >>
> >> HTH,
> >> Manuel
> >>
> >> #include<fstream>
> >> #include<iostream>
> >>
> >> #include<seqan/sequence.h>
> >> #include<seqan/stream.h>
> >>
> >> int main(int argc, char ** argv)
> >> {
> >> using namespace seqan;
> >>
> >> if (argc != 2)
> >> return 1;
> >> std::fstream in(argv[1]);
> >>
> >> RecordReader<std::fstream, SinglePass<> > reader(in);
> >> AutoSeqStreamFormat tagSelector;
> >> bool b = checkStreamFormat(reader, tagSelector);
> >> if (!b)
> >> {
> >> std::cerr<< "Could not detect file format!"<< std::endl;
> >> return 1;
> >> }
> >>
> >> // b is true if any format was detected successfully.
> >> if (tagSelector.tagId == 1)
> >> std::cerr<< "Detected FASTA."<< std::endl;
> >> else if (tagSelector.tagId == 2)
> >> std::cerr<< "Detected FASTQ."<< std::endl;
> >> else
> >> std::cerr<< "Unknown file format!"<< std::endl;
> >> return 0;
> >> }
> >>
> >>
> >> On 12/21/2011 05:15 PM, Felix Heeger wrote:
> >>> Hi,
> >>>
> >>> I have to different functions I want to call depending on the fact if a
> >>> input file is fasta or fastq format.
> >>>
> >>> My approach to this is:
> >>>
> >>>> RecordReader<std::ifstream, SinglePass<> > reader(inFile);
> >>>> if (checkStreamFormat(reader, Fasta()))
> >>>> {
> >>>> std::cerr<< "Input file format is fasta."<< std::endl;
> >>>> [call function for fasta]
> >>>> }
> >>>> else if (checkStreamFormat(reader, Fastq()))
> >>>> {
> >>>> std::cerr<< "Input file format is fastq."<< std::endl;
> >>>> [call function for fastq]
> >>>> }
> >>>> else
> >>>> {
> >>>> std::cerr<< "ERORR: Input file format is not fasta or fastq."<< std::endl;
> >>>> return -1;
> >>>> }
> >>>
> >>> This works fine for fasta. However my fastq file is not recognized.
> >>> I looked into the code for checkStreamFormat a bit and the file is not
> >>> recognized because the iterator in the readRecord function reaches
> >>> atEnd before the quality meta data for the 35th record is finished (l. 392).
> >>> This happens with two different fastq files.
> >>>
> >>> So my theory is the following:
> >>> In the checkStreamFormat function LimitRecordReaderInScope
> >>> is used. The documentation states that this prevents the stream from
> >>> "rebuffering". This probably prevents the reader from finishing to read
> >>> the complete record and the recognition of the file fails.
> >>>
> >>> I hope I could make myself clear. I can also provide my code and a sample
> >>> fastq file if it would be helpful.
> >>>
> >>> Cheers,
> >>> felix
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> seqan-dev mailing list
> >>> seqan-dev@lists.fu-berlin.de
> >>> https://lists.fu-berlin.de/listinfo/seqan-dev
> >>
> >> _______________________________________________
> >> seqan-dev mailing list
> >> seqan-dev@lists.fu-berlin.de
> >> https://lists.fu-berlin.de/listinfo/seqan-dev
> >
>
> _______________________________________________
> seqan-dev mailing list
> seqan-dev@lists.fu-berlin.de
> https://lists.fu-berlin.de/listinfo/seqan-dev