Hi Theo,
we already have nightly tests.
Are you use SequenceStream? What does your source code look like?
What are you reading the sequences into? DnaString? CharString? Can you give more details here?
Your snippet parses nicely with SequenceStream.
Currently, there is a limitation that when reading sequence into Dna5String then any non-CGATN character causes an error. We will resolve this issue with a configuration object to the readRecord function in the future that allows to switch between error/coerce-to-N
for other characters (e.g. when there are IUPAC characters indicating an A-C ambiguity).
*m
From: Theodore Omtzigt [theo@stillwater-sc.com]
Sent: Saturday, March 02, 2013 12:00 AM To: SeqAn Dev List Subject: [Seqan-dev] Fastq test files I just got a set of FASTQ test files from Illumina BaseSpace and SeqAn is barfing on them reporting INVALID_FORMAT.
s_G1_L001_I1_001.fastq.1, s_G1_L001_I1_002.fastq.1, s_G1_L001_R1_001.fastq.1, s_G1_L001_R1_002.fastq.1, s_G1_L001_R2_001.fastq.1, s_G1_L001_R2_002.fastq.1 Here is a quick snippet of the first file @:89:A0172:1:1:12008:1323 1:N:0:1 TTAGGC + ;B@FFF @:89:A0172:1:1:15627:1329 1:N:0:1 TTAGGC + @CCFFF @:89:A0172:1:1:19263:1331 1:N:0:1 TTAGGC + @@CDDF @:89:A0172:1:1:24249:1331 1:N:0:1 TTAGGC + BCCFFF @:89:A0172:1:1:15721:1332 1:N:0:1 TTAGGC + <@<DAD @:89:A0172:1:1:15433:1333 1:N:0:1 TTAGGC Would it be possible to include a couple of very short test files in the SeqAn src tree, say under seqan/data, that do pass successfully through readRecord() so that software development can continue while I/O issues are sorted out? Would also be nice to know why these Illumina BaseSpace files don't pass. |