Hi, actually it should automatically convert every non-ACGT character to N (or A for Dna targets). Have you already tried reading your files into string over Dna5 alphabets? Cheers, David -- David Weese weese@inf.fu-berlin.de Freie Universität Berlin http://www.inf.fu-berlin.de/ Institut für Informatik Phone: +49 30 838 75246 Takustraße 9 Algorithmic Bioinformatics 14195 Berlin Room 021 Am 25.04.2012 um 11:08 schrieb Bernd Jagla: > Hi, > > I have a couple of genome seqeunces that contain characters other than ACTGN (i.e. Y, M,...)... > > Is there a way to read those sequences in as well and automatically convert those non conforming letters to N? > > Thanks, > > Bernd > > PS: > > I am using: > > RecordReader<String<char, MMap<> >, DoublePass<Mapped> > refReader(seqMMapString); > int read2out = read2(seqIds, faSeqs, refReader, Fasta()); > > for reading in the data and get an INVALID_FORMAT error... > > > _______________________________________________ > seqan-dev mailing list > seqan-dev@lists.fu-berlin.de > https://lists.fu-berlin.de/listinfo/seqan-dev