Am Sonntag, den 10.04.2011, 23:27 +0200 schrieb manner.moe@gmx.de: > gconvert has following syntax: > gchar * g_convert (const gchar *str, gssize len, const gchar *to_codeset, > const gchar *from_codeset, gsize *bytes_read, gsize *bytes_written, GError > **error); > "to_codeset" and "from_codeset" have to be changed. > I think your patch tries to convert utf8 into a valid himd encoding format. You are right. That function has a confusing parameter order. > My previous patch uses "length = strlen(string)" inside himd_add_string() and > doesn´t need the string length as an argument of himd_add_string(). Right. > Your current patch uses "gsize length" uninitialized. Wrong. It's initialized by g_convert. And you *can* not achieve what you want using strlen. You need the length of the converted string in bytes. The converted string might be in UTF-16, and strlen can not work on UTF-16 strings. > Compilation fails with error "too many arguments to function "himd_add_string". Right. I forgot to add the patches to himddump.c > If I understand your patch correctly your patch depends on strings encoded in > utf8. Yes. This is intended, because the user of libhimd should not need to care about what encodings are understood by the Hi-MD format. As we can store UTF-16 on Hi-MD, all unicode characters can be represented (which does not mean the Hi-MD Walkman necessarily can display them). Also UTF-8 can represent all unicode characters, and UTF-8 is the more common character set in the Linux community. > This is true for strings taken from get_songinfo() in himddump.c which > reads the strings from the id3 tag and converts them to utf8 automatically. Right. > For example, if the user computers encoding is SHIFT_JIS we shouldn´t convert > it to utf8 and the try to reconvert it to make himd_add_string work. Why that? In the general case, the users encoding might be something completely different, like big5 (a chinese character set), Latin-5 (ISO-8859-9, a turkish character set) or Codepage 1251 (the default cyrillic character set on Windows). Passing data in an arbitrary character set does not seem to make sense to me. Also note that with Windows NT and later, the preferred character set is Unicode (the wide character functions all use UTF-16LE), the internal characterset of Qt is Unicode (using UTF-16 with native endianness), and the default character set of Gtk is also unicode (represented as UTF-8). So if you get the string from a GUI, you most likely can get it as unicode. The conversion unicode->shift_jis and shift_jis->unicode (especially if performed by glib in both directions) should be completely reversible and thus do no harm. Regards, Michael Karcher