Page MenuHome GnuPG

Wrong charset in console messages (Cyrillic, Windows)
Closed, DuplicatePublic

Description

Russian messages are printed in wrong character set.

Details

Version
2.0.17

Event Timeline

kiav added projects: gnupg, Bug Report.
kiav added a subscriber: kiav.

We always output plain UTF-8 on Windows.

But why is it unreadable? I see Cyrillic letters (gpgconsole.jpg) but this is
not Russian text!

I wrote a small test program for .Net to see what is console charset:

Console.WriteLine("Original console charset is 866")
Console.WriteLine("User readable name and codepage of charset: {0} {1}",
Console.OutputEncoding.EncodingName, Console.OutputEncoding.CodePage)
Console.WriteLine()

Console.WriteLine("Set console charset to ANSI (on Russian systems it is 1251)")
Console.OutputEncoding = System.Text.Encoding.Default
Console.WriteLine()

Console.WriteLine("User readable name and codepage of charset: {0} {1}",
Console.OutputEncoding.EncodingName, Console.OutputEncoding.CodePage)
Console.WriteLine("Pay attention to unreadable name of charset in this case.")

The output screenshot attached.

As I can see - the right charset for Russian systems is 866 (CP866).

We use what the system tells us. See jnlib/utf8conv.c:set_native_charset . An
alias for CP866 might be missing. We don't switch the console charset but use
libiconv to translate between charsets.

Ok, I wrote a small test program in Visual C:

unsigned int cpno;

cpno = GetConsoleOutputCP ();
printf ("CP%u", cpno );

The output is CP866. So GnuPG should correctly recognize console charset.
See russianconsolecharset_vc.jpg

Perhaps the reason is in absence CP866 in aliases ...

According to libiconv documentation, it recognizes CP866 as legal charset name.
http://www.gnu.org/s/libiconv/

So GnuPG does not need to convert an alias for it and

sprintf (codepage, "CP%u", cpno );
newset = codepage;

prepares the right value for newset in

int set_native_charset (const char *newset);

Rather strange error:

1.) CP866 is supported by libiconv and successfully recognized bu GnuPG,
2.) Russian messages are correct Russian texts and in UTF-8 (po/ru.po opened in
Notepad++ as ANSI as UTF-8)

but GnuPG prints unreadable text!

I copied some text output from console to Notepad.exe (standard Windows program).
Look first line in gpg-consoleoutput-unicode.txt

It looks exactly like in screenshot (look next line after 'Home:
C:/Users/akir/AppData/Roaming/gnupg' in gpgconsole.jpg).

I translated this output on http://2cyr.com/decode/ (look second line in
gpg-consoleoutput-unicode.txt). This site detects windows-1251 as source
encoding and ibm866 as 'displayed as'. I do not know what does it mean.

windows-1251 is ANSI charset on Russian Windows
ibm866 is an alias for CP866 (OEM charset on Russian Windows)

A test program compiled with MinGW under Windows gives the same result - CP866.
I used just 'gcc console.c' and ran a.exe in MinGW Shell.

#include <windows.h>
#include <stdio.h>

int main()
{
unsigned int cpno;

cpno = GetConsoleOutputCP ();
printf ("CP%u", cpno );

return 0;
}

I can't compile the whole GnuPG under Windows to verify it. I suppose nobody can
because GnuPG for Windows can be compiled using cross-compiling only.

A fix for this has been included in gpg4win 2.2.2.

GnuPG already converted the Output but to CP_ACP instead of the
"GetConsoleOutputCP" which was wrong.

Does this now also work for you? I've only tested it with the Codepage for Germany.

Does this now also work for you?

Yes. Thank you.

aheinecke claimed this task.
aheinecke removed a project: Restricted Project.