Russian messages are printed in wrong character set.
Description
Details
- Version
- 2.0.17
Related Objects
Event Timeline
But why is it unreadable? I see Cyrillic letters (gpgconsole.jpg) but this is
not Russian text!
I wrote a small test program for .Net to see what is console charset:
Console.WriteLine("Original console charset is 866")
Console.WriteLine("User readable name and codepage of charset: {0} {1}",
Console.OutputEncoding.EncodingName, Console.OutputEncoding.CodePage)
Console.WriteLine()
Console.WriteLine("Set console charset to ANSI (on Russian systems it is 1251)")
Console.OutputEncoding = System.Text.Encoding.Default
Console.WriteLine()
Console.WriteLine("User readable name and codepage of charset: {0} {1}",
Console.OutputEncoding.EncodingName, Console.OutputEncoding.CodePage)
Console.WriteLine("Pay attention to unreadable name of charset in this case.")
The output screenshot attached.
As I can see - the right charset for Russian systems is 866 (CP866).
We use what the system tells us. See jnlib/utf8conv.c:set_native_charset . An
alias for CP866 might be missing. We don't switch the console charset but use
libiconv to translate between charsets.
Ok, I wrote a small test program in Visual C:
unsigned int cpno;
cpno = GetConsoleOutputCP ();
printf ("CP%u", cpno );
The output is CP866. So GnuPG should correctly recognize console charset.
See russianconsolecharset_vc.jpg
Perhaps the reason is in absence CP866 in aliases ...
According to libiconv documentation, it recognizes CP866 as legal charset name.
http://www.gnu.org/s/libiconv/
So GnuPG does not need to convert an alias for it and
sprintf (codepage, "CP%u", cpno );
newset = codepage;
prepares the right value for newset in
int set_native_charset (const char *newset);
Rather strange error:
1.) CP866 is supported by libiconv and successfully recognized bu GnuPG,
2.) Russian messages are correct Russian texts and in UTF-8 (po/ru.po opened in
Notepad++ as ANSI as UTF-8)
but GnuPG prints unreadable text!
I copied some text output from console to Notepad.exe (standard Windows program).
Look first line in gpg-consoleoutput-unicode.txt
It looks exactly like in screenshot (look next line after 'Home:
C:/Users/akir/AppData/Roaming/gnupg' in gpgconsole.jpg).
I translated this output on http://2cyr.com/decode/ (look second line in
gpg-consoleoutput-unicode.txt). This site detects windows-1251 as source
encoding and ibm866 as 'displayed as'. I do not know what does it mean.
windows-1251 is ANSI charset on Russian Windows
ibm866 is an alias for CP866 (OEM charset on Russian Windows)
A test program compiled with MinGW under Windows gives the same result - CP866.
I used just 'gcc console.c' and ran a.exe in MinGW Shell.
#include <windows.h>
#include <stdio.h>
int main()
{
unsigned int cpno;
cpno = GetConsoleOutputCP ();
printf ("CP%u", cpno );
return 0;
}
I can't compile the whole GnuPG under Windows to verify it. I suppose nobody can
because GnuPG for Windows can be compiled using cross-compiling only.
A fix for this has been included in gpg4win 2.2.2.
GnuPG already converted the Output but to CP_ACP instead of the
"GetConsoleOutputCP" which was wrong.
Does this now also work for you? I've only tested it with the Codepage for Germany.