Page MenuHome GnuPG

Error handling in libassuan
Closed, ResolvedPublic

Description

When I connect gpg-agent running with LANG=ja_JP, when an error, it can be like this with longer error message:

$ gpg-connect-agent "SCD getattr KEY-STATUS" /bye
ERR 100663364 オブジェクト内に項目が欠如してい\343 <SCD>

It's "Missing item in object" but Japanese translation text is 18 character long, 54 in bytes in UTF-8, it is truncated in the middle of character, resulted \343.
The message is truncated in libassuan/src/assuan-handler.c (in the function assuan_process_done), formatting with %.50s, thus, it is truncated at the middle of character in the UTF-8 representation.

There are two problems here:
(1) How the error message is truncated
(2) It would be good if an error message is generated at the client side (not server side), and it is client side which does the translation, if needed.

Related Objects

Event Timeline

re 1: Correct utf-8 truncation would be quite some work. In this case the message is in the Assuan interface is a debugging aid. Translation is not necessary so we can try to disable it.

re 2: Given that it is a debugging aid, I don't think that extra code in the cleint is justified

To implement this it would be best to have an gpg_strerror variant which does not call dgettext.

Sorry, I was wrong. It seems that GNU C library has a feature to avoid bad truncation.

It is the gpg_strerror_r which truncates the string.

This problem can be avoided with the following patch to libgpg-error:

diff --git a/configure.ac b/configure.ac
index 53a343b..f496729 100644
--- a/configure.ac
+++ b/configure.ac
@@ -82,6 +82,7 @@ AC_PROG_AWK
 AC_CHECK_TOOL(AR, ar, :)
 AC_USE_SYSTEM_EXTENSIONS
 
+AM_LANGINFO_CODESET
 
 # Taken from mpfr-4.0.1, then modified for LDADD_FOR_TESTS_KLUDGE
 dnl Under Linux, make sure that the old dtags are used if LD_LIBRARY_PATH
diff --git a/src/strerror.c b/src/strerror.c
index 4cce17f..a4bb36e 100644
--- a/src/strerror.c
+++ b/src/strerror.c
@@ -32,6 +32,10 @@
 #include "gettext.h"
 #include "err-codes.h"
 
+#if defined(ENABLE_NLS) && defined(HAVE_LANGINFO_CODESET)
+#include <langinfo.h>
+#endif
+
 /* Return a pointer to a string containing a description of the error
    code in the error value ERR.  This function is not thread-safe.  */
 const char *
@@ -169,9 +173,29 @@ _gpg_strerror_r (gpg_error_t err, char *buf, size_t buflen)
   errstr = dgettext (PACKAGE, msgstr + msgidx[msgidxof (code)]);
   errstr_len = strlen (errstr) + 1;
   cpy_len = errstr_len < buflen ? errstr_len : buflen;
+#if defined(ENABLE_NLS) && defined(HAVE_LANGINFO_CODESET)
+  /* Avoid truncation in the middle of "character" boundary.  */
+  if (buflen && errstr_len > buflen
+      && !strcasecmp (nl_langinfo (CODESET), "UTF-8"))
+    {
+      /* Skip to the boundary */
+      for (; cpy_len; cpy_len--)
+        if (((unsigned char)errstr[cpy_len-1] & 0xC0) != 0x80)
+          break;
+      memcpy (buf, errstr, cpy_len);
+      memset (buf+cpy_len, 0, buflen - cpy_len);
+    }
+  else
+    {
+      memcpy (buf, errstr, cpy_len);
+      if (buflen)
+        buf[buflen - 1] = '\0';
+    }
+#else
   memcpy (buf, errstr, cpy_len);
   if (buflen)
     buf[buflen - 1] = '\0';
+#endif
 
   return cpy_len == errstr_len ? 0 : ERANGE;
 }

But I'm not sure if it's worth to apply.

I guess the strcasecmp (nl_langinfo (CODESET), "UTF-8") results in some overhead, so if we do that what about kicking in only if a truncation is really to happen.

Updated:

diff --git a/configure.ac b/configure.ac
index 53a343b..f496729 100644
--- a/configure.ac
+++ b/configure.ac
@@ -82,6 +82,7 @@ AC_PROG_AWK
 AC_CHECK_TOOL(AR, ar, :)
 AC_USE_SYSTEM_EXTENSIONS
 
+AM_LANGINFO_CODESET
 
 # Taken from mpfr-4.0.1, then modified for LDADD_FOR_TESTS_KLUDGE
 dnl Under Linux, make sure that the old dtags are used if LD_LIBRARY_PATH
diff --git a/src/strerror.c b/src/strerror.c
index 4cce17f..fb1bebf 100644
--- a/src/strerror.c
+++ b/src/strerror.c
@@ -32,6 +32,10 @@
 #include "gettext.h"
 #include "err-codes.h"
 
+#if defined(ENABLE_NLS) && defined(HAVE_LANGINFO_CODESET)
+#include <langinfo.h>
+#endif
+
 /* Return a pointer to a string containing a description of the error
    code in the error value ERR.  This function is not thread-safe.  */
 const char *
@@ -169,9 +173,30 @@ _gpg_strerror_r (gpg_error_t err, char *buf, size_t buflen)
   errstr = dgettext (PACKAGE, msgstr + msgidx[msgidxof (code)]);
   errstr_len = strlen (errstr) + 1;
   cpy_len = errstr_len < buflen ? errstr_len : buflen;
+#if defined(ENABLE_NLS) && defined(HAVE_LANGINFO_CODESET)
+  /* Avoid truncation in the middle of "character" boundary.  */
+  if (buflen && errstr_len > buflen
+      && ((unsigned char)errstr[cpy_len-1] & 0xC0) == 0x80
+      && !strcasecmp (nl_langinfo (CODESET), "UTF-8"))
+    {
+      /* Go back to the boundary */
+      for (; cpy_len; cpy_len--)
+        if (((unsigned char)errstr[cpy_len-1] & 0xC0) != 0x80)
+          break;
+      memcpy (buf, errstr, cpy_len);
+      memset (buf+cpy_len, 0, buflen - cpy_len);
+    }
+  else
+    {
+      memcpy (buf, errstr, cpy_len);
+      if (buflen)
+        buf[buflen - 1] = '\0';
+    }
+#else
   memcpy (buf, errstr, cpy_len);
   if (buflen)
     buf[buflen - 1] = '\0';
+#endif
 
   return cpy_len == errstr_len ? 0 : ERANGE;
 }

(sorry, about my former comment, I only now realized that you did just that already in your original patch)

gniibe claimed this task.

Applied and pushed.