t-lock and random tests crash on SPARC 32-bit
Closed, ResolvedPublic

jf added a subscriber: jf.
jf added a comment.Jun 4 2016, 6:43 PM

libgcryp-1.7.0 t-lock and random test cases consistently crash on
SPARCv7 platform. On other platforms like x86/amd64/SPARCv9 the tests
succeed.

  • 8< ---

PASS: t-mpi-point
PASS: curves
/bin/bash: line 5: 6557 Bus Error (core dumped)
GCRYPT_IN_REGRESSION_TEST=1 ${dir}$tst
FAIL: t-lock
PASS: prime

[...]

PASS: fips186-dsa
PASS: aeswrap
PASS: pkcs1v2
random: running './random --in-recursion --early-rng-check' failed
FAIL: random
PASS: dsa-rfc6979

256 of 1026 tests done
512 of 1026 tests done
  • 8< ---
jf renamed this task from libgcrypt-1.7.0 t-lock test crashes on SPARC 32-bit to t-lock and random tests crash on SPARC 32-bit.Jun 4 2016, 6:43 PM
jf set Version to libgcrypt-1.7.0.
jf removed a project: libgcrypt.Jun 4 2016, 7:04 PM

Debugging a bit more in the source code suggests, that the problem could
be still related to the misalignment resolved in the bug 2144 [1].

There seem to be two problems with the fix [2].

  1. the fix causes that two identical gpg-error.h header files could be

generated based on what the current compilation target platform is,
32/64bit. Instead of generating two distinct header files, only one
header file should be generated (or better to say, the
gen-posix-lock-obj can be called twice - once for 32bit platform, once
for the 64-bit one, but it should always return the same content).
The particular decision whether the alignment item will be added to the
struct should be solved in the header file - not at the header file
generation time. In that case, the decision whether the alignment item
is added will be made at the libgpg-error consumer's compilation time
(like libgcrypt).

  1. the fix updates only the external gpgrt_lock_t; it's internal

counterpart _gpgrt_lock_t is not updated. This causes that functions
working with the POSIX mutexes (gpgrt_lock_*()) could access misaligned
addresses - that results in Bus Errors on SPARC.

Referecences:
[1] T2144
[2]
https://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgpg-error.git;a=commit;h=f7a77c5c236ecec846de9be46703026f9b01008f

jf removed Version.Jun 4 2016, 7:04 PM
jf added a project: gpgrt.
jf added a comment.Jun 15 2016, 9:18 AM

Please, find below the preliminary suggested fix:

  • ./src/gen-posix-lock-obj.c.orig Mon Jun 13 08:07:53 2016

+++ ./src/gen-posix-lock-obj.c Mon Jun 13 08:08:40 2016
@@ -42,21 +42,8 @@
#endif
#endif

-/* Special requirements for certain platforms. */

  • define USE_LONG_DOUBLE_FOR_ALIGNMENT 0

-#if defined(sun) && !defined (LP64__) && !defined(_LP64)
-/* Solaris on 32-bit architecture. */

  • define USE_DOUBLE_FOR_ALIGNMENT 1

-#else

  • define USE_DOUBLE_FOR_ALIGNMENT 0

-#endif
-#if defined(hppa)

  • define USE_16BYTE_ALIGNMENT 1

-#else

  • define USE_16BYTE_ALIGNMENT 0

-#endif

-#if USE_16BYTE_ALIGNMENT && !HAVE_GCC_ATTRIBUTE_ALIGNED
+#if defined(hppa) && !HAVE_GCC_ATTRIBUTE_ALIGNED

  1. error compiler is not able to enforce a 16 byte alignment #endif

@@ -122,12 +109,14 @@

"\n"
"#define GPGRT_LOCK_INITIALIZER {%d,{{",
SIZEOF_PTHREAD_MUTEX_T,
  • if USE_16BYTE_ALIGNMENT

+/* Special requirements for certain platforms. */
+# ifdef (hppa)

"    int _x16_align __attribute__ ((aligned (16)));\n",
    • elif USE_DOUBLE_FOR_ALIGNMENT
  • " double _xd_align;\n",
    • elif USE_LONG_DOUBLE_FOR_ALIGNMENT
  • " long double _xld_align;\n",

+# elif defined(sun)
+ "#if (defined(
sparc) || defined(sparc)) && \\\n"
+ " !defined (LP64) && !defined(_LP64)\n"
+ " double _xd_align;\n"
+ "#endif\n",

  1. else "",
  2. endif
jf added a comment.Jun 15 2016, 9:24 AM

Note: the comment 2) in T2378 (jf on Jun 04 2016, 07:04 PM / Roundup) [https://bugs.gnupg.org/gnupg/msg8416]
is not correct. The original text says:

    • 8< ---
  1. the fix updates only the external gpgrt_lock_t; it's internal

counterpart _gpgrt_lock_t is not updated. This causes that functions
working with the POSIX mutexes (gpgrt_lock_*()) could access misaligned
addresses - that results in Bus Errors on SPARC.

  • 8< ---

The fact is that _gpgrt_lock_t already contains pthread_mutex_t thus it
is correctly aligned (alignes on 8B boundary). The problem pops up if
the outer gpgrt_lock_t is aligned on 4 bytes boundary, while the
internal _gpgrt_lock_t in aligned on 8 bytes.

werner added a subscriber: werner.Jun 15 2016, 12:00 PM

Re T2378 (jf on Jun 04 2016, 07:04 PM / Roundup): We consider a 32 bit and a 64 bit system different platforms and
thus you get different header files.

jf added a comment.Jul 26 2016, 7:00 PM

IMHO, it's pretty uncommon from the packaging point of view to deliver
different generic header files for different platforms (one can meet
platform related ifdefs much often). It necessarily moves the platform
decision rules into the source code of the library consumer.

gniibe set External Link to https://java.net/projects/solaris-userland/pages/Home.Jul 29 2016, 3:25 AM

Distinguishing between 32 and 64 bit Windows in the same development package
works on Windows but only because 64 bit Windows also supports 32 bit Windows.
On most other platforms this is not the case. For a different ABI it is quite
common to require the installation of a platform specific development package.

You won't change the design to support sloppy build systems which would only
trigger hard to find bugs.

werner closed this task as Resolved.
werner claimed this task.