libgpg-error: assertion error on Solaris/sparc
Closed, ResolvedPublic

Description

In https://mail-index.netbsd.org/pkgsrc-users/2015/05/11/msg021538.html Ibraheem
Saleh report an assertion problem on Solaris 10 sparc with gcc 4.9.2

Tim Zingelman found a fix for this (see
https://mail-index.netbsd.org/pkgsrc-users/2015/11/07/msg022575.html). I'll
include this here completely:

On Mon, 11 May 2015, Ibraheem Saleh wrote:

Building libgpg-error 1.19 on Solaris 10 sparc with gcc 4.9.2 succeeds
without error but coredumps when trying to use it.

bmake test fails all 5 of the tests with the same assertion message:

gmake[2]: Entering directory
`/usr/pkgsrc/security/libgpg-error/work/libgpg-error-1.19/tests'
Assertion failed: !"sizeof lock obj", file posix-lock.c, line 119
Abort (core dumped)
FAIL: t-version

Anyone know whats going on or how to fix this?

I'm thinking that the problem is that in _gpgrt_lock_t the second
element ends up being 64 bit aligned (a long of padding after vers)
giving a total sizeof=32 and no such alignment happens for
gpgrt_lock_t so sizeof=28. I added a printf before the assert and it
reported those two exact different sizes, so that supports the theory.

typedef struct
{

long _vers;
union {
  volatile char _priv[24];
  long _x_align;
  long *_xp_align;
} u;

} gpgrt_lock_t;

typedef struct
{

long vers;

#if USE_POSIX_THREADS

union {
  pthread_mutex_t mtx;
  long *dummy;
} u;

#endif
} _gpgrt_lock_t;

I just tested on Solaris 10 Sparc and the following patch resolves the
problem. I'm not sure if this patch is appropriate for all platforms
or only Sparc. If someone else has the cycles to pass this fix
upstream that would be great.
(Also note I did NOT take out the patch from
patches/patch-src_estream.c when testing this fix... it seemed
unrelated to the assert failure.)

  • src/gen-posix-lock-obj.c.orig Thu Nov 5 16:44:59 2015

+++ src/gen-posix-lock-obj.c Thu Nov 5 16:49:19 2015
@@ -109,7 +109,7 @@

"  union {\n"
"    volatile char _priv[%d];\n"
"%s"
  • " long _x_align;\n"

+ " long long _x_align;\n"

"    long *_xp_align;\n"
"  } u;\n"
"} gpgrt_lock_t;\n"

Details

Version
1.19
wiz set Version to 1.19.
wiz added a subscriber: wiz.
werner claimed this task.Nov 11 2015, 4:31 PM
werner added a subscriber: werner.
wiz added a comment.Feb 27 2016, 9:33 AM

Thank you for the patch.
I don't have the environment, but I asked the original reporter to test.
Sadly, his reply is negative, see:

https://mail-index.netbsd.org/pkgsrc-users/2016/02/27/msg023071.html

Let me clarify/confirm. Does it work on Solaris? And now do you speak for NetBSD?
My fix is specific to Solaris (no matter if it's Sparc or not). It doesn't
handle any issues for NetBSD.

I seems that Sparc GNU/Linux doesn't have this alignment issue, but (for me) it
is highly likely that sparc architecutre requires the alignment of 8-byte.

wiz added a comment.Feb 29 2016, 9:06 AM

I'm working on pkgsrc, which is a portable packaging system origination on
NetBSD. I myself work mostly on NetBSD.

However, we have patches for non-NetBSD platforms in pkgsrc, and this patch was
worked on by the two people mentioned earlier. Since I can not test on Solaris,
I asked them to test on Solaris, and Ibraheem did that.

I hope that clears it up.

gniibe added a comment.Mar 1 2016, 2:32 AM

Thank you for clarification. I didn't know that pkgsrc supports other
platforms. Now, I understand.

I changed more:
http://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgpg-error.git;a=commitdiff;h=34b07146bbb0863436fc9381a84314b18bdfb807;hp=49655fb6ef39b307787e1b6e00c996f9c7db64f7

The intention is that USE_DOUBLE_FOR_ALIGNMENT for Solaris 32-bit version.
I thought that checking ILP32 worked (but not, in fact). I believe that LP64
checking works (at least with GCC).

wiz added a comment.Mar 2 2016, 12:47 AM

Ibraheem very kindly tested again. However, it is still not working completely.
He writes:

It's still core dumping... Out of curiosity, I explicitly defined

'USE_DOUBLE_FOR_ALIGNMENT 1' and the checks were passing on Solaris with no more
core dumps. I guess that means they're on the right track, just have to get the
preprocessor directives right for gcc and Solaris.

Full details are in
https://mail-index.netbsd.org/pkgsrc-users/2016/03/01/msg023078.html

jf added a subscriber: jf.Apr 14 2016, 5:25 PM

The attached is an analysis from the Solaris/SPARC point of view.

One of the possible SPARC specific fixes:

  • ./src/posix-lock-obj.h.orig Wed Apr 13 08:24:20 2016

+++ ./src/posix-lock-obj.h Wed Apr 13 08:24:25 2016
@@ -29,7 +29,7 @@

typedef struct
{

  • long vers;

+ long long vers;
#if USE_POSIX_THREADS

union {
  pthread_mutex_t mtx;
  • ./src/gen-posix-lock-obj.c.orig Wed Apr 13 08:23:59 2016

+++ ./src/gen-posix-lock-obj.c Wed Apr 13 08:24:29 2016
@@ -66,7 +66,7 @@

int i;

#endif

struct {
  • long vers;

+ long long vers;
#ifdef USE_POSIX_THREADS

pthread_mutex_t mtx;

#endif
@@ -105,7 +105,7 @@

   union and include a long and a pointer to a long.  */
printf ("typedef struct\n"
        "{\n"
  • " long _vers;\n"

+ " long long _vers;\n"

"  union {\n"
"    volatile char _priv[%d];\n"
"%s"

@@ -138,7 +138,7 @@

printf ("/* Dummy object - no locking available.  */\n"
        "typedef struct\n"
        "{\n"
  • " long _vers;\n"

+ " long long _vers;\n"

           "} gpgrt_lock_t;\n"
           "\n"
           "#define GPGRT_LOCK_INITIALIZER {%d}\n",

Note, that this was not fully tested on other platforms and might need
additional changes in the header files. I did some minor tests on
Solaris amd64/SPARCv9/SPARCv7, Linux amd64/SPARCv9.

jf added a comment.Apr 14 2016, 5:25 PM

Thank you for your patch. Yes, we already located the issue is the alignment.
I think that it were good if the MTX were placed at the head. While your patch
works, it changes ABI of the lock object for existing archs, unfortunately.

I fixed the detection of Solaris in:
http://git.gnupg.org/cgi-bin/gitweb.cgi?p=libgpg-error.git;a=commit;h=f7a77c5c236ecec846de9be46703026f9b01008f

And I believe that the bug reported here had gone.
Please test current development master.

This particular problem of assertion error, it was fixed in 1.22. So, I close this.
We also have T2378 for a possible change for Solaris/sparc. Please continue
there.

gniibe closed this task as Resolved.