error: call to 'vec_vsx_ld' is ambiguous
Testing, LowPublic

Description

I'm working from libgcrypt master on ppc64le:

$ make -j 3
...
crc-ppc.c:212:23: error: call to 'vec_vsx_ld' is ambiguous
  vector2x_u64 my_p = CRC_VEC_U64_LOAD(0, &consts->my_p[0]);
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
crc-ppc.c:169:4: note: expanded from macro 'CRC_VEC_U64_LOAD'
          vec_vsx_ld((offs), (const unsigned long long *)(ptr))
          ^~~~~~~~~~
/usr/lib/llvm-8/lib/clang/8.0.0/include/altivec.h:11903:1: note: candidate function
vec_vsx_ld(int __a, const vector bool int *__b) {
^
/usr/lib/llvm-8/lib/clang/8.0.0/include/altivec.h:11908:1: note: candidate function
vec_vsx_ld(int __a, const vector signed int *__b) {
^
/usr/lib/llvm-8/lib/clang/8.0.0/include/altivec.h:11913:1: note: candidate function
vec_vsx_ld(int __a, const signed int *__b) {
^
/usr/lib/llvm-8/lib/clang/8.0.0/include/altivec.h:11918:1: note: candidate function
vec_vsx_ld(int __a, const vector unsigned int *__b) {
^
/usr/lib/llvm-8/lib/clang/8.0.0/include/altivec.h:11923:1: note: candidate function
vec_vsx_ld(int __a, const unsigned int *__b) {
^
/usr/lib/llvm-8/lib/clang/8.0.0/include/altivec.h:11928:1: note: candidate function
vec_vsx_ld(int __a, const vector float *__b) {
^
/usr/lib/llvm-8/lib/clang/8.0.0/include/altivec.h:11932:45: note: candidate function
static __inline__ vector float __ATTRS_o_ai vec_vsx_ld(int __a,
                                            ^
...

Related Objects

JW created this task.Apr 1 2020, 4:38 PM
JW created this object in space S1 Public.
JW updated the task description. (Show Details)
gniibe added a subscriber: gniibe.Apr 3 2020, 3:55 AM

Thansk for your report.

I can't reproduce the error (no problem for build). My (cross-)compiler is:

$ powerpc64le-linux-gnu-gcc --version
powerpc64le-linux-gnu-gcc (Debian 9.3.0-8) 9.3.0
...
gniibe claimed this task.EditedApr 3 2020, 4:25 AM
gniibe closed this task as Invalid.
gniibe triaged this task as Normal priority.

I think that it is compiler issue for AltiVec (now, VSX) support.
The usage is not ambiguous. It _is_ ambiguous in the header file.

It looks for me that it was partially fixed in:
https://github.com/llvm/llvm-project/commit/2b36b15834e3589203b798c357ea032a35929d58

IIUC, it is needed to add more types (like unsigned long long *).

JW added a comment.EditedApr 3 2020, 4:43 AM

Hi @gniibe,

I can't reproduce the error (no problem for build). My (cross-)compiler is:

Maybe try a native compile?

I can reproduce it with both GCC and Clang on PowerPC on Travis:

I can also reproduce it on GCC112 on the compile farm. GCC112 is ppc64le.

It looks like the recipe to build the source file is missing the necessary arch options. I.e., -mcpu=power7 -mvsx:

libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I.. -I../src -I../src -I../mpi -I../
mpi -I/usr/local/include -g -O2 -fvisibility=hidden -fno-delete-null-pointer-che
cks -Wall -Wcast-align -Wshadow -Wstrict-prototypes -Wformat -Wno-format-y2k -Wf
ormat-security -W -Wextra -Wbad-function-cast -Wwrite-strings -Wdeclaration-afte
r-statement -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -c
 crc-ppc.c  -fPIC -DPIC -o .libs/crc-ppc.o

crc-ppc.c: In function 'crc32r_ppc8_ce_bulk':
crc-ppc.c:212:3: error: invalid parameter combination for AltiVec intrinsic __builtin_vec_ld
   vector2x_u64 my_p = CRC_VEC_U64_LOAD(0, &consts->my_p[0]);
   ^~~~~~~~~~~~
crc-ppc.c:213:3: error: invalid parameter combination for AltiVec intrinsic __builtin_vec_ld
   vector2x_u64 k1k2 = CRC_VEC_U64_LOAD(0, &consts->k[1 - 1]);
   ^~~~~~~~~~~~
crc-ppc.c:214:3: error: invalid parameter combination for AltiVec intrinsic __builtin_vec_ld
   vector2x_u64 k3k4 = CRC_VEC_U64_LOAD(0, &consts->k[3 - 1]);
   ^~~~~~~~~~~~
...

You can also test on GCC119 from the compile farm. GCC 119 is AIX ppc64be with IBM XLC. You will need CC=xlc and -qarch=pwr7 -qvsx

And one other thing... the GCC folks recommend _not_ use vec_vsx_ld. It is not part of the PowerPC64 ABI. Also see https://www.google.com/search?q=openpower+ABI+spec

JW added a comment.EditedApr 3 2020, 4:51 AM

It looks like the recipe to build the source file is missing the necessary arch options. I.e., -mcpu=power7 -mvsx ...

My bad... The vector polynomial multiply is Power8. You should use -mcpu=power8. The VSX unit is part of the CPU spec for Power8, so you don't need -mvsx. VSX was optional in Power7.

the GCC folks recommend _not_ use vec_vsx_ld

My bad again... I should have told you what to use... Use vec_xl for Power9 and above. Use vec_xlw4 for Power7 and above. Use vec_xl for Power7 with VSX.

For Power6 and below... use vec_ld for an _aligned_ load. For unaligned loads it gets ugly. See the Altivec Programming Environments Manual (PEM).

And when using LLVM, you need Clang 7.1 and above due to this bug: https://bugs.llvm.org/show_bug.cgi?id=39704.


And you might find this header useful. It gives you the different intrinsics for the three PowerPC compilers (GCC, Clang, XLC): ppc_simd.h. For example, here is what an AES encryption abstraction looks like. Each compiler used something different:

typedef __vector unsigned char   uint8x16_p;
typedef __vector unsigned long long uint64x2_p;
...

template <class T1, class T2>
inline T1 VecEncrypt(const T1 state, const T2 key)
{
#if defined(__ibmxl__) || (defined(_AIX) && defined(__xlC__))
    return (T1)__vcipher((uint8x16_p)state, (uint8x16_p)key);
#elif defined(__clang__)
    return (T1)__builtin_altivec_crypto_vcipher((uint64x2_p)state, (uint64x2_p)key);
#elif defined(__GNUC__)
    return (T1)__builtin_crypto_vcipher((uint64x2_p)state, (uint64x2_p)key);
#else
    CRYPTOPP_ASSERT(0);
#endif
}
gniibe reopened this task as Testing.Apr 3 2020, 5:25 AM
gniibe lowered the priority of this task from Normal to Low.

OK. I reopen this ticket to collect information.

You can test with newer compiler.

JW added a comment.EditedApr 3 2020, 5:45 AM

You can test with newer compiler.

OK, let me see what the compile farm offers. Sometimes they provide something newer.

And since you are use the polynomial multiply, here is what you need for intrinsics:

// 32-bit words
inline uint32x4_p VecPolyMultiply(const uint32x4_p& a, const uint32x4_p& b)
{
#if defined(__ibmxl__) || (defined(_AIX) && defined(__xlC__))
    return __vpmsumw (a, b);
#elif defined(__clang__)
    return __builtin_altivec_crypto_vpmsumw (a, b);
#else
    return __builtin_crypto_vpmsumw (a, b);
#endif
}

And:

// 64-bit words
inline uint64x2_p VecPolyMultiply(const uint64x2_p& a, const uint64x2_p& b)
{
#if defined(__ibmxl__) || (defined(_AIX) && defined(__xlC__))
    return __vpmsumd (a, b);
#elif defined(__clang__)
    return __builtin_altivec_crypto_vpmsumd (a, b);
#else
    return __builtin_crypto_vpmsumd (a, b);
#endif
}

(I assume you are calculating the CRC using the 16-byte polynomials and then doing the Barrett Reductions).

Attached patch should solve the issue for gcc 7.5 and clang 8.

gniibe added a comment.Apr 6 2020, 4:28 AM

@jukivili : Thank you. Please apply & push it.

JW added a comment.Apr 6 2020, 10:21 AM

@jukivili,

I'd be interested in seeing the results of testing the patch. Can you provide a link to the results?

In T4906#133954, @JW wrote:

@jukivili,

I'd be interested in seeing the results of testing the patch. Can you provide a link to the results?

No link, sorry. I reproduced build issue on fresh Ubuntu 18.04 ppc64le instance (at https://openpower.ic.unicamp.br/minicloud/).