Likely fixed by commit a4d1595a2638db63ac4c73e722c8ba95fdd85ff7 (rijndael-aesni: split assembly block to ease register pressure) in 1.7 branch (and included in 1.7.3+).
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jul 13 2017
Jul 6 2017
I did some experimenting and clang SIGILL does not trigger with commonly used, but non-conforming, variable-length object with "struct hack", as below:
Jun 18 2017
May 21 2017
Apr 11 2017
Feb 26 2017
How about this patch?
Does the attached patch fix the problem?
Feb 4 2017
Jan 25 2017
I have now learnt how GCC uses 'undefined behavior' for aggressive optimization
and that this could break code doing unaligned accesses even on x86. So this
needs to be fixed after all.
Dec 21 2016
Attached patch should solve LTO problems with rinjdael-ssse-amd64.c.
'memcpy' problem seems to be because of bad interaction between -flto and
#pragma "no-sse". Strangely switching memcpy to buf_cpy solved problem, even
through buf_cpy itself just uses memcpy (on x86).
With this issue solved, I ran in to problem with rijndael-ssse3 assembly code
blocks going missing with -flto and link failing. So rest of the changes in
patch are for fixing lto visibility of assembly.
Jul 2 2016
Currently, there is no need for alignmask API. Implementations that we have at
the moment can handle unaligned data and some have fast paths for word-aligned
in/out buffers (which malloc can provide).
We could add section in documentation about appropiate memory alignment for best
performance, and tell to align buffers to cacheline size.
Hello,
I posted fix for this issue to mailing-list. See:
http://marc.info/?l=gcrypt-devel&m=146732375910584&w=2
Mar 25 2016
Current code is perfectly fine as crc-intel-pclmul.c is i386/amd64-only source
file and that target architecture can handle unaligned loads.
Sep 7 2015
Fixed by commit 92fa5f16d69707e302c0f85b2e5e80af8dc037f1
Mar 11 2015
Unaligned memory accesses are enabled on only architectures that can handle
those. The buf_xor function that you copy-pasted partially to stackoverflow
actually has alignment checks:
#if defined(i386) || defined(x86_64) || \
defined(__powerpc__) || defined(__powerpc64__) || \ (defined(__arm__) && defined(__ARM_FEATURE_UNALIGNED)) || \ defined(__aarch64__)
/* These architectures are able of unaligned memory accesses and can
handle those fast. */
- define BUFHELP_FAST_UNALIGNED_ACCESS 1 #endif ... /* Optimized function for buffer xoring */ static inline void buf_xor(void *_dst, const void *_src1, const void *_src2, size_t len) { byte *dst = _dst; const byte *src1 = _src1; const byte *src2 = _src2; uintptr_t *ldst; const uintptr_t *lsrc1, *lsrc2; #ifndef BUFHELP_FAST_UNALIGNED_ACCESS const unsigned int longmask = sizeof(uintptr_t) - 1; /* Skip fast processing if buffers are unaligned. */ if (((uintptr_t)dst | (uintptr_t)src1 | (uintptr_t)src2) & longmask) goto do_bytes; #endif ldst = (uintptr_t *)(void *)dst; lsrc1 = (const uintptr_t *)(const void *)src1; lsrc2 = (const uintptr_t *)(const void *)src2; for (; len >= sizeof(uintptr_t); len -= sizeof(uintptr_t)) *ldst++ = *lsrc1++ ^ *lsrc2++; dst = (byte *)ldst; src1 = (const byte *)lsrc1; src2 = (const byte *)lsrc2; #ifndef BUFHELP_FAST_UNALIGNED_ACCESS do_bytes: #endif /* Handle tail. */ for (; len; len--) *dst++ = *src1++ ^ *src2++; }
So, yes, we use unaligned memory accesses but only when it is known that they work.
Now, solution (with same code generation, without undefined behaviour) to this
issue is to tell the compiler that we really want to do unaligned accesses. For
that we need to change the accesses to happen through type that has proper
one-byte alignment, but generates the same code (unaligned word-size memory
accesses) on the few architectures that enable 'BUFHELP_FAST_UNALIGNED_ACCESS':
#ifdef BUFHELP_FAST_UNALIGNED_ACCESS
/* Define type with one-byte alignment on architectures with fast unaligned
memory accesses. */
typedef struct bufhelp_int_s
{
uintptr_t a;
} attribute((packed, aligned(1))) bufhelp_int_t;
#else
/* Define type with default alignment for other architectures (unaligned
accessed handled in per byte loops). */
typedef struct bufhelp_int_s
{
uintptr_t a;
} bufhelp_int_t;
#endif
Ofcourse, BUFHELP_FAST_UNALIGNED_ACCESS now need to be limited to compiler that
support GCC style attributes.