User Details
- User Since
- Mar 27 2017, 4:48 PM (399 w, 2 d)
- Availability
- Available
Sat, Nov 9
Aug 28 2024
Thanks. Test works in my nightly builds now.
Aug 22 2024
Aug 8 2024
Aug 7 2024
Do you have any way to test PAC/BTI on actual HW that support these extensions?
Aug 5 2024
This excludes 32-bit ARM assembly from Aarch64 builds:
Aug 4 2024
Here's patch:
This patch should fix the issue:
Ok, so aarch64 assembly would need PAC and BTI support. As far as I have understood these, is that PAC instructions are not needed with current assembly as none of those is storing/loading LR register (all aarch64 assembly functions are leaf functions). So only BTI is needed and that is basically same modification as CET on x86.
Jul 29 2024
Jul 27 2024
"rijndael-vaes-avx2-i386.S" should not be build for x86-64 but until now that has not had any affect as #ifdefs in that source file result empty object file on x86-64.
Jul 26 2024
Here's patches for adding CET support to x86-64 and i386 assembly.
OpenBSD carries libgcrypt patch for CET which adds endbr64 instruction to CFI_STARTPROC() macro in "asm-common-amd64.h". We could do the same and also add endbr32 to i386 too. That would be easiest way to add required endbr instructions. OpenBSD also has patch for arm64 to add similar BTI instructions to aarch64 variant of CFI_STARTPROC.
There is -O flag munging for "tiger.o" in "cipher/Makefile.am", an old workaround for broken compiler I think. IMHO tiger.o case can and should be removed.
OpenBSD carries libgcrypt patch for CET which adds endbr64 instruction to CFI_STARTPROC() macro in "asm-common-amd64.h". We could do the same and also add endbr32 to i386 too. That would be easiest way to add required endbr instructions. OpenBSD also has patch for arm64 to add similar BTI instructions to aarch64 variant of CFI_STARTPROC.
Jul 7 2024
Jun 24 2024
Jun 23 2024
Jun 22 2024
I tried to reproduce issue with clang/w32 toolchain from https://github.com/mstorsjo/llvm-mingw but there build worked even with CFI directives.
Hm, CFI directives should not be used on WIN32 target. This patch should solve the issue for now:
Thanks for testing. I pushed this fix to libgcrypt master.
Jun 21 2024
Just to make sure, did you use the updated version of the patch? I edited the message with fix candidate and changed the attachment.
Jun 20 2024
Here's fix candidate (edit, new try):
Algo 329 and 330 are the new CSHAKE128 and CSHAKE256 digest algos. Looks that s390x only support accelerating SHA3 and SHAKE, as only SHA3 and SHAKE suffix are supported (see keccak_final_s390x()). So s390x acceleration needs to be disabled for CSHAKE algos.
May 29 2024
I left review comments in gitlab. One additional concern is license for mpi-mul-cs.c, original code not having copyright information... "does not have any copyright information, assuming public domain".
May 9 2024
May 8 2024
Thanks for report. I've applied this change to master.
Apr 30 2024
Mar 1 2024
Looks good to me. __CLOBBER_CC is needed as PA-RISC has carry/borrow bits in status register for add/sub instructions.
Feb 28 2024
No, hardware barrier is not needed here. Compiler barrier is used here to prevent optimization removing mask generation and usage in following constant-time code.
Feb 4 2024
Dec 21 2023
Fix for i386 assembly pushed to master and 1.10 branch.
Dec 19 2023
It looks that this is a bit more problematic case than I thought. Now building i386 with "-O2 -fsanitize=undefined" flags fails. I need to think little bit more how to handle this.
Dec 18 2023
Dec 16 2023
Attached patch should workaround the issue:
Nov 4 2023
Oct 23 2023
Yes, int8_t/int16_t/int32_t/uint8_t/uint16_t/uint32_t should not be used. There is size-specific integer types defined in src/types.h which can be used instead (byte/u16/u32). This header does not yet have signed integer types, but those can be added (for example, s8/s16/s32).
Oct 17 2023
Oct 15 2023
- There's many functions that use buffers on stack. Do those contain secrets? Should those buffers be wiped before returning from function (with wipememory())? For example, "mlkem_check_secret_key" has two buffers "shared_secret_1" and "shared_secret_2" which are not wiped.
- mlkem.c: mlkem_check_secret_key: "memcmp" is used to compare shared secrets. Should this use constant time comparison instead?
- mlkem-common.c: _gcry_mlkem_mlkem_shake256_rkprf:
- _gcry_md_hash_buffers_extract can be used here instead of _gcry_md_open&write&extract&close.
- mlkem-symmetric.c: _gcry_mlkem_shake256_prf:
- _gcry_md_hash_buffers_extract can be used here instead of _gcry_md_open&write&extract&close. Temporary buffer usage can be avoided by passing input buffers through two IOV to _gcry_md_hash_buffers_extract.
Few comments on the patches.
Sep 30 2023
Sep 15 2023
Just started wondering how much of this slow down is because of MingW libc not having very well optimized memcpy/memmove/memchr/strlen/etc. Is there profiling tools like 'perf' on Linux that could be used for Windows builds?