AVX512 when CPU has high vector inst latency
e5bc3b28260e
Actions

Description

blake2: avoid AVX/AVX2/AVX512 when CPU has high vector inst latency

* cipher/blake2.c (blake2b_init_ctx, blake2s_init_ctx): Disable
AVX/AVX2/AVX512 implementation if x86 CPU prefers GPR implementation
over scalar integer vector.
* src/hwf-common.h (hwf_x86_cpu_details)
(_gcry_hwf_x86_cpu_details): New.
* src/hwf-x86.c (x86_cpu_details, x86_hw_features)
(x86_detect_done, _gcry_hwf_x86_cpu_details): New.
(detect_x86_gnuc): Detect Zen5 and add 'cpu_details'.
(_gcry_hwf_detect_x86): Add 'x86_cpu_details' setup.

Blake2s/Blake2b AVX/AVX2/AVX512 implementations are slower than
generic C implementation if CPU has integer vector latency higher
than 1 (for example, AMD Zen5 has int-vector latency of 2) and powerful
GPR execution. Therefore use generic C implementation for Blake2
on Zen5.

Generic C with AMD Zen5:

|  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz

BLAKE2B_512 | 0.473 ns/B 2016 MiB/s 2.72 c/B 5750
BLAKE2S_256 | 0.798 ns/B 1195 MiB/s 4.59 c/B 5750

AVX512 with AMD Zen5:

|  nanosecs/byte   mebibytes/sec   cycles/byte  auto Mhz

BLAKE2B_512 | 0.923 ns/B 1033 MiB/s 5.31 c/B 5750
BLAKE2S_256 | 1.42 ns/B 672.4 MiB/s 8.15 c/B 5749

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance

jukivili

Authored on Dec 21 2025, 5:15 PM

Parents

rC8b538a8c7669: camellia-gfni-avx512: add 1-block constant-time implementation

Branches

Unknown

Tags

Unknown

Event Timeline

jukivili committed rCe5bc3b28260e: blake2: avoid AVX/AVX2/AVX512 when CPU has high vector inst latency (authored by jukivili).Jan 2 2026, 3:01 PM

• werner mentioned this in T7643: Release Libgcrypt 1.12.0.Jan 29 2026, 12:48 PM

Changes (3)

Path

Size

cipher/

blake2.c

src/

hwf-common.h

hwf-x86.c

rCe5bc3b28260e

View Options

cipher/blake2.c

View Options

src/hwf-common.h

View Options

blake2: avoid AVX/AVX2/AVX512 when CPU has high vector inst latencye5bc3b28260eActions

Description

Details

Event Timeline

Changes (3)

rCe5bc3b28260e

cipher/blake2.c

src/hwf-common.h

src/hwf-x86.c

blake2: avoid AVX/AVX2/AVX512 when CPU has high vector inst latency
e5bc3b28260e
Actions