AVX2 implementation of BLAKE2b

Authored by jukivili on Jan 14 2018, 3:48 PM.

Description

AVX2 implementation of BLAKE2b

* cipher/Makefile.am: Add 'blake2b-amd64-avx2.S'.
* cipher/blake2.c (USE_AVX2, ASM_FUNC_ABI, ASM_EXTRA_STACK)
(_gry_blake2b_transform_amd64_avx2): New.
(BLAKE2B_CONTEXT) [USE_AVX2]: Add 'use_avx2'.
(blake2b_transform): Rename to ...
(blake2b_transform_generic): ... this.
(blake2b_transform): New.
(blake2b_final): Pass 'ctx' pointer to transform function instead of
'S'.
(blake2b_init_ctx): Check HW features and enable AVX2 implementation
if supported.
* cipher/blake2b-amd64-avx2.S: New.
* configure.ac: Add 'blake2b-amd64-avx2.lo'.

Benchmark on Intel Core i7-4790K (4.0 Ghz, no turbo):

Before:

|  nanosecs/byte   mebibytes/sec   cycles/byte

BLAKE2B_512 | 1.07 ns/B 887.8 MiB/s 4.30 c/B

After (~1.4x faster):

|  nanosecs/byte   mebibytes/sec   cycles/byte

BLAKE2B_512 | 0.771 ns/B 1236.8 MiB/s 3.08 c/B

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Committed
jukiviliFeb 4 2018, 5:51 PM
Parents
rCffdc6f3623a0: Fix incorrect counter overflow handling for GCM
Branches
Unknown
Tags
Unknown