AVX implementation of BLAKE2s

Authored by jukivili on Feb 8 2018, 6:45 PM.

Description

AVX implementation of BLAKE2s

* cipher/Makefile.am: Add 'blake2s-amd64-avx.S'.
* cipher/blake2.c (USE_AVX, _gry_blake2s_transform_amd64_avx): New.
(BLAKE2S_CONTEXT) [USE_AVX]: Add 'use_avx'.
(blake2s_transform): Rename to ...
(blake2s_transform_generic): ... this.
(blake2s_transform): New.
(blake2s_final): Pass 'ctx' pointer to transform function instead of
'S'.
(blake2s_init_ctx): Check HW features and enable AVX implementation
if supported.
* cipher/blake2s-amd64-avx.S: New.
* configure.ac: Add 'blake2s-amd64-avx.lo'.

Benchmark on Intel Core i7-4790K (4.0 Ghz, no turbo):

Before:

|  nanosecs/byte   mebibytes/sec   cycles/byte

BLAKE2S_256 | 1.77 ns/B 538.2 MiB/s 7.09 c/B

After (~1.3x faster):

|  nanosecs/byte   mebibytes/sec   cycles/byte

BLAKE2S_256 | 1.34 ns/B 711.4 MiB/s 5.36 c/B

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Committed
jukiviliFeb 16 2018, 6:28 PM
Parents
rCaf7fc732f9a7: AVX2 implementation of BLAKE2b
Branches
Unknown
Tags
Unknown