chacha20: add RISC-V vector intrinsics implementation
* cipher/Makefile.am: Add 'chacha20-riscv-v.c' and add ENABLE_RISCV_VECTOR_INTRINSICS_EXTRA_CFLAGS handling for 'chacha20-riscv-v.o' and 'chacha20-riscv-v.lo'. * cipher/chacha20-riscv-v.c: New. * cipher/chacha20.c (USE_RISCV_V): New. (CHACHA20_context_s): Add 'use_riscv_v'. [USE_RISCV_V] (_gcry_chacha20_riscv_v_blocks) (_gcry_chacha20_riscv_v_check_hw): New. (chacha20_blocks) [USE_RISCV_V]: Add RISC-V vector code path. (chacha20_do_setkey) [USE_RISCV_V]: Add HW feature detection for RISC-V vector implementation. * configure.ac: Add 'chacha20-riscv-v.lo'.
Patch adds RISC-V vector extension implementation. Variable length
vector implementation is used for large inputs (4 blocks or more blocks)
and fixed width 128-bit vector implementation is used for shorter input.
Benchmark on SpacemiT K1 (1600 Mhz):
Before:
CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 10.67 ns/B 89.37 MiB/s 17.07 c/B
After (3x faster):
CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 3.41 ns/B 279.9 MiB/s 5.45 c/B
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>