rijndael-aesni: small optimization for cbc-enc and cfb-enc
* cipher/rijndael-aesni.c (_gcry_aes_aesni_cfb_enc) (_gcry_aes_aesni_cbc_enc): Copy contents of 'do_aesni_enc' here and merge input/output and first/last round key xoring to shorten critical path.
Benchmark on AMD Ryzen 7 5800X:
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC enc | 0.541 ns/B 1762 MiB/s 2.62 c/B 4850 CFB enc | 0.541 ns/B 1762 MiB/s 2.63 c/B 4850
After (5% faster):
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC enc | 0.515 ns/B 1850 MiB/s 2.50 c/B 4850 CFB enc | 0.515 ns/B 1851 MiB/s 2.50 c/B 4850
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>