AES-NI/OCB: Optimize last and first key XORing
* cipher/rijndael-aesni.c (aesni_ocb_enc, aesni_ocb_dec) [__x86_64__]: Reorder and mix first and last key XORing with OCB offset XOR operations.
OCB pre-XORing and post-XORing can be mixed and reordered with
first and last round XORing of AES cipher. This commit utilizes
this fact for additional optimization of AES-NI/OCB encryption
and decryption.
Benchmark on Intel Haswell:
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
OCB enc | 0.174 ns/B 5468 MiB/s 0.697 c/B 3998 OCB dec | 0.170 ns/B 5617 MiB/s 0.679 c/B 3998
After (enc ~11% faster, dec ~6% faster):
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
OCB enc | 0.157 ns/B 6065 MiB/s 0.629 c/B 3998 OCB dec | 0.160 ns/B 5956 MiB/s 0.640 c/B 3998
For reference, CTR:
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CTR enc | 0.157 ns/B 6090 MiB/s 0.626 c/B 3998 CTR dec | 0.157 ns/B 6092 MiB/s 0.626 c/B 3998
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>