Camellia: Tweaks for AES-NI implementations
* cipher/camellia-aesni-avx-amd64.S: Align stack to 16 bytes; tweak key-setup for small speed up. * cipher/camellia-aesni-avx2-amd64.S: Use vmovdqu even with aligned stack; reorder vinsert128 instructions; use rbp for stack frame.
Use of 'vmovdqa' with ymm registers produces quite interesting scattering in
measurement timings. By using 'vmovdqu' instead, repeated measuments produce
more stable results.
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>