blowfish: add ARMv6 assembly implementation
* cipher/Makefile.am: Add 'blowfish-armv6.S'. * cipher/blowfish-armv6.S: New file. * cipher/blowfish.c (USE_ARMV6_ASM): New macro. [USE_ARMV6_ASM] (_gcry_blowfish_armv6_do_encrypt) (_gcry_blowfish_armv6_encrypt_block) (_gcry_blowfish_armv6_decrypt_block, _gcry_blowfish_armv6_ctr_enc) (_gcry_blowfish_armv6_cbc_dec, _gcry_blowfish_armv6_cfb_dec): New prototypes. [USE_ARMV6_ASM] (do_encrypt, do_encrypt_block, do_decrypt_block) (encrypt_block, decrypt_block): New functions. (_gcry_blowfish_ctr_enc) [USE_ARMV6_ASM]: Use ARMv6 assembly function. (_gcry_blowfish_cbc_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function. (_gcry_blowfish_cfb_dec) [USE_ARMV6_ASM]: Use ARMv6 assembly function. * configure.ac (blowfish) [arm]: Add 'blowfish-armv6.lo'.
Patch provides non-parallel implementations for small speed-up and 2-way
parallel implementations that gets accelerated on multi-issue CPUs (hand-tuned
for in-order dual-issue Cortex-A8). Unaligned access handling is done in
assembly.
For now, only enable this on little-endian systems as big-endian correctness
have not been tested yet.
Old vs new (Cortex-A8, Debian Wheezy/armhf):
ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- ---------------
BLOWFISH 1.28x 1.16x 1.21x 2.16x 1.26x 1.86x 1.21x 1.25x 1.89x 1.96x
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>