Home GnuPG

cast5: add amd64 assembly implementation
0bdf26eea8cdUnpublished

Unpublished Commit · Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

cast5: add amd64 assembly implementation

* cipher/Makefile.am: Add 'cast5-amd64.S'.
* cipher/cast5-amd64.S: New file.
* cipher/cast5.c (USE_AMD64_ASM): New macro.
(_gcry_cast5_s1tos4): Merge arrays s1, s2, s3, s4 to single array to
simplify access from assembly implementation.
(s1, s2, s3, s4): New macros pointing to subarrays in
_gcry_cast5_s1tos4.
[USE_AMD64_ASM] (_gcry_cast5_amd64_encrypt_block)
(_gcry_cast5_amd64_decrypt_block, _gcry_cast5_amd64_ctr_enc)
(_gcry_cast5_amd64_cbc_dec, _gcry_cast5_amd64_cfb_dec): New prototypes.
[USE_AMD64_ASM] (do_encrypt_block, do_decrypt_block, encrypt_block)
(decrypt_block): New functions.
(_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec, _gcry_cast5_cfb_dec)
(selftest_ctr, selftest_cbc, selftest_cfb): New functions.
(selftest): Call new bulk selftests.
* cipher/cipher.c (gcry_cipher_open) [USE_CAST5]: Register CAST5 bulk
functions for ctr-enc, cbc-dec and cfb-dec.
* configure.ac (cast5) [x86_64]: Add 'cast5-amd64.lo'.
* src/cipher.h (_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec)
(gcry_cast5_cfb_dec): New prototypes.

Provides non-parallel implementations for small speed-up and 4-way parallel
implementations that gets accelerated on `out-of-order' CPUs.

Speed old vs. new on AMD Phenom II X6 1055T:

   ECB/Stream         CBC             CFB             OFB             CTR
--------------- --------------- --------------- --------------- ---------------

CAST5 1.23x 1.22x 1.21x 2.86x 1.21x 2.83x 1.22x 1.17x 2.73x 2.73x

Speed old vs. new on Intel Core i5-2450M (Sandy-Bridge):

   ECB/Stream         CBC             CFB             OFB             CTR
--------------- --------------- --------------- --------------- ---------------

CAST5 1.00x 1.04x 1.06x 2.56x 1.06x 2.37x 1.03x 1.01x 2.43x 2.41x

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on May 24 2013, 11:43 AM
wernerCommitted on May 24 2013, 12:51 PM
Parents
rCab8fc70b5f0c: cipher-selftest: make selftest work with any block-size
Branches
Unknown
Tags
Unknown

Event Timeline

Werner Koch <wk@gnupg.org> committed rC0bdf26eea8cd: cast5: add amd64 assembly implementation (authored by Jussi Kivilinna <jussi.kivilinna@iki.fi>).May 24 2013, 12:51 PM