Home GnuPG

3des: add amd64 assembly implementation for 3DES
b76b632a453bUnpublished

Unpublished Commit ยท Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

3des: add amd64 assembly implementation for 3DES

* cipher/Makefile.am: Add 'des-amd64.S'.
* cipher/cipher-selftests.c (_gcry_selftest_helper_cbc)
(_gcry_selftest_helper_cfb, _gcry_selftest_helper_ctr): Handle failures
from 'setkey' function.
* cipher/cipher.c (_gcry_cipher_open_internal) [USE_DES]: Setup bulk
functions for 3DES.
* cipher/des-amd64.S: New file.
* cipher/des.c (USE_AMD64_ASM, ATTR_ALIGNED_16): New macros.
[USE_AMD64_ASM] (_gcry_3des_amd64_crypt_block)
(_gcry_3des_amd64_ctr_enc), _gcry_3des_amd64_cbc_dec)
(_gcry_3des_amd64_cfb_dec): New prototypes.
[USE_AMD64_ASM] (tripledes_ecb_crypt): New function.
(TRIPLEDES_ECB_BURN_STACK): New macro.
(_gcry_3des_ctr_enc, _gcry_3des_cbc_dec, _gcry_3des_cfb_dec)
(bulk_selftest_setkey, selftest_ctr, selftest_cbc, selftest_cfb): New
functions.
(selftest): Add call to CTR, CBC and CFB selftest functions.
(do_tripledes_encrypt, do_tripledes_decrypt): Use
TRIPLEDES_ECB_BURN_STACK.
* configure.ac [host=x86-64]: Add 'des-amd64.lo'.
* src/cipher.h (_gcry_3des_ctr_enc, _gcry_3des_cbc_dec)
(_gcry_3des_cfb_dec): New prototypes.

Add non-parallel functions for small speed-up and 3-way parallel functions for
modes of operation that support parallel processing.

Old vs new (Intel Core i5-4570):

enc    dec

ECB 1.17x 1.17x
CBC 1.17x 2.51x
CFB 1.16x 2.49x
OFB 1.17x 1.17x
CTR 2.56x 2.56x

Old vs new (Intel Core i5-2450M):

enc    dec

ECB 1.28x 1.28x
CBC 1.27x 2.33x
CFB 1.27x 2.34x
OFB 1.27x 1.27x
CTR 2.36x 2.35x

New (Intel Core i5-4570):

3DES | nanosecs/byte mebibytes/sec cycles/byte

ECB enc |     28.39 ns/B     33.60 MiB/s     90.84 c/B
ECB dec |     28.27 ns/B     33.74 MiB/s     90.45 c/B
CBC enc |     29.50 ns/B     32.33 MiB/s     94.40 c/B
CBC dec |     13.35 ns/B     71.45 MiB/s     42.71 c/B
CFB enc |     29.59 ns/B     32.23 MiB/s     94.68 c/B
CFB dec |     13.41 ns/B     71.12 MiB/s     42.91 c/B
OFB enc |     28.90 ns/B     33.00 MiB/s     92.47 c/B
OFB dec |     28.90 ns/B     33.00 MiB/s     92.48 c/B
CTR enc |     13.39 ns/B     71.20 MiB/s     42.86 c/B
CTR dec |     13.39 ns/B     71.21 MiB/s     42.86 c/B

Old (Intel Core i5-4570):

3DES | nanosecs/byte mebibytes/sec cycles/byte

ECB enc |     33.24 ns/B     28.69 MiB/s     106.4 c/B
ECB dec |     33.26 ns/B     28.67 MiB/s     106.4 c/B
CBC enc |     34.45 ns/B     27.69 MiB/s     110.2 c/B
CBC dec |     33.45 ns/B     28.51 MiB/s     107.1 c/B
CFB enc |     34.43 ns/B     27.70 MiB/s     110.2 c/B
CFB dec |     33.41 ns/B     28.55 MiB/s     106.9 c/B
OFB enc |     33.79 ns/B     28.22 MiB/s     108.1 c/B
OFB dec |     33.79 ns/B     28.22 MiB/s     108.1 c/B
CTR enc |     34.27 ns/B     27.83 MiB/s     109.7 c/B
CTR dec |     34.27 ns/B     27.83 MiB/s     109.7 c/B

New (Intel Core i5-2450M):

3DES | nanosecs/byte mebibytes/sec cycles/byte

ECB enc |     42.21 ns/B     22.59 MiB/s     105.5 c/B
ECB dec |     42.23 ns/B     22.58 MiB/s     105.6 c/B
CBC enc |     43.70 ns/B     21.82 MiB/s     109.2 c/B
CBC dec |     23.25 ns/B     41.02 MiB/s     58.12 c/B
CFB enc |     43.71 ns/B     21.82 MiB/s     109.3 c/B
CFB dec |     23.23 ns/B     41.05 MiB/s     58.08 c/B
OFB enc |     42.73 ns/B     22.32 MiB/s     106.8 c/B
OFB dec |     42.73 ns/B     22.32 MiB/s     106.8 c/B
CTR enc |     23.31 ns/B     40.92 MiB/s     58.27 c/B
CTR dec |     23.35 ns/B     40.84 MiB/s     58.38 c/B

Old (Intel Core i5-2450M):

3DES | nanosecs/byte mebibytes/sec cycles/byte

ECB enc |     53.98 ns/B     17.67 MiB/s     134.9 c/B
ECB dec |     54.00 ns/B     17.66 MiB/s     135.0 c/B
CBC enc |     55.43 ns/B     17.20 MiB/s     138.6 c/B
CBC dec |     54.27 ns/B     17.57 MiB/s     135.7 c/B
CFB enc |     55.42 ns/B     17.21 MiB/s     138.6 c/B
CFB dec |     54.35 ns/B     17.55 MiB/s     135.9 c/B
OFB enc |     54.49 ns/B     17.50 MiB/s     136.2 c/B
OFB dec |     54.49 ns/B     17.50 MiB/s     136.2 c/B
CTR enc |     55.02 ns/B     17.33 MiB/s     137.5 c/B
CTR dec |     55.01 ns/B     17.34 MiB/s     137.5 c/B
  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance
jukiviliAuthored on Mar 30 2014, 5:11 PM
Parents
rC50aeee51a0b1: tests: Print diagnostics for skipped tests.
Branches
Unknown
Tags
Unknown

Event Timeline

Jussi Kivilinna <jussi.kivilinna@iki.fi> committed rCb76b632a453b: 3des: add amd64 assembly implementation for 3DES (authored by Jussi Kivilinna <jussi.kivilinna@iki.fi>).Mar 30 2014, 5:11 PM