Home GnuPG

Add parallelized AES-NI CBC decryption
35aff0cd4388Unpublished

Unpublished Commit · Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.

Description

Add parallelized AES-NI CBC decryption

* cipher/rijndael.c [USE_AESNI] (aesni_cleanup_5): New macro.
[USE_AESNI] (do_aesni_dec_vec4): New function.
(_gcry_aes_cbc_dec) [USE_AESNI]: Add parallelized CBC loop.
(_gcry_aes_cbc_dec) [USE_AESNI]: Change IV storage register from xmm3
to xmm5.

This gives ~60% improvement in CBC decryption speed on sandy-bridge (x86-64).
Overall speed improvement with this and previous CBC patches is over 400%.

Before:

$ tests/benchmark --cipher-repetitions 1000 cipher aes aes192 aes256
Running each test 1000 times.

   ECB/Stream         CBC             CFB             OFB             CTR
--------------- --------------- --------------- --------------- ---------------

AES 670ms 770ms 2920ms 720ms 1900ms 660ms 2260ms 2250ms 480ms 500ms
AES192 860ms 930ms 3250ms 870ms 2210ms 830ms 2580ms 2580ms 570ms 570ms
AES256 1020ms 1080ms 3580ms 1030ms 2550ms 970ms 2880ms 2870ms 660ms 660ms

After:

$ tests/benchmark --cipher-repetitions 1000 cipher aes aes192 aes256
Running each test 1000 times.

   ECB/Stream         CBC             CFB             OFB             CTR
--------------- --------------- --------------- --------------- ---------------

AES 670ms 770ms 2130ms 450ms 1880ms 670ms 2250ms 2280ms 490ms 490ms
AES192 880ms 920ms 2460ms 540ms 2210ms 830ms 2580ms 2570ms 580ms 570ms
AES256 1020ms 1070ms 2800ms 620ms 2560ms 970ms 2880ms 2880ms 660ms 650ms

  • Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>

Details

Provenance
jukiviliAuthored on Nov 23 2012, 6:22 PM
wernerCommitted on Nov 26 2012, 9:16 AM
Parents
rC5acd0e5ae2a5: Clear xmm5 after use in AES-NI CTR mode
Branches
Unknown
Tags
Unknown

Event Timeline

Werner Koch <wk@gnupg.org> committed rC35aff0cd4388: Add parallelized AES-NI CBC decryption (authored by Jussi Kivilinna <jussi.kivilinna@mbnet.fi>).Nov 26 2012, 9:16 AM