Page MenuHome GnuPG

libgcrypt: bulk AES-GCM acceleration for ppc64le
Closed, ResolvedPublic

Description

To implement AES-GCM bulk function to improve AES-GCM performance using Power Assembly.

Event Timeline

dannytsen created this object in space S1 Public.
dannytsen updated the task description. (Show Details)

The implementation is for Power 10 and above. The improvement is as follow for AES128,

Current:

GCM enc |     0.240 ns/B      3969 MiB/s         - c/B
GCM dec |     0.240 ns/B      3966 MiB/s         - c/B

After:

GCM enc |     0.192 ns/B      4971 MiB/s         - c/B
GCM dec |     0.193 ns/B      4951 MiB/s         - c/B

The benchmark was run on P10 with 3.447 GHz.

Also, This is the first time I am using the platform, I would need advise how to submit the patch. Thanks.

I have upload my patch here.

Files added and changed.

  1. configure.ac
  2. cipher/rijndael.c
  3. cipher/rijndael-p10le.c - new
  4. cipher/Makefile.am
  5. cipher/rijndael-gcm-p10le.s - new
  6. gsrc/10lib.h
  7. src/hwf-ppc.c
werner triaged this task as Normal priority.Nov 23 2021, 9:06 AM
werner added projects: libgcrypt, ppc, patch.
werner added a subscriber: werner.

FWIW: We need a DCO; see doc/HACKING.

Hi Werner, Here is the DCO. Thanks.

Please read doc/HACKING carefully on the process of sending DCO the right way.

I sent a copy to gcrypt-devel@gnupg.org. Hope this is the right process. Thanks.

Thanks, however I didn't see your email on mailing-list. Maybe the email got stuck on the way.

Few comments on patch. Currently implementation is in plain assembly file, without preprocessor. Complication with this approach is that assembly file is selected in configure.ac for all powerpc64le-*-* architectures/compilers/environments without checking if assembler supports PPC arch 3.10 instructions or if assembly macros are supported, etc. With C preprocessed assembly, you can include config.h and do additional checks.

Does rijndael-gcm-p10le.s use any PPC arch 3.10 instructions? Is there other reason for limiting aes-gcm-pcc to just arch 3.10 than instruction set support? It seems to run on QEMU (arch 3.00 supported) without invalid instruction faults. However when running tests, I get test fails:

libgcrypt/build-power64le$ QEMU_LD_PREFIX=/usr/powerpc64le-linux-gnu/ tests/basic --verbose
Starting Cipher checks.
  checking BLOWFISH [4]
  checking DES [302]
  checking 3DES [2]
  checking CAST5 [3]
  checking AES [7]
basic: pass 0, algo 7, mode 9, in-place, tag mismatch
basic: pass 0, algo 7, mode 9, split-buffer, encrypt mismatch
basic: pass 0, algo 7, mode 9, split-buffer (pos: 2032, piecelen: 1995), gcry_cipher_checktag failed: Checksum error
  checking AES192 [8]
basic: pass 0, algo 8, mode 9, in-place, tag mismatch
basic: pass 0, algo 8, mode 9, split-buffer, encrypt mismatch
basic: pass 0, algo 8, mode 9, split-buffer (pos: 2032, piecelen: 1995), gcry_cipher_checktag failed: Checksum error
  checking AES256 [9]
basic: pass 0, algo 9, mode 9, in-place, tag mismatch
basic: pass 0, algo 9, mode 9, split-buffer, encrypt mismatch
basic: pass 0, algo 9, mode 9, split-buffer (pos: 2032, piecelen: 1995), gcry_cipher_checktag failed: Checksum error
  checking TWOFISH [10]
  checking TWOFISH128 [303]
  checking SERPENT128 [304]
  checking SERPENT192 [305]
  checking SERPENT256 [306]
  checking RFC2268_40 [307]
  checking SEED [309]
  checking CAMELLIA128 [310]
  checking CAMELLIA192 [311]
  checking CAMELLIA256 [312]
  checking IDEA [1]
  checking GOST28147 [315]
  checking GOST28147_MESH [317]
  checking SM4 [318]
  checking ARCFOUR
  checking SALSA20
  checking SALSA20R12
  checking CHACHA20
Completed Cipher checks.
Starting Cipher Mode checks.
  Starting ECB checks.
    checking ECB mode for BLOWFISH [4]
    checking ECB mode for BLOWFISH [4]
    checking ECB mode for DES [302]
    checking ECB mode for DES [302]
    checking ECB mode for SM4 [318]
    checking ECB mode for SM4 [318]
  Completed ECB checks.
  Starting AES128 CBC CTS checks.
    checking encryption for length 17
    checking decryption for length 17
    checking encryption for length 31
    checking decryption for length 31
    checking encryption for length 32
    checking decryption for length 32
    checking encryption for length 47
    checking decryption for length 47
    checking encryption for length 48
    checking decryption for length 48
    checking encryption for length 64
    checking decryption for length 64
  Completed AES128 CBC CTS checks.
  Starting CBC MAC checks.
    checking CBC MAC for AES [7]
    checking CBC MAC for 3DES [2]
    checking CBC MAC for DES [302]
  Completed CBC MAC checks.
  Starting CTR cipher checks.
    checking CTR mode for AES [7]
    checking CTR mode for AES192 [8]
    checking CTR mode for AES256 [9]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES [7]
    checking CTR mode for AES256 [9]
    checking CTR mode for AES256 [9]
    checking CTR mode for AES256 [9]
    checking CTR mode for AES256 [9]
    checking CTR mode for CAMELLIA256 [312]
    checking CTR mode for CAMELLIA256 [312]
    checking CTR mode for CAMELLIA256 [312]
    checking CTR mode for CAMELLIA256 [312]
    checking CTR mode for CAST5 [3]
    checking CTR mode for SM4 [318]
    checking CTR mode for SM4 [318]
  Completed CTR cipher checks.
  Starting CFB checks.
    checking CFB mode for AES [7]
    checking CFB mode for AES192 [8]
    checking CFB mode for AES256 [9]
    checking CFB mode for AES [7]
    checking CFB mode for AES192 [8]
    checking CFB mode for AES256 [9]
    checking CFB mode for AES [7]
    checking CFB mode for AES192 [8]
    checking CFB mode for AES256 [9]
    checking CFB mode for 3DES [2]
    checking CFB mode for 3DES [2]
    checking CFB mode for GOST28147_MESH [317]
    checking CFB mode for SM4 [318]
    checking CFB mode for SM4 [318]
  Completed CFB checks.
  Starting OFB checks.
    checking OFB mode for AES [7]
    checking OFB mode for AES192 [8]
    checking OFB mode for AES256 [9]
    checking OFB mode for SM4 [318]
    checking OFB mode for SM4 [318]
  Completed OFB checks.
  Starting CCM checks.
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for AES [7]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
    checking CCM mode for CAMELLIA128 [310]
  Starting GCM checks.
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
    checking GCM mode for AES [7]
basic: aes-gcm, encrypt mismatch entry 11 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 11
    checking GCM mode for AES [7]
basic: aes-gcm, encrypt mismatch entry 12 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 12
    checking GCM mode for AES [7]
basic: aes-gcm, encrypt mismatch entry 13 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 13
    checking GCM mode for AES192 [8]
basic: aes-gcm, encrypt mismatch entry 14 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 14
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 15 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 15
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 16 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 16
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 17 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 17
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 18 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 18
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 19 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 19
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 20 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 20
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 21 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 21
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 22 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 22
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 23 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 23
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 24 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 24
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 25 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 25
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 26 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 26
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 27 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 27
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 28 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 28
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 29 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 29
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 30 (step -1)
basic: aes-gcm, encrypt tag mismatch entry 30
    checking GCM mode for AES256 [9]
basic: aes-gcm, encrypt mismatch entry 31 (step -1)
basic: stopped after 50 errors.
libgcrypt/build-power64le$

Thanks jukivili for the review.

  1. I did get an email that the list moderator will check my email before post it.
  2. The check for the arch 3.0/3.1 is done in rijndael.c and hwf-ppc.c. I am not sure how to check arch in configure.ac yet since this is not specifically for arch reason it is more of the p9 vs p10. I did not use any arch3.1 instructions. It's all in 3.0. The reason this is for p10 and above because p9 performance is degraded a little bit if using this bulk function.
  3. I have to check why basic tests has failed. I used my modified bench-slope.c to verify the encoded output and tag.

Hi jukivili,
I ran some basic tests and it did show the errors. I am in the process investigating what went wrong. In the meantime, i also included test result that I have used in my testing from bench-slope. In this test, I captured the message with 272 bytes buffer from the original libgcrypt repo and my optimized repo. Note that the bulk version of my code do 8x unrolling and the rest will do 16 bytes. So the first 2 128 bytes ran thru gcry_ppc_aes_gcm_encrypt and the rest of the 16 bytes thru gcm_ctr_encrypt (cipher-gcm.c).

Please review. In the meantime, I'll continue checking the basic tests.

Also, the bulk function always passed with 16 bytes blocks so my function don't need to handle partial block even I did implement that.

Thanks.
-Danny

Hi jukivili,

I have fixed the counter overflow problems that the basic tests revealed. Here is the new patch.

Thanks.
-Danny

Hello,

Few comments on new patch:

  1. general:
    • Patch/commit needs commit log. See "doc/HACKING" section "Commit log requirements" and see commit history for examples.
  2. cipher/Makefile.am:
    • Wrong file name here, "rijndael-gcm-ppc9le.s".
  3. cipher/rijndael-p10le.c:
    • Please, use GNU C coding style like rest of the library.
    • Instead of bcopy use buf_cpy (see cipher/bufhelp.h) and/or cipher_block_cpy (see cipher/cipher-internal.h).
  4. src/hwf-ppc.c:
    • Also add "ppc-arch_3_10" to hwflist table in "src/hwfeatures.c".
  5. configure.ac:
    • Please, rebase patch on top of the latest master branch.
    • Building of "rijndael-gcm-ppc10le.s" should be limited some how to avoid build failure on legacy compilers/assemblers without arch 3.00 instruction set support. For example, "$gcry_cv_gcc_inline_asm_ppc_altivec" and "$gcry_cv_gcc_inline_asm_ppc_arch_3_00" could be used for this:
powerpc64le-*-*)
   # Build with the crypto extension implementation
   GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS rijndael-ppc.lo"
   GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS rijndael-ppc9le.lo"

   if test "$gcry_cv_gcc_inline_asm_ppc_altivec" = "yes" &&
      test "$gcry_cv_gcc_inline_asm_ppc_arch_3_00" = "yes" ; then
     # Build with AES-GCM bulk implementation for P10
     GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS rijndael-gcm-p10le.lo"
     GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS rijndael-p10le.lo"
   fi
;;

-Jussi

Hi Jussi,

Fixed what you have commented. Here is the new patch and commit log message attached. Hope it's all in correct order.

Thanks.
-Danny

I did some finishing touches on coding style:

Now we'd just need to get the DCO through.

Thanks Jussi, I did not receive the list moderator's email so I am not sure if the it has been posted on gcrypt-devel@gnupg.org. If not, I can resend the DCO. Thanks.

Ok, I have subscribed to the mailing list. I have resent the DCO.

Seen. @jukivili can you please add it to the AUTHORS file?