Page MenuHome GnuPG

Support for PowerPC's AES acceleration.
AbandonedPublic

Authored by slandden on May 24 2019, 6:03 AM.

Details

Summary

Parent D490

This generates the S-Boxes on the fly, and thus
is more resistant to side-channel attacks.

I get an approximentally 2-3X speed-up with vcrypto support.
However, I saw no benifits from additionally using assembly for
the block mode code, so it is disabled for adding unnecessary
complexity and code size.

Before:
Cipher:
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

 ECB enc |      6.27 ns/B     152.2 MiB/s      7.92 c/B      1263
 ECB dec |      7.10 ns/B     134.4 MiB/s      8.97 c/B      1264
 CBC enc |      4.52 ns/B     211.2 MiB/s      5.71 c/B      1264
 CBC dec |      4.59 ns/B     207.9 MiB/s      8.69 c/B      1895
 CFB enc |      4.34 ns/B     219.6 MiB/s      8.23 c/B      1895
 CFB dec |      4.14 ns/B     230.6 MiB/s      7.84 c/B      1895
 OFB enc |      6.09 ns/B     156.7 MiB/s     11.54 c/B      1895
 OFB dec |      6.03 ns/B     158.2 MiB/s     11.43 c/B      1895
 CTR enc |      4.17 ns/B     228.9 MiB/s      7.90 c/B      1895
 CTR dec |      4.17 ns/B     228.6 MiB/s      7.91 c/B      1895
 XTS enc |      4.53 ns/B     210.4 MiB/s      8.59 c/B      1895
 XTS dec |      5.00 ns/B     190.8 MiB/s      9.47 c/B      1895
 CCM enc |      8.51 ns/B     112.0 MiB/s     16.13 c/B      1895
 CCM dec |      8.51 ns/B     112.0 MiB/s     16.13 c/B      1895
CCM auth |      4.35 ns/B     219.1 MiB/s      8.25 c/B      1895
 EAX enc |      8.51 ns/B     112.1 MiB/s     16.13 c/B      1895
 EAX dec |      8.55 ns/B     111.5 MiB/s     16.21 c/B      1895
EAX auth |      4.34 ns/B     219.5 MiB/s      8.23 c/B      1895
 GCM enc |      7.49 ns/B     127.3 MiB/s     14.20 c/B      1895
 GCM dec |      7.49 ns/B     127.3 MiB/s     14.20 c/B      1895
GCM auth |      3.33 ns/B     286.2 MiB/s      6.31 c/B      1895
 OCB enc |      4.33 ns/B     220.1 MiB/s      8.21 c/B      1895
 OCB dec |      5.69 ns/B     167.5 MiB/s      9.32 c/B      1638
OCB auth |      5.05 ns/B     189.0 MiB/s      8.26 c/B      1638

After:
Cipher:
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

 ECB enc |      2.14 ns/B     445.7 MiB/s      4.06 c/B      1895
 ECB dec |      2.41 ns/B     396.0 MiB/s      4.54 c/B      1887
 CBC enc |      2.11 ns/B     451.9 MiB/s      4.00 c/B      1895
 CBC dec |      2.06 ns/B     462.7 MiB/s      3.91 c/B      1895
 CFB enc |      2.09 ns/B     455.9 MiB/s      3.96 c/B      1895
 CFB dec |      2.09 ns/B     456.2 MiB/s      3.96 c/B      1895
 OFB enc |      2.17 ns/B     439.9 MiB/s      4.11 c/B      1895
 OFB dec |      2.12 ns/B     449.6 MiB/s      4.02 c/B      1895
 CTR enc |      2.10 ns/B     454.6 MiB/s      3.98 c/B      1895
 CTR dec |      2.09 ns/B     456.7 MiB/s      3.96 c/B      1895
 XTS enc |      2.30 ns/B     415.3 MiB/s      4.35 c/B      1895
 XTS dec |      2.29 ns/B     415.8 MiB/s      4.35 c/B      1895
 CCM enc |      4.67 ns/B     204.2 MiB/s      7.65 c/B      1638
 CCM dec |      4.83 ns/B     197.3 MiB/s      7.92 c/B      1638
CCM auth |      2.43 ns/B     391.9 MiB/s      3.99 c/B      1638
 EAX enc |      4.84 ns/B     197.2 MiB/s      7.92 c/B      1638
 EAX dec |      4.83 ns/B     197.3 MiB/s      7.92 c/B      1638
EAX auth |      2.42 ns/B     394.2 MiB/s      3.96 c/B      1638
 GCM enc |      5.42 ns/B     176.0 MiB/s     10.27 c/B      1895
 GCM dec |      5.42 ns/B     176.1 MiB/s     10.27 c/B      1895
GCM auth |      3.33 ns/B     286.2 MiB/s      6.32 c/B      1895
 OCB enc |      2.10 ns/B     454.7 MiB/s      3.98 c/B      1895
 OCB dec |      2.11 ns/B     452.8 MiB/s      3.99 c/B      1895
OCB auth |      2.10 ns/B     453.6 MiB/s      3.98 c/B      1895

Fixes T4529

GCM is slow because of lack of CPU support.

Test Plan

This is integrated into the existing tests, and they all pass.

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

slandden edited the summary of this revision. (Show Details)
slandden edited the summary of this revision. (Show Details)
slandden edited the summary of this revision. (Show Details)
slandden retitled this revision from Add PowerPC crypto acceleration support for SHA2. to Support for PowerPC's AES acceleration..
slandden edited the summary of this revision. (Show Details)

Actually include modified perlasm file.

Consider using tests/bench-slope to get cycles/byte results so they can be compared with https://github.com/dot-asm/cryptogams/blob/master/ppc/aesp8-ppc.pl#L34

Command "tests/bench-slope --cpu-mhz auto cipher aes" should do the trick, assuming auto-detection of cpu-mhz work on powerpc.

We are trying to apply patches in order to conduct internal testing. They did apply successfully. However, we can't get the result to link because _gcry_hwf_detect_ppc is undefined. Is there a hwf-ppc.c somewhere?

It turns out that the upstream cryptogams is broken on ppc64 big-endian elfv1. I reported this upstream https://github.com/dot-asm/cryptogams/issues/5 (openssl version works fine)

slandden edited the summary of this revision. (Show Details)

rebase

Thanks for the hwf-ppc.c. I've pulled the latest from upstream, applied the patches, and gotten the updated library built. Will let you know of any feedback from our performance team.

fix running with hardware acceleration off.

this has been commited