@werner Could these two patches could be backported to 2.2? These changes give same level of performance increase in 2.2 as seen in 2.3.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jul 28 2022
Jul 27 2022
Jul 25 2022
Jul 21 2022
Jul 20 2022
Jul 7 2022
Jul 6 2022
Jun 12 2022
Patch applied to master with small changes.
Jun 3 2022
Thanks for updated patch. I'm travelling next week and have time to check it closely only after I'm back. On quick glance, it looks good. What is also needed is the changelog for git commit log.
Jun 1 2022
I meant interleaving integer register based 1xPoly1305 with 8xChacha20 as is done for 4xChacha20 in cipher/chacha20-ppc.c (interleaved so that for each 4xChaCha20 processed, 4 blocks of 1xPoly1305 is executed). Quite often microarchitectures have separate execution units for integer registers and vector registers and then it makes sense to interleave integer-poly1305 with vector-chacha20 as algorithms do not end up competing for same execution resources. Interleaving vector-poly1305 and vector-chacha20 is not likely to give performance increase (and likely to run problems with running out of vector registers).
May 28 2022
Problem is that new assembly is using VSX registers vs14-vs31 which overlap with floating-point registers f14-f31. f14-f31 are ABI callee saved, so those need to be stored and restored.
Tested patch with small change so that HWF_PPC_ARCH_3_00 is used instead of HWF_PPC_ARCH_3_10. Building bench-slope with "-O3 -flto" makes bug in new implementation visible. Without new implementations bench-slope is ok (testing with QEMU):
$ tests/bench-slope --disable-hwf ppc-arch_3_00 cipher chacha20 Cipher: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 2.35 ns/B 405.0 MiB/s - c/B STREAM dec | 2.32 ns/B 410.7 MiB/s - c/B POLY1305 enc | 2.46 ns/B 388.0 MiB/s - c/B POLY1305 dec | 2.34 ns/B 408.1 MiB/s - c/B POLY1305 auth | 0.238 ns/B 4003 MiB/s - c/B
May 27 2022
-O2 problem with bench-slope seems strange. Does problem appear after this patch is applied?
May 15 2022
May 11 2022
May 9 2022
Apr 30 2022
Apr 19 2022
Apr 6 2022
Apr 4 2022
Apr 1 2022
Fixed in master. I rechecked that bulk implementation passes tests with qemu-ppc64le.
Looks like that line went missing in third/final version of AES-GCM patch at https://dev.gnupg.org/T5700
Mar 29 2022
Mar 12 2022
Mar 11 2022
Mar 9 2022
Mar 8 2022
Mar 7 2022
Is large change to cipher API really needed (new open/encrypt with less flexibility)? How that would affect performance? Would following new interfaces to gcry_cipher API work instead?
- gcry_cipher_setup_geniv(hd, int ivlen, int method): for setting up IV generator with parameters such as IV length, method id (RFC5116, TLS 1.3, SSH, etc), (other parameters?)
- gcry_cipher_geniv(hd, byte *outiv): for generating new iv: generate IV using select method, set IV internally and output generated IV to 'ivout'.
- gcry_cipher_genkey(hd, byte *outkey, int keylen, int method): for generating keys, generate key internally with parameters (method id, other?), setup key internally and output generated key to 'outkey'. (how keys from key exchange protocol be handled? using existing setkey?)
I went through my test files and found that --enarmor on zero length input file did no longer work. I made separate patch to fix that issue, which then also needs another approach for handling compress issue noticed earlier: