twofish-avx2-amd64: replace VPGATHER with manual gather
ded3a1ec2ec6
Actions

Description

twofish-avx2-amd64: replace VPGATHER with manual gather

* cipher/twofish-avx2-amd64.S (do_gather): New.
(g16): Switch to use 'do_gather' instead of VPGATHER instruction.
(__twofish_enc_blk16, __twofish_dec_blk16): Prepare stack
for 'do_gather'.
* cipher/twofish.c (twofish) [USE_AVX2]: Remove now unneeded
HWF_INTEL_FAST_VPGATHER check.

As VPGATHER is now slow on majority of CPUs (because of "Downfall"),
switch twofish-avx2 implementation to use manual memory gathering
instead.

Benchmark on Intel Core i3-1115G4 (tigerlake, with "Downfall" mitigated
microcode):

Before:
TWOFISH | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

ECB enc |      7.00 ns/B     136.3 MiB/s     28.62 c/B      4089
ECB dec |      7.00 ns/B     136.2 MiB/s     28.64 c/B      4090

After (~3.2x faster):
TWOFISH | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

ECB enc |      2.19 ns/B     435.5 MiB/s      8.95 c/B      4089
ECB dec |      2.19 ns/B     436.2 MiB/s      8.94 c/B      4089

Benchmark on AMD Ryzen 9 7900X (zen4, did not suffer from "Downfall"):

Before:
TWOFISH | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

ECB enc |      1.91 ns/B     499.0 MiB/s      8.98 c/B      4700
ECB dec |      1.90 ns/B     500.7 MiB/s      8.95 c/B      4700

After (~9% faster):
TWOFISH | nanosecs/byte mebibytes/sec cycles/byte auto Mhz

ECB enc |      1.74 ns/B     547.9 MiB/s      8.18 c/B      4700
ECB dec |      1.74 ns/B     547.8 MiB/s      8.18 c/B      4700

[v2]:

reorder memory operations in do_gather for small performance increase.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>

Details

Provenance

jukivili

Authored on Aug 12 2023, 8:19 PM

Parents

rCf2bf9997d465: Avoid VPGATHER usage for most of Intel CPUs

Branches

Unknown

Tags

Unknown

Event Timeline

jukivili committed rCded3a1ec2ec6: twofish-avx2-amd64: replace VPGATHER with manual gather (authored by jukivili).Aug 20 2023, 5:31 PM

Changes (2)

Path

Size

cipher/

twofish-avx2-amd64.S

twofish.c

rCded3a1ec2ec6

View Options

cipher/twofish-avx2-amd64.S