Serpent: faster S-box implementation
* cipher/serpent.c (SBOX0, SBOX1, SBOX2, SBOX3, SBOX4, SBOX5, SBOX6) (SBOX7, SBOX0_INVERSE, SBOX1_INVERSE, SBOX2_INVERSE, SBOX3_INVERSE) (SBOX4_INVERSE, SBOX5_INVERSE, SBOX6_INVERSE, SBOX7_INVERSE): Replace with new definitions.
These new S-box definitions are from paper:
D. A. Osvik, “Speeding up Serpent,” in Third AES Candidate Conference,
(New York, New York, USA), p. 317–329, National Institute of Standards and
Technology, 2000. Available at http://www.ii.uib.no/~osvik/pub/aes3.ps.gz
Although these were optimized for two-operand instructions on i386 and for
old Pentium-1 processors, they are slightly faster on current processors
on i386 and x86-64. On ARM, the performance of these S-boxes is about the
same as with the old S-boxes.
new vs old speed ratios (AMD K10, x86-64):
ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- ---------------
SERPENT128 1.06x 1.02x 1.06x 1.02x 1.06x 1.06x 1.06x 1.05x 1.07x 1.07x
new vs old speed ratios (Intel Atom, i486):
ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- ---------------
SERPENT128 1.12x 1.15x 1.12x 1.15x 1.13x 1.11x 1.12x 1.12x 1.12x 1.13x
new vs old speed ratios (ARM Cortex A8):
ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- ---------------
SERPENT128 1.04x 1.02x 1.02x 0.99x 1.02x 1.02x 1.03x 1.03x 1.01x 1.01x
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>