sha512: reduce stack use in transform function by 512 bytes
* cipher/sha512.c (transform): Change 'u64 w[80]' to 'u64 w[16]' and inline input expansion to first 64 rounds. (sha512_write, sha512_final): Reduce burn_stack depth by 512 bytes.
The input expansion to w[] array can be inlined with rounds and size of array
reduced from u64[80] to u64[16]. On Cortex-A8, this change gives small boost,
possibly thanks to reduced burn_stack depth.
New vs old (tests/benchmark md sha512 sha384):
SHA512 1.09x 1.11x 1.06x 1.09x 1.08x
SHA384 1.09x 1.11x 1.06x 1.09x 1.09x
- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>