TODO list
- ARMv8 32bit & 64bit implementations
- Port CRC pclmul implementations to ARM-CE PMULL 32/64bit
- Stitched Chacha20-Poly1305 implementations 32/64bit
- Port Serpent ARMv7/NEON implementation to 64bit
- Port Camellia aesni/avx implementation to ARM-CE AES 32/64bit
- Support for more crypto instruction sets on different architectures
- PowerPC8 vector crypto
- Sparc crypto instruction set
- Intel ADX implementation of large integer multiply
- Performance optimizations for curve 25519
- https://marc.info/?l=gcrypt-devel&m=153295947908984&w=2
- Maybe use mixed asm/C approach as used with poly1305.c