TODO list
- ARMv8 32bit & 64bit implementations
- Stitched Chacha20-Poly1305 implementations 64bit(/32bit)
- Port Camellia aesni/avx implementation to ARM-CE AES 64bit(/32bit)
- Port Serpent ARMv7/NEON implementation to 64bit
- Port CRC pclmul implementations to ARM-CE PMULL 32bit (64bit done)
- x86_64 / i386 implementations
- AES-NI XTS 8-way for 64-bit (currently only 4-way)
- ADX implementation of large integer multiply
- Support for more crypto instruction sets on different architectures
- PowerPC8 vector crypto
- Sparc crypto instruction set
- Performance optimizations for curve 25519
- https://marc.info/?l=gcrypt-devel&m=153295947908984&w=2
- Maybe use mixed asm/C approach as used with poly1305.c