Status of the PQC implementation in LIbgcrypt
Few comments on the patches.
- First, I needed following to patches to make tests run (fixes building & testing outside source tree, and fixes memory leaks):
- There's few files missing license text at beginning of file (cipher/consttime.*, tests/test-utils.h).
- // C++ style comment lines are used in few places. These need to be changed to C comments /* ... */
- "_gcry_consttime_bytes_differ": there is already "buf_eq_const" in bufhelp.h which does the same.
- "_gcry_consttime_cmov": this could be hardened against nonce@once style attacks. _gcry_mpih_set_cond in mpi/mpih-const-time.c can be used as example for this (uses two masks and AND/OR instead of single mask with XOR).
- "load32_littleendian": there is "buf_get_le32" in bufhelp.h for reading 32-bit in little-endian order.
- cipher/tests-utils.h: not sure about this file. Where these functions copied over from other tests sources? Should other tests switch to use tests-utils.h?
- tests/benchmark and tests/bench-slope are missing cSHAKE and KMAC benchmarking support
- Should there be benchmarking support for Kyber in tests/benchmark?
- There's many functions that use buffers on stack. Do those contain secrets? Should those buffers be wiped before returning from function (with wipememory())? For example, "mlkem_check_secret_key" has two buffers "shared_secret_1" and "shared_secret_2" which are not wiped.
- mlkem.c: mlkem_check_secret_key: "memcmp" is used to compare shared secrets. Should this use constant time comparison instead?
- mlkem-common.c: _gcry_mlkem_mlkem_shake256_rkprf:
- _gcry_md_hash_buffers_extract can be used here instead of _gcry_md_open&write&extract&close.
- mlkem-symmetric.c: _gcry_mlkem_shake256_prf:
- _gcry_md_hash_buffers_extract can be used here instead of _gcry_md_open&write&extract&close. Temporary buffer usage can be avoided by passing input buffers through two IOV to _gcry_md_hash_buffers_extract.
thanks for your feedback. Before I go into the technical discussion, please allow me to propse that we use our own GitHub fork to keep track of the detailed technical discussion. In GitHub, we can easily create a separate issue for each item to discuss it and track its status. Your find your issues replicated in this top level list: to https://github.com/pqc-thunderbird/libgcrypt/issues/24.
@fse: Github is not an option here. We don't use it and thus everything relevant to Libgcrypt needs to be documented here and not at some external platform.
Regarding benchmarks, we should have some for the new algorithms.
Regarding test-utils.h, I already taked with @gniibe about that and we can eventually discuss this over here. It might even be useful to have a separate task to streamline some of the test code.
OK, fine, however, in order to be able keep an overview of our tasks I would still keep track of them in our GitHub, where I can create a sub-issue from the list of tasks with one click. But we will post our comments and results here as well as far relevant for the purpose of documentation. I think most of the points Jussi raised are more or less clear to me anyway.
Regarding test-utils.h, I think it might even make sense to have a test-utils library which the tests could link to.
@jukivilli I have addressed a number of your comments now. You find my comments inline.
First, I needed following to patches to make tests run (fixes building & testing outside source tree, and fixes memory leaks)
C++ style comment lines
Fixed for both our branches (mlkem/kyber, kmac)
cipher/consttime.c:"_gcry_consttime_bytes_differ": there is already "buf_eq_const" in bufhelp.h which does the same.
Removed that function again.
"_gcry_consttime_cmov": this could be hardened against nonce@once style attacks.
EM / power analysis countermeasures are out of scope for our code. We only address constant time, as do the reference implementations.
cipher/mlkem-cbd.c: "load32_littleendian": there is "buf_get_le32" in bufhelp.h for reading 32-bit in little-endian order.
Removed the duplicated function
cipher/tests-utils.h: not sure about this file. Where these functions copied over from other tests sources? Should other tests switch to use tests-utils.h?
Yes, taken from the existing test files which currently duplicate the code verbatim each by itself. Werner now suggested to create a .c file for this. If this is the idea, we could take the existing test-utils.h and turn it into test-utils.c together with the corresponding header file. I think we would do that in a separate branch that could be incorporated (successively) by our other branches. Other tests could then successively be adapted to use the functions from that compilation unit.
tests/benchmark and tests/bench-slope are missing cSHAKE and KMAC benchmarking support
Should there be benchmarking support for Kyber in tests/benchmark?
We'll add benchmark tests for all our new algorithms. However, we currently have to pay respect to the order of the work packages in our project and thus will implement the benchmark tests after completing the implementation of all the algorithms. This will happen the latest in Q1/2024.
There's many functions that use buffers on stack. Do those contain secrets? Should those buffers be wiped before returning from function (with wipememory())? For example, "mlkem_check_secret_key" has two buffers "shared_secret_1" and "shared_secret_2" which are not wiped.
The use of stack buffers for secrets is indeed not intended by us. Our solution for this is to allocate secure memory. However, I found a number of instances that had slipped my attention. The example that you refer to is not relevant though, as the shared secret generated in this function is
not used for anything else than checking the "correctness" of the key. So there is no need to burden this function with the use of secure memory.
Generally, wiping the memory doesn't solve the main problem that is addressed by the secure memory, as the process may be interrupted at any point during the processing and its memory pages be written to disk. Wiping the buffers may reduce the probability for that, but it doesn't provide a guarantee. Especially not when considering a dedicated attack that may provoke the swapping intentionally and repeatedly.
mlkem.c: mlkem_check_secret_key: "memcmp" is used to compare shared secrets. Should this use constant time comparison instead?
No actual need for this, as, as stated above, these buffers do not contain actual secrets. But for "educational" purposes (in case someone uses that code as a template) I changed it to use the constant time comparison.
mlkem-common.c: _gcry_mlkem_mlkem_shake256_rkprf:_gcry_md_hash_buffers_extract can be used here instead of _gcry_md_open&write&extract&close.
mlkem-symmetric.c: _gcry_mlkem_shake256_prf:_gcry_md_hash_buffers_extract can be used here instead of _gcry_md_open&write&extract&close
In the code of _gcry_md_hash_buffers_extract it says:
The only supported flag in FLAGS is GCRY_MD_FLAG_HMAC which turns this function into a HMAC function; the first item in IOV is then used as the key."
The flag GCRY_MD_FLAG_SECURE that we need is thus not supported. Also I don't see much saving in code lines, as also the buffer object would have to be initialized.
I also saw now that actually our code is using uint8_t quite a lot (something we inherited from the reference implementation). You haven't commented on that so far. However, as I understand, uint8_t is not desired because of potential portability problems. Do you want us to change all occurrences of uint8_t to unsigned char?
Once we have agreed on the items still under discussion here and the KEM API has been finalized, I plan to provide a new patch. In the meantime, if you are interested, you can check my fixes already at https://github.com/pqc-thunderbird/libgcrypt in the kyber branch
Yes, int8_t/int16_t/int32_t/uint8_t/uint16_t/uint32_t should not be used. There is size-specific integer types defined in src/types.h which can be used instead (byte/u16/u32). This header does not yet have signed integer types, but those can be added (for example, s8/s16/s32).
Hi, please respond to me privately at email@example.com (to keep the ticket clean) why our phabricator (this dev.gnupg.org) instance does not work for that, you can freely create subtasks and ticket relations here, too. Just so we know what we might want to improve here.
OK, done in our code, will come in the next patch.
However, I am not sure how to exactly correctly refer to the sizes of the signed types determined by the configure script. For now I used the sizes of the unsigned types here also, which shouldn't be a problem but of course isn't entirely correct. Maybe that can be addressed by someone who has more experience with the configure scripts.
In master, when fixing padding issue, libgcrypt/src/const-time.h is just introduced.
I will replace your functions.
- ct_not_memequal (for _gcry_consttime_bytes_differ)
- This is originally in NetBSD as constanttime_memequal, and modified return value and type
- ct_memmov_cond (for _gcry_constime_cmov
- This follows existing convention of _gcry_mpih_set_cond
- thank you for your implementation, I confirmed yours are better with XOR by examining compiler outputs by Compiler Explorer
- My choice for the name is memmov (instead of memmove), because it supports overlapping memory areas when DST < SRC.
We have addressed all comments regarding ML-KEM (Kyber) and KMAC. Currently I am working on the GnuPG integration of the the ML-KEM composites. For that purpose I will need a branch of libgcrypt with both ML-KEM and KMAC. I am not sure if you are considering to integrate the ML-KEM version already now before the final NIST standards are release. Some libraries do it, for instance Botan. Appropriate naming of the algorithms can ensure that there arises no confusion which version of the algorithm one is using.
In any case, it might be possible that I still make changes to ML-KEM during the work on the GnuPG integration in case that becomes necessary.
Regarding KMAC, there were only very minor feedback comments from you. If you prefer the integration of KMAC now, then I would prepare a patch for it. The benchmarking tests are still not implemented and will be delivered by us only in the beginning (first 3 months) of the next year. I hope this is not hindering the integration of KMAC.
What do you think about this? Does it make sense to think about the KMAC integration now? Should I provide a new patch for KMAC in the next days anyhow?
And another question: in the GnuPG code on the master branch I saw that algorithm identifiers for ML-KEM with Ed25519 and Ed448 are already defined in the code base. Do I understand correctly that the maintainers prefer the inclusion of these two algorithms and not necessarily the inclusion of the ones based on ML-KEM with ECDH using NIST or Brainpool curves?