Page MenuHome GnuPG

libgcrypt: KEM API
Closed, ResolvedPublic

Description

It would be good to add an API for KEM (Key Encapsulation Mechanism).

Unfortunately, even in the standardization, there is no consensus for the abstraction (yet).

Here are some references from IETF.

RFC 5990: Use of RSA-KEM Key Transport Algorithm in the Cryptographic Message Syntax (CMS):
https://www.rfc-editor.org/rfc/rfc5990.html

  • It also has key wrapping process

RFC 9180: Hybrid Public Key Encryption: https://www.rfc-editor.org/rfc/rfc9180.html

IETF draft: Using Key Encapsulation Mechanism (KEM) Algorithms in the Cryptographic Message Syntax (CMS):
https://www.ietf.org/archive/id/draft-ietf-lamps-cms-kemri-05.html

IETF draft: Streamlined NTRU Prime: sntrup761
https://www.ietf.org/archive/id/draft-josefsson-ntruprime-streamlined-00.html

IETF draft: Kyber Post-Quantum KEM
https://www.ietf.org/archive/id/draft-cfrg-schwabe-kyber-03.html

For an API candidate, we could consider that three functions are somehow common (among those standardization above):
Citing from draft-ietf-lamps-cms-kemri-05:

  • KeyGen() -> (pk, sk):

    Generate the public key (pk) and a private key (sk).
  • Encapsulate(pk) -> (ct, ss):

    Given the recipient's public key (pk), produce a ciphertext (ct) to be passed to the recipient and shared secret (ss) for the originator.
  • Decapsulate(sk, ct) -> ss:

    Given the private key (sk) and the ciphertext (ct), produce the shared secret (ss) for the recipient.

Related Objects

Event Timeline

gniibe triaged this task as Wishlist priority.Oct 10 2023, 8:23 AM
gniibe created this task.

The API that you quote at the end is indeed what is comonly understood as how a KEM functions and is exactly what fits to ML-KEM.

In which file(s) should this API be implemented?

@fse Thank you for your comment (quick ! :-).

I looked through your patch set for ML-KEM and Simon's work for SNTRU761 (https://gitlab.com/jas/libgcrypt/-/commit/d1a376ab7251e62c2f1922524d7c8d3f69139b4d).
I'm considering to put those functions in cipher/kem.c, just like Simon did.

I can see the internal KEM functions in your patch set:

gcry_err_code_t _gcry_mlkem_kem_keypair (uint8_t *pk,
                                         uint8_t *sk,
                                         gcry_mlkem_param_t *param);

gcry_err_code_t _gcry_mlkem_kem_enc (uint8_t *ct,
                                     uint8_t *ss,
                                     const uint8_t *pk,
                                     gcry_mlkem_param_t *param);

gcry_err_code_t _gcry_mlkem_kem_dec (uint8_t *ss,
                                     const uint8_t *ct,
                                     const uint8_t *sk,
                                     gcry_mlkem_param_t *param);

... while Simon proposed the internal functions (and public functions) of:

gcry_err_code_t _gcry_kem_keypair (int algo,
				   void *pubkey,
				   void *seckey);
gcry_err_code_t _gcry_kem_enc (int algo,
			       const void *pubkey,
			       void *ciphertext,
			       void *key);
gcry_err_code_t _gcry_kem_dec (int algo,
			       const void *ciphertext,
			       const void *seckey,
			       void *key);

... which have similar/common (mostly same) API.

Currently, I'm considering about adding this kind of API for both of ML-KEM and SNTRU761.

I have two points for improvements:

  • Renaming would be good (not enc but encap, and not dec but decap), so that people won't be confused (against encrypt or encode, decrypt or decode).
  • Adding context (gcry_ctx_t) would be good (or will be required for actual use cases), because a KEM could be complex (having many parameters like key size (for PKC and symmetric), hash function to be used, kdf parameters, etc.), and possibly support AuthEncap/AuthDecap of RFC9180.

It might be OK to exclude the key-wrapping support (found in RSA-KEM), assuming that it's done later in upper layer in an application.

Our own internal function signatures is not necessarily a good refernce. The main objection to all what you list above is the lack of explicit length information. For each uint8_t* there should also be a size_t ...len in my opinion. Otherwise the API will be highly prone to memory access errors.

Then I am not convinced that the gcry_ctx_t is a good solution for the parameter issue, as this is, as far as I can see, a completely generic type. Rather, I would define a KEM parameter struct with a KEM-type field and a union for all the different supported KEM types (analogous to the existing gcry_mac_handle).

And note that key wrapping has nothing to do with the KEM itself. Key wrapping has to be addressed by the protocol.

For length information, we can find that Simon's patch (let me call it v1) has length argument:
https://gitlab.com/jas/libgcrypt/-/commit/3af635afca052a9575912b257fe7518a58bfe810

I prefer v2 patch.

In the API I proposed above, it's void * (or const void * when read-only) to express pointer to structure where the structure is defined by ALGO.

I consider again about adding context. gcry_ctx_t would not match with the API using void * (it matches some API with SEXP, that is, higher level API). Aligning the type of argument for context as void * is better for low level API.

With respect to the function signatures, I see the following issues with the API you reference via the provided link:

gcry_err_code_t _gcry_kem_keypair (gcry_kem_hd_t hd, size_t pklen, void *pubkey, size_t sklen, void *seckey)

The main problem I see in the void*. It should be a const uint8_t* to indicate an encoded key. For this interpretation let me quote Simon:

This is a trade-off, and my rationale was that I prefer doing byte-oriented APIs since that seems to what all modern KEM's are using (including Kyber?). And for some reason byte-strings are passed as 'void*' in libgcrypt, so I followed that style. There should be documentation explaining this.

So I don't think you are right when you say "it's void * (or const void * when read-only) to express pointer to structure". From what I understand from Simon it is an encoded key, and thus should be a uint8_t. What Simon says about
void* being generally used in Libgcrypt for byte arrays I cannot confirm at all. In various places in the API, I would say even clearly in the majority of the cases (based on my impression, I did not count, though), uint8_t is used for
byte arrays, which also in my opinion is the correct choice. In the same mail to the list Simon himself suggests to use uint8_t:

However you make me believe we could use uint8_t here? My KEM API is not similar to other parts of libgcrypt anyway, so we don't have to repeat using 'void*' for data.

May Simon can say something to this,too, to clarify how his proposed API is to be understood.

Also the order of the arguments clearly should be that first comes the object and then its length – this common sense every else to the best of my knowledge – not only in Libgcrypt, it is a general common coding practice, I am convinced.

But in any case, I agree, the API should contain length information for the encoded objects.

Regarding your statement:

I consider again about adding context. gcry_ctx_t would not match with the API using void * (it matches some API with SEXP, that is, higher level API). Aligning the type of argument for context as void * is better for low level API.

I still don't understand what information you want to pass in the context. Could you elaborate on this, please? In any case, we have already proposed an API on the SEXP level, which re-uses the existing functions for key generation and decryption and only adds the function

gcry_err_code_t
_gcry_pk_encap(gcry_sexp_t *r_ciph, gcry_sexp_t* r_shared_key, gcry_sexp_t s_pkey)

which is necessary as the function signature for KEM encapsulation does not match that of a public key encryption function. Couldn't we just leave it at that API for SEXP?

Actually we never use uint8_t* because that is c99 and very uncommon except for some MCU projects. Instead we use unsigned char *. The use of void* is often used because this allows to pass arbitrary types to a function without requiring ugly and error-prone casting at the caller site.

The context is a way to add additional information to those functions. For example the context can be used to add state which is useful for per-computations. We have a system to work with such contexts.

Yes, apparently I confused uint8_t and unsigned char here because the former appears in Simon's comments. We also kept to the use of unsigned char* in our implementations (that is even part of the GNU coding guidelines if I remember correctly).

So then, if I understand correctly, the context would take over the function that the gcry_kem_hd_t has in Simon's API. I think it would be cleaner to use a KEM-specific type here like Simon proposes. Since we are designing a new API it seems we are free to use a new type here.

Although, we don't use our usual s-expressions we need to add a way to derive a keygrip from Kyber et al and also to wrap the key into an s-expression to that it can be stored by gpg-agent in its usual files. An exported new API to get the keygrip of a KEM key would be good to avoid encapsulation but for other purposes an encapsulation is still required.

BTW: In gpg we need to combine two KEMs which should be done within gpg and with two keygrips. This also gives us a nice opportunity to keep the ECC part on card. There will be questions on how the keygrip is communicated to users of gpg; this is not needed for everyday use, though.

A way to generated keys in the usual s-expression way has been added. This allows us to get the keygrip for the key.

It might now be useful to also have an interface to turn a KEM API specified key into an s-expression. It is actually easy to do with our s-expression functions but given that we have that API, why not add a way to do this. Or should we add an optional arg to gcry_kem_encap and decap which takes an sexp instead of kem-algo and key buffer?