ECC support in OpenPGP requires clarification for the interpretation of "MPI".
Description
Event Timeline
Better to paste directly:
# SOS representation # # Initially, it was intended as "Simply, Octet String", but # it is actually "Strange" Octet String. # # Input is an integer def sos_from_int(x): if x < 0: raise ValueError("Negative integer is not allowed in MPI") strange_nbits = x.bit_length() return strange_nbits.to_bytes(2, byteorder='big') \ + x.to_bytes((strange_nbits+7)//8, byteorder='big') # NOTE: Nothing strange if it represents an integer # Input can be representation of an EC point, scalar value, or opaque octets def sos_from_octetstring(o_s): if len(o_s) == 0: return b'\x00\x00' strange_nbits = (len(o_s) - 1)* 8 if o_s[0] == 0: strange_nbits += 8 else: strange_nbits += o_s[0].bit_length() return strange_nbits.to_bytes(2, byteorder='big') + o_s # The semantic of SOS is determined by its context, # specifically, by the curve OID. def dump_sos(sos): if len(sos) < 2: raise ValueError("Length of SOS must be >= 2") strange_nbits = (sos[0] << 8) + sos[1] nbytes = (strange_nbits+7)//8 if len(sos) != nbytes + 2: raise ValueError("Malformed SOS") print("This SOS may represent an octet string:") print("\t" + sos[2:].hex()) print("") print("This SOS may represent an integer:") print("\t" + int.from_bytes(sos[2:], byteorder='big')) print("") print("It is the curve OID which decides the interpretation.") print("The semantics is determined by the curve OID.") print("Currently, it is like:") print("\tNIST P-256: integer (or integer which encodes an EC point)") print("\tNIST P-384: Likewise") print("\tNIST P-521: Likewise") print("\tbrainpoolP256r1: Likewise") print("\tbrainpoolP512r1: Likewise") print("\tEd25519: octet string (of an EC point) for public key") print("\tEd25519: octet string (of an EC point) for signature R") print("\tEd25519: octet string for signature value S") print("\tEd25519: octet string (of opaque binary) for secret part") print("\tCurve25519: octet string (of an EC point) for public key") print("\tCurve25519: octet string (of an EC point) for ephemeral key") print("\tCurve25519: integer for secret part") print("In future, for newer curves, it's simply octet string of native representation:") print("\tX448: non-prefixed octet string (of an EC point) for public key") print("\tX448: non-prefixed octet string (of an EC point) for ephemeral key") print("\tX448: octet string (of opaque binary) for secret part") print("\tEd448: non-prefixed octet string (of an EC point) for public key") print("\tEd448: non-prefixed octet string (of an EC point) for signature R") print("\tEd448: octet string for signature value S") print("\tEd448: octet string (of opaque binary) for secret part") # In the case of Ed25519, special care is needed to recover actual value. # When the length of octet string for signature value S is less than 32, # zero octet should be padded at the front to make 32-octet string. # When the length of octet string for secret part is less than 32, # zero octet should be padded at the front to make 32-octet string. # In the case of Curve25519, while an EC point is represented in # native representation (nearly), secret part is not (but big-endian integer) # This is considered ugly. This way should be avoided for newer curve(s). # The intention of defining SOS is to clarify existing practice of Ed25519 key # so that we can improve/avoid implementation issues in future.
Important interoperability issue:
OpenPGP implementations should implement:
- Recovery of leading zero octets for Ed25519 key handling (secret part) and Ed25519 signature
New things:
OpenPGP implementations are expected to accept:
- Malformed MPI (with leading zero octet(s)), which is valid in SOS
- For secret part of Ed25519/Curve25519/X448/Ed448 key
- For signature value S
There are more places for clean up in GnuPG.
While "MPI" in OpenPGP specification is based on unsigned integer, the default "MPI" handling of GnuPG/Libgcrypt is signed. This difference matters internally.
Formatting by "%m" with libgcrypt, it may result prefixed by 0x00 (so that it represents unsigned value, even if scanned as signed).
And because of this, existing private keys in private-keys-v1.d may have this leading zero-byte.
But the counting bits don't count this byte.
Confusingly, in the SSH specification, it is signed MPI.
See RFC4251, for the definition of "mpint": https://tools.ietf.org/html/rfc4251#page-8
mpint Represents multiple precision integers in two's complement format, stored as a string, 8 bits per byte, MSB first. Negative numbers have the value 1 as the most significant bit of the first byte of the data partition. If the most significant bit would be set for a positive number, the number MUST be preceded by a zero byte. Unnecessary leading bytes with the value 0 or 255 MUST NOT be included. The value zero MUST be stored as a string with zero bytes of data. By convention, a number that is used in modular computations in Z_n SHOULD be represented in the range 0 <= x < n.
But... it seems that negative number is not actually used, because the number is always for/in modular computations.
In libgcrypt, we have another problem of GCRYSEXP_FMT_ADVANCED formatting, which is used by gpg-agent of GnuPG 2.3 with name-value list.
I should concentrate the case of ECC, in particular, ECC with modern curves.
Removing leading zero from RSA/ECC/ELGamal assuming unsigned integer would result more work.
In the SOS branch, rG1c4291c3951d: ecc-sos: Add special leading zero octet removal. should be reverted.
Instead, the S_KEY should be fixed up in read_key_file in findkey.c,
and merge_lists in protect.c.
(Then, no need to be fixed up in extract_private_key.)
Do following for ECC:
(1) locate the place of private key part 'd'
(2) Determine the size of private part
(3) Fix up the private part by the fixed size