ECC support in OpenPGP requires clarification for the interpretation of "MPI".
Description
Event Timeline
Better to paste directly:
# SOS representation
#
# Initially, it was intended as "Simply, Octet String", but
# it is actually "Strange" Octet String.
#
# Input is an integer
def sos_from_int(x):
if x < 0:
raise ValueError("Negative integer is not allowed in MPI")
strange_nbits = x.bit_length()
return strange_nbits.to_bytes(2, byteorder='big') \
+ x.to_bytes((strange_nbits+7)//8, byteorder='big')
# NOTE: Nothing strange if it represents an integer
# Input can be representation of an EC point, scalar value, or opaque octets
def sos_from_octetstring(o_s):
if len(o_s) == 0:
return b'\x00\x00'
strange_nbits = (len(o_s) - 1)* 8
if o_s[0] == 0:
strange_nbits += 8
else:
strange_nbits += o_s[0].bit_length()
return strange_nbits.to_bytes(2, byteorder='big') + o_s
# The semantic of SOS is determined by its context,
# specifically, by the curve OID.
def dump_sos(sos):
if len(sos) < 2:
raise ValueError("Length of SOS must be >= 2")
strange_nbits = (sos[0] << 8) + sos[1]
nbytes = (strange_nbits+7)//8
if len(sos) != nbytes + 2:
raise ValueError("Malformed SOS")
print("This SOS may represent an octet string:")
print("\t" + sos[2:].hex())
print("")
print("This SOS may represent an integer:")
print("\t" + int.from_bytes(sos[2:], byteorder='big'))
print("")
print("It is the curve OID which decides the interpretation.")
print("The semantics is determined by the curve OID.")
print("Currently, it is like:")
print("\tNIST P-256: integer (or integer which encodes an EC point)")
print("\tNIST P-384: Likewise")
print("\tNIST P-521: Likewise")
print("\tbrainpoolP256r1: Likewise")
print("\tbrainpoolP512r1: Likewise")
print("\tEd25519: octet string (of an EC point) for public key")
print("\tEd25519: octet string (of an EC point) for signature R")
print("\tEd25519: octet string for signature value S")
print("\tEd25519: octet string (of opaque binary) for secret part")
print("\tCurve25519: octet string (of an EC point) for public key")
print("\tCurve25519: octet string (of an EC point) for ephemeral key")
print("\tCurve25519: integer for secret part")
print("In future, for newer curves, it's simply octet string of native representation:")
print("\tX448: non-prefixed octet string (of an EC point) for public key")
print("\tX448: non-prefixed octet string (of an EC point) for ephemeral key")
print("\tX448: octet string (of opaque binary) for secret part")
print("\tEd448: non-prefixed octet string (of an EC point) for public key")
print("\tEd448: non-prefixed octet string (of an EC point) for signature R")
print("\tEd448: octet string for signature value S")
print("\tEd448: octet string (of opaque binary) for secret part")
# In the case of Ed25519, special care is needed to recover actual value.
# When the length of octet string for signature value S is less than 32,
# zero octet should be padded at the front to make 32-octet string.
# When the length of octet string for secret part is less than 32,
# zero octet should be padded at the front to make 32-octet string.
# In the case of Curve25519, while an EC point is represented in
# native representation (nearly), secret part is not (but big-endian integer)
# This is considered ugly. This way should be avoided for newer curve(s).
# The intention of defining SOS is to clarify existing practice of Ed25519 key
# so that we can improve/avoid implementation issues in future.Important interoperability issue:
OpenPGP implementations should implement:
- Recovery of leading zero octets for Ed25519 key handling (secret part) and Ed25519 signature
New things:
OpenPGP implementations are expected to accept:
- Malformed MPI (with leading zero octet(s)), which is valid in SOS
- For secret part of Ed25519/Curve25519/X448/Ed448 key
- For signature value S
There are more places for clean up in GnuPG.
While "MPI" in OpenPGP specification is based on unsigned integer, the default "MPI" handling of GnuPG/Libgcrypt is signed. This difference matters internally.
Formatting by "%m" with libgcrypt, it may result prefixed by 0x00 (so that it represents unsigned value, even if scanned as signed).
And because of this, existing private keys in private-keys-v1.d may have this leading zero-byte.
But the counting bits don't count this byte.
Confusingly, in the SSH specification, it is signed MPI.
See RFC4251, for the definition of "mpint": https://tools.ietf.org/html/rfc4251#page-8
mpint
Represents multiple precision integers in two's complement format,
stored as a string, 8 bits per byte, MSB first. Negative numbers
have the value 1 as the most significant bit of the first byte of
the data partition. If the most significant bit would be set for
a positive number, the number MUST be preceded by a zero byte.
Unnecessary leading bytes with the value 0 or 255 MUST NOT be
included. The value zero MUST be stored as a string with zero
bytes of data.
By convention, a number that is used in modular computations in
Z_n SHOULD be represented in the range 0 <= x < n.But... it seems that negative number is not actually used, because the number is always for/in modular computations.
In libgcrypt, we have another problem of GCRYSEXP_FMT_ADVANCED formatting, which is used by gpg-agent of GnuPG 2.3 with name-value list.
I should concentrate the case of ECC, in particular, ECC with modern curves.
Removing leading zero from RSA/ECC/ELGamal assuming unsigned integer would result more work.
In the SOS branch, rG1c4291c3951d: ecc-sos: Add special leading zero octet removal. should be reverted.
Instead, the S_KEY should be fixed up in read_key_file in findkey.c,
and merge_lists in protect.c.
(Then, no need to be fixed up in extract_private_key.)
Do following for ECC:
(1) locate the place of private key part 'd'
(2) Determine the size of private part
(3) Fix up the private part by the fixed size