# Description

ECC support in OpenPGP requires clarification for the interpretation of "MPI".

# Related ObjectsSearch...

gniibe created this task.May 21 2020, 6:50 AM

I wrote this:

Better to paste directly:

```# SOS representation
#
# Initially, it was intended as "Simply, Octet String", but
# it is actually "Strange" Octet String.
#

# Input is an integer
def sos_from_int(x):
if x < 0:
raise ValueError("Negative integer is not allowed in MPI")
strange_nbits = x.bit_length()
return strange_nbits.to_bytes(2, byteorder='big') \
+ x.to_bytes((strange_nbits+7)//8, byteorder='big')
# NOTE: Nothing strange if it represents an integer

# Input can be representation of an EC point, scalar value, or opaque octets
def sos_from_octetstring(o_s):
if len(o_s) == 0:
return b'\x00\x00'

strange_nbits = (len(o_s) - 1)* 8
if o_s == 0:
strange_nbits += 8
else:
strange_nbits += o_s.bit_length()
return strange_nbits.to_bytes(2, byteorder='big') + o_s

# The semantic of SOS is determined by its context,
# specifically, by the curve OID.
def dump_sos(sos):
if len(sos) < 2:
raise ValueError("Length of SOS must be >= 2")
strange_nbits = (sos << 8) + sos
nbytes = (strange_nbits+7)//8
if len(sos) != nbytes + 2:
raise ValueError("Malformed SOS")

print("This SOS may represent an octet string:")
print("\t" + sos[2:].hex())
print("")
print("This SOS may represent an integer:")
print("\t" + int.from_bytes(sos[2:], byteorder='big'))
print("")
print("It is the curve OID which decides the interpretation.")
print("The semantics is determined by the curve OID.")
print("Currently, it is like:")
print("\tNIST P-256: integer (or integer which encodes an EC point)")
print("\tNIST P-384: Likewise")
print("\tNIST P-521: Likewise")
print("\tbrainpoolP256r1: Likewise")
print("\tbrainpoolP512r1: Likewise")
print("\tEd25519: octet string (of an EC point) for public key")
print("\tEd25519: octet string (of an EC point) for signature R")
print("\tEd25519: octet string for signature value S")
print("\tEd25519: octet string (of opaque binary) for secret part")
print("\tCurve25519: octet string (of an EC point) for public key")
print("\tCurve25519: octet string (of an EC point) for ephemeral key")
print("\tCurve25519: integer for secret part")
print("In future, for newer curves, it's simply octet string of native representation:")
print("\tX448: non-prefixed octet string (of an EC point) for public key")
print("\tX448: non-prefixed octet string (of an EC point) for ephemeral key")
print("\tX448: octet string (of opaque binary) for secret part")
print("\tEd448: non-prefixed octet string (of an EC point) for public key")
print("\tEd448: non-prefixed octet string (of an EC point) for signature R")
print("\tEd448: octet string for signature value S")
print("\tEd448: octet string (of opaque binary) for secret part")

# In the case of Ed25519, special care is needed to recover actual value.
# When the length of octet string for signature value S is less than 32,
# zero octet should be padded at the front to make 32-octet string.
# When the length of octet string for secret part is less than 32,
# zero octet should be padded at the front to make 32-octet string.

# In the case of Curve25519, while an EC point is represented in
# native representation (nearly), secret part is not (but big-endian integer)
# This is considered ugly.  This way should be avoided for newer curve(s).

# The intention of defining SOS is to clarify existing practice of Ed25519 key
# so that we can improve/avoid implementation issues in future.```
gniibe added a comment.EditedMay 21 2020, 7:01 AM

Important interoperability issue:
OpenPGP implementations should implement:

• Recovery of leading zero octets for Ed25519 key handling (secret part) and Ed25519 signature

New things:
OpenPGP implementations are expected to accept:

• Malformed MPI (with leading zero octet(s)), which is valid in SOS
• For secret part of Ed25519/Curve25519/X448/Ed448 key
• For signature value S

There are more places for clean up in GnuPG.
While "MPI" in OpenPGP specification is based on unsigned integer, the default "MPI" handling of GnuPG/Libgcrypt is signed. This difference matters internally.
Formatting by "%m" with libgcrypt, it may result prefixed by 0x00 (so that it represents unsigned value, even if scanned as signed).
And because of this, existing private keys in private-keys-v1.d may have this leading zero-byte.
But the counting bits don't count this byte.

gniibe added a comment.EditedMay 26 2020, 3:59 AM

Confusingly, in the SSH specification, it is signed MPI.
See RFC4251, for the definition of "mpint": https://tools.ietf.org/html/rfc4251#page-8

```mpint

Represents multiple precision integers in two's complement format,
stored as a string, 8 bits per byte, MSB first.  Negative numbers
have the value 1 as the most significant bit of the first byte of
the data partition.  If the most significant bit would be set for
a positive number, the number MUST be preceded by a zero byte.
Unnecessary leading bytes with the value 0 or 255 MUST NOT be
included.  The value zero MUST be stored as a string with zero
bytes of data.

By convention, a number that is used in modular computations in
Z_n SHOULD be represented in the range 0 <= x < n.```

But... it seems that negative number is not actually used, because the number is always for/in modular computations.

In libgcrypt, we have another problem of GCRYSEXP_FMT_ADVANCED formatting, which is used by gpg-agent of GnuPG 2.3 with name-value list.

I should concentrate the case of ECC, in particular, ECC with modern curves.
Removing leading zero from RSA/ECC/ELGamal assuming unsigned integer would result more work.

In the SOS branch, rG1c4291c3951d: ecc-sos: Add special leading zero octet removal. should be reverted.