Page MenuHome GnuPG

gcry_sexp_canon_len() documentation claims that valid S-expressions will never return 0, but it returns 0 if an empty string is found in a valid S-expression
Closed, WontfixPublic

Description

gcry_sexp_canon_len() is documented like this:

Often canonical encoding is used in the external representation.  The
following function can be used to check for valid encoding and to learn
the length of the S-expression.

 -- Function: size_t gcry_sexp_canon_len (const unsigned char *BUFFER,
          size_t LENGTH, size_t *ERROFF, int *ERRCODE)

     Scan the canonical encoded BUFFER with implicit length values and
     return the actual length this S-expression uses.  For a valid
     S-expression it should never return 0.  If LENGTH is not 0, the
     maximum length to scan is given; this can be used for syntax checks
     of data passed from outside.  ERRCODE and ERROFF may both be passed
     as 'NULL'.

However, if you pass it the valid S-expression ("") it returns 0.

On gcrypt-devel @werner justifies this choice with a statement from Rivest's S-expression documentation:

10. Utilization of S-expressions

This note has described S-expressions in general form.  Application writers
may wish to restrict their use of S-expressions in various ways.  Here are
some possible restrictions that might be considered:

	-- no display-hints
	-- no lengths on hexadecimal, quoted-strings, or base-64 encodings
	-- no empty lists
	-- no empty octet-strings
	-- no lists having another list as its first element
	-- no base-64 or hexadecimal encodings
	-- fixed limits on the size of octet-strings

However, libgcrypt is a library, not an application, and nothing in Rivest's document suggests that an S-expression with an empty octet-string is strictly invalid. and gcry_sexp_canon_len doesn't enforce some of these other rules (e.g. empty lists are allowed).

Furthermore, gcrypt's own dumpsexp doesn't have a problem with empty octet strings in non-canonicalized form, but only has it in canonicalized form:

0 dkg@alice:~/src/nettle/nettle/tools$ echo '("")' |  dumpsexp 
00000000  28 22 22 29 0a                                    |("").|
0 dkg@alice:~/src/nettle/nettle/tools$ echo '(0:)' |  dumpsexp 
00000000  28 30                                             |(0|
             ^                                                ^
          Error: zero prefixed length
00000002        3a                                          |  :|
                ^                                              ^
          Error: no data length
00000003           29 0a                                    |   ).|
0 dkg@alice:~/src/nettle/nettle/tools$

Either the gcrypt documentation should be amended to state exactly which constraints it intends to enforce, so that users of the API can have a solid guarantee (or can abandon the library if it doesn't meet their application's needs) or it should avoid arbitrary constraints where possible.

Details

Version
1.8.4

Event Timeline

werner claimed this task.

By marking this as "wontfix", you appear to be saying that you won't even fix the documentation to describe the constraints that gcrypt intends to enforce. This is surprising to me.

I don't see why the documentation needs to be fixed. gcry_sexp_canon_len returns 0 for certain and s-expressions, meaning tha the s-expression is not valid. After all the s-expression code in libgcrypt does not claim to be a general purpose parser for s-expression but is targeted towards Libgcrypt needs.