Page MenuHome GnuPG

gpgme: add a faster variant of gpgme_get_key()
Open, HighPublic

Description

While debugging some slow gpgme code in libalpm I noticed that gpgme_get_key() is quite slow (I'm on Windows here, so that might exaggerate the issue).

And I noticed that 40% of the time is spend due to creating a new context, which, given the user serializes the operations, is not strictly needed from what I see.

See the attached patch for a proposal for a faster gpgme_get_key() variant that doesn't do that (missing tests and documentation).

If that's something that would makes sense in gpgme and would be considered, let me know!

Event Timeline

The context cloning should not be that expensive compared to invoking gpg. Thus let us first see how to speed up this in the common case.

yeah, I'd guess it's creating a new gpg instance with it. strace shows extra clone/pipe/read/fcntl syscalls with the new context.

I can provide my (hacky) benchmark snippet if needed.

Benchmark script:

Just asking the obvious: You are using an optimized release build for your benchmarks, right?

Moreover, if you have performance problems on Windows, then it's not the best idea to strace the code on Linux.

Finally, what's your use case? gpgme_get_key() is meant to be used for getting individual keys. It's not meant to be used to get 1000 keys in a loop.

If you mean gcc optimization flags, then yes.

I'd still guess the main culprit is creating new processes on Windows being slow.

For extra context, here is my downstream issue: https://github.com/msys2/msys2-pacman/issues/26

Finally, what's your use case? gpgme_get_key() is meant to be used for getting individual keys. It's not meant to be used to get 1000 keys in a loop.

In my case I'm getting 8 keys, which takes 400ms without this patch and 260ms with this patch.

If you got a limited list of, say, fingerprints, you should put them into an array and use gpgme_op_keylist_ext_start tolist only those keys. This will be much faster.

That's what I was initially trying to do, but then I saw https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gpgme.git;a=blob;f=src/keylist.c;h=1c01bd42b8497932d218e4d525794ed98e712bf5;hb=HEAD#l1362 and I wasn't sure if I needed to copy that logic to avoid introducing any regressions.

(Note that I'd totally understand if this was considered too much of an edge case... I just saw the FIXME and thought I'd give it a try)

The context cloning should not be that expensive compared to invoking gpg. Thus let us first see how to speed up this in the common case.

I agree with that. The expensive thing of listing a key is invoking gpg / reading the keyring. This would be solved by passing the 8 keys to list them at once, or do a full key listing and then take from that the needed keys in client code.

While cloning a context should just be some memory allocations. My strong suspicion is that the expensive call here is gpgme_set_engine_info which would do an engine_get_version internally that then calls gpg --version so we have two calls to gpg instead of one that we would have by reusing the same context. That would match to your results of roughly half the time needed with your patch.

I had the same suspicion andIchecked the code. afaics all values are taken from a cache (see dirinfo.c). Thus no real overhead.

@lazka: This edge case we try to detect works only if the result is returned sorted. That is not guaranteed. Thus if you throw in several fingerprints you better do your own sorting to be safe. The processing model of gpgme and gpg does not allow to return the results in a sorted order. However, if you use the new keyboxd you should be pretty safe becuase the fingerprint is used as a primary key in an SQL data base and thus it should be unique. See gnupg/kbx/backend-sqlite.c

@werner I saw the call in _gpgme_set_engine_info at line 452 https://dev.gnupg.org/source/gpgme/browse/master/src/engine.c$452 which I think leads down to _gpgme_get_program_version in version.c which does a spawn and uses no cache.

Indeed. The called function dates back to 2004. We really need to rework this and cache the value - it might be required to take the file_name into account.