Order of magnitude degradation in performance in gpg2 cf gpg
Open, NormalPublic

Description

gpg2 is roughly ten times slower at running --check-trustdb than gpg. During
this time gpg2 runs at 100% cpu. I have also noticed performance problems with
listing keys.

This is unrelated to T1938, as I am not using a keybox.

$ time gpg --check-trustdb; time gpg2 --check-trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 6 signed: 9 trust: 0-, 0q, 0n, 0m, 0f, 6u
gpg: depth: 1 valid: 9 signed: 19 trust: 0-, 0q, 0n, 1m, 8f, 0u
gpg: depth: 2 valid: 18 signed: 147 trust: 7-, 0q, 0n, 5m, 6f, 0u
gpg: depth: 3 valid: 17 signed: 300 trust: 4-, 0q, 0n, 8m, 5f, 0u
gpg: depth: 4 valid: 158 signed: 616 trust: 49-, 0q, 0n, 91m, 18f, 0u
gpg: next trustdb check due at 2015-07-04

real 0m27.898s
user 0m24.932s
sys 0m0.696s
gpg: Note: signatures using the MD5 algorithm are rejected
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 6 signed: 9 trust: 0-, 0q, 0n, 0m, 0f, 6u
gpg: depth: 1 valid: 9 signed: 19 trust: 0-, 0q, 0n, 1m, 8f, 0u
gpg: depth: 2 valid: 18 signed: 147 trust: 7-, 0q, 0n, 5m, 6f, 0u
gpg: depth: 3 valid: 17 signed: 300 trust: 4-, 0q, 0n, 8m, 5f, 0u
gpg: depth: 4 valid: 158 signed: 613 trust: 50-, 0q, 0n, 90m, 18f, 0u
gpg: next trustdb check due at 2015-07-04

real 5m29.332s
user 4m30.520s
sys 0m58.216s

$ gpg --version
gpg (GnuPG) 1.4.18
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Home: ~/.gnupg
Supported algorithms:
Pubkey: RSA, RSA-E, RSA-S, ELG-E, DSA
Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH,

CAMELLIA128, CAMELLIA192, CAMELLIA256

Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224
Compression: Uncompressed, ZIP, ZLIB, BZIP2

$ gpg2 --version
gpg (GnuPG) 2.0.26
libgcrypt 1.6.3
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Home: ~/.gnupg
Supported algorithms:
Pubkey: RSA, RSA, RSA, ELG, DSA
Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH,

CAMELLIA128, CAMELLIA192, CAMELLIA256

Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224
Compression: Uncompressed, ZIP, ZLIB, BZIP2

Details

Version
2.0.26
andrewgdotcom set Version to 2.0.26.
andrewgdotcom added a subscriber: andrewgdotcom.

I'll second this report, note that I noticed this change in behaviour
when migrating from Wheezy to Jessie and from gnupg2
2.0.25-99intevation2 to 2.0.26-6

More details copied from gnupg-devel@:
----------Original Message----------
From: Bernhard Reiter <bernhard@intevation.de>
Sent: Tuesday 23 June 2015, 16:21:54
To: gnupg-devel@gnupg.org
Subject: trustdb calculation very slow (gpg2.0 vs gpg1)

Hi
on a Debian GNU/Linux Jessie system I have a defect that
the trust calculation of a cert is taking a lot of time.
Almost 2 minutes with 100%cpu.
This used to be different with early gnupg2 versions I've used
and it is different to gpg (1), where it were 20 seconds.

No report found, thus seem to depend on my trustdb.
Any ideas on how to create a good report/debug?

Details:
How to recreate:
gpg2 --delete-key 52D717F3
time LANG=C gpg2 -v --recv-keys 52D717F3
real 1m57.144s
user 1m46.560s
sys 0m9.296s

as compared to gpg verison 1

gpg2 --delete-key 52D717F3
time LANG=C gpg --recv-keys 52D717F3
real 0m19.840s
user 0m18.192s
sys 0m0.900s

ii gnupg 1.4.18-7 i386 GNU privacy guard - a free PGP re
ii gnupg2 2.0.26-6 i386 GNU privacy guard - a free PGP re

With 2.0.25-99intevation2 on Wheezy I did not notice the issue,
but of course, my trustdb changed over time.

----------Original Message----------
From: "Dr. Peter Voigt" <pvoigt@uos.de>
Sent: Tuesday 23 June 2015, 23:35:14
To: gnupg-devel@gnupg.org
[..]

I am just having a fully configured VirtualBox VM of Jessie with Gnome
3 available, but I am not having this problem:

% time LANG=C gpg2 -v --recv-keys 52D717F3
...
LANG=C gpg2 -v --recv-keys 52D717F3 0.11s user 0.06s system 7% cpu
2.304 total

% dpkg -l gnupg2
Desired=Unknown/Install/Remove/Purge/Hold

Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend

/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)/

Name Version Architecture
Description

+++-

ii gnupg2 2.0.26-6 amd64 GNU
privacy guard - a free PGP replacement (new v2.x)

---------Original Message----------
From: Bernhard Reiter <bernhard@intevation.de>
Sent: Wednesday 24 June 2015, 09:17:35
To: gnupg-devel@gnupg.org

On Tuesday 23 June 2015 at 23:35:14, Dr. Peter Voigt wrote:

I am just having a fully configured VirtualBox VM of Jessie with Gnome
3 available, but I am not having this problem:

thanks for trying to reproduce!
As your result is showing as well:
It probably depends on my .gnupg data like the contents of my trustdb,
private keys and settings.

Of course I cannot just make this available,
so I am grateful for hints to find out more about the issue.

I could try to export, delete and reimport the trustdb, but because
gpg 1.4 works on it a lot better I assume that it is not a simple structural
problem of my trustdb. (Which I hope both gpg versions would detect.) So I am
not sure, if this is a good step to take.

gpg2 is unusable under jessie. Just browsing my inbox in icedove/thunderbird
causes the following:

top - 11:08:34 up 17 days, 35 min, 7 users, load average: 4.47, 3.24, 1.93
Tasks: 243 total, 4 running, 239 sleeping, 0 stopped, 0 zombie
%Cpu(s): 60.9 us, 16.3 sy, 0.0 ni, 21.7 id, 1.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem: 8072496 total, 7827000 used, 245496 free, 117576 buffers
KiB Swap: 7688188 total, 1291184 used, 6397004 free. 1675144 cached Mem

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

3420 agallag+ 20 0 78616 57736 3572 R 99.5 0.7 6:06.75 gpg2

3494 agallag+ 20 0 33040 11912 3532 R 99.5 0.1 4:52.95 gpg2

3424 agallag+ 20 0 31940 10796 3468 R 99.2 0.1 3:31.33 gpg2

8743 agallag+ 20 0 5129560 2.326g 44372 S 5.3 30.2 1262:07 iceweasel

7743 agallag+ 20 0 1898320 198764 26908 S 2.3 2.5 225:11.77 gnome-shell

3407 root 20 0 0 0 0 S 0.7 0.0 0:00.50 kworker/3:0

4344 jira 20 0 3547040 581920 5216 S 0.7 7.2 84:02.57 java

3610 agallag+ 20 0 25944 3168 2428 R 0.3 0.0 0:00.01 top

3718 root 20 0 364320 63932 41096 S 0.3 0.8 110:25.72 Xorg

5899 agallag+ 20 0 2750304 86372 13620 S 0.3 1.1 2:40.06 dropbox

7858 agallag+ 20 0 419240 25884 9216 S 0.3 0.3 12:40.70
gnome-terminal-
10159 agallag+ 20 0 2155424 602372 46804 S 0.3 7.5 116:15.04 icedove

                                                
    1 root      20   0   15484   1084   1048 S   0.0  0.0   0:06.36 init

I can't use gpg1 in the meantime because I have a smartcard.

werner added a subscriber: werner.Jul 30 2015, 10:01 PM

To test whether there is a problem with the key signature verifcation test, you
may run your test cases again with the added option --no-sig-cache. These
things are not easy to replicate because they depend on the keys in your keyring
and your ownertrust settings.

neal added a subscriber: neal.Oct 29 2015, 11:31 AM

I've implemented a new db format. It's still incomplete and experimental, but
it's available from the neal/next branch. Importing
/usr/share/keyrings/debian-keyring.gpg, which contains 751 keys is much faster
using this format:

$ rm pubring.kdb; time gpg2 --no-default-keyring --primary-keyring
gnupg-kdb:pubring.kdb --import debian-keyring.gpg >/dev/null
gpg: Total number processed: 751
gpg: imported: 751

real 0m7.729s
user 0m5.404s
sys 0m0.332s
$ rm pubring.kdx; time gpg2 --no-default-keyring --primary-keyring
gnupg-kbx:pubring.kdx --import debian-keyring.gpg >/dev/null
gpg: Total number processed: 751
gpg: imported: 751
gpg: public key of ultimately trusted key 2183839A not found
gpg: public key of ultimately trusted key BC15C85A not found
gpg: public key of ultimately trusted key EE37CF96 not found

real 1m52.560s
user 0m6.268s
sys 0m31.604s

Running --check-trustdb is almost an order of magnitude faster:

$ time gpg2 --no-default-keyring --primary-keyring gnupg-kdb:pubring.kdb
--check-trustdb
real 0m0.158s
user 0m0.004s
sys 0m0.004s
$ time gpg2 --no-default-keyring --primary-keyring gnupg-kbx:pubring.kbx
--check-trustdb
real 0m0.975s
user 0m0.012s
sys 0m0.032s

Doing a sequential read is a bit slower:

$ time gpg2 --no-default-keyring --primary-keyring gnupg-kdb:pubring.kdb -k |
grep ^pub | wc -l
751

real 0m2.515s
user 0m2.432s
sys 0m0.088s
$ time gpg2 --no-default-keyring --primary-keyring gnupg-kbx:pubring.kdx -k |
grep ^pub | wc -l
gpg: NOTE: THIS IS A DEVELOPMENT VERSION!
gpg: It is only intended for test purposes and should NOT be
gpg: used in a production environment or with production keys!
751

real 0m1.245s
user 0m1.168s
sys 0m0.076s

This is because the interface for doing a full scan of the DB is unsuitable. If
we decide to use the new format, it shouldn't be hard to improve this.

I'd be interested in any feedback and perhaps some more measurements in real
conditions.

Thanks,

Neal

bernhard added a subscriber: wk.Oct 29 2015, 3:40 PM

@werner
Running with --no-sig-cache took 30 Minutes.
gpg2 --delete-key 52D717F3
time LANG=C gpg2 -v --no-sig-cache --recv-keys 52D717F3
real 29m38.897s

While time LANG=C gpg2 -v --recv-keys 52D717F3 took 2 minutes.
Debian gnupg2 Version: 2.0.26-6 i386.

@neal:
Thanks for working on this, if you think it may may sense to test this
with real data, can you point to the steps required to do this?
(I guess building gpg-2.1-from your git branch, ...)

@All,
any idea what the change between 2.0.25-99intevation2 on Wheezy
and 2.0.26-6 on Jessie could be that would cause this problem?
(Or is it just a few small certs or trust settings more that will cause
this one magnitude higher load)

neal added subscribers: dkg, guilhem.Nov 23 2015, 3:13 PM

@guilhem, @dkg I've cc'd you on this since you seem to be interested in this code.

I've just updated the branch with a few small bug fixes. Most importantly, I
fixed the memory problem by limiting the read-ahead cache to 20 MB. @guilhem:
I'd be interested to hear whether this fixes the problem that you observed.

@bernhard Sorry for not getting back to you sooner. If you checkout neal/kdb,
you'll get the latest code for the kdb format. Set --homedir or GNUPGHOME
appropriately, import your keyring and then try some operations:

$ mkdir /tmp/gnupg-kbx

$ gpg2 --export | time gpg2 --homedir /tmp/gnupg-kbx --no-default-keyring

--keyring gnupg-kdb:pubring.kdb --import /tmp/keys 2>/dev/null

This makes 6.5 seconds on my box using the kdb format and just under 2 min using
kdb. See
https://lists.gnupg.org/pipermail/gnupg-devel/2015-November/030525.html for some
example benchmarks.

I'm particularly interested to hear whether this fixes the major performance
problem that you're experiencing.

Thanks.

marcus added a subscriber: marcus.Jul 5 2017, 1:53 PM

As the @neal branch has not been updated anymore, I wonder what the status of this report is. Do we have a canonical test case and a performance goal, or anything else that let's us evaluate this? @werner ?

neal added a comment.Jul 6 2017, 1:43 PM

The sqlite backend was a little experiement that I did and it will not be merged.

In terms of poor performance of --check-trustdb, the canonical advice, AIUI, is that users should disable automatically running --check-trustdb (by adding no-auto-check-trustdb to their gpg.conf file) and instead use cron to schedule it in the background.

marcus edited projects, added Stalled; removed gnupg (gpg20).Jul 10 2017, 10:23 AM
dkg added a comment.Mar 27 2018, 10:40 PM

The severe delay caused by check-trustdb continues to cause problems elsewhere in the ecosystem. It would be great to try to address this so that GnuPG was more responsive for routine tasks like importing a single key.

@dkg thanks for the link.

Looking at the report, I think the next step would be to try to reproduce the problems with GnuPG 2.2 (as this is the stable version now and 2.0 is end-of-life). The original report and my confirmation were done with 2.0.

If it then is still important to someone to fix 2.0, they'd probably have to look at the changes in the version range (or host operating system) that I had given for noticing the problems.

georg added a subscriber: georg.Fri, Jun 28, 3:38 PM