For T7224, I look around possible (root) cause of the problem.
Multiple gpg.exe and gpgsm.exe remained would be by an issue of keyboxd race condition.
I think that kbx/backend-sqlite.c needs some clean up.
For T7224, I look around possible (root) cause of the problem.
Multiple gpg.exe and gpgsm.exe remained would be by an issue of keyboxd race condition.
I think that kbx/backend-sqlite.c needs some clean up.
| rG GnuPG | |||
| rGa698adbb533f kbx: Fix a race condition on DATABASE_HD. | |||
| rGb804378f183f kbx: Fix a race condition on DATABASE_HD. | |||
| Status | Assigned | Task | ||
|---|---|---|---|---|
| Resolved | • aheinecke | T7224 Kleopatra: broken in Testversion beta-41 | ||
| Resolved | • gniibe | T7294 keyboxd: Possible race conditions (and clean up) | 
I found one problem. This problem may result lock-up on Windows, I suppose.
The variable database_hd is a shared object among multiple threads, which needs to be protected.
Here is the patch to focus the issue:
diff --git a/kbx/backend-sqlite.c b/kbx/backend-sqlite.c index 2398aa77f..4c67c3ef7 100644 --- a/kbx/backend-sqlite.c +++ b/kbx/backend-sqlite.c @@ -568,11 +568,14 @@ create_or_open_database (ctrl_t ctrl, const char *filename) int dbversion; int setdbversion = 0; - if (database_hd) - return 0; /* Already initialized. */ - acquire_mutex (); + if (database_hd) + { + release_mutex (); + return 0; /* Already initialized. */ + } + /* To avoid races with other temporary instances of keyboxd trying * to create or update the database, we run the database with a lock * file held. */
The possible race condition is:
(1) one thread get a request from gpg for SEARCH, and goes into create_or_open_database.
(2) it goes to npth_mutex_lock and context switch there.
(3) another thread get a request from gpg for SEARCH, and it goes into create_or_open_database.
(4) two threads are now race on database_hd.
(5) one thead takes the dotlock.
(6) another thread cannot take the dotlock while timeout.
(7) after the timeout, it wrongly release the lock and call dotlock_destroy.
(8) something wrong may occur here.
With a single keyboxd process, dotlock is a kind of overkill here, so clean up is needed.
I applied rGb804378f183f: kbx: Fix a race condition on DATABASE_HD. in master. Let us see how behavior changes.
Sounds very reasonable. Maybe the initial idea was to open the database directly after keyboxd start and before and connections are accepted. My usual try to optimize a mutex away - I should not do this.
Given that create_or_open_database is called from just one place and that will then take the mutex again; it might be easier to move the mutex stuff out of the function and declare that the function must be called with the mutex held.
Things look good on Windows. A quick test using gnupg24 with backported patches did not show any hangs. More testing will follow next week.