Page MenuHome GnuPG

Kleopatra: Loop in DeviceInfoWatcher with GnuPG 2.3 on Windows
Closed, ResolvedPublic

Description

I did notice this issue on linux when I ran with GnuPG 2.3 against a gpg-agent from 2.2 but on Windows for me this is also reproducible when I am sure that the right agent version from 2.3 is used.

I'll try to understand why SCD behaves differently on Windows but @ikloecker could you maybe double check the logic in the device info watcher to ensure that this does not go in a 100% CPU consuming loop?

The log reads:

[3900] org.kde.pim.kleopatra: Startup timing: 1800 ms elapsed: SelfCheck completed
[3900] org.kde.pim.kleopatra: openOrRaiseMainWindow
[3900] org.kde.pim.kleopatra: ReaderStatusThread[2nd]: new iteration command= "__update__"  ; nullSlot= true
[3900] org.kde.pim.kleopatra: update_cardinfo()
[3900] org.kde.pim.kleopatra: gpgagent_transact( SCD SERIALNO --all )
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_START 
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_END 
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::poll: context finished with Erfolg (code: 0, source: Quelle nicht angegeben)
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started
[3900] org.kde.pim.kleopatra: gpgagent_transact( SCD SERIALNO --all ): Error: Service ist nicht aktiv (code: 318, source: SCD)
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_START 
[3900] org.kde.pim.kleopatra: agent_getattr_status( SCD SERIALNO --all ): t == NULL
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_END 
[3900] org.kde.pim.kleopatra: Running SCD SERIALNO --all failed: Service ist nicht aktiv (code: 318, source: SCD)
[3900] org.kde.pim.kleopatra: ReaderStatusThread: Card "" with app "" was added
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::poll: context finished with Erfolg (code: 0, source: Quelle nicht angegeben)
[3900] org.kde.pim.kleopatra: ReaderStatusThread[2nd]: waiting for commands
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_START 
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_END 
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::poll: context finished with Erfolg (code: 0, source: Quelle nicht angegeben)
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_START 
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_END 
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::poll: context finished with Erfolg (code: 0, source: Quelle nicht angegeben)
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_START 
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_END 
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::poll: context finished with Erfolg (code: 0, source: Quelle nicht angegeben)

The last 4 lines continue to loop then.

Details

Version
master

Event Timeline

aheinecke created this task.

The difference seems to be that "scd devinfo --watch" returns immediately with "Service not active"

There are two things happening at the same time:

  1. ReaderStatusThread looks for smart cards by running SCD SERIALNO --all. This fails with "Service not active".
  2. DeviceInfoWatcher starts SCD DEVINFO --watch and then polls for results. Polling results in "finished with success". If this happens then DeviceInfoWatcher starts another SCD DEVINFO --watch. This is done because SCD DEVINFO --watch finishes with success when the last smartcard is removed. I have no idea why it exists immediately with success. Could this be related the "service not active" message? If yes, then I guess SCD DEVINFO --watch should probably also fail with "service not active" instead of starting without error and then finishing with success.

Yes, I think the service not active is the cause of the issue. But I don't really understand where this error is lost, I think this should be investigated because I would also expect it not to have a success on this line:
[3900] org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::poll: context finished with Erfolg (code: 0, source: Quelle nicht angegeben)

As mentioned you can reproduce that by using a 2.2 gpg-agent with 2.3 or probably also by building gnupg 2.3 with PC/SCD (although I'm not sure what the configure flag is for this)

aheinecke raised the priority of this task from High to Unbreak Now!.Apr 30 2021, 10:33 AM

Hi Ingo,

this keeps happening for me. Please give this highest priorty. As we start more and more tests with 2.3 this is a definite blocker.

Loops look like:

10:30:56.298 org.kde.pim.kleopatra: Kleo::DeviceInfoWatcher::Worker::start DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started
10:30:56.298 org.kde.pim.kleopatra: Kleo::DeviceInfoWatcher::Worker::poll DeviceInfoWatcher::Worker::poll: context finished with Success (code: 0, source: Unspecified source)
10:30:56.298 org.kde.pim.kleopatra: Kleo::DeviceInfoWatcher::Worker::start DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started
10:30:56.298 org.kde.pim.kleopatra: Kleo::DeviceInfoWatcher::Worker::poll DeviceInfoWatcher::Worker::poll: context finished with Success (code: 0, source: Unspecified source)
10:30:56.298 org.kde.pim.kleopatra: Kleo::DeviceInfoWatcher::Worker::start DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started

CPU goes to 100% and nothing works anymore.

ikloecker moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.

I have disabled this on Windows. Once "SCD DEVINFO --watch" works reliably on Windows, we can reenable the DeviceInfoWatcher on Windows.

I haven't looked into the problem with using a 2.2 gpg-agent with 2.3. I'd create a separate task for this.

ikloecker reassigned this task from ikloecker to aheinecke.
ikloecker moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.

Can you please clarify this point: If you run on Unix with --disable-ccid-driver, do you get the same behavior as on Windows?

Can you please clarify this point: If you run on Unix with --disable-ccid-driver, do you get the same behavior as on Windows?

I'm not sure. I have added disable-ccid to scdaemon.conf, did gpgconf --kill all, and then started Kleopatra. It didn't start an endless loop:

org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::start: Assuan transaction for SCD DEVINFO --watch started
org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_START 
org.kde.pim.kleopatra: DeviceInfoWatcher::Worker::status: DEVINFO_END

And that's all. Inserting a GnuK and pressing reload in Kleopatra didn't make Kleopatra show the GnuK. In scdaemon logged

2021-05-03 09:24:57 scdaemon[7302] DBG: chan_7 <- DEVINFO --watch
2021-05-03 09:24:57 scdaemon[7302] DBG: chan_7 -> S DEVINFO_START
2021-05-03 09:24:57 scdaemon[7302] DBG: chan_7 -> S DEVINFO_END
2021-05-03 09:24:57 scdaemon[7302] pcsc_list_readers failed: unknown PC/SC error code (0x8010002e)

Maybe I'm missing something on my system.

The error code is: No Readers Available. With the latest version you should have seen that string.

werner lowered the priority of this task from Unbreak Now! to Normal.Jun 8 2021, 11:36 AM

Sorry, I have been confused and it took time to understand issues.
Indeed, there are (at least) four issues.

(1) DeviceInfoWatcher expects scdaemon DEVINFO loops forever, watching devices.
(2) DeviceInfoWatcher expects scdaemon DEVINFO detects new device when reloaded.
(3) There is a feature request detecting newly inserted device by scdaemon.
(4) Race between detecting removal and reporting device information.

I didn't understand (1) very well, and the old behavior was DEVINFO returns when no device available.
That must be the cause of trouble.

I pushed the scdaemon change rG48e824b6ea69: scd: Modify DEVINFO behavior to support looping forever. to solve those issues above. Now, DEVINFO keeps looping forever.
Underlying function app_send_devinfo still supports old behavior (loop until no device detected) when it's called with KEEP_LOOPING=-1.

The issue (3) is only solved partially. User needs to "reload" kleo so that scdaemon can scan the devices. It's not yet automatic.

gniibe moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Oct 29 2021, 4:18 AM
gniibe added projects: scd, Restricted Project.

Fixed quite some time ago.