Page MenuHome GnuPG

npth:w32: npth_cond_broadcast only wakes up the first waiter
Closed, ResolvedPublic

Description

npth_cond_broadcast is buggy on Windows.

When there are multiple waiters, it only wakes up the first waiter.

npth_cond_signal is buggy, too, for multiple waiters.

Basically, deque_thread and/or queue handling need to be fixed.

Let us write a test and fix the implementation.

Event Timeline

gniibe triaged this task as High priority.
gniibe created this task.
gniibe changed the task status from Open to Testing.Nov 12 2024, 6:18 AM
ebo moved this task from Backlog to WiP on the vsd33 board.
ebo added a project: Restricted Project.Nov 12 2024, 8:45 AM
ebo moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.
ebo removed projects: Restricted Project, vsd33.Nov 13 2024, 3:36 PM
gniibe renamed this task from npth:w32: npth_cond_broadcast no return to npth:w32: npth_cond_broadcast only wakes up the first waiter.Nov 14 2024, 3:10 AM
gniibe updated the task description. (Show Details)

After I fixed the problem, I realized that the description of this ticket was not accurate, so, modified.

The minimum fix for npth 1.7 might be following:

diff --git a/w32/npth.c b/w32/npth.c
index ab4596f..ec27d10 100644
--- a/w32/npth.c
+++ b/w32/npth.c
@@ -205,6 +205,8 @@ struct npth_impl_s
 static void
 dequeue_thread (npth_impl_t thread)
 {
+  npth_impl_t next = thread->next;
+
   /* Unlink the thread from any condition waiter queue.  */
   if (thread->next)
     {
@@ -213,7 +215,7 @@ dequeue_thread (npth_impl_t thread)
     }
   if (thread->prev_ptr)
     {
-      *(thread->prev_ptr) = thread->next;
+      *(thread->prev_ptr) = next;
       thread->prev_ptr = NULL;
     }
 }

(I confirmed that the patch above succeeded for t-cond.c.)

The fix in 1.8 has better fix (also fixing enque_thread, so that no looping is needed to find the last entry).

The symptom of this bug was:

  • there are multiple waiters for COND.
  • COND is fired by npth_cond_broadcast, all waiters should be waken up, but only one wakes up by the old code of 1.7.
  • other waiters keep waiting forever.

The situation of npth_cond_signal was that:

  • COND is fired by npth_cond_signal, one waiter should be waken up and other keep waiting for next occurrence.
  • one waiter is surely waken up, but the queue is lost by the bug, and those were keep waiting forever.