Because https://bugs.debian.org/972525 has been reopened, I review again the code path in question.
I found that dotlock implementation is not perfect. There is a race condition for handling stale lock file.
- Process A got the lock.
- Process B waits for the lock, examines the lockfile by read_lockfile, pid is one of Process A.
- Process A released the lock and finishes its work, terminates its process.
- Process C got the lock, started some work with the lock.
- Process B wrongly considers the lockfile is stale (by checking kill(pid,0) with pid of A), removes the lockfile by mistake.
- Process D can get the lock, while actually it's process C which takes the lock.
I confirmed reproducible failure by adding usleep at the end of read_lockfile.