Build of documentation from tarball not deterministic
Closed, ResolvedPublic

Description

For reproducibly building the documentation the build scripts rely on a time stamp derived from the most recent git commit, which is written to doc/defsincdate. If no .git folder is present in the top level source dir or the git command fails, this file is empty.
When building from the tarball, mkdefsinc can not read a valid timestamp from this file and thus falls back to scanning the documentation source files for the most recent modification timestamp, which is always that of doc/defsincdate and thus the documentation is populated with the date of the actual build. The patch below should deliver the intended behaviour of mkdefsinc, which presumably is using a date related to the latest source modification.

The issue might be present in all versions since 2015-06-09 (commit 25331bba5554a39d226d32433add7784b2e170b8), but the versions I worked with were 2.2.20 and 2.2.19.

Candidate patch (needs testing across time and space):

--- doc/mkdefsinc.c	2017-08-28 12:22:54.000000000 +0200
+++ doc/mkdefsinc.c.new	2020-05-15 22:54:13.974749616 +0200
@@ -109,7 +109,7 @@
 
   for (; (file = *files); files++)
     {
-      if (!*file || !strcmp (file, ".") || !strcmp (file, ".."))
+      if (!*file || !strcmp (file, ".") || !strcmp (file, "..") || !strcmp (file, "defsincdate"))
         continue;
       if (stat (file, &sb))
         {

Credits for spotting the non-determinism in the first place to https://r13y.com/.
A workaround is currently discussed for Nixpkgs (see external link), though an upstream fix would be much appreciated.

werner claimed this task.May 17 2020, 5:22 PM
werner added a project: gnupg.
werner added a subscriber: werner.

I think an option to ignore certain files is a better way to do this. I'll give it a try.

Looking at the rules I do not understand why we have a problem here, the rule

  defsincdate: $(gnupg_TEXINFOS)
	  : >defsincdate ; \
	  if test -e $(top_srcdir)/.git; then \
	    (cd $(srcdir) && git log -1 --format='%ct' \
                 -- $(gnupg_TEXINFOS) 2>/dev/null) >>defsincdate; \
	  fi

should rebuild the distributed defincfile file only if any of the distrubuted texi sources are newer. Well unless they carry the same timestamp. The latter could be fixed by adding a sleep to the rule above.

Well, I had simply accepted that the rule for defsincdate is always triggered. I looked a bit more into it, and the cause for triggering is that Nixpkgs patches dirmngr.texi, hence defsincdate is cleared by the rule above and the fallback behaviour is triggered.
But this also means my suggested patch wouldn't help here as the modification date of dirmngr.texi would be picked up.

I don't see this as a bug any more, but a request for improvement. Instead of simply blanking out the contents of defsincdate, an else clause could be added:

else \       
    echo $SOURCE_DATE_EPOCH >>defsincdate; \

SOURCE_DATE_EPOCH is NixOS specific?

No, it is widely understood as a means for reproducible builds and specified at: https://reproducible-builds.org/docs/source-date-epoch/

Okay, makes sense.

werner closed this task as Resolved.Jun 3 2020, 5:17 PM

Done.