Page MenuHome GnuPG

sha256sum on Windows 32bit calculates wrong values for files > 256 GiByte
Closed, ResolvedPublic

Description

According to a report from
https://forum.gnupg.org/t/are-there-known-file-size-limits-with-hash-programs-on-windows/6407
sha256sum creates wrong values (compared to 7zip and certutils) when the files are bigger than 256 GiByte.

User AustinFastER writes:

So after a lot of testing I have found that the hashes match if the file is less than 256GB but when you make it 256GB things fall apart. So I am going to GUESS that there is a design limit of 256GB on a 32-bit program running on a 64-bit OS on the Windows 10 platform.

The reproduction instructions where given as:
"""
254GB per Windows Explorer
H:>fsutil file createnew file10 273650000000
File H:\file10 is created

H:>certutil -hashfile file10 SHA256
SHA256 hash of file10:
3804fda9b711bc2f3e9a60e1de221158e5e97f63be4c4d83dbaa7db84176bf80
CertUtil: -hashfile command completed successfully.
H:>“c:\Program Files (x86)\Gpg4win”\bin\sha256sum.exe file10
3804fda9b711bc2f3e9a60e1de221158e5e97f63be4c4d83dbaa7db84176bf80 file10

256 GB per Windows Explorer
H:>fsutil file createnew file9 274900000000
File H:\file9 is created

H:>certutil -hashfile file9 SHA256
SHA256 hash of file9:
24b88b164c430b7a1cdb06ad9c3ec495d018e2a21e727a2ec343d1ea33b5c605
CertUtil: -hashfile command completed successfully.

H:>“c:\Program Files (x86)\Gpg4win”\bin\sha256sum.exe file9
00d9d416b93f0b93da63620300393dbf8c6272d404036de351cfde6aabd0272d file9
"""

Event Timeline

werner added a project: Bug Report.
werner added a subscriber: werner.

the included tools are intended to bootstrap things and are not optimized in any way. We don't run large data test either. Someone will look into it, thoigh. A better way is to use

gpg --print-md sha256 <myfile

which has no restrictions or bugs. It does not support the sha1sum outptut formating, though. From the man page:

--print-md algo
--print-mds
       Print message digest of algorithm algo for all given files or STDIN.  With the second form (or a deprecated  "*"  for
       algo) digests for all available algorithms are printed.
werner claimed this task.

Fixed.

I did not run the full tests becaue those would take some hours but one test case using the genhashdata tool from the libgcrypt test suite gives the correct value (see genhashdata.c source)

$ ~/b/libgcrypt/tests/genhashdata --gigs 256 | ./sha256sum  -
2d0723878cb2c3d5c59dfad910cdb857f4430a6ba2a7d687938d7a20e63dde47  -

The fix will also work for md5sum and sha1sum.