wipememory relies upon volatile qualifier
Closed, ResolvedPublic

Description

The wipememory macro relies upon the volatile qualifier, which appears to be an
abuse of the use of the qualifier.

According to Ian Lance Taylor, GCC interprets the qualifier to mean memory that
can be changed by hardware (memory mapped hardware). GCC's interpretation is a
different interpretation than Microsoft, et al. See
https://gcc.gnu.org/ml/gcc-help/2012-03/msg00242.html for Taylor's comments.

Unfortunately, memset_s is not available on GNU systems, so its not an
alternative. Its unfortunate because its guaranteed to *not* be removed by the C
standard. Also see "[Bug libc/17879] Library is missing memset_s",
https://sourceware.org/bugzilla/show_bug.cgi?id=17879.


$ grep -R wipememory * | grep define
libgcrypt-1.6.3/src/g10lib.h:#define wipememory2(_ptr,_set,_len) do { \
libgcrypt-1.6.3/src/g10lib.h:#define wipememory(_ptr,_len) wipememory2(_ptr,0,_len)
...

#define wipememory2(_ptr,_set,_len) do { \

volatile char *_vptr=(volatile char *)(_ptr); \
size_t _vlen=(_len); \
unsigned char _vset=(_set); \
fast_wipememory2(_vptr,_vset,_vlen); \
while(_vlen) { *_vptr=(_vset); _vptr++; _vlen--; } \
    } while(0)

#define wipememory(_ptr,_len) wipememory2(_ptr,0,_len)


On Win32 systems under MinGW/Cygwin, it might be advisable to call
SecureZeroMemory if available. Its also guaranteed to *not* be removed, and its
available on Windows 2000 and higher. Also see "SecureZeroMemory function",
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366877%28v=vs.85%29.aspx.

The best alternative I've been able to find for GCC and Intel x86's is the
following. Ironically, __asm__ __volatile__ is guaranteed to not be removed, too.

    string s("Hello world");
    cout << "S: " << s << endl;
    
    char* ptr = &s[0];
    size_t size = s.length();
    
    if(ptr && size)
    {
        /* Needed because we can't just say to GCC, */
        /*   "give me a register that you choose".  */
        void* dummy;

        __asm__ __volatile__
        (
         "%=:\n\t"                /* generate a unique label for TOP */

#if (__WORDSIZE == 64)

"subq $1, %2\n\t"        /* 0-based index */

#elif (__WORDSIZE == 32)

"subl $1, %2\n\t"        /* 0-based index */

#elif (__WORDSIZE == 16)

"subw $1, %2\n\t"        /* 0-based index */

#else

error Unknown machine word size

#endif

    
     "lea (%1, %2), %0\n\t"   /* calcualte ptr[idx] */
     "movb $0, (%0)\n\t"      /* 0 -> ptr[size - 1] .. ptr[0] */
     "jnz %=b\n\t"            /* Back to TOP if non-zero */
     
     : "=&r" (dummy)
     :  "r" (ptr), "r" (size)
     : "0", "1", "2", "cc"
     );
}

Details

Due Date
Aug 31 2015, 2:00 AM
JW added a subscriber: JW.
werner added a subscriber: werner.Mar 10 2015, 9:08 AM

So what is the bug here? See also the long discussions every few years on
cryptography regading disabling of optimization.

neal added a subscriber: neal.Jun 12 2015, 4:07 PM

FWIW, I believe I saw this bug while debugging pinentry-curses. I committed
1d3583a2562e83496ac515276e9bd63a7f1abbc7 to pinentry to work around this.

Neal: Repalcing malloc by calloc is ano replacement for wipememory.

Ian's comments are not about the way we use wipememory but about cache
syncronization - this is not the issue here. FWIW, the STACK tool does not
complain about how we use wipememory.

JW, can you please provide the compile command and the disassembly (or object
file) of a place where wipememory was optmized away?

gniibe added a subscriber: gniibe.Jun 16 2015, 10:31 AM

I think that JW had some confusion. I believe that his argument is irrelevant
for libgcrypt's implementation of wipememory.

(1) Ian is right about multi-processors and memory access. We should use atomic
operations, memory barrier, and mutual exclusion things like lock/unlock.
(2) Compiler can optimize away memory access if computation should be same (as
defined by the language).

libgcrypt's use of volatile qualifier for wipememory is *not* for shared access
by multi-processors.
It is to make sure memory is wiped by its write access.

Certainly, it's not "memory mapped hardware", and it would be abuse. But, this
interpretation does not conflict with current wipememory implementation. Write
access to memory mapped hardware never removed. For correctness of code, I
believe current implementation is OK.

If this were the suggestion for fast wipememory, I would agree for non-use of
volatile qualifier. I think that current implementation is OK, considering use
cases.

Hello,

Thank you for clarification.

On 06/17/2015 02:24 PM, Jeffrey Walton wrote:

Actually, here's what I said:

The wipememory macro relies upon the volatile qualifier,
which appears to be an abuse of the use of the qualifier.

Its still an abuse because GCC does not interpret volatile the way the
library is using it.

The way the library is using volatile qualifier is to make sure
writing to memory. In my understanding, it is a valid use of volatile
qualifier, if we are speaking about the correctness of the code.

You refereed Ian's post as a reference. For that point, GCC's
interpretation has no problem with the use of volatile qualifier by
libgcrypt.

If we discuss about performance of wiping memory, your suggestion
would make sense, but I believe it's better to have "memory" clobber
in the asm statement.

The library appears to be using volatile in an attempt to suppress
removal of what the optimizer would deem as dead code.

Not exactly. The intention of the code is that we want to write to
the specific address of memory, so that data cannot be accessed later.

You shouldn't use volatile for that when compiling with GCC.

Any references which support this opinion of yours, please?

Especially, are there any examples that GCC (or any compilers) removes
the code of wipememory?

Or do you know any examples that GCC removes the code of writing

memory which address is qualified as volatile?

JW added a comment.Jun 17 2015, 9:15 AM

On Tue, Jun 16, 2015 at 4:31 AM, NIIBE Yutaka via BTS
<gnupg@bugs.g10code.com> wrote:

NIIBE Yutaka <gniibe@fsij.org> added the comment:

I think that JW had some confusion. I believe that his argument is irrelevant
for libgcrypt's implementation of wipememory.

Actually, here's what I said:

The wipememory macro relies upon the volatile qualifier,
which appears to be an abuse of the use of the qualifier.

Its still an abuse because GCC does not interpret volatile the way the
library is using it.

The library appears to be using volatile in an attempt to suppress
removal of what the optimizer would deem as dead code. You shouldn't
use volatile for that when compiling with GCC.

You can use volatile to ensure something is not optimized away when
compiling with MSVC because Microsoft interprets volatile differently,
and it includes the use case of another thread or the Operating System
changing the value. Confer,
http://msdn.microsoft.com/en-us/library/12a04hfd%28v=vs.100%29.aspx.

I'm not sure if/how/where/when memory barriers factor into it. I
usually write the code to ensure its not removed. I don't usually
worry about flushing cache lines and such. I can't remember the last
time I paid it any mind.

Jeff

JW added a comment.Jun 17 2015, 11:13 AM

You shouldn't use volatile for that when compiling with GCC.

Any references which support this opinion of yours, please?

It not my opinion. Its what the GCC folks told me.

Ian is one of the GCC devs. This is his blog discussing it:
http://www.airs.com/blog/archives/154.

You can find other discussions about it through out the web. Its easy
to find once you know what to look for:
http://www.google.com/search?q=ian+lance+taylor+volatile+memory+mapped+hardware.

Especially, are there any examples that GCC (or any compilers) removes
the code of wipememory?

That's synonymous with asking everyone to very the code generation.
Aside from Walfield, I'm not aware of anyone who looks at code
generation. I don't even look at it when auditing software.

When I encounter it in the field, I flag it as a blocking security bug
and ask the devs to fix it so I don't have to look at the generated
code.

Jeff

Hello,

Thank you for your auditing libgcrypt. Your effort helps our
development.

When I encounter it in the field, I flag it as a blocking security
bug and ask the devs to fix

We really appreciate your reports (not only this one, but also other
issue in the tracking system).

My point and the discussion is: This particular case would not be
"a blocking security bug", but a false alarm.

I think that this alarm and your suggestion is useful.

On 06/17/2015 06:13 PM, Jeffrey Walton wrote:

Ian is one of the GCC devs. This is his blog discussing it:
http://www.airs.com/blog/archives/154.

Yes, I know Ian and GCC. Although it was more than ten years ago, I
did some work for GCC in the field of embedded processor support.

I understand the issue of volatile qualifier with regards to shared
memory access by multiple threads.

Please note that the use of volatile qualifier in wipememory is not
the case of shared memory access by multiple threads.

Thus, the relevant part in the article is:

One relatively unimportant misunderstanding is due to the fact that
the standard only talks about accesses to volatile objects. It does
not talk about accesses via volatile qualified pointers. Some
programmers believe that using a pointer-to-volatile should be
handled as though it pointed to a volatile object. That is not
guaranteed by the standard and is therefore not portable. However,
this is relatively unimportant because gcc does in fact treat a
pointer-to-volatile as though it pointed to a volatile object.

It says that it's not guaranteed and it's not portable by the C
language itself.

So, you are right that volatile qualifier to a pointer should be
avoided (from viewpoint of portability).

I think that I am also right that it works with GCC implementation

(in 2008, at least).

JW added a comment.Jun 17 2015, 4:02 PM

One relatively unimportant misunderstanding is due to the fact that
the standard only talks about accesses to volatile objects. It does
not talk about accesses via volatile qualified pointers. Some
programmers believe that using a pointer-to-volatile should be
handled as though it pointed to a volatile object. That is not
guaranteed by the standard and is therefore not portable. However,
this is relatively unimportant because gcc does in fact treat a
pointer-to-volatile as though it pointed to a volatile object.

It says that it's not guaranteed and it's not portable by the C
language itself.

So, you are right that volatile qualifier to a pointer should be
avoided (from viewpoint of portability).

I think that I am also right that it works with GCC implementation
(in 2008, at least).

I'm actually more concerned that the optimizer will remove the code
because it surmises its a dead store. That's the issue I am trying to
articulate.

I assumed the use of volatile was an attempt to tame the optimizer. My
apologies for inferring that.

I'm fairly certain the issue of the optimizer removing the dead stores
is present regardless of cache consistency and memory barriers.

Jeff

On 06/17/2015 11:01 PM, Jeffrey Walton wrote:

I'm actually more concerned that the optimizer will remove the code
because it surmises its a dead store. That's the issue I am trying to
articulate.

You are right. We are happy that contributors like you audit the
code.

If memset_s is available now, we should use that. It takes time.
Please see the discussion here, for a reference:

https://cygwin.com/ml/libc-alpha/2014-12/msg00506.html

Please note that I never deny your suggestion (in the scope of future
implementation of libgcrypt, and future development environment).

I was making sure the point if the problem is there now or not, to
triage your report.

In practice, for now, the use of volatile for the intention would be
OK.

Please see the following document.

https://www.securecoding.cert.org/confluence/display/c/MSC06-C.+Beware+of+compiler+optimizations

In the section "Compliant Solution (C99)" it says:

If memset_s() function is not yet available on your implementation,
this compliant solution is the best alternative, and can be
discarded once supported by your implementation.

I agree that if libgcrypt is compiled by Windows development
environment (other than cross compiled by GCC, currently recommended),
it is better to use SecureZeroMemory.

Your suggestion never will be thrown away. Please pardon if I gave

you impression like that.

Note that our wipememory does not use memset and thus a compiler can't
apply the C specification of memset.

volatile is used to make sure the writes actually hit the
memory. gcc is not allowed to remove that for the simple reason, it
can't know whether this plain RAM or a device mapped into the address
space. That is the whole point of using volatile and it has been
introduced back in the 80ies for just this reason (back than to write
to video memory).

The use of SecureZeroMemory under Windows is not useful because it
does exactly the same what we do with wipememory.

JW added a comment.Jul 6 2015, 9:36 AM

volatile is used to make sure the writes actually hit the
memory. gcc is not allowed to remove that for the simple reason, it
can't know whether this plain RAM or a device mapped into the address
space. That is the whole point of using volatile and it has been
introduced back in the 80ies for just this reason (back than to write
to video memory).

GCC 3x used to regularly remove that kind of code. GCC 4.x on Cygwin
optimizes away some of the assignments. (I don't recall which version
of GCC and which version of Cygwin.).

The use of SecureZeroMemory under Windows is not useful because it
does exactly the same what we do with wipememory.

Not true.

That's as simple as a compliance item. You either follow the
platform's best practices, or you don't. In this case, the best
practice is to use SecureZeroMemory. Cf., Howard and LeBlanc's Writing
Secure Code, 2ed, p. 325 (http://www.amazon.com/dp/0735617228). Howard
and LeBlanc work for Microsoft, and they literally wrote the book for
the Windows platform.

Jeff

werner set Due Date to Aug 31 2015, 2:00 AM.Jul 21 2015, 3:44 PM

Please show me the disassembly of an example along with the commands and program
versions you used to create an object file with removed wipememory code.

Following a propriteary platforms best practices without checking is not always
the Right Thing.

JW added a comment.Jul 21 2015, 3:50 PM

Werner Koch <wk@gnupg.org> added the comment:

Please show me the disassembly of an example along with the commands and program
versions you used to create an object file with removed wipememory code.

That is downright arrogant.

Following a propriteary platforms best practices without checking is not always
the Right Thing.

That is borderline negligent.

GnuPG is high integrity software. The project needs to be doing it
cleaner and tighter than most junk in a distro's repo.

Why do you think language rules and compiler behaviors don't apply to
the project?

Jeff

I asked you for an example to replicate the problem. You claimed:

  GCC 3x used to regularly remove that kind of code.

Why do you think it is arrogant to ask for an example in a case where we have
different opinions on how to interpret a standard? I explained why your
arguments can't be applied to our code. Your counter argument may be valid but
I do not see it as my job to tests dozens of compiler versions to find a
compiler bug.

We discussed and improved the wipememory code over many years and I see no new
datapoints which require another change. Iff there is a concrete compiler bug
or a different specs interpretation which leads to the remove of wipememory code
I would like to see an example to find a way to work around it.

JW also posted a response as T2076 (but without the example ;-)

werner closed this task as Resolved.Sep 9 2015, 4:06 PM
werner claimed this task.