memset_explicit
C Document number: N2897
Date: 2021-12-27
Author: Miguel Ojeda <ojeda@ojeda.dev>
Project: ISO/IEC JTC1/SC22/WG14: Programming Language C
Sensitive data, like passwords or keying data, should be cleared from memory as soon as they are not needed. This requires ensuring the compiler will not optimize the memory overwrite away. This proposal adds a memset_explicit
function that allows users to express that a memory area needs to be actually cleared.
0
as second argument to the C++ function template to match the current C interface.secure_clear
is now imported from C; proposal changed accordingly (e.g. removed noexcept
).is_trivially_copyable_v
instead of using the trivially_copyable
pseudo-concept (since it does not exist in C++20 on its own); and changed pointer
to is_pointer_v
.memset_s
wording suggestion into the C side.TriviallyCopyable
pseudo-concept to trivially_copyable
following the change in the rest of the standard for all other concepts.secure_val
class template as polled and changed paper title to secure_clear
to reflect that.secure_clear
, similar to C11’s memset_s
wording.access
member function into read_access
, write_access
and modify_access
.Remove the C++ template interface for memset_explicit
from P1315?
SF WF N WA SA
4 6 6 2 0 => Consensus.
Does the committee favor intent, as in Alternative 1, or implementation-defined, as in Alternative 2 or calls-memset
-and-implementation-defined (Alternative 3) as Ojeda suggested, in N2682?
A1 A2 A3
15 7 14 => Unclear, but people do not want Alternative 2 by itself.
Does the committee favor intent, as in Alternative 1, or calls-memset
-and-implementation-defined (Alternative 3) as Ojeda suggested, in N2682?
A1 A3
10 9 => Still not clear.
Does the committee favor something along the lines of “The implementation might clear other copies of the data (e.g.: intermediate values, stack frame, cache lines, spilled registers, swapped out pages, etc.) or it might avoid their creation (e.g.: reducing copies, locking/pinning pages, etc.).” in N2682?
Y N A
8 7 6 => No clear consensus.
Does the committee wish to replace the c
parameter with a specific value from memset_explicit
in N2631?
Y N A
4 8 11 => Does not pass.
Does the committee wish to replace the c
parameter with “implementation-defined values” from memset_explicit
in N2631?
Y N A
4 9 8 => Does not pass.
Does the committee prefer Alternatives 1, 2, or 3 as is in N2631?
A1 A2 A3
13 4 0 => Alternative 1 wins.
Does the committee prefer removing the Recommended Practice section from Alternative 1 in N2631?
Y N A
15 0 7 => Passes.
Does the committee prefer adding memset_explicit
to the exception list in <string.h>
freestanding implementations?
Y N A
3 8 10 => Does not pass.
Should P1315 memset_explicit
use Alternative 1 plus a statement that the semantics be implementation-defined as a statement of intent rather than an effect on the abstract machine?
SF F N A SA
WG21 5 5 0 0 1 => Consensus.
WG14 2 2 0 0 0 => Consensus.
Does the committee wish to adopt something along the lines of alternative 3 of N2599 into C23?
Y N A
16 1 6 => Clear direction.
Would the Committee like to see a non-elidable, non-optional memory-erasing function added to C2x?
Y N A
14 0 2 => Clear direction.
Would the Committee like the non-elidable, non-optional memory-erasing function not to specify a value in its interface?
Y N A
6 5 6 => Unclear direction.
Would the Committee like to be able to specify a value in the interface to the non-elidable, non-optional memory-erasing function?
Y N A
7 4 6 => Clearer direction.
Would the Committee like to have both no-value and value-specifying interfaces to the non-elidable, non-optional function available?
Y N A
5 6 7 => Unclear direction.
We don’t see a specific SG1 concern for what we presume is a single-threaded API. If this paper intends to make secure_clear
resilient to UB in C++ then data-race UB is but one of the kinds.
No objection to unanimous consent.
Spend committee time on this versus other proposals, given that time is limited?
SF F N A SA
2 9 2 1 0 => Consensus.
Send the paper to SG1 for input on abstract machine integration and wording (similar to volatile_load
/volatile_store
). Send it back to us after.
SF F N A SA
4 5 4 0 0 => Consensus.
Remove all cache related things from the proposal.
SF F N A SA
3 1 3 0 0 => Consensus.
Remove encrypting at rest from the proposal.
SF F N A SA
4 1 1 1 0 => Consensus.
Want secure_clear
to write indeterminate values (as opposed to memset_s
).
SF F N A SA
4 1 2 0 0 => Consensus.
Want to work with WG14 on secure_clear
(e.g. salvage memset_s
from Annex K).
SF F N A SA
2 3 2 0 0 => Consensus.
We want something along the lines of secure_val
(with compiler support).
SF F N A SA
0 0 2 2 3 => Consensus to reject.
When manipulating sensitive data, like passwords in memory or keying data, there is a need for library and application developers to clear the memory after the data is not needed anymore [1][2][3][4], in order to minimize the time window where it is possible to capture it (e.g., ending in a core dump or probed by a malicious actor). Recent vulnerabilities (e.g., Meltdown, Spectre-class, Rowhammer-class…) have made this requirement even more prominent.
To ensure that the memory is cleared, the developer needs to inform the compiler that it must not optimize away the memory write, even if it can prove it has no observable behavior. For C++, extra care is needed to consider all exceptional return paths.
For instance, the following C function may be vulnerable, since the compiler may optimize the memset
call away because the password
buffer is not read from before it goes out of scope:
void f(void)
{
char password[SIZE];
// Acquire some sensitive data
get_password_from_user(password, SIZE);
// Use it for some operations
use_password(password, SIZE);
// Attempt to clear the sensitive data
memset(password, 0, SIZE);
}
On top of that, in C++, use_password
could throw an exception so memset
is never called (i.e., assume the stack is not overwritten and/or that the memory is held in the free store).
There are many ways that developers may use to try to ensure the memory is cleared as expected (i.e., avoiding the optimizer):
memset_s
from Annex K, if available.volatile
pointer (e.g., decaf_bzero
[5] from OpenSSL).memset
through a volatile
function pointer (e.g., OPENSSL_cleanse
C implementation [6]).extern
variable into it (e.g., CRYPTO_malloc
implementation [7] from OpenSSL).OPENSSL_cleanse
SPARC implementation [8]).memzero_explicit
implementation [9] from the Linux Kernel).-fno-builtin-memset
[10] in GCC).Or they may use a pre-existing solution whenever available:
explicit_bzero
[11] from OpenBSD & FreeBSD, SecureZeroMemory
[12] from Windows).memzero_explicit
[13][9] from the Linux Kernel, OPENSSL_cleanse
[14][6]).Regardless of how it is done, none of these ways is — at the same time — portable, easy to recognize the intent (and/or grep
for it), readily available and avoiding compiler implementation details. The topic may generate discussions in major projects on which is the best way to proceed and whether an specific approach ensures that the memory is actually cleansed (e.g., [15][16][17][18][19]). Sometimes, such a way is not effective for a particular compiler (e.g., [20]). In the worst case, bugs happen (e.g., [21][22]).
C11 (and C++17 as it was based on it) added memset_s
(K.3.7.4.1) to give a standard solution to this problem [4][23]. However, it is an optional extension (Annex K) and, at the time of writing, several major compiler vendors do not define __STDC_LIB_EXT1__
(GCC [24], Clang [25], MSVC [26], icc [27]). Therefore, in practical terms, there is no widely-available, portable solution yet for C nor C++. A 2015 paper on this topic, WG14’s N1967 “Field Experience With Annex K — Bounds Checking Interfaces” [28], concludes that “Annex K should be removed from the C standard”. On the other hand, a 2019 paper, WG14’s N2336 “Bounds-checking Interfaces: Field Experience and Future Directions” [29], examines the arguments both for and against the bounds-checked interfaces, evaluates possible solutions for actual problems with Annex K, and propounds the repair and retention of Annex K for C2x.
Other languages offer similar facilities in their standard libraries or as external projects:
secrecy
, secstr
and zeroize
libraries in Rust [32][33][34].SecureString
class in .NET Core, .NET Framework and Xamarin [35].securemem
package in Haskell [36].We can standardize current practise by providing a memset_explicit
function that takes a memory range (a pointer and a size) to be erased, plus a value, guaranteeing that it won’t be optimized away:
memset_explicit(password, 0, size);
There are several reasonable design alternatives for such a function with respect to its interface, semantics, wording, naming, etc. Both WG14 and WG21 discussed them in several instances on previous revisions of this proposal. In the end, the C committee decided to go for a function with the same interface as memset
for simplicity and familiarity.
While it is a good practise to ensure that the memory is cleared as soon as possible, there are other potential improvements when handling sensitive data in memory:
Most of these extra improvements require either non-portable code, operating-system-specific calls, or compiler support as well, which also makes them a good target for standardization. A subset (e.g., encryption-at-rest and locking/pinning) may be relatively easy to tackle in the near future since libraries/projects and other languages handle them already. Other improvements, however, are well in the research stage on compilers, e.g.:
Furthermore, discussing the addition of any security-related features to the C and C++ languages is a complex task. Therefore, this paper attempts to only spearhead the work on this area, providing access to users to the well-known “guaranteed memory clearing” function already used in the industry as explained in the previous sections. For context, it is important to mention that, recently, there seems to be a resurgence of interest in these topics: for C, N2485 proposed to add similar functionality [42]; while within C++ there are ongoing discussions about creating a safety and security SG or SIG.
Some other related improvements based on the basic building blocks can be also thought of, too:
memset_explicit
is called on destruction/move, plus some other features (explicit read/modify/write access, locking/pinning, encryption-at-rest, etc.). This wrapper is a good approach to ensure memory is cleared even when dealing with exceptions. During the WG21 LEWGI review, it was acknowledged that similar types are in use by third-parties. Some libraries and languages feature similar types, as mentioned in the previous section.secure_string
) may be considered useful (e.g., for passwords).There are, as well, other related library features that some languages provide for handling sensitive data:
getpass
module in Python [44]).This proposal adds the memset_explicit
function which follows the memset
interface, semantics and naming.
Two main wording alternatives are proposed (1 and 4).
memset
would provide.The proposed wordings are with respect to the N2596 C2x Working Draft.
After “7.24.6.1 The memset
function”, add:
7.24.6.2 The
memset_explicit
functionSynopsis
#include <string.h> void *memset_explicit(void *s, int c, size_t n);
Description
The
memset_explicit
function copies the value ofc
(converted to anunsigned char
) into each of the firstn
characters of the object pointed to bys
. The purpose of this function is to make sensitive information stored in the object inaccessible. (Footnote 1)(Footnote 1) The intention is that the memory store is always performed (i.e., never elided), regardless of optimizations. This is in contrast to calls to the
memset
function (7.24.6.1).Returns
The
memset_explicit
function returns the value ofs
.
In “B.23 String handling <string.h>
”, after the memset
line, add:
void *memset_explicit(void *s, int c, size_t n);
In “J.6.1 Rule based identifiers”, after the memset
line, add:
memset_explicit
After “7.24.6.1 The memset
function”, add:
7.24.6.2 The
memset_explicit
functionSynopsis
#include <string.h> void *memset_explicit(void *s, int c, size_t n);
Description
The
memset_explicit
function copies the value ofc
(converted to anunsigned char
) into each of the firstn
characters of the object pointed to bys
. The purpose of this function is to make sensitive information stored in the object inaccessible. (Footnote 1)(Footnote 1) The intention is that the memory store is always performed (i.e., never elided), regardless of optimizations. This is in contrast to calls to the
memset
function (7.24.6.1). The implementation might clear other copies of the data (e.g., intermediate values, stack frame, cache lines, spilled registers, swapped out pages, etc.) or it might avoid their creation (e.g., reducing copies, locking/pinning pages, etc.).Returns
The
memset_explicit
function returns the value ofs
.
In “B.23 String handling <string.h>
”, after the memset
line, add:
void *memset_explicit(void *s, int c, size_t n);
In “J.6.1 Rule based identifiers”, after the memset
line, add:
memset_explicit
After “7.24.6.1 The memset
function”, add:
7.24.6.2 The
memset_explicit
functionSynopsis
#include <string.h> void *memset_explicit(void *s, int c, size_t n);
Description
The
memset_explicit
function copies the value ofc
(converted to anunsigned char
) into each of the firstn
characters of the object pointed to bys
. The purpose of this function is to make sensitive information stored in the object inaccessible, possibly through additional implementation-defined behavior(s) unobservable by the abstract machine (5.1.2.3). (Footnote 1)(Footnote 1) The intention is that the memory store is always performed (i.e., never elided), regardless of optimizations. This is in contrast to calls to the
memset
function (7.24.6.1). Additional implementation-defined behavior can include clearing other implementation-defined copies of the data (e.g., intermediate values, stack frame, cache lines, spilled registers, swapped out pages, etc.), or avoiding the creation of such implementation-defined copies (e.g., reducing copies, locking/pinning pages, etc.).Returns
The
memset_explicit
function returns the value ofs
.
In “B.23 String handling <string.h>
”, after the memset
line, add:
void *memset_explicit(void *s, int c, size_t n);
In “J.6.1 Rule based identifiers”, after the memset
line, add:
memset_explicit
Thanks to Aaron Ballman, JF Bastien, David Keaton and Billy O’Neal for providing guidance about the WG14 and WG21 standardization processes. Thanks to Aaron Ballman, Peter Sewell, David Svoboda, Hubert Tong, Martin Uecker, Ville Voutilainen, Rajan Bhakta, JeanHeyd Meneide, Jens Gustedt and others for their input on wording and the abstract machine. Thanks to Aaron Peter Bachmann for withdrawing his parallel proposal [42]. Thanks to Robert C. Seacord for his review and proofreading of a previous revision. Thanks to Ryan McDougall for presenting an early revision at WG21 Kona 2019. Thanks to Graham Markall for his input regarding the SECURE project and the current state of compiler support for related features. Thanks to Martin Sebor for pointing out the SECURE project. Thanks to BSI for suggesting constraining the template to non-pointers. Thanks to Philipp Klaus Krause for raising the discussion in the OpenBSD list. Thanks to everyone else that joined all the different discussions.
memset
function call (…)” — https://www.viva64.com/en/w/v597/memset_s()
to clear memory, without fear of removal” — http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1381.pdfopenssl/crypto/ec/curve448/utils.c
(old code)” — https://github.com/openssl/openssl/blob/f8385b0fc0215b378b61891582b0579659d0b9f4/crypto/ec/curve448/utils.cOPENSSL_cleanse
(implementation)” — https://github.com/openssl/openssl/blob/master/crypto/mem_clr.copenssl/crypto/mem.c
(old code)” — https://github.com/openssl/openssl/blob/104ce8a9f02d250dd43c255eb7b8747e81b29422/crypto/mem.c#L143openssl/crypto/sparccpuid.S
(example of assembly implementation)” — https://github.com/openssl/openssl/blob/master/crypto/sparccpuid.S#L363memzero_explicit
(implementation)” — https://elixir.bootlin.com/linux/v4.18.5/source/lib/string.c#L706bzero
, explicit_bzero
- zero a byte string” — http://man7.org/linux/man-pages/man3/bzero.3.htmlSecureZeroMemory
function” — https://msdn.microsoft.com/en-us/library/windows/desktop/aa366877(v=vs.85).aspxmemzero_explicit
” — https://www.kernel.org/doc/htmldocs/kernel-api/API-memzero-explicit.htmlOPENSSL_cleanse
” — https://www.openssl.org/docs/man1.1.1/man3/OPENSSL_cleanse.htmlOPENSSL_cleanse()
#455” — https://github.com/openssl/openssl/pull/455SecureZeroMemory
/ RtlSecureZeroMemory
?” — https://stackoverflow.com/questions/13299420/memset()
calls?” — https://gcc.gnu.org/ml/gcc-help/2014-10/msg00047.htmlmemset
optimized out in random.c
” — https://bugzilla.kernel.org/show_bug.cgi?id=82041memset_s
in gcc 8.2 at Godbolt” — https://godbolt.org/g/M7MyRgmemset_s
in clang 6.0.0 at Godbolt” — https://godbolt.org/g/ZwbkgYmemset_s
in MSVC 19 2017 at Godbolt” — https://godbolt.org/g/FtrVJ8memset_s
in icc 18.0.0 at Godbolt” — https://godbolt.org/g/vHZNrWsecrecy
Rust library” — https://crates.io/crates/secrecysecstr
Rust library” — https://crates.io/crates/secstrzeroize
Rust library” — https://crates.io/crates/zeroizeSecureString
Class” — https://docs.microsoft.com/en-us/dotnet/api/system.security.securestringsecuremem
: abstraction to an auto scrubbing and const time eq, memory chunk.” — https://hackage.haskell.org/package/securememsecure_val
: a secure-clear-on-move type” — http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1315r1.html