In April-September 2015 we distributed a web survey to investigate what C is, in current mainstream practice: the behaviour that programmers assume they can rely on, the behaviour provided by mainstream compilers, and the idioms used in existing code, especially systems code. We were not asking what the ISO C standard permits, which is often more restrictive, or about obsolete or obscure hardware or compilers. We focussed on the behaviour of memory and pointers. This is a step towards an unambiguous and mathematically precise definition of the de facto standards: the C dialects that are actually used by systems programmers and implemented by mainstream compilers.
This document analyses the results. For many questions the outcome seems clear, but for some, especially 1, 2, 9, 10, and 11, major open questions about current compiler behaviour remain; we'd greatly appreciate comments from the relevant compiler developers or other experts.
We would like to thank all those who responded to the survey, those who distributed it, and especially those who helped us tune earlier versions, including members of the Cambridge Systems Research Group. This work is funded by the EPSRC REMS (Rigorous Engineering for Mainstream Systems) Programme Grant, EP/K008528/1.
Aiming for a modest-scale but technically expert audience, we distributed the survey at the University of Cambridge systems research group, at EuroLLVM 2015, via John Regehr's blog, and via various mailing lists: gcc, llvmdev, cfe-dev, libc-alpha, xorg, a FreeBSD list, xen-devel, a Google C users list, and a Google C compilers list. It was then distributed second-hand via Facebook, Twitter, and Reddit. It was also sent to some Linux and MSVC people, but not widely advertised there.
In all there were 323 responses, between 2015/04/10 and 2015/09/29. Of those, 223 included a name and/or an email address while 100 were anonymous. The responses included a few duplicate submissions from non-anonymous people; the earlier submissions are not included in these numbers. There may also be a small number of duplicates from anonymous people. It's hard to be certain here about exactly which are duplicates, though, so they left them in the data; the small number means they shouldn't significantly affect the results. There were also a few responses directly to the mailing lists, which we include in the text discussion but not in the numbers below.
The responses include around 100 printed pages of textual comments, which are often more meaningful that the numerical survey results. Below we include just a few representative examples for each, not all the comments.
C applications programming : 255
C systems programming : 230
Linux developer : 160
Other OS developer : 111
C embedded systems programming : 135
C standard : 70
C or C++ standards committee member : 8
Compiler internals : 64
GCC developer : 15
Clang developer : 26
Other C compiler developer : 22
Program analysis tools : 44
Formal semantics : 18
no response : 6
other : 18
Most have expertise in C systems programming and significant numbers report expertise in compiler internals and in the C standard.
Comparing the numbers between the set of all responses and those with one or other of those two, they seemed broadly similar. On the whole the "C standard" expertise people are a little more pessimistic.
If you zero all bytes of a struct and then write some of its members, do reads of the padding return zero? (e.g. for a bytewise CAS or hash of the struct, or to know that no security-relevant data has leaked into them.)
Will that work in normal C compilers?
yes : 116 (36%)
only sometimes : 95 (29%)
no : 21 ( 6%)
don't know : 82 (25%)
I don't know what the question is asking : 3 ( 1%)
no response : 6
Do you know of real code that relies on it?
yes : 46 (14%)
yes, but it shouldn't : 31 ( 9%)
no, but there might well be : 158 (49%)
no, that would be crazy : 58 (18%)
don't know : 25 ( 7%)
no response : 5
If it won't always work, is that because [check all that apply]:
you've observed compilers write junk into padding bytes : 31
you think compilers will assume that padding bytes contain unspecified values and optimise away those reads : 120
no response : 150
other : 80
It remains unclear what behaviour compilers currently provide (or should provide) for this. On one side, arguing for a relatively tight semantics:
A modest but significant number of respondents say they know real code that relies on this.
In some circumstances it seems important to provide systems programmers with a mechanism to ensure that no information is leaked via padding. Rewriting structure definitions to make all padding into explicit fields may not be practicable, especially if one wants to do so in a platform-independent way, and so option (d) is not compatible with this. Option (c) makes it possible but awkward to prevent leakage, as there padding must be re-zero'd after member writes.
In some circumstances programmers may rely on predictable padding values, at least in the absence of structure member writes, e.g. for memcmp, hashing, or compare-and-swap of struct values. Again (d) is not compatible with this, and (a) or (b) are preferable. But it's not clear whether any of those usages are common or essential.
More deterministic semantics is in general desirable for debugging.
For MSVC, one respondent suggests the compiler provides (a).
On the other side, looking at the optimisations that compilers actually do (which may force a relatively loose semantics):
Structure assignments observably sometimes do copy padding.
Some respondents expect that writes to a single member might overwrite adjacent padding with zeros, in a wide write. But we don't yet have concrete cases on modern mainstream architectures where this or any of the following three actually happen.
Some respondents expect that writes to a single member might overwrite adjacent padding with arbitrary values, in a wide write.
Many respondents suggest that padding bytes could be deemed by the compiler as holding unspecified values irrespective of any source-code writes of those bytes, and hence that such writes could be omitted and later reads of the padding bytes be given arbitrary (and unstable) values. But this would mean that there is no way for the programmer to avoid leakage or provide deterministic padding values. It's unclear whether this actually happens at present.
One respondent (Joseph Myers) suggests (for GCC) that a "plausible sequence of optimizations is to apply SRA (scalar replacement of aggregates), replacing the memset with a sequence of member assignments (discarding assignments to padding) in order to do so." This could require something equivalent to the above to make the existing compiler behaviour admissible, but it's similarly unclear to us whether it actually does at present.
For Clang, one respondent (David Chisnall) suggests that by the time the optimisation passes operate, padding has been replaced by explicit fields, so neither over-wide writes or permanently-undefined-value behaviour will occur.
The above suggests four possible semantics, listed with the strongest first (in order of decreasing predictability for the programmer and increasing looseness, and hence increasing permissiveness, for optimisers):
a) Structure copies might copy padding, but structure member writes never touch padding.
b) Structure member writes might write zeros over subsequent padding.
c) Structure member writes might write arbitrary values over subsequent padding, with reads seeing stable results.
d) Padding bytes are regarded as always holding unspecified values, irrespective of any byte writes to them, and so reads of them might return arbitrary and unstable values. (But note that for structs stored to malloc'd regions, this is at odds with the idea that malloc'd regions can be reused, so perhaps we could only really have this semantics for the other storage-duration kinds.)
For each compiler (GCC, Clang, MSVC, ICC, ...), the question is which of these it provides on mainstream platforms? If the answer is not (a), to what extent is it feasible to provide compiler flags that force the behaviour to be stronger?
Is reading an uninitialised variable or struct member (with a current mainstream compiler):
(This might either be due to a bug or be intentional, e.g. when copying a partially initialised struct, or to output, hash, or set some bits of a value that may have been partially initialised.)
a) undefined behaviour (meaning that the compiler is free to arbitrarily miscompile the program, with or without a warning) : 139 (43%)
b) ( * ) going to make the result of any expression involving that value unpredictable : 42 (13%)
c) ( * ) going to give an arbitrary and unstable value (maybe with a different value if you read again) : 21 ( 6%)
d) ( * ) going to give an arbitrary but stable value (with the same value if you read again) : 112 (35%)
e) don't know : 3 ( 0%)
f) I don't know what the question is asking : 2 ( 0%)
g) no response : 4
If you clicked any of the starred options, do you know of real code that relies on it (as opposed to the looser options above the one you clicked)?
yes : 27 (11%)
yes, but it shouldn't : 52 (22%)
no, but there might well be : 63 (27%)
no, that would be crazy : 80 (34%)
don't know : 10 ( 4%)
no response : 91
Here also it remains unclear what compilers current provide and what they should provide. The survey responses are dominated by the "undefined behaviour" and "arbitrary but stable" options.
It's not clear whether people are actually depending on the latter, beyond the case of copying a partially initialised struct, which it seems must be supported, and comparing against a partially initialised struct, which it seems is done sometimes. Many respondents mention historical uses to attempt to get entropy, but that seems now widely regarded as a mistake. There is a legitimate general argument that the more determinacy that can be provided the better, for debugging.
But it seems clear that GCC, Clang, and MSVC do not at present exploit the licence the ISO standard gives (in defining this to be undefined behaviour) to arbitrarily miscompile code. Clang seems to be the most aggressive, propagating undef in many cases, though one respondent said "LLVM is moving towards treating this as UB in the cases where the standards allow it to do so (Richard Smith)". But there are special cases where LLVM is a bit stronger (cf the undef docs); it's unclear why they think those are useful. For GCC, Joseph Myers said
"Going to give arbitrary, unstable values (that is, the variable assigned from the uninitialised variable itself acts as uninitialised and having no consistent value). (Quite possibly subsequent transformations will have the effect of undefined behavior.) Inconsistency of observed values is an inevitable consequence of transformations PHI (undefined, X) -> X (useful in practice for programs that don't actually use uninitialised variables, but where the compiler can't see that)."
For MSVC, one respondent said:
"I am aware of a significant divergence between the LLVM community and MSVC here; in general LLVM uses "undefined behaviour" to mean "we can miscompile the program and get better benchmarks", whereas MSVC regards "undefined behaviour" as "we might have a security vulnerability so this is a compile error / build break". First, there is reading an uninitialized variable (i.e. something which does not necessarily have a memory location); that should always be a compile error. Period. Second, there is reading a partially initialised struct (i.e. reading some memory whose contents are only partly defined). That should give a compile error/warning or static analysis warning if detectable. If not detectable it should give the actual contents of the memory (be stable). I am strongly with the MSVC folks on this one - if the compiler can tell at compile time that anything is undefined then it should error out. Security problems are a real problem for the whole industry and should not be included deliberately by compilers."
For each compiler we ask which of these four semantics it provides (weakest first, as in the question):
a) undefined behaviour (meaning that the compiler is free to arbitrarily miscompile the program, with or without a warning).
b) the result of any expression involving that value unpredictable.
c) an arbitrary and unstable value (maybe with a different value if you read again).
d) an arbitrary but stable value (with the same value if you read again).
It looks as if several compiler writers are saying (b), while a significant number of programmers are relying on (d) (which may also be what MSVC supports).
If you calculate an offset between two separately allocated C memory objects (e.g. malloc'd regions or global or local variables) by pointer subtraction, can you make a usable pointer to the second by adding the offset to the address of the first?
Will that work in normal C compilers?
a) yes : 154 (48%)
b) only sometimes : 83 (26%)
c) no : 42 (13%)
d) don't know : 36 (11%)
e) I don't know what the question is asking : 3 ( 0%)
f) no response : 5
Do you know of real code that relies on it?
yes : 61 (19%)
yes, but it shouldn't : 53 (16%)
no, but there might well be : 99 (31%)
no, that would be crazy : 73 (23%)
don't know : 27 ( 8%)
no response : 10
If it won't always work, is that because [check all that apply]:
you know compilers that optimise based on the assumption that that is undefined behaviour : 51
no response : 228
other : 51
Most respondents expect this to work, and a significant number know of real code that relies on it. For example:
It's used in both Linux and FreeBSD for per-CPU variables.
(Robert Watson, David Chisnall, Paul McKenney)
It's used for calculating a fingerprint of bytes in memory, for FIPS validation. The OpenSSL FIPS canister is one example. (Jonathan Lennox)
QEMU relies heavily on pointer arithmetic working in the "obvious" way on the set of machines/OSes we target. I know this isn't strictly standards compliant but it would break so much real code to enforce it that I trust that gcc/clang won't do something dumb here. (IIRC there was a research project that tried to enforce no buffer overruns by being strict to the standards text here and they found that an enormous amount of real world code did not work under their setup.) (Peter Maydell)
The MPI Forum (which includes me) recognizes the problems of address arithmetic in C and has utility functions to make it possible to do things that are necessary, but in a portable way (of course, the implementation is platform specific). (Jeff Hammond)
It's undefined behavior, but an implementation is permitted to use
undefined behavior in its own code since it ostensibly has control
over it. An example of this is the glibc strcpy source (generic C
version) using a ptrdiff_t
between src and dest to create a single
offset and then walking through only one pointer. (Chris Young)
I've seen this done in an OS to link system function calls into ELF binaries (anon)
For example, coreboot contains a mechanism to relocate part of its
data segment from one base address to another during execution. All
accesses to globals in that segment go through a wrapper which after
the migration uses arithmetic like this to find the new address
(e.g. something like return !migration_done ? addr : addr -
old_base + new_base;
). (anon)
Historically, the main reason to disallow it seems to have been segmented architectures, especially 8086. There are still some embedded architectures with distinct address spaces, and people mention the following, but it's not clear that "mainstream" C should be concerned with this, and those cases could be identified as a language dialect or implementation-defined choice.
Semantically, it's straightforward to identify language dialects in which this is or is not allowed.
Then there is the possibility of exotic implementations in which pointers are represented by hash-map entries (Nick Lewycky), but again that seems outwith "mainstream" C.
On the other hand, current compilers sometimes do optimise based on an assumption (in a points-to analysis) that this doesn't occur (c.f. comments from Joseph Myers and Dan Gohman). How could these be reconciled?
One could argue that the use cases should be rewritten, but that seems unlikely to actually happen in practice.
One could turn off such optimisations (e.g. with -fno-tree-pta
for
GCC?).
The analysis could treat inter-object pointer subtractions as giving integer offsets that have the power to move between objects (though occurrences split across compilation units might mean one had to be too pessimistic).
One could add additional annotated pointer or integer types to identify in the source where this might occur.
For two pointers derived from the addresses of two separate allocations, will equality testing (with ==) of them just compare their runtime values, or might it take their original allocations into account and assume that they do not alias, even if they happen to have the same runtime value? (for current mainstream compilers)
a) it will just compare the runtime values : 141 (44%)
b) pointers will compare nonequal if formed from pointers to different allocations : 20 ( 6%)
c) either of the above is possible : 101 (31%)
d) don't know : 40 (12%)
e) I don't know what the question is asking : 16 ( 5%)
f) no response : 5
If you clicked either of the first two answers, do you know of real code that relies on it?
yes : 60 (26%)
yes, but it shouldn't : 16 ( 7%)
no, but there might well be : 68 (29%)
no, that would be crazy : 46 (20%)
don't know : 37 (16%)
no response : 96
The responses are roughly bimodal: many believe "it will just compare the runtime values", while a similar number believe that the comparison might take the allocation provenance into account. Of the former, 41 "know of real code that relies on it".
In practice we see that GCC does sometimes take allocation provenance into account, with the result of a comparison (in an n+1 case, comparing &p+1 and &q) sometimes varying depending on whether the compiler can see the provenance, e.g. on whether it's done in the same compilation unit as the allocation. We don't see any reason to forbid that, especially as this n+1 case seems unlikely to arise in practice, though it does complicate the semantics, effectively requiring a nondeterministic choice at each comparison of whether to take provenance into account. But for comparisons between pointers formed by more radical pointer arithmetic from pointers originally from different allocations, as in [3/15], it's not so clear.
Conclusion:
The best "mainstream C" semantics here seems to be to make a nondeterministic choice at each comparison of whether to take provenance into account or just compare the runtime pointer value, option (c). In the vast majority of cases the two will coincide.
Can you make a usable copy of a pointer by copying its representation bytes with code that indirectly computes the identity function on them, e.g. writing the pointer value to a file and then reading it back, and using compression or encryption on the way?
Will that work in normal C compilers?
a) yes : 216 (68%)
b) only sometimes : 50 (15%)
c) no : 18 ( 5%)
d) don't know : 24 ( 7%)
e) I don't know what the question is asking : 9 ( 2%)
f) no response : 6
Do you know of real code that relies on it?
yes : 101 (33%)
yes, but it shouldn't : 24 ( 7%)
no, but there might well be : 100 (33%)
no, that would be crazy : 54 (17%)
don't know : 23 ( 7%)
no response : 21
The responses are overwhelmingly positive, with many specific use cases in the comments, e.g.:
Marshalling data between guest and hypervisor (Jon)
You can go much stronger than that. Many security mitigation techniques rely on being able to XOR a pointer with one or more values and recover the pointer later by again XORing with one or more possible different values, (whose total XOR is the same as the original set). (Richard Black)
Windows /GS stack cookies do this all the time to protect the return address. The return address is encrypted on the stack, and decrypted as part of the function epilogue. (Austin Donnelly)
I've written code for a JIT that stores 64-bit virtual ptrs as their hardware based 48-bits. This is a valuable optimisation, even if it's not strictly OK. (anon)
I've also worked on 64-bit ports of 32-bit code that purposefully keep 32-bit pointer-like ints to keep their memory footprint low (with appropriate calls to tell the system exactly where we want our data).
The current Julia task-scheduler does this, by way of copying a task's stack into a buffer, and copying the buffer back to the stack later. (Arch D. Robison)
BLOSC (http://blosc.org/) does something like this. It compresses data stored in RAM with the goal of reading compressed data from RAM into L1 cache faster than an uncompressed memcpy. If pointer values can't be copied indirectly, then BLOSC users are in trouble. (Alan Somers)
The responses about current compiler behaviour are clear that in simple cases, with direct data-flow from original to computed pointer, both GCC and Clang support this. But for computation via control-flow, it's not so clear:
[wrt GCC] Yes, it is valid to copy any object that way (of course, the original pointer must still be valid at the time it is read back in). It is not, however, valid or safe to manufacture a pointer value out of thin air by, for example, generating random bytes and seeing if the representation happens to compare equal to that of a pointer. See DR#260. Practical safety may depend on whether the compiler can see through how the pointer representation was generated. (Joseph Myers)
[wrt Clang] Pretty sure this is valid behaviour. We go out of our way to support this. Well, okay, it depends how indirectly. If you want to be completely loopy, this won't work in our compiler:
bool isThisIt(uintptrt i) { return i == 0x12341234; } void *launderpointer() { int stackobj; for (uintptr_t i = 0; ; ++i) { if (isThisIt(&stackobj + i)) { return (void*)(i - 0x12341234); } } }
because we may return false for every call to isThisIt()
even though I
think it's technically valid. We generally forbid guessing the
addresses of values where we're allowed to pick the address (ie., we
fold &stackobj == (void*)rand()
to false), but we didn't account for
the case someone tries the entire address space in a loop. Don't care.
Taking the pointer and capturing/escaping it is supported, we assume
it may come back in from anywhere in the future, including by being
typed in at the console. (Nick Lewycky)
See Gil Hur's GCC bugreport thread and email discussion from 19 May 2015: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 (essentially the same as Nick Lewycky's example above?)
Some compilers require the computation of the pointer to somehow depend on the original pointer -- you can round-trip through a file, but you can't just guess the address, even if you guess right (for instance, if you ask the user to type in a number and assume it's the pointer, and the user gets the number from a debugger, that will not work in practice). (Richard Smith)
Conclusion:
It looks as if a reasonable "mainstream C" semantics should allow indirect pointer copying at least whenever there's a data-flow provenance path, perhaps not when there's only a control-flow provenance path. It should allow pointers to be marshalled to and read back in, and the simplest way of doing that is to allow any pointer value to be read in, with the compiler making no aliasing/provenance assumptions on such, and with the semantics checking the numeric pointer values points to a suitable live object only when and if it is dereferenced.
Can one do == comparison between pointers to objects of different types (e.g. pointers to int, float, and different struct types)?
Will that work in normal C compilers?
a) yes : 175 (55%)
b) only sometimes : 67 (21%)
c) no : 44 (13%)
d) don't know : 29 ( 9%)
e) I don't know what the question is asking : 2 ( 0%)
c) no response : 6
Do you know of real code that relies on it?
yes : 111 (35%)
yes, but it shouldn't : 47 (15%)
no, but there might well be : 107 (34%)
no, that would be crazy : 27 ( 8%)
don't know : 17 ( 5%)
no response : 14
The question should have been clearer about whether the pointers are first cast to void* or char*. With those casts, the responses seem clear that it should be allowed, modulo now-unusual architectures with segmented memory or where the pointer representations are different.
Then there's a question, which we would hope applies only in the case without those casts, about whether -fstrict-aliasing will treat comparisons with type mismatches as nonequal, e.g.
Depends on strict-aliasing flags? I think LLVM TBAA might optimise this sort of check away? (Chris Smowton)
There are a lot of examples of this, in particular in libc, or possibly implementations of vtables. (anon)
Can one do < comparison between pointers to separately allocated objects?
Will that work in normal C compilers?
a) yes : 191 (60%)
b) only sometimes : 52 (16%)
c) no : 31 ( 9%)
d) don't know : 38 (12%)
e) I don't know what the question is asking : 3 ( 0%)
f) no response : 8
Do you know of real code that relies on it?
yes : 101 (33%)
yes, but it shouldn't : 37 (12%)
no, but there might well be : 89 (29%)
no, that would be crazy : 50 (16%)
don't know : 27 ( 8%)
no response : 19
This seems to be widely used for lock ordering and collections.
As for Q3, there's a potential issue for segmented memory systems (where the implementation might only compare the offset) which seems not to be relevant for current "mainstream" C.
Apart from that, there doesn't seem to be any reason from compiler implementation to forbid it.
Can you inspect (e.g. by comparing with ==) the value of a pointer to an object after the object itself has been free'd or its scope has ended?
Will that work in normal C compilers?
a) yes : 209 (66%)
b) only sometimes : 52 (16%)
c) no : 30 ( 9%)
d) don't know : 23 ( 7%)
e) I don't know what the question is asking : 1 ( 0%)
f) no response : 8
Do you know of real code that relies on it?
yes : 43 (14%)
yes, but it shouldn't : 55 (18%)
no, but there might well be : 102 (33%)
no, that would be crazy : 86 (28%)
don't know : 18 ( 5%)
no response : 19
The responses mostly say that this will work (the ISO standard notwithstanding), and include various use cases:
The pointer itself is still valid, and can be compared. Dereferencing the pointer can't. (Warner Losh)
A pointer is a value which does not cease to have a value because
you happened to pass that value to a function called free (or any
other function annotated with _Frees_ptr_
) but the set of things
that it would be reasonable to do with such a pointer would be
extremely limited. (Richard Black)
Where I've seen this is code like this:
free(myptr);
release_extra_data_keyed_by_pointer(myptr);
(Jorg Brown)
A common pattern that relies on this is calling realloc and "checking whether it moved" to decide whether to update other copies of the pointer. (Nick Lewycky)
As discussed in Q4, the current stable version of ntpd does this. (Pascal Cuoq)
You can't deference the pointer, but the value remains valid. The only good use for it I can think of it to log a debugging message (which would only be useful if one also logged the allocate). In fact, I have logged such messages myself when unloading a loadable kernel driver (because all evidence of what had been at those pages was gone; so, anything faulting referencing the unloaded driver would be a complete mystery). (Herbie Robinson)
There are debugging environments that will warn of it, however, e.g.:
And for GCC, one respondent writes:
but doesn't say what might go wrong.
Can we either establish that current mainstream compilers will support this or identify more specifically where and how they will fail to do so?
Can you (transiently) construct an out-of-bounds pointer value (e.g. before the beginning of an array, or more than one-past its end) by pointer arithmetic, so long as later arithmetic makes it in-bounds before it is used to access memory?
Will that work in normal C compilers?
a) yes : 230 (73%)
b) only sometimes : 43 (13%)
c) no : 13 ( 4%)
d) don't know : 27 ( 8%)
e) I don't know what the question is asking : 2 ( 0%)
f) no response : 8
Do you know of real code that relies on it?
yes : 101 (33%)
yes, but it shouldn't : 50 (16%)
no, but there might well be : 123 (40%)
no, that would be crazy : 18 ( 5%)
don't know : 14 ( 4%)
no response : 17
It seems clear that this is often assumed to work, e.g.:
All the time. All the time. (anon)
The Numerical Recipes in C rely on it; that's widely-used code in
the physics community with some pretty horrible (and probably
illegal) C code. This code explicitly stores and passes
out-of-bounds pointers. If I index a multi-dimensionsl array
manually, then I there's a chain of arithmetic like
p + i * di + j * dj + k * dk
or so, where p
is a pointer and the others are integers, and
I don't pay attention to the order in which these are
evaluated. This just may temporarily lead to out-of-bounds pointers,
depending on the order of evaluation. (Erik Schnetter)
Tcpdump does a bit of this where they create a variable from an array and then check it is in bounds (Brooks Davis)
Yeah, we didn't even bother with this one in clang
-fsanitize=undefined
. (Nick Lewycky)
though we also see:
But on the other hand, compilers may sometimes assume otherwise:
This is not safe; compilers may optimise based on pointers being within bounds. In some cases, it's possible such code might not even link, depending on the offsets allowed in any relocations that get used in the object files. (Joseph Myers)
The situation has not gotten friendlier to old-school pointer manipulations since https://lwn.net/Articles/278137/ was written in
Pretty sure this one I've seen buggy code optimised away by real compilers. (David Jones)
Here the prevalence of transiently out-of-bounds pointer values in real code suggests it's worth seriously asking the cost of disabling whatever compiler optimisation is done based on this, to provide a simple predictable semantics.
Given two structure types that have the same initial members, can you use a pointer of one type to access the intial members of a value of the other?
Will that work in normal C compilers?
a) yes : 219 (69%)
b) only sometimes : 54 (17%)
c) no : 17 ( 5%)
d) don't know : 22 ( 6%)
e) I don't know what the question is asking : 4 ( 1%)
f) no response : 7
Do you know of real code that relies on it?
yes : 157 (50%)
yes, but it shouldn't : 54 (17%)
no, but there might well be : 59 (19%)
no, that would be crazy : 22 ( 7%)
don't know : 18 ( 5%)
no response : 13
It's clear that this is used very commonly:
LLVM's hand rolled rtti does this! (JF Bastien)
The FreeBSD kernel and many other things do this. Most anything that uses structs to access IPv4 and IPv6 header data. (Brooks Davis)
This is very common. It is often achieved by simply making the first member of the second structure an instance of the first structure, but in some cases (e.g., the Berkeley socket address types) even dissimilar views to the same representation data are used at different times. (Ethan Blanton)
Lots of code uses this type punning. (Warner Losh)
This happens all the time. Not just restricted to initial members,
using the CONTAINING_RECORD()
macro. (Austin Donnelly)
Yes, this is permitted by the std. (Nick Lewycky)
[in fact it isn't]
Guaranteed by the standard only if the structures are members of the same union (clause 6.5.2.3, structure and union members) but it will normally work for bare structures. Very common for implementing object-oriented polymorphism, e.g. in bytecode interpreters. (Tony Finch)
I can swear I've seen this in both Windows headers and the Linux kernel.
This is a common idiom in X11 event handling code - you are forced into it by the Xlib API which assumes that you can read the event type from the first member of the XEvent union regardless of which subtype of the union will be used to read the rest of the data. (Peter Benie)
Half of the Win32 API, BSD sockets and most OOP done in C would break. (anon)
This is used so commonly that no compiler would dare to do anything than what you expect. (anon)
This is used all over the place. (Herbie Robinson)
On the other hand, with strict aliasing:
and w.r.t. GCC:
though the latter doesn't say why, or whether that's specific to strict-aliasing.
At least with -no-strict-aliasing, it seems this should be guaranteed to work.
Can an unsigned character array be used (in the same way as a malloc’d region) to hold values of other types?
Will that work in normal C compilers?
a) yes : 243 (76%)
b) only sometimes : 49 (15%)
c) no : 7 ( 2%)
d) don't know : 15 ( 4%)
e) I don't know what the question is asking : 2 ( 0%)
f) no response : 7
Do you know of real code that relies on it?
yes : 201 (65%)
yes, but it shouldn't : 30 ( 9%)
no, but there might well be : 55 (17%)
no, that would be crazy : 6 ( 1%)
don't know : 16 ( 5%)
no response : 15
Here again it's clear that it's very often relied on for statically allocated (non-malloc'd) character arrays, and it should work, with due care about alignment. For example:
BSD kernels use the caddr_t
typedef for allocations that will be
manipulated as bytes. (Brooks Davis)
Encoder/Decoders do this all the time. They read bytes from a file
into an unsigned char buffer, then cast a struct *
on top of it to
pick out the relevant fields and move on. (Austin Donnelly)
But the ISO standard disallows it, and we also see:
though the latter doesn't say why. It is a violation of the strict-aliasing text of the ISO standard.
With -no-strict-aliasing it seems clear it that it should be allowed.
Can you make a null pointer by casting from an expression that isn't a constant but that evaluates to 0?
Will that work in normal C compilers?
a) yes : 178 (56%)
b) only sometimes : 38 (12%)
c) no : 22 ( 6%)
d) don't know : 67 (21%)
e) I don't know what the question is asking : 11 ( 3%)
f) no response : 7
Do you know of real code that relies on it?
yes : 56 (18%)
yes, but it shouldn't : 21 ( 6%)
no, but there might well be : 113 (37%)
no, that would be crazy : 63 (20%)
don't know : 50 (16%)
no response : 20
This is very often assumed to work. The only exception seems to be some (unidentified) embedded systems.
NULL was until maybe C99 or so only conventionally zero, and on some embedded platforms it in practice had a nonzero value. I have not seen this in a very long time. (Ethan Blanton)
Some embedded compilers use a non-zero null pointer so they can point it at unaddressable memory, when the zero page is addressable. (Richard Smith)
WRT GCC:
A "mainstream C" semantics should permit it.
Can null pointers be assumed to be represented with 0?
Will that work in normal C compilers?
a) yes : 201 (63%)
b) only sometimes : 50 (15%)
c) no : 54 (17%)
d) don't know : 7 ( 2%)
e) I don't know what the question is asking : 4 ( 1%)
f) no response : 7
Do you know of real code that relies on it?
yes : 187 (60%)
yes, but it shouldn't : 61 (19%)
no, but there might well be : 42 (13%)
no, that would be crazy : 7 ( 2%)
don't know : 12 ( 3%)
no response : 14
Basically an unequivocal "yes" for mainstream systems. For example:
For all targets supported by GCC, yes. (Joseph Myers)
My understanding is that (1) memset-ing a pointer to zero is NOT guaranteed by the spec to produce a null pointer, but that (2) it does on all systems that most people care about, and that there is real code that relies on that. Being able to memset a struct to zero and have all the fields come out null/zero is convenient enough that I kind of wish the spec would change in this regard. (Matthew Steele)
Note that the POSIX committee is currently discussing a requirement
that a pointer value with all bits zero be treated as a null pointer
(the requirement is specifically that memset()
on a structure
containing pointers initialize those pointers to nulls). (anon)
Again a potential exception for segmented memory, but not relevant for "mainstream" current practice:
Can one read the byte representation of a struct as aligned words without regard for the fact that its extent might not include all of the last word?
Will that work in normal C compilers?
a) yes : 107 (33%)
b) only sometimes : 81 (25%)
c) no : 44 (13%)
d) don't know : 47 (14%)
e) I don't know what the question is asking : 36 (10%)
f) no response : 8
Do you know of real code that relies on it?
yes : 40 (13%)
yes, but it shouldn't : 39 (13%)
no, but there might well be : 103 (35%)
no, that would be crazy : 42 (14%)
don't know : 67 (23%)
no response : 32
This is sometimes used in practice and believed to work, modulo alignment, page-boundary alignment, and valgrind/MSan/etc.
The C version of strcmp()
in FreeBSD is a good example (Brooks Davis)
Lots of code assumes that if you can read any part of a word, you
can read the full word. It won't always use the bits that aren't
valid, but some crazy code does. Often you'd see this expressed as
a variation on a theme of using bcopy where you might see a length
computed by &a[1] - &a[0]
rather than sizeof(*a)
or
sizeof(a[0])
. (Warner Losh)
Incidentally, LLVM will do this to stack accesses in its optimizer. (Nick Lewycky)
If nothing else it requires the compiler to support something like
GCC's __attribute__((__may_alias__));
otherwise the read is undefined
already due to aliasing violations. (Rich Felker)
In practice this is safe with GCC except for possibly generating errors with sanitizers, valgrind etc. (but should be avoided except in special cases such as vectorized string operations). (Joseph Myers)
A "mainstream C" semantics could either forbid this entirely (slightly limiting the scope of the semantics) or could allow it, for sufficiently aligned cases, if some switch is set.
When is type punning - writing one union member and then reading it as a different member, thereby reinterpreting its representation bytes - guaranteed to work (without confusing the compiler analysis and optimisation passes)?
There's widespread doubt, disagreement and confusion here, e.g.:
always (anon)
Never (anon)
According to the standard never; in practice always. (Chris Smowton)
As long as all accesses are via the union, and not, say, by taking separate pointers to the union's fields. (anon)
Type punning always works. The compiler knows very well which fields in a union have what offsets so it knows what writes to one union impact which fields in another member of the union. It should not be confused. (Richard Black)
only when one of the types is a char type. otherwise, never guaranteed to work. (David Jones)
GCC and Clang try to allow it when it's sufficiently obvious that you're doing type punning (for instance, when you're directly accessing a block-scope union variable). GCC documents this, Clang does not (and only really does it for GCC compatibility). (Richard Smith)
You are allowed to pun the prefixes of structure types when the
struct members in the prefix have the same types.
Unsigned char []
and other types is probably OK.
Otherwise, you are getting into strict aliasing problems. (Tony Finch)
Per the standard? Never. The conforming way to do this is with memcpy to a local, and the compiler is plenty smart enough to not actually emit the memcpy or the local. GCC's documentation claims that they support this as long as you've declared the union in advance. This is pretty scary because it means lexically in advance. So two identical function bodies before and after an unrelated declaration introducing a union may change the generated code for the two functions. In practice these unions go into header files and come before the rest of your code, so people tend not to notice. (Nick Lewycky)
Here the minimal thing it seems one should support is, broadly following the GCC documentation, type punning via a union whose definition is in scope and which is accessed via l-values that manifestly involve the union type.
Or, in the -no-strict-aliasing case, one could allow it everywhere. For "mainstream C", it's not yet clear.
If you know of other areas where the C used in practice differs from that of the ISO standard, or where compiler optimisation limits the behaviour you can rely on, please list them.
There were many comments here which are hard to summarise. Many mention integer overflow and the behaviour of shift operators.