malloc
'd region) to hold values of other types?==========================================================
The C Memory Object Model Study Group was created at the April 2018 Brno WG14 meeting, with charter: This group will study the memory object model issues of C, including pointer provenance, uninitialised values, and related questions, aiming to reconcile mainstream usage, implementation practice, the ISO C standard text, and harmonisation with ISO C++. It will produce papers and proposals for C2x. Membership is not restricted to WG14.
The SG has had phone meetings roughly every two weeks from mid-July (2018.07.16, 2018.07.23, 2018.08.06, 2018.08.20, 2018.08.30, 2018.09.17), with various subsets of: Andrew Banks, Hal Finkel, Victor Gomes, Jens Gustedt, Stephen Kell, Stella Lau, Kayvan Memarian, Martin Sebor, Peter Sewell, Hubert Tong, Martin Uecker.
There has also been detailed discussion on the mailing list, https://lists.cam.ac.uk/mailman/listinfo/cl-c-memory-object-model
(moderated, public archives). There is also a publicly readable github repository, not yet much used: https://github.com/C-memory-object-model-study-group/c-mom-s
. Contact PS to be added to the mailing list and github.
This note summarises our progress since the Brno meeting, first with a brief summary (updated with some of the 2018.09.17 meeting comments), in this section, and then going into more details with examples. We have tried to capture the sometimes-disparate views of the SG members; apologies for any errors or omissions.
Discussions have focussed on provenance, subobjects, and effective types. They have not yet reached a clear overall conclusion (a detailed concrete proposal for the desired semantics), but they have clarified some issues and identified a number of new questions that need to be resolved.
There has also been extensive meta-discussion on the balance to be struck between supporting existing C code and imposing tight bounds on behaviour (to improve error reporting and optimisation). We hope to avoid getting bogged down in this by focussing on clarifying the current ISO and de facto standards, and by enumerating possible options, deferring discussion of what should be the default.
The WG21 members of the SG highlighted recent related work there, especially the C++ object creation paper P0593r2.
Some of us attended the GNU Tools Cauldron, 2018.09.07-09 (Kayvan Memarian, Martin Sebor, Peter Sewell). PS gave a talk exploring some of the questions about provenance, subobjects, and uninitialised reads, and we had a lengthy and informative discussion with some of the key GCC developers.
Our earlier proposal treats provenance only on a per-allocation granularity. That's attractively simple, but it would let one (for example) use pointer arithmetic to move between adjacent int members of a struct, which one would probably prefer to regard as a reportable error. There seems to be a consensus that some tighter subobject semantics would be preferable.
Hubert: in C++, for objects with a lifetime, the pointer arithmetic rules (not the aliasing rules) say you can't move between members. They recognise there's a need for moving between bytes in allocated regions; this is still in development. Maybe they will say one can't recover (through casts) back to a pointer to the non-character/byte type after having done char* arithmetic.
Hubert: XL C/C++ for subobjects operates under the assumption that things accessed in a way that suggests they ought to be subobjects are indeed subobjects.
Jens: agrees with the need for some tighter semantics.
A key question is whether one can cast from a pointer to the first member of a struct to the struct as a whole, and then use that to access other members. We discussed it previously in N2222 Q34 Can one move among the members of a struct using representation-pointer arithmetic and casts?, N2222 Q37 Are usable pointers to a struct and to its first member interconvertable?, N2013, and N2012. Some of us had thought that that was uncontroversially allowed in ISO C, by 6.7.2.1p15 ...A pointer to a structure object, suitably converted, points to its initial member..., and vice versa..., but others disagree. In practice, this seems to be common in real code, in the "container-of" idiom.
Though someone suggested that the IBM XL C/C++ compiler does not support it. Clarification from WG14 and compiler teams would be very helpful on this point.
Jens: we know code does it, but how much does it have to? Some cases don't involve up-casts, but casting the whole void pointer to the struct pointer.
Hubert: think the idiom is used. C++ committee is convinced that C has intended that this is supposed to work. (There's some subtlety around the case where the first member is an array). It's meant to work for C++. Maybe XL doesn't get this right all the time right now - would call that a bug.
With careful reading, the standard seems reasonably clear that multidimensional arrays are recursively structured arrays-of-arrays. It seems uncontroversially desirable to maintain this for accesses via explicit indexing (i.e., requiring each index to be within the corresponding bounds). For access via pointer arithmetic, the ISO text forbids (for example) linear traversals of a multidimensional array, but the situation in practice and the best semantics for C2X are not so clear.
Hubert: C++ not likely to entertain flat pointer arithmetic view, because there's extensive compiler-time evaluation, using a symbolic representation, and they're wary of exposing incidental layout facts to that. On the other hand, it's highly unlikely that an implementation would break a flat address space traversal. Not sure the standard has to specifically bless it, so long as people know that they shouldn't break it.
Jens: array access is formally defined using pointer arithmetic, so we'd have to change that to differentiate the two. Possible.
The basic idea of our previous provenance semantics, tracking provenance through dataflow but not through control flow, seems relatively uncontroversial, and to match what GCC does (including the treatment of the representation bytes of pointers).
Our previous semantics tracks provenance through casts to integer types and back, based on optimisations seen in GCC on some tests and on the GCC documentation: "When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.". However, discussions at the GNU Tools Cauldron suggest instead that they regard the result of casting via integer types as aliasing with almost anything, at least in GIMPL, and those test results as long-standing bugs in the RTL backend.
Shifting to this would considerably simplify the provenance semantics. Again, comments from other compiler teams and WG14 would be very helpful.
See also Reconciling High-Level Optimizations and Low-Level Code in LLVM, Juneyoung Lee, Chung-Kil Hur, Ralf Jung, Zhengyang Liu, John Regehr, Nuno P. Lopes, OOPSLA 2018, which discusses the closely related but distinct question of the memory object model for the LLVM internal language.
Hubert: the visibility of the cast and where you're doing the read or write matters greatly. Flow-sensitive form: requires you to see the source of the value. If you knew you took the address, cast to int, and cast back, Hubert's reading of the GCC statement above is that the flow-sensitive form is no longer aware of the source of the pointer. But the result will be subject to the type-based aliasing rules. Not sure this would be similar to LLVM, because it's aggressive about removing non-value-changing casts, so unlikely that that would be preserved, and so flow-sensitive analysis would go through casts to and from integer type. So for GCC and XL, it's just this flow-sensitive analysis for which such casts would introduce opacity (XL just fixed an issue). But for LLVM, they wouldn't.
Jens: but then you need to nail down what's escaped.
Hubert: would guess that the "taken address and escaped local scope" would mostly work.
Hubert: When talking about escaped, subobject provenance is also in play. Suppose you took the address of the second member of a structure. Should you then regard the whole structure as escaped?
Peter: what about the offsetof case?
Hubert: that case is extremely controversial. C++ has no intention of allowing it. WG14 asked what the use of offsetof is if you can't do this, but there's nothing to say it works. Most people vaguely think that once upon a time it was supposed to work? But unclear whether it's supposed to work now.
Victor: FreeBSD has > 5000 offsetofs. Maybe Stephen's tool could analyse how many are used for inter-member shifts.
Martin Uecker + Hubert: [discussion of how one can define "escaped"]
There seemed to be some consensus among the GNU Tools Cauldron audience that reads of uninitialised scalars have to uniformly be regarded as UB as far as the GCC implementation is concerned, the existence of code that does incremental bitwise initialisation notwithstanding. That matches the intent of several WG14 members from the Brno meeting, though not (in our reading) the text of the standard. The case of structure reads of partially or wholly uninitialised structs was not so clear.
Hubert: making all uninitialised reads UB is very very close to what C++ does. Some corner cases via accessing via char type. memcpy is almost the only thing you do - if you implement memcpy yourself, the promotion would cause UB.
Peter: shouldn't the language be strong enough to let the user write a working memcpy?
Jens: yes - in the standard, they're often the same; they should be. Character-wise access should allow this.
Peter: should people be allowed to do bitwise ops to partially initialise?
Jens: wouldn't be a fan of that except at char type.
Hubert: agree it'd be nice and C++ would like it to be possible to implement memcpy, perhaps, but C++ is imbuing more properties on memcpy - might come to viewing memcpy as really special.
Jens: strcpy even more problematic, because of accessing after the 0 for efficient wider accesses.
Hubert: that would be UB in C++
Hubert: I believe C is rather clear that structures cannot have trap representations, thus whole-structure copies by assignment are okay with uninitialized structs. There is an active issue that I reported against C++ that the undefined behaviour on whole-structure copies (due to the member-wise nature of C++ copy assignment) is a C compatibility issue.
Martin Sebor: You're right, that is unfortunately true since C11 (it was explicitly undefined before then). The change was introduced in response to DR 222:
http://std.dkuug.dk/jtc1/sc22/wg14/www/docs/dr_222.htm
I think the change was based on an invalid premise [...]
The ISO text for effective types, 6.5p{6,7}, is all in terms of the types of the lvalues used for access. This seems not to match GCC behaviour, which involves the construction of the lvalues, not just the types of the lvalues as a whole.
At least some compilers also seem to differ from ISO in requiring syntactic visibility of union definitions in order to allow accesses to structures with common prefixes inside unions.
The ISO text also leaves several questions unclear, e.g. relating to memory initialised piece-by-piece and then read as a struct or array, or vice versa.
Hubert: we almost all agree that the TBAA that compilers implement is almost as if everything was declared explicitly, and it just happens to be the case that malloc works.
Peter: GCC people said they mostly ignored declared types, because of pervasive strong updates.
Hubert: XL also looks at the type of the maximal thing in the lvalue.
Our reading of C11 and proposal for C2x: C11: the text does not clearly specify. C2x proposal: yes. C2x proposal (no-provenance option): n/a
Our previous proposal (N2219 and before) had a simple per-allocation notion of provenance, allowing access via a pointer anywhere within the memory footprint of the original allocation (though note that without '-fno-strict-aliasing' there would be some additional effective-type restrictions). In some cases this flexibility is needed, e.g. for any bytewise computation of the representation of a struct, e.g. via a user-code 'memcpy', as below, or to serialise/unserialise it, where an unsigned char *
pointer has to be able to traverse the entire allocation footprint.
Example: pointer_copy_user_dataflow_direct_bytewise_struct.c
#include <stdio.h>
#include <string.h>
typedef struct { int i; float f; } st;
st s1 = {.i=1, .f=1.0 };
st s2;
void user_memcpy(unsigned char* dest,
unsigned char *src, size_t n) {
while (n > 0) {
*dest = *src;
src += 1;
dest += 1;
n -= 1;
}
}
int main() {
st *p = &s1;
st *q = &s2;
user_memcpy((unsigned char*)q,
(unsigned char*)p, sizeof(st));
// is this free of undefined behaviour?
printf("p->i = %d q->i = %d\n",p->i,q->i);
}
It is similarly needed for the N2222 2.5.4 Q34 Can one move among the members of a struct using representation-pointer arithmetic and casts?, if one believes that should be allowed.
But in other cases per-allocation provenance allows intra-allocation examples which we don't think should be supported, e.g. as below, and which will unnecessarily impede alias analysis. Existing compilers may well do this (according to Nuno Lopes, GCC and ICC already support some subobject analysis, and people are working on it for LLVM. We don't know whether or not these analyses are turned on even with -fno-strict-aliasing
.)
Example: provenance_intra_object_1.c
#include <stdio.h>
#include <string.h>
typedef struct { int x; int y; } st;
int main() {
st s = { .x=1, .y=2 };
int *p = &s.x + 1;
int *q = &s.y;
printf("Addresses: p=%p q=%p\n",(void*)p,(void*)q);
if (memcmp(&p, &q, sizeof(p)) == 0) {
*p = 11; // is this free of undefined behaviour?
printf("s.x=%d s.y=%d *p=%d *q=%d\n",s.x,s.y,*p,*q);
}
}
A possible reconcilation of all this would be to have a per-subobject provenance restriction by default, but relax this (to per-allocation provenance) for pointers that have been formed by an explicit cast. Perhaps only for casts to void *
, unsigned char *
, intptr_t
, or uintptr_t
, or perhaps (for simplicity) for all casts. This semantics was suggested by documentation from the RV-match group.
To enforce subobject provenance, we could adapt the previous proposal to keep, with every pointer value (and with integer value if tracking provenance via integers), both a provenance ID and a subrange of the original allocation footprint. When constructing a pointer to a struct member, by &(x.p)
, we'd take the subrange of that member. When constructing a pointer to a union member, we'd take the size of that member. For a pointer constructed as the address of a specific array element, it's unclear whether one should take just that element or the whole array. Then on any cast, we'd expand the subrange to that of the original allocation. Similarly on any representation-byte write, except that we'd like a user-memcopy to reconstruct the original limited pointer value.
Against this, MS argues that casts should have no effect on the memory one is allowed to access via a pointer (with the possible exception of unsigned char*
casts).
Example: cast_struct_and_first_member_1.c
#include <stdio.h>
typedef struct { int i; float f; } st;
int main() {
st s = {.i = 1, .f = 1.0};
int *pi = &(s.i);
st* p = (st*) pi; // free of undefined behaviour?
p->f = 2.0; // and this?
printf("s.f=%f p->f=%f\n",s.f,p->f);
}
The variant below is a case where one can imagine that an aggressive compiler that assumes this is not permitted might do visible optimisation.
Example: cast_struct_and_first_member_2.c
#include <stdio.h>
typedef struct { int i; float f; } st;
void f(int *pi) {
st *pst = (st*)pi;
pst->f = 2.0; // free of undefined behaviour?
}
int main() {
st s = {.i = 1, .f = 1.0};
int *pi = &(s.i);
f(pi);
float f = s.f;
printf("f=%f\n",f); // prints 2.0 not 1.0?
}
We discussed this previously in N2222 Q34 Can one move among the members of a struct using representation-pointer arithmetic and casts?, N2222 Q37 Are usable pointers to a struct and to its first member interconvertable?, N2013, and N2012.
In practice, the "container-of" idiom, which requires this, seems to be common in many codebases.
In our reading of the standard, this is clearly allowed: 6.7.2.1p15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning. (bold emphasis added).
Martin Sebor disagrees, writing The intent of the text is to require implementations to avoid inserting padding before the first member. It is not to also require them to treat such pointers interchangeably for the purposes of pointer arithmetic or accesses. I suspect you base your reading on the requirement of equality of the two pointers. Equality between two pointers doesn't imply that they are necessarily interchangeable for accesses to the underlying object. [...]
Our reading of C11 and proposal for C2x: C11: no
Example: multidimensional_array_1.c
#include <stdio.h>
int main() {
int a[2][2] = {{0,0},{0,0}};
a[0][2] = 1; // undefined behaviour?
int x = a[1][0];
printf("x=%d\n",x);
}
This seems uncontroversially "no": 6.5.2.1 Array Subscripting speaks of a multidimensional array object both as a single object and, for the purpose of array indexing by expressions in square brackets, as a recursively structured array of arrays. It says The definition of the subscript operator []
is that E1[E2]
is identical to ( *((E1)+(E2))))
. Together with the 6.5.6p8 discussion of additive operators for pointers, that makes this example illegal: the a[0][2]
points to one-past the a[0]
array (of arrays), and 6.5.6p8's If the result points one past the last element of the array object, it shall not be used as the operand of a unary *
operator that is evaluated makes the access undefined behaviour.
Moreover, it seems likely that such uses are usually programmer errors, and hence desirable for compilers to be allowed to report this as such.
Our reading of C11 and proposal for C2x: C11: no. C2x proposal: debatable
For accesses within a multidimensional array via explicit pointer arithmetic, rather than by array indexing, the situation is not so clear. Examples such as that below are UB according to C11, by the same reasoning as for the previous question, but they seem likely to be common and intentional in practice, and to be regarded as reasonable code by practitioners.
Example: multidimensional_array_2.c
#include <stdio.h>
int main() {
int a[2][2] = {{1,2},{3,4}};
int *p00 = &(a[0][0]);
int *p11 = &(a[1][1]);
int sum = 0;
for (int *p = p00; p <= p11; ++p) // defined behaviour?
sum = sum + *p; // defined behaviour?
printf("sum=%d\n",sum);
}
Example: multidimensional_array_3.c
#include <stdio.h>
int main() {
int a[2][2] = {{1,2},{3,4}};
int *p00 = &(a[0][0]);
int *p11 = p00 + 1*2 + 1; // defined behaviour?
*p11 = 44;
printf("a[1][1]=%d\n",a[1][1]);
}
(The first is Stephen Kell's 2018-08-20 Example 1, adapted)
Forbidding these, while still forbidding multidimensional_array_1.c
, would require differentiating the 6.5.2.1 subscript operator from pointer arithmetic.
Perhaps one could regard a multidimensional array object as a single flat array for the purposes of pointer arithmetic, while changing the abstract-machine definition of the subscript operator to include explicit bounds checks based on the type - i.e., regarding E1[E2][E3]
as ( assert(...E2 in bounds && E3 in bounds...), *((E1)+(E2)*size3+(E3)) )
wherever E1
has an array type.
From a compiler alias-analysis point of view, in the current standard, compilers can assume that accesses via any pointer derived from &(a[0])
and any pointer derived from &(a[1])
never alias. If we allowed the above two examples, that would no longer hold. Do compilers currently depend on such assumptions?
The basic idea of our previous provenance semantics, tracking provenance through dataflow but not through control flow, seems relatively uncontroversial, and to match what GCC does (including the treatment of the representation bytes of pointers).
Our previous proposal tracks provenance through casts to integer types and back, based in part on optimisations seen in GCC on some tests and on the GCC documentation: "When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.". However, discussions at the GNU Tools Cauldron suggest instead that they regard the result of casting via integer types as aliasing with almost anything, at least in GIMPL, and those test results as long-standing bugs in the RTL backend.
Shifting to this would considerably simplify the provenance semantics. Comments from other compiler teams and WG14 would be very helpful.
This would change the situation for several questions from our previous note n2263: Clarifying Pointer Provenance (Q1-Q20) v4:
Example: provenance_basic_using_intptr_t_global_xy.c
Example: provenance_basic_using_intptr_t_global_yx.c
#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <inttypes.h>
int y = 2, x = 1;
int main() {
intptr_t ux = (intptr_t)&x;
intptr_t uy = (intptr_t)&y;
intptr_t offset = 4;
int *p = (int *)(ux + offset);
int *q = &y;
printf("Addresses: &x=%"PRIiPTR" p=%p &y=%"PRIiPTR\
"\n",ux,(void*)p,uy);
if (memcmp(&p, &q, sizeof(p)) == 0) {
*p = 11; // does this have undefined behaviour?
printf("x=%d y=%d *p=%d *q=%d\n",x,y,*p,*q);
}
}
Given the type intptr_t
, this question asks whether one can return to a concrete view of pointers as simple numbers, by casting to intptr_t
followed by integer arithmetic and casting back to a pointer type.
Our previous semantics tracks provenance through casts to integer types and back, based on optimisations seen in GCC on some tests and on the GCC documentation: "When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.". However, discussions at the GNU Tools Cauldron suggest instead that they regard the result of casting via integer types as aliasing with almost anything, at least in GIMPL, and that those test results as long-standing bugs in the RTL backend.
If we followed that, the answer to this question would be "no", and the above example would not have undefined behaviour.
The following casts a pointer to intptr_t
and back, using bitwise logical operations on the integer value to store some tag bits. The example is a strong form of this, storing the address and tag bit combination as a pointer (which thereby creates a misaligned pointer value, though one not used for accesses); a weaker form would store the combined value only as an integer.
Example: provenance_tag_bits_via_uintptr_t_1.c
#include <assert.h>
#include <stdio.h>
#include <stdint.h>
int x=1;
int main() {
int *p = &x;
// cast &x to an integer
uintptr_t i = (uintptr_t) p;
// check the bottom two bits of an int* are not used
assert(_Alignof(int) >= 4);
assert((i & 3u) == 0u);
// construct an integer like &x with low-order bit set
i = i | 1u;
// cast back to a pointer
int *q = (int *) i; // does this have defined behaviour?
// cast to integer and mask out the low-order two bits
uintptr_t j = ((uintptr_t)q) & ~((uintptr_t)3u);
// cast back to a pointer
int *r = (int *) j;
// are r and p now equivalent?
*r = 11; // does this have defined behaviour?
_Bool b = (r==p); // is this true?
printf("x=%i *r=%i (r==p)=%s\n",x,*r,b?"true":"false");
}
The standard leaves conversions between integer and pointer types implementation-defined (6.3.2.3p{5,6}), but it is common practice to use unused pointer bits (either low-order bits from alignment requirements or high-order bits beyond the maximum address range). We suggest that the set of unused bits for pointer types of each alignment should be made implementation-defined, to make this practice legal where necessary.
In our previous proposal, this example was allowed through careful definition of the provenance results of the intermediate operations. In a semantics allowing pointers cast through integer types to alias with (perhaps) any object whose address has been taken, it would be allowed for simpler reasons.
Note also that where the standard does give a guarantee about round-trip casts, e.g. for round-trips through intptr_t (7.20.1.4p1), it says only that the result "will compare equal". In a provenance-aware semantics, that may not be enough to make the result usable to reference memory; the standard text should be strengthened here to guarantee that by giving the result a usable provenance - the same as that of the source. And similarly for casts to void *
and back, and adding and removing qualifiers (c.f. 6.3.2.3p{1,2}).
The classic XOR linked list algorithm (implementing a doubly linked list with only one pointer per node, by storing the XOR of two pointers) also makes essential use of pointers constructed from pointer values with multiple distinct provenances. In this example we XOR the integer values from two pointers and XOR the result again with one of them.
Example: pointer_offset_xor_auto.c
Example: pointer_offset_xor_global.c
#include <stdio.h>
#include <inttypes.h>
int x=1;
int y=2;
int main() {
int *p = &x;
int *q = &y;
uintptr_t i = (uintptr_t) p;
uintptr_t j = (uintptr_t) q;
uintptr_t k = i ^ j;
uintptr_t l = k ^ i;
int *r = (int *)l;
// are r and q now equivalent?
*r = 11; // does this have defined behaviour?
_Bool b = (r==q);
printf("x=%i y=%i *r=%i (r==p)=%s\n",x,y,*r,
b?"true":"false");
}
It appears (though it is not completely clear) that this idiom is not important in modern practice. One respondent remarks that the XOR list implementation interacts badly with modern pipelines and the space saving is not a big win; another doesn't know of modern usages, though suspects that it is probably still used in places. We don't know whether current compiler alias analysis permits it or not.
Our previous proposal would not allow this, while the new proposal would.
We discussed uninitialised reads at the Brno meeting, based on the note N2221: Clarifying Unspecified Values (Q47-Q59) v3. Summarising the main points briefly:
ISO C11 (following C99) has the concept of trap representations, for which merely reading a trap representation (except by an lvalue of character type), is undefined behaviour. See 3.19.4, 6.2.6.1p5, 6.2.6.2p2, DR338. The ISO text is clear that trap representations are particular object representations that do not represent values of the object type (as opposed to, e.g., ghost state present only in the abstract machine). We observe that in many mainstream implementations, most types do not have unused object representations that could be trap representations; the principal exception being _Bool
.
For types that (in the particular implementation in question) do not have any trap representations, in our reading of the standard, reading uninitialised values is undefined iff "the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken)" (6.3.2.1p2 and DR338). This clause seems to have been added to cope with the Itanium NaT bit, but it's not clear that it correctly does so.
Despite the above, several WG14 members said that the intent of the standard was to make all reading of uninitialised values (perhaps except at character type) undefined behaviour.
When asking whether it is or should be allowed to read partially initialised structs, e.g. in a struct copy, opinions differed. One can also ask this for memcpy
and user code accessing representation bytes.
In some codebases it seems to be an accepted idiom to incrementally initialise values with bitwise operations, which involves reading uninitialised integer values.
There are also a number of related questions about padding, in Section 3.3 of our revised N2013, which we did not discuss.
At the GNU Tools Cauldron, there seemed to be consensus among the audience that reads of uninitialised scalars have to uniformly be regarded as UB as far as the GCC implementation is concerned, the existence of code that does incremental bitwise initialisation notwithstanding. That matches the intent of several WG14 members from the Brno meeting, though not (in our reading) the text of the standard. The case of structure reads of partially or wholly uninitialised structs was not so clear.
Paragraphs 6.5p{6,7} of the standard introduce effective types. These were added to C in C99 to permit compilers to do optimisations driven by type-based alias analysis, by ruling out programs involving unannotated aliasing of references to different types (regarding them as having undefined behaviour). This is one of the less clear, less well-understood, and more controversial aspects of the standard, as one can see from various GCC and Linux Kernel mailing list threads https://gcc.gnu.org/ml/gcc/2010-01/msg00013.html
https://lkml.org/lkml/2003/2/26/158
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg01647.html
and blog postings http://blog.regehr.org/archives/959
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
http://davmac.wordpress.com/2010/02/26/c99-revisited/
http://dbp-consulting.com/tutorials/StrictAliasing.html
http://stackoverflow.com/questions/2958633/gcc-strict-aliasing-and-horror-stories
. The type-based aliasing question of our preliminary survey was the only one which received a unanimous response: "don't know".
See also earlier committee discussion http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1409.htm
and http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf
(p14).
The C11 standard says:
6.5p6 The effective type of an object for an access to its stored value is the declared type of the object, if any.87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy
or memmove
, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
6.5p7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:88)
Footnote 87) Allocated objects have no declared type.
Footnote 88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.
As Footnote 87 says, allocated objects (from malloc
, calloc
, and presumably any fresh space from realloc
) have no declared type, whereas objects with static, thread, or automatic storage durations have some declared type.
For the latter, 6.5p{6,7} say that the effective types are fixed and that their values can only be accessed by an lvalue that is similar ("compatible", modulo signedness and qualifiers), an aggregate or union containing such a type, or (to access its representation) a character type.
For the former, the effective type is determined by the type of the last write, or, if that is done by a memcpy
, memmove
, or user-code char array copy, the effective type of the source.
Several major systems software projects, including the Linux Kernel, the FreeBSD Kernel, and PostgreSQL (though not Apache) disable type-based alias analysis with the -fno-strict-aliasing
compiler flag.
The ISO text for effective types, 6.5p{6,7}, is all in terms of the types of the lvalues used for access. This seems not to match GCC behaviour, which involves the construction of the lvalues, not just the types of the lvalues as a whole.
Suppose we write a single member of a structure into a fresh allocated region, with an lvalue (e.g. s->m
) that involves the structure type but which as a whole has only the member type, then does
- (i) the footprint of the member take on an effective type as the type of that struct member, or
- (ii) the footprint of the member take on an effective type of the type of that structure member annotated as coming from that member of that structure type, or
- (iii) the footprint of the whole structure take on the structure type as its effective type?
Our reading of C11 and proposal for C2x: C11: no
Example: effective_type_1.c
#include <stdio.h>
#include <inttypes.h>
#include <assert.h>
void f(uint32_t *p1, float *p2) {
*p1 = 2;
*p2 = 3.0; // does this have defined behaviour?
printf("f: *p1 = %" PRIu32 "\n",*p1);
}
int main() {
assert(sizeof(uint32_t)==sizeof(float));
uint32_t i = 1;
uint32_t *p1 = &i;
float *p2;
p2 = (float *)p1;
f(p1, p2);
printf("i=%" PRIu32 " *p1=%" PRIu32
" *p2=%f\n",i,*p1,*p2);
}
With -fstrict-aliasing
(the default for GCC), GCC assumes in the body of f
that the write to *p2
cannot affect the value of *p1
, printing 2
(instead of the integer value of the representation of 3.0
that would the most recent write in a concrete semantics): while with -fno-strict-aliasing
(as used in the Linux kernel, among other places) it does not assume that. The former behaviour can be explained by regarding the program as having undefined behaviour, due to the write of the uint32_t
i
with a float*
lvalue.
We give another basic effective type example below, here just involving integer types and without the function call.
Example: effective_type_10.c
#include <stdio.h>
#include <stdint.h>
int main() {
int32_t x;
uint16_t y;
x = 0x44332211;
y = *(uint16_t *)&x; // defined behaviour?
printf("x=%i y=0x%x\n",x,y);
}
Our reading of C11 and proposal for C2x: C11: TODO (presumably yes...)
Example: effective_type_2c.c
#include <stdio.h>
typedef struct { int i; } st;
void f(st* sp, int* p) {
sp->i = 2;
*p = 3;
printf("f: sp->i=%i *p=%i\n",sp->i,*p); // prints 3,3 not 2,3 ?
}
int main() {
st s = {.i = 1};
st *sp = &s;
int *p = &(s.i);
f(sp, p);
printf("s.i=%i sp->i=%i *p=%i\n", s.i, sp->i, *p);
}
Our reading of C11 and proposal for C2x: C11: no
Similar compiler behaviour occurs with pointers to two distinct but isomorphic structure types:
Example: effective_type_2.c
#include <stdio.h>
typedef struct { int i1; } st1;
typedef struct { int i2; } st2;
void f(st1* s1p, st2* s2p) {
s1p->i1 = 2;
s2p->i2 = 3;
printf("f: s1p->i1 = %i\n",s1p->i1);
}
int main() {
st1 s = {.i1 = 1};
st1 * s1p = &s;
st2 * s2p;
s2p = (st2*)s1p;
f(s1p, s2p); // defined behaviour?
printf("s.i1=%i s1p->i1=%i s2p->i2=%i\n",
s.i1,s1p->i1,s2p->i2);
}
Our reading of C11 and proposal for C2x: C11: no
Example: effective_type_2b.c
#include <stdio.h>
#include <stdlib.h>
typedef struct { int i1; } st1;
typedef struct { int i2; } st2;
int main() {
void *p = malloc(sizeof(st1));
st1 *p1 = (st1 *)p;
*p1 = (st1){.i1 = 1};
st2 *p2 = (st2 *)p;
st2 s2 = *p2; // undefined behaviour?
printf("s2.i2=%i\n",s2.i2);
}
This is essentially Stephen Kell's 2018-08-20 Example 2, except with a final read instead of a write (the write would be a strong update, which I think is allowed even in ISO C11, and with that final read at the struct type. This test discriminates between a notion of effective type that only applies to the leaves, and one which takes struct/union types into account.
Example: effective_type_2d.c
#include <stdio.h>
#include <stdlib.h>
typedef struct { int i1; } st1;
typedef struct { int i2; } st2;
int main() {
void *p = malloc(sizeof(st1));
st1 *p1 = (st1 *)p;
*p1 = (st1){.i1 = 1};
st2 *p2 = (st2 *)p;
int *pi = &(p2->i2); // defined behaviour?
int i = *pi; // defined behaviour?
printf("i=%i\n",i);
}
This variation does a read via an lvalue merely at type int
, albeit with that lvalue constructed via a pointer of type st2 *
. This is perhaps more debatable, but in my (PS) reading of the standard there's nothing to forbid it.
See also Q80.
malloc
'd region) to hold values of other types?Our reading of C11 and proposal for C2x: C11: no (though this is used in practice)
(Question 11 of our survey relates to this)
A literal reading of the effective type rules prevents the use of an unsigned character array as a buffer to hold values of other types (as if it were an allocated region of storage). For example, the following has undefined behaviour due to a violation of 6.5p7 at the access to *fp
. (This reasoning presumes that the conversion of the (float * )c
cast gives a usable result - the conversion is permitted by 6.3.2.3p7 but the standard text only guarantees a roundtrip property.)
Example: effective_type_3.c
#include <stdio.h>
#include <stdalign.h>
int main() {
_Alignas(float) unsigned char c[sizeof(float)];
float *fp = (float *)c;
*fp=1.0; // does this have defined behaviour?
printf("*fp=%f\n",*fp);
}
In a future semantics we imagine there has to be some (local or global) mechanism to allow this.
Even bytewise copying of a value via such a buffer leads to unusable results in the standard:
Example: effective_type_4.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdalign.h>
int main() {
_Alignas(float) unsigned char c[sizeof(float)];
// c has effective type char array
float f=1.0;
memcpy((void*)c, (const void*)(&f), sizeof(float));
// c still has effective type char array
float *fp = (float *) malloc(sizeof(float));
// the malloc'd region initially has no effective type
memcpy((void*)fp, (const void*)c, sizeof(float));
// does the following have defined behaviour?
// (the ISO text says the malloc'd region has effective
// type unsigned char array, not float, and hence that
// the following read has undefined behaviour)
float g = *fp;
printf("g=%f\n",g);
}
This seems to be unsupportable for a systems programming language: a character array and malloc'd region should be interchangeably usable, either on-demand or by default.
GCC developers commented that they essentially ignore declared types in alias analysis because of this.
Our reading of C11 and proposal for C2x: C11: the text does not clearly specify (?)
Example: effective_type_5.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
int main() {
void *p = malloc(sizeof(st1)); assert (p != NULL);
st1 s1 = { .c1='A', .f1=1.0};
*((st1 *)p) = s1;
float *pf = &(((st1 *)p)->f1);
// is this free of undefined behaviour?
float f = *pf;
printf("f=%f\n",f);
}
Our reading of C11 and proposal for C2x: C11: yes (?)
In the first test below, the lvalues used for the member writes are constructed using char *
arithmetic, and then cast to the member types, while in the second, they are constructed from p
cast to a pointer to the struct type. The types of the lvalues used for the member writes are the same, so by the 6.5p{6,7} text this should make no difference, but it might be that a definition of effective types that makes current TBAA practice sound would need to take lvalue construction into account.
Example: effective_type_5b.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
int main() {
void *p = malloc(sizeof(st1)); assert (p != NULL);
char *pc = (char*)((char*)p + offsetof(st1, c1));
*pc = 'A';
float *pf = (float *)((char*)p + offsetof(st1,f1));
*pf = 1.0;
st1 *pst1 = (st1 *)p;
st1 s1;
s1 = *pst1; // is this free of undefined behaviour?
printf("s1.c1=%c s1.f1=%f\n", s1.c1, s1.f1);
}
Example: effective_type_5c.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
int main() {
void *p = malloc(sizeof(st1)); assert (p != NULL);
char *pc = &((*((st1*)p)).c1);
*pc = 'A';
float *pf = &((*((st1*)p)).f1);
*pf = 1.0;
st1 *pst1 = (st1 *)p;
st1 s1;
s1 = *pst1; // is this free of undefined behaviour?
printf("s1.c1=%c s1.f1=%f\n", s1.c1, s1.f1);
}
Our reading of C11 and proposal for C2x: C11: no
Example: effective_type_6.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
int main() {
void *p = malloc(sizeof(float)); assert (p != NULL);
// is this free of undefined behaviour?
float f = *((float *)p);
printf("f=%f\n",f);
}
The effective type rules seem to deem this undefined behaviour.
Our reading of C11 and proposal for C2x: C11: no
Example: effective_type_7.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
int main() {
void *p = malloc(sizeof(st1)); assert (p != NULL);
((st1 *)p)->c1 = 'A';
// is this free of undefined behaviour?
float f = ((st1 *)p)->f1;
printf("f=%f\n",f);
}
If the write should be considered as affecting the effective type of the footprint of the entire structure, then it would change the answer to effective_type_5.c
here. It seems unlikely but not impossible that such an interpretation is desirable.
There is a defect report (which?) about copying part of a structure and effective types.
Our reading of C11 and proposal for C2x: C11: the text does not clearly specify
Example: effective_type_8.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
typedef struct { char c2; float f2; } st2;
int main() {
assert(sizeof(st1)==sizeof(st2));
assert(offsetof(st1,c1)==offsetof(st2,c2));
assert(offsetof(st1,f1)==offsetof(st2,f2));
void *p = malloc(sizeof(st1)); assert (p != NULL);
((st1 *)p)->c1 = 'A';
// is this free of undefined behaviour?
((st2 *)p)->f2 = 1.0;
printf("((st2 *)p)->f2=%f\n",((st2 *)p)->f2);
}
Again this is exploring the effective type of the footprint of the structure type used to form the lvalue.
Example: effective_type_9.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
typedef struct { char c2; float f2; } st2;
int main() {
assert(sizeof(st1)==sizeof(st2));
assert(offsetof(st1,c1)==offsetof(st2,c2));
assert(offsetof(st1,f1)==offsetof(st2,f2));
void *p = malloc(sizeof(st1)); assert (p != NULL);
st1 s1 = { .c1='A', .f1=1.0};
*((st1 *)p) = s1;
// is this free of undefined behaviour?
float f = ((st2 *)p)->f2;
printf("f=%f\n",f);
}
Example: effective_type_9b.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
typedef struct { char c2; float f2; } st2;
int main() {
assert(sizeof(st1)==sizeof(st2));
assert(offsetof(st1,c1)==offsetof(st2,c2));
assert(offsetof(st1,f1)==offsetof(st2,f2));
void *p = malloc(sizeof(st1)); assert (p != NULL);
char *pc = (char*)((char*)p + offsetof(st1, c1));
*pc = 'A';
float *pf = (float *)((char*)p + offsetof(st1,f1));
*pf = 1.0;
// is this free of undefined behaviour?
float f = ((st2 *)p)->f2;
printf("f=%f\n",f);
}
Here there is nothing specific to st1
or st2
about the initialisation writes, so the read of f
should be allowed.
Example: effective_type_9c.c
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
typedef struct { char c2; float f2; } st2;
int main() {
assert(sizeof(st1)==sizeof(st2));
assert(offsetof(st1,c1)==offsetof(st2,c2));
assert(offsetof(st1,f1)==offsetof(st2,f2));
void *p = malloc(sizeof(st1)); assert (p != NULL);
st1 *pst1 = (st1*)p;
pst1->c1 = 'A';
pst1->f1 = 1.0;
float f = ((st2 *)p)->f2; // is this free of undefined behaviour?
printf("f=%f\n",f);
}
Here the construction of the lvalues used to write the structure members involves st1
, but the lvalue types do not. The 6.5p{6,7} text is all in terms of the lvalue types, not their construction, so in our reading of C11 this is similarly allowed.
Our reading of C11 and proposal for C2x: C11: the text does not clearly specify
Robbert Krebbers asks on the GCC list https://gcc.gnu.org/ml/gcc/2015-03/msg00083.html
whether "GCC uses 6.5.16.1p3 of the C11 standard as a license to perform certain optimizations. If so, could anyone provide me an example program. In particular, I am interested about the 'then the overlap shall be exact' part of 6.5.16.1p3: If the value being stored in an object is read from another object that overlaps in any way the storage of the first object, then the overlap shall be exact and the two objects shall have qualified or unqualified versions of a compatible type; otherwise, the behavior is undefined." Richard Biener replies with this example (rewritten here to print the result), saying that it will be optimised to print 1 and that this is basically effective-type reasoning.
Example: krebbers_biener_1.c
#include <stdlib.h>
#include <assert.h>
#include <stdio.h>
struct X { int i; int j; };
int foo (struct X *p, struct X *q) {
// does this have defined behaviour?
q->j = 1;
p->i = 0;
return q->j;
}
int main() {
assert(sizeof(struct X) == 2 * sizeof(int));
unsigned char *p = malloc(3 * sizeof(int));
printf("%i\n", foo ((struct X*)(p + sizeof(int)),
(struct X*)p));
}