Document: N1111
Date: 09-Mar-05

`An Analysis of the Issue Raised by DR 236`

`Introduction`

This is a revision to N1090 in the post-Redmond mailing.

It is in general expensive to do pointer analysis in a program. A pointer can point to different address locations at different time; and two pointers can be used to access overlapping objects. This limits the amount of code motion an optimizer can safely do, and in turn limits optimization opportunities. On the other hand, if we restrict pointer usage, the language may not express the programming logic as fluently as it should. The C aliasing model attempts to strike a balance between expressiveness of the language and ease of optimization. Note that good optimization in the end also benefits the program.

The C aliasing model is mainly defined by 6.5 paragraph 7 (and paragraph 6). In essence, it limits the address locations that can be pointed to by a pointer -- that is, with a couple of exceptions, a pointer can only be deferenced to access objects with the same type. (Sign and type qualifiers are ignored in this consideration.) The exceptions to this rule are provided to make the language more usable. However, as pointed out by DR 236, these provisions make the model ineffective in providing optimization information.

`The issue raised by DR 236`

The "exception" referred to in the above is basically bullet 7 of 6.5p7. This bullet allows a union to be referenced via both the lvalue of the union type and lvalues of the member types. For example:

union U {
int a;
float b;
} u;

The memory occupied by u can be accessed using an lvalue of type union U, int or float. Putting it in another way, a U pointer and an int pointer can point to the same address; likewise for a U pointer and a float pointer. Both of these are fine as the optimizer knows about this aliasing possibility through the union's definition. However, due to transitivity, an int pointer and a float pointer can now point to the same (or overlapping) memory locations. DR 236 suggested this was not the intention of the aliasing model.

int *p = &u.a
float *q = &u.b;
...
*p=1;
++*p;
*q=2.0f; // Can the optimizer assume q and p point to different locations ?
++*q;

The code separated by "..." can be in different parts of the program. They can be in different CUs.

The intent of the aliasing model is to provide sufficient information to the optimizer so that it can make assertions about a pointer's value by just looking locally at the location where the pointer is used (basing on the pointer's type) without going through extensive analysis. However, the code samples in DR 236 show that 6.5p7 does not provide sufficient restrictions for the optimizer to make such assertions.

`Intention of the alising model`

As a first step in resolving this DR, let us reexamine the intention of the C aliasing model, and the balance it attempts to strike between helping the programmer and the optimizer. We need to identify those cases that are supposed to be allowed and those that are not. For the purpose of this discussion, we will introduce rules as though they are added to the Standard. These rules are not meant to be the ultimate resolution of this DR nor meant to be suggested text to the Standard. (Further discussion is needed to come up with those text.) It is meant to help understanding the issue.

6.5p7 describes the aliasing restriction by using lvalues. If P is a pointer, the lvalue of the object pointed to by P is *P. 6.5p7 specifies the group of objects accessible by *P. This approach attacks the problem at the location where the pointer is being used. However, as alluded to in the above discussion, the problem may lie at the location where the pointer gets its value -- the pointers point to overlapping address locations. Talking about P itself directly may offer some hints of what is causing the problem. So let us try to rewrite the rules in 6.5p7 in terms of pointers. (Ignore the implication of effective type for the time being.)

Let P, Q and R be pointer objects.

a/ The basic restriction we want to impose is that a pointer P can only point to objects with types compatible to *P, ignoring sign and type qualifiers. This covers bullets 1,2 and 3 in 6.5p7.

b/ A character pointer can point to any address locations. This covers bullet 6.

c/ Now add aggregates and union types. If *P is an aggregate or union type, and *Q a member type, then P and Q can point to overlapping memory locations. This covers bullet 5.

A side effect of point c is that if *R is another member type different from *Q, R and Q can point to overlapping memory locations. This happens when a union type is involved. This is the crust of the problem. DR 236 suggests that this is not the original intend of the aliasing model. We need to find a way to restrict this transitive side effect.

Note: A pointer can gets an address but never used in any accesses. Such pointer need not be restricted by the above rules. 6.5p7 talks about lvalues, which avoids this issue altogether.

The dilemma we face in trying to resolve this DR is whether point a or c above should take precedence. If point a takes precedence, then the code sample in the previous section becomes invalid (regardless of whether the pieces separated by the "..." are in the same CU or not). If point c takes precedence, then the information provided by point a (which is the major information relied on by optimizers) becomes useless. This is an all or nothing choice.

`Where should we draw the line`

Discussions within the committee has been trying to find a compromise between helping the optimizer and helping the programmer. Proposals in N980 and N1090 were attempts along this line. However, it is perhaps fair to say that the proposals we have so far are not entirely satisfactory. For example, one issue with the suggestion in N1090 (i.e. addition of 6.5.5.2p10a) is that when optimizers do inlining, the aliasing restriction imposed at function boundary would be broken. (Refer to http://gcc.gnu.org/ml/gcc/2004-12/msg00164.html, and msg00166.html)

If the analysis in the previous section is correct, the problem cannot be fixed by rules on lvalues alone. This is because the issue is not caused by the relationship between an object and its lvalue, but between the object (its type) and the piece of memory it occupies. In cases where only one type of object can occupy a particular piece of memory, specifying the aliasing rules via lvalues creates no ambiguity. But when different object types can occupy the same memory location, the one-to-one correspondence between lvalues and memory locations is gone. If P and Q are pointers, by just telling the optimizer that *P and *Q are different lvalue types provides little information.

If we accept that the three examples in N1090 (repeated below) are invalid, then the following variation of the suggestion in N1090 might be considered.

A type alias set is a set of types. The type alias set of a pointer P to a non-aggregate non-union type contains the type of *P, together with all the sign and type qualifier variations. The type alias set of a pointer P to an aggregate or union type contains the type of *P, together with all the type qualifier variations, and all member types, taken recursively for subaggregate or contained union, with all type qualifier and sign variations (if applicable). The type alias set of a pointer to char is the set of all types.

Two type alias set are distinct if neither is a subset of the other. If two pointers have distinct type alias set, they shall point to non-overlapping address locations. Note that this effectively fixed the type of allocated storage to the one that is used in the first access. A union type would be needed if the allocated storage is used as a buffer for different object types.

Below repeats the three code examples in N1090.

Example 1

The following code is invalid because p and q have distinct type alias set but point to the same address.

union U {
int a;
float b;
};

void foo(union U *v) {
int *p = &v->a; // v is used
float *q = &v->b;
...
*p=1;
++*p;
*q=2.0f;
++*q;
}

void main() {
union U u = {0};
foo(&u);
}

Example 2

The following is invalid for the same reason as above. p and q cannot point to the same address location.

union U {
int a;
float b;
};

void foo(int *p, float *q) { // p & q have distinct type alias set but point to the same address
*p=1;
++*p;
*q=2.0f;
++*q;
}

void main() {
union U u = {0};
foo(&u.a, &u.b);
}

Example 3

The following is invalid. It would be valid if p is removed and the code is rewritten as in example 4 below.

union U {
int a;
float b;
};

void foo(union U *v, float *q) { // q's type alias set is a subset of v's.
int *p = &v->a;
...
*p=1;
++*p;
*q=2.0f;
++*q;
}

void main() {
union U u = {0};
foo(&u, &u.b);
}

Example 4

The following is valid. q's type alias set is a subset of v's. They can
point to overlapping address location.

union U {
int a;
float b;
};

void foo(union U *v, float *q) { // q's type alias set is a subset of v's.
v->a =1;
v->a +=1;
*q=2.0f;
++*q;
}

void main() {
union U u = {0};
foo(&u, &u.b);
}

`Concluding Remarks`

Comparing with N980 and N1090, the proposal in this paper provides the most information to the optimizer. Arguably this is also the easiest to understand from a programming perspective.