JTC1/SC22/WG14
N866
Document number: WG14 N866 (J11/99-001)
Title: Proposals for restricted pointer issues
Author: Bill Homer
Date: 10-Jan-99
URL: http://reality.sgi.com/homer_craypark/c9x/n866.txt
http://reality.sgi.com/homer_craypark/c9x/n866.html
1 ---------- Summary -------------------------------------------------
2
3 This document addresses four issues with the restrict qualifier,
4 three from PC-UK0118, and one a side effect of the recent change
5 to the specification for realloc.
6
7 It supersedes the changes proposed in:
8 SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-4
9 and it supplies the details of the change proposed in:
10 SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-5
11
12 The primary goal of these changes is to limit undefined behavior
13 to those cases in which it promotes optimization. This allows a
14 programmer to use the restrict qualifier in more contexts, but
15 does not impose additional burdens on a translator.
16
17 What follows is a brief description of the issues, then the
18 proposed changes to the specification that resolve these issues,
19 and finally some examples that illustrate the effects of the
20 changes.
21
22 ---------- Description of the issues -------------------------------
23
24 1. The FCD specification of restrict forbids aliasing of
25 unmodified objects. Doing so does not promote optimization,
26 and has other disadvantages, which are discussed in examples
27 A-E below. It is also contrary to the prior art in Fortran.
28
29 2. If a restricted pointer points to an object with const-qualified
30 type, the FCD specification allows casting away the const
31 qualifier, and modifying the object. Disallowing such
32 modifications promotes optimization as illustrated in example F
33 below.
34
35 3. The FCD specification does not address the effect of accessing
36 objects through pointers of various types, all based on one
37 restricted pointer, as in Example G below. In particular,
38 these objects are supposed to determine an array, but the
39 element type is not specified.
40
41 4. The specification of realloc now states that the old object is
42 freed, and a new object is allocated. The old and new objects
43 cannot, in general, be viewed as being members of an array of
44 such objects. With the FCD specification, this appears to
45 prohibit the use of the restrict qualifier for a pointer that
46 points to an object that is realloc'd. There are also related
47 issues for dynamically allocated linked lists. See examples
48 H-K below.
49
50 ---------- Proposed edits to FCD 6.7.3.1 ---------------------------
51
52 === Append to paragraph #1
53 the text marked with > at left:
54
55 [#1] Let D be a declaration of an ordinary identifier that provides a
56 means of designating an object P as a restrict-qualified pointer
57 > to objects of type T.
58
59
60 === Change paragraph #4 from
61 (text being changed marked with < at left):
62
63 [#4] < During each execution of B, let A be the array object that
64 < is determined dynamically by all references through pointer
65 < expressions based on P. Then all references to values of A
66 < shall be through pointer expressions based on P. Furthermore,
67 if P is assigned the value of a pointer expression E that is
68 based on another restricted pointer object P2, associated with
69 block B2, then either the execution of B2 shall begin before
70 the execution of B, or the execution of B2 shall end prior to
71 the assignment. If these requirements are not met, then the
72 behavior is undefined.
73
74 === to
75 (new text indicated by > at left):
76
77 [#4] > During each execution of B, let L be any lvalue that has &L based
78 > on P. If L is used to access the value of the object X that it
79 > designates, and X is also modified (by any means), then the
80 > following requirements apply. T shall not be const-qualified.
81 > Every other lvalue used to access the value of X shall also have
82 > its address based on P. Every access that modifies X shall be
83 > considered also to modify P, for the purposes of this section.
84 If P is assigned the value of a pointer expression E that is
85 based on another restricted pointer object P2, associated with
86 block B2, then either the execution of B2 shall begin before
87 the execution of B, or the execution of B2 shall end prior to
88 the assignment. If these requirements are not met, then the
89 behavior is undefined.
90
91 === Change example 1 from
92 (text being changed marked with < at left):
93
94 [#7] EXAMPLE 1 The file scope declarations
95
96 int * restrict a;
97 int * restrict b;
98 extern int c[];
99
100 < assert that if an object is accessed using the value of one
101 < of a, b, or c, then it is never accessed using the value of
102 < either of the other two.
103
104 === to
105 (modified text indicated by > at left):
106
107 [#7] EXAMPLE 1 The file scope declarations
108
109 int * restrict a;
110 int * restrict b;
111 extern int c[];
112
113 > assert that if an object is accessed using one of a, b, or c,
114 > and that object is modified anywhere in the program,
115 > then it is never accessed using either of the other two.
116
117 === Change example 3 from
118 (text being changed marked with < at left):
119
120 [#10] EXAMPLE 3 The function parameter declarations
121
122 < void h(int n, int * const restrict p,
123 < int * const q, int * const r)
124 {
125 int i;
126 for (i = 0; i < n; i++)
127 p[i] = q[i] + r[i];
128 }
129
130 < show how const can be used in conjunction with restrict.
131 < The const qualifiers imply, without the need to examine the
132 < body of h, that q and r cannot become based on p. The fact
133 < that p is restrict-qualified therefore implies that an
134 < object accessed through p is never accessed through either
135 < of q or r. This is the precise assertion required to
136 < optimize the loop. Note that a call of the form h(100, a,
137 < b, b) has defined behavior, which would not be true if all
138 < three of p, q, and r were restrict-qualified.
139
140 === to
141 (modified text indicated by > at left):
142
143 [#10] EXAMPLE 3 The function parameter declarations
144
145 > void h(int n, int * restrict p,
146 > int * restrict q, int * restrict r)
147 {
148 int i;
149 for (i = 0; i < n; i++)
150 p[i] = q[i] + r[i];
151 }
152
153 > illustrate how an unmodified object can be aliased through two
154 > restricted pointers. In particular, if a and b are disjoint
155 > arrays, a call of the form h(100, a, b, b) has defined behavior,
156 > because array b is not modified within function h.
157
158 ---------- End of proposed edits to FCD 6.7.3.1 --------------------
159
160 The following examples are included here to illustrate the benefits
161 of the proposed changes, but are not themselves proposed as
162 additions to the standard.
163
164 --------- Examples for Issue 1 -------------------------------------
165
166 Example A: restrict in standard library
167
168 extern in printf(const char * restrict format, ...);
169
170 printf("%s", "s");
171
172 Under C89, a translator could "pool" unmodifiable string
173 literals so that "s" refers to a subobject of "%s". Such
174 pooling is not allowed by the restrict qualifier on the
175 first parameter of printf with the FCD specification of
176 restrict, but is allowed with the proposed specification
177 (because there are no requirements for unmodified objects).
178
179 Example B: restricted pointer structure members
180
181 typedef struct { int n; double * restrict v; } vector;
182
183 void addB(vector a, vector b, vector c)
184 {
185 int i;
186 for(i=0; i<a.n; i++)
187 a.v[i] = b.v[i] + c.v[i];
188 }
189
190 Allowing calls of the form addB(x,y,y) does not inhibit
191 optimization of the loop, but the behavior of such calls is
192 undefined under the FCD specification. (This is particularly
193 unfortunate because there is no convenient way to rewrite the
194 function and eliminate the qualifier from the second and third
195 arguments of addB.) In contrast, such calls are defined under
196 the proposed specification. In particular, the objects that
197 are modified in this function are those designated by a.v[i],
198 and &(a.v[i]), or a.v+i, is based on the restricted pointer a.v.
199 Therefore the only requirement imposed by the proposed
200 specification is that all lvalues used to access a.v[i]
201 during an execution of addB shall have addresses based on a.v.
202
203 Example C: two levels of indirection
204
205 void addC(vector * restrict p, vector * restrict q)
206 {
207 int i;
208 for(i=1; i<p->n; i++)
209 p->v[i] += q->v[i-1];
210 }
211
212 Calls of the form addC(&x, &x) should be undefined to promote
213 optimization of the loop, and they are undefined under both the
214 FCD and the proposed specifications. The analysis for the
215 proposed specification is as follows. Let X be p->v[i], and
216 let B be the block of addC. Then for a call addC(&x, &x) with
217 x.n>1, &(p->v[i]) is based on the restricted pointer p->v, and
218 so the modification of p->v[i] is considered also to modify
219 p->v. Recursively, &(p->v) is based on the restricted pointer
220 p, and the same object is also accessed through the lvalue q->v
221 (because p and q have the same value), but &(q->v) is not based
222 on p, so the behavior is undefined.
223
224 Example D: array arguments with unmodified overlap
225
226 void addD(int n, int * restrict x, int * restrict y)
227 {
228 int i;
229 for(i=0; i<n; i++) {
230 x[i] += x[i+n];
231 y[i+2*n] += y[i+n];
232 }
233 }
234
235 If z is an array of 300 int's, then for calls of the form
236 addD(100,z,z), the last half of the array accessed through x
237 coincides with the first half of the array accessed through y,
238 but these common elements are not modified. The FCD
239 specification needlessly gives such calls undefined behavior,
240 but the proposed specification gives them defined behavior.
241
242 Example E: comparison with "property M" of
243 SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-4
244
245 typedef struct {
246 int * restrict p;
247 int * restrict q;
248 } T;
249
250 int f(T * restrict a, T * restrict b)
251 {
252 *a->p += 1;
253 return *b->q;
254 }
255
256 In this example, even if *a and *b happen to refer to the same
257 object of type T, the members a->p and b->q will be distinct
258 restricted pointers. A translator can infer that *a->p and
259 *b->q are distinct objects, and so there is no potential
260 aliasing to inhibit optimization of the body of f().
261
262 Note that this means that the restrict qualifiers in the
263 declarations of parameters a and b are not needed to promote
264 optimization of f, and so the question in this example is
265 whether they "do harm" by giving calls of the form f(x, x)
266 undefined behavior. The answer is clearly yes for the FCD
267 specification.
268
269 The answer is also yes for the specification proposed in
270 SC22 N2794 Final CD Ballot for FCD 9899 Comment #9-4. In
271 particular, *a has property M because it contains a subobject,
272 (*a).p, or a->p, that is a restricted pointer with a target,
273 *a->p, that is modified. This requires that every lvalue used
274 to access *a must have its address based on a. But in a call
275 of the form f(x, x), *b will designate the same object as *a,
276 and b is not based on a.
277
278 In contrast, the answer is no for the proposed specification.
279 In particular, during an execution of f, the only object that
280 is actually modified is *a->p. Now &(*a->p) is a->p, a
281 restricted pointer, so a->p is also considered to be modified.
282 Recursively, &(a->p) is based on a, another restricted pointer.
283 Because no other lvalues are used to access either *a->p or
284 a->p, all the requirements of the proposed specification are
285 met (and they do not involve b).
286
287 For this example, the proposed specification can be said to
288 be more precise that the one based on property M because its
289 "bottom up" recursion better limits the extent of objects
290 considered to be modified as a side effect of modifying the
291 target of a restricted pointer.
292
293 --------- Examples for Issue 2 -------------------------------------
294
295 Example F: asserting unmodifiability
296
297 extern void g();
298
299 T f(const T * restrict p)
300 {
301 g(p);
302 return *p;
303 }
304
305 Under the proposed specification, but not under the FCD
306 specification, the use of const and restrict in this example
307 implies that function g only accesses the value of *p, and does
308 not modify it. On some architectures, this would allow the
309 load from *p to be initiated prior to the function call, so
310 that its value would be available sooner both within that call
311 and for the return value. The potential benefit could be large
312 if T is a large aggregate type.
313
314 --------- Example for Issue 3 --------------------------------------
315
316 Example G: casting restricted pointer-to-double to pointer-to-char
317
318 The following example from PC-UK0118 illustrated uncertainty
319 about how to determine the type and extent of the array
320 accessed through a restricted pointer, under the FCD
321 specification.
322
323 void fred (double restrict *a, double restrict *b) {
324 /* Assume sizeof(double) > 1 */
325 ((char *)b)[sizeof(double)+2] =
326 ((char *)a)[sizeof(double)+1];
327 a[0] = b[2];
328 }
329
330 double array[3];
331 fred(array,array);
332
333 Under the proposed specification, the requirements implied by
334 the restrict qualifiers apply separately to each modified
335 object, and it follows that this example has defined behavior.
336 In particular, the 1-byte object ((char *)b)[sizeof(double)+2]
337 is accessed only through b, and the disjoint object a[0] is
338 accessed only through a.
339
340 Here it is assumed that the modification of the 1-byte object
341 ((char *)b)[sizeof(double)+2] is not considered to be a
342 modification of the containing object b[1]. By the way, the
343 same issue arises if the expressions appear as operands of
344 a comma-operator:
345
346 ((char *)b)[sizeof(double)+1]++,
347 ((char *)b)[sizeof(double)+2]++
348
349 Thus the extent of the object modified by a particular
350 operation is an issue that is important for the correct
351 interpretation of the restrict qualifier, but is separate from
352 its specification.
353
354 --------- Examples for Issue 4 -------------------------------------
355
356 Example H: realloc of a restricted pointer
357
358 void f(T * restrict q) {
359 T * restrict p = (T *)malloc(sizeof(T));
360 p[0] = q[0];
361 ...
362 p = (T *)realloc(p, 2*sizeof(T));
363 p[1] = p[0] + q[1];
364 ...
365 }
366
367 With recent changes to the specification of realloc, the two
368 instances of the lvalue p[0] refer to two different objects
369 (with the value copied from the old object to the new object by
370 realloc). Because there is no guarantee that there is an array
371 containing both these objects, this use of restrict does not
372 conform to the FCD specification. It does conform to the
373 proposed specification, which does not require the existence of
374 such an array.
375
376 Example I: disjoint linked lists
377
378 typedef struct item_t {
379 double d;
380 struct item_t * next;
381 } item;
382
383 void f(item * restrict list1, item * restrict list2)
384 {
385 while(list1 != 0 && list2 != 0) {
386 list1->d += list2->d;
387 list1 = list1->next;
388 list2 = list2->next;
389 }
390 }
391
392 This use of restrict does not conform to the FCD specification
393 unless each list is contained within an array. It does conform
394 to the proposed specification, if the items that are modified
395 through list1 are disjoint from the items that are accessed
396 through list2. This allows the translator greater flexibility
397 in scheduling the loads and stores of the d members of the
398 items.
399
400 Example J: disjoint linked lists, with a recursion
401
402 void g(item * list1, item * restrict list2)
403 {
404 if(list1 != 0) {
405 while(list1->next != 0 && list2 != 0) {
406 list1->next->d += list1->d + list2->d;
407 list1 = list1->next;
408 list2 = list2->next;
409 }
410 }
411 }
412
413 In this example, the modified objects, list1->next->d,
414 are accessed though the 'next' pointers in the first list.
415 The values of these next pointers are not based on either
416 list1 or list2. This implies that
417 1) list1 could not be restrict-qualified, because the modified
418 objects are also accessed through list1->d, and
419 &(list1->d) is based on list1.
420 2) list2->d is disjoint from the modified objects, because
421 &(list2->d) is based on the restrict-qualified pointer
422 list2.
423
424 Example K: even/odd index values
425
426 void f(int n, char * restrict p, char * restrict q)
427 {
428 int i;
429 for(i=0; i<n; i+=2) {
430 ++p[i];
431 ++q[i];
432 }
433 }
434
435 With the FCD specification, a call of the form f(100, a, a+1)
436 has undefined behavior, but under the proposed specification,
437 it does not.
438
439 There is an objection that can be raised to the change for this
440 example. Although there are no dependencies in the loop caused
441 by aliasing, there may be "architectural" dependencies that
442 limit the ways in which the loads and stores of p[i] and q[i]
443 can be reordered. In particular, p[i] and q[i] may both be
444 contained within the same smallest loadable unit of storage (on
445 systems that cannot load individual bytes), or within the same
446 cache line (a concern for parallel execution on multiprocessor
447 systems). The FCD semantics do have the advantage of limiting
448 potential sharing to the ends of the arrays being accessed.
449
450 On balance, the advantage of the FCD specification in this
451 example, which affects only some systems, seems outweighed by
452 the advantages of the proposed specification in all the
453 previous examples.