| Doc. No.: | WG21/N2052 J16/06-0122 | |
|---|---|---|
| Date: | 2006-09-07 | |
| Reply to: | Clark Nelson | Hans-J. Boehm | 
| Phone: | +1-503-712-8433 | +1-650-857-3406 | 
| Email: | clark.nelson@intel.com | Hans.Boehm@hp.com | 
This paper is a successor to, but not a revision of, N1944. N1944 was basically an exploratory paper, despite the amount of nearly-WD-ready text proposed; its style of presentation was very heavy on explanation and motivation. Consequently, it is certain to be useful as a tutorial introduction and/or rationale for this paper.
But based on the amount of positive feedback received, the exploratory phase could hopefully be considered complete. Furthermore, some of the feedback received would have been difficult to address in a document organized as N1944 was. It now seems highly desirable to have a cohesive presentation of the changed WD text, emphasizing the result rather than the process. This paper also presents work on aspects of sequencing explicitly related to concurrency, addressing other feedback on N1944.
This paper should also be viewed as a successor to N1942, the memory model proposal. Again, much of the explanatory material from N1942 is not repeated here. In an attempt to simplify, some of the terminology has changed from N1942.
The WD text proposed in N1944 introduced ambiguity in the use of the term "evaluation". Most new uses of that term were intended to reflect usage in mathematics, as in the computation of a value, without side effects. This usage is inconsistent with C/C++ tradition, and the way the term is used in the standard. So when it is necessary to talk about evaluations that do not have side effects, the term "value computation" is now used.
There is a new paragraph defining and explaining the "sequenced before" relation; see 1.9p14.
To reflect the consensus from the discussion in Berlin, a note has been added clearly stating that there is no requirement of consistency for operations whose sequencing is not constrained; see 1.9p16.
The statement of the "no interleaving" rule for functions has been updated; see 1.9p17. Also, an example has been added pointing out a possibly-surprising interpretation of "unspecified behavior".
Resolutions are proposed for several questions raised but not answered in N1944, mostly in Fixes for miscellaneous sequencing issues.
The changes proposed in N1944 were mainly in section 1.9 (Program execution) and various locations in clause 5 (Expressions), plus a couple of spots in clause 12 (Special member functions). The "undefined behavior" rule, a key paragraph in the understanding of sequencing, which basically describes what may be called an "intra-thread data race", is currently in 5p4, which is widely separated from the bulk of the discussion of the principles of sequencing in 1.9. Furthermore, it would seem logical to describe concurrency — and particularly inter-thread data races — in a new section building on and immediately following 1.9. Therefore we propose to move the "undefined behavior" rule from 5p4 to 1.9.
Within 1.9 with the changes proposed in N1944, the bulk of the discussion of sequencing is in p15-16. Paragraph 8, which currently contains the "no overlap" rule for function execution, should be merged into p16, which discusses many other sequencing constraints on function calls. And if, as proposed, the references to sequence points and evaluation are removed from p11 (the "least requirements"), then the definitions in p7 are not needed until p15; moving paragraph 7 down would result in a more cohesive presentation.
Finally, it could be argued that cohesiveness would be increased still further by moving the discussion of reassociation (concerning implications of the "as-if" rule) to immediately follow the "least requirements" (which is basically the normative statement of the "as-if" rule), instead of showing up in the middle of the discussion of expressions and sequencing.
This table shows the proposed shifting of content, assuming regular paragraph (re-)numbering. The letters in the central columns are just tags, intended to illustrate how text moves around (in lieu of arrows): the tag stays with the content.
| Paragraph number | Old content | New content | ||
|---|---|---|---|---|
| 1.9p7 | Definitions of "side effect", "sequence point" | A | C | Effect of asynchronous signal | 
| 1.9p8 | "No overlap" rule for function execution | B | C | Allocation of automatic objects | 
| 1.9p9 | Effect of asynchronous signal | C | C | The "least requirements" | 
| 1.9p10 | Allocation of automatic objects | C | E | Note concerning reassociation | 
| 1.9p11 | The "least requirements" | C | D | Definition of "full-expression" | 
| 1.9p12 | Definition of "full-expression" | D | D | Note concerning default arguments | 
| 1.9p13 | Note concerning default arguments | D | A | Definition of "side effect", "evaluation" | 
| 1.9p14 | Note concerning reassociation | E | [new] | Definition of "sequenced before" | 
| 1.9p15 | Sequencing between full-expressions | F | F | Sequencing between full-expressions | 
| 1.9p16 | Sequencing constraints on function calls | G | 5p4 | The "undefined behavior" rule | 
| 1.9p17 | Operators that impose a sequence point | [delete] | G+B | Sequencing constraints on function calls, including the "no overlap" rule | 
So here is the proposed reading of section 1.9, beginning with p6 (just for the sake of context). Each paragraph is introduced with its proposed paragraph number, and an explanation of its source. Text from the current working draft to be replaced or deleted is stricken through. Replacement or added text is underlined. Footnotes are presented here in the same style as examples and notes. If the introductory paragraphs and stricken text were deleted, the result would be longish block of consecutive paragraphs, as proposed for the standard.
1.9p6 (unchanged):
The observable behavior of the abstract machine is its sequence of reads and writes to
volatiledata and calls to library I/O functions. An implementation can offer additional library I/O functions as an extension. [ Footnote: Implementations that do so should treat calls to those functions as "observable behavior" as well. —end footnote ]
1.9p7 (unchanged from the current p9, except for the addition of an omitted word):
When the processing of the abstract machine is interrupted by receipt of a signal, the values of objects with type other than
volatile std::sig_atomic_tare unspecified, and the value of any object not of typevolatile std::sig_atomic_tthat is modified by the handler becomes undefined.
1.9p8 (unchanged from the current p10):
An instance of each object with automatic storage duration (3.7.2) is associated with each entry into its block. Such an object exists and retains its last-stored value during the execution of the block and while the block is suspended (by a call of a function or receipt of a signal).
1.9p9 (original text from p11):
The least requirements on a conforming implementation are:
- At sequence points, volatile objects are stable in the sense that previous evaluations are complete and subsequent evaluations have not yet occurred. Accesses to volatile objects are initiated strictly according to the rules of the abstract machine.
- At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.
- The input and output dynamics of interactive devices shall take place in such a fashion that prompting messages actually appear prior to a program waiting for input. What constitutes an interactive device is implementation-defined.
[ Note: more stringent correspondences between abstract and actual semantics may be defined by each implementation. —end note ]
1.9p10 (unchanged from p14):
[ Note: operators can be regrouped according to the usual mathematical rules only where the operators really are associative or commutative.11) For example, in the following fragment
[unchanged text omitted]
However on a machine in which overflows do not produce an exception and in which the results of overflows are reversible, the above expression statement can be rewritten by the implementation in any of the above ways because the same result will occur. —end note ]
1.9p11 (original text from p12):
A full-expression is an expression that is not a subexpression of another expression. If a language construct is defined to produce an implicit call of a function, a use of the language construct is considered to be an expression for the purposes of this definition. A call to a destructor generated at the end of the lifetime of an object other than a temporary object is an implicit full-expression. Conversions applied to the result of an expression in order to satisfy the requirements of the language construct in which the expression appears are also considered to be part of the full-expression. [ Example:
[unchanged example omitted]
1.9p12 (unchanged from p13):
[ Note: the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument. —end note ]
1.9p13 (original text from p7):
Accessing an object designated by a
volatilelvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression might produce side effects. Evaluation of an expression (or sub-expression) in general includes both value computations (including fetching a value previously assigned to an object) and initiation of side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. [ Footnote: Note that some aspects of sequencing in the abstract machine are unspecified; the preceding restriction upon side effects applies to that particular execution sequence in which the actual code is generated. Also note that when a call to a library I/O function returns, the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) may not have completed yet. —end footnote ]
"Sequenced before" is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread, which induces a partial order among those evaluations. Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. If A is not sequenced before B and B is not sequenced before A, then A and B are unsequenced. [ Note: The execution of unsequenced evaluations can overlap. —end note ] When A and B are indeterminately sequenced, then either A is sequenced before B, or B is sequenced before A, but which is unspecified. [ Note: Indeterminately sequenced evaluations can not overlap, but either could be executed first. —end note ]
1.9p15 (original text from p15):
There is a sequence point at the completion of evaluation of each full-expression. Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated. [ Footnote: As specified in 12.2, after the "end-of-full-expression" sequence point after a full-expression is evaluated, a sequence of zero or more invocations of destructor functions for temporary objects takes place, usually in reverse order of the construction of each temporary object. —end footnote ]
1.9p16 (original text from clause 5 paragraph 4):
Except where noted, the order of evaluation evaluations of operands of individual operators, and of subexpressions of individual expressions, and the order in which side effects take place, is unspecified are unsequenced. [ Footnote: The precedence of operators is not directly specified, but it can be derived from the syntax. —end footnote ] [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. —end note ] Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined. If a side effect on a scalar object is not sequenced relative to either a different side effect on the same scalar object, or a value computation using the value of the same scalar object, the behavior is undefined. [ Example:
i = v[i++]; // the behavior is undefined i = 7, i++, i++; // i becomes 9 i = ++i + 1; // the behavior is undefined i = i + 1; // the value of i is incremented—end example ]
1.9p17 (original text is p16 with p8 inserted):
When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place every value computation and side effect associated with with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of any expressions or statements expression or statement in the body of the called function body. [ Note: Value computations and side effects associated with different argument expressions are unsequenced. —end note ] There is also a sequence point after the copying of a returned value and before the execution of any expressions outside the function. [ Footnote: The sequence point at the function return is not explicitly specified in ISO C, and can be considered redundant with sequence points at full-expressions, but the extra clarity is important in C++. In C++, there are more ways in which a called function can terminate its execution, such as the throw of an exception. —end footnote ] Once the execution of a function begins, no expressions from the calling function are evaluated until execution of the called function has completed. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function. [ Footnote: In other words, function executions do not "interleave" with each other. —end footnote ] Several contexts in C++ cause evaluation of a function call, even though no corresponding function call syntax appears in the translation unit. [ Example: evaluation of a new expression invokes one or more allocation and constructor functions; see 5.3.4. For another example, invocation of a conversion function (12.3.2) can arise in contexts in which no function call syntax appears. —end example ] The sequence points at function-entry and function-exit sequencing constraints on the execution of the called function (as described above) are features of the function calls as evaluated, whatever the syntax of the expression that calls the function might be. [ Example:
int increment_x() { x++; } x++ + increment_x(); // Evaluation order unspecified; x may be incremented only once increment_x() + increment_x(); // x is incremented twice—end example ]
Deleted as redundant with descriptions of operators (original text from p17):
In the evaluation of each of the expressions
a && b a || b a ? b : c a , busing the built-in meaning of the operators in these expressions (5.14, 5.15, 5.16, 5.18), there is a sequence point after the evaluation of the first expression. [ Footnote: The operators indicated in this paragraph are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation, and the operands form an argument list, without an implied sequence point between them. —end footnote ]
New paragraphs inserted as 1.7p3 et seq.:
A memory location is either an object of scalar type, or a maximal sequence of adjacent bit-fields all having non-zero width. Two threads of execution can update and access separate memory locations without interfering with each other.
[Note: Thus a bit-field and an adjacent non-bit-field are in separate memory locations, and therefore can be concurrently updated by two threads of execution without interference. The same applies to two bit-fields, if one is declared inside a nested struct declaration and the other is not, or if the two are separated by a zero-length bit-field declaration, or if they are separated by a non-bit-field declaration. It is not safe to concurrently update two bit-fields in the same struct if all fields between them are also bit-fields, no matter what the sizes of those intervening bit-fields happen to be.]
[Example: A structure declared as
struct {char a; int b:5, c:11, :0, d:8; struct {int ee:8;} e;}contains four separate memory locations: The fieldsa, and bit-fieldsdande.eeare each separate memory locations, and can be modified concurrently without interfering with each other. The bit-fieldsbandctogether constitute the fourth memory location. The bit-fieldsbandccan not be concurrently modified, butbanda, for example, can be. --end example.]
Insert a new section between 1.9 and 1.10, titled "Multi-threaded executions and data races".
1.10p1:
Under a hosted implementation, a C++ program can have more than one thread of execution (a.k.a. thread) running concurrently. Each thread executes a single function according to the rules expressed in this standard. The execution of the entire program consists of an interleaved execution of all of its threads. Under a freestanding implementation, it is implementation-defined whether a program can have more than one thread of execution.
1.10p2:
The execution of each thread proceeds as defined by the remainder of this standard. The value of an object visible to a thread T at a particular point might be the initial value of the object, a value assigned to the object by T, or a value assigned to the object by another thread, according to the rules below.
1.10p3:
Two expression evaluations conflict if one of them modifies a memory location and the other one accesses or modifies the same memory location.
1.10p4:
The library defines a number of operations, such as operations on locks, that are specially identified as synchronization operations. These operations play a special role in making assignments in one thread visible to another. A synchronization operation is either an acquire operation or a release operation, or both, on one or more memory locations. [Note: For example, a call that acquires a lock will perform an acquire operation on the locations comprising the lock. Correspondingly, a call that releases the same lock will perform a release operation on those same locations. Informally, performing a release operation on A forces prior side effects on other memory locations to become visible to other threads that later perform an acquire operation on A.]
The following merges in the "depends-on" relation from the description in N1944. Hopefully this is easier to follow.
1.10p5:
An expression evaluation A is inter-thread-ordered-before another evaluation B if:
- A is sequenced before B and either A performs an acquire operation, or B performs a release operation; or
- A is an unordered atomic read and B is an unordered atomic write, and either the value written by B is computed using the value read by A, or the execution of B is conditioned on the value read by A.
1.10p6:
[Note: An evaluation A can only be inter-thread-ordered-before B if A is also sequenced before B. For race-free programs making conventional use of locks, the distinction between inter-thread-ordered-before and sequenced-before is unimportant. The distinction becomes important with very weakly ordered library synchronization primitives.]
This was rewritten in terms of "synchronizes-with", which is restricted to synchronization operations, instead of explicitly including store-load dependencies in a "communicates-with" relation as in N1944. This version is intended to be equivalent, since we insist that "happens-before" together with store-load dependencies remains acyclic. We need that for the race free implies sequential consistency proof, and for one of the examples.
1.10p7:
A evaluation A that performs a release operation on a location L synchronizes-with an evaluation B that performs an acquire operation on L and reads the value written by A. [Note: The specifications of the synchronization operations define when one reads the value written by another. For atomic variables, the definition is clear. For locks, we assume that all lock operations occur in a single total order. Each lock acquisition "reads the value written" by the last lock release.]
1.10p8:
An evaluation A happens-before an evaluation B if:
- A is inter-thread-ordered-before B; or
- A synchronizes-with B; or
- for any evaluation X, A happens-before X, and X happens-before B.
1.10p9:
An evaluation A precedes an evaluation B if:
- A happens-before B; or
- A is an assignment, and B observes the value stored by A.
1.10p10:
A multi-threaded execution is consistent if each thread observes values of objects that obey the following constraints:
- No evaluation precedes itself.
- Each read access B to a scalar object observes the value assigned to that object by a side effect A only if there is no other side effect X to the same object such that
- A is sequenced before or happens-before X, and
- X is sequenced before or happens-before B.
[Note: The first condition implies that a read operation B cannot "see" an assignment A if B happens-before A. It also prevents cyclic situation in which, for example
xandyare initially zero, one thread evaluatesx = y;while another evaluatesy = x;, each sees the result of the other thread, and bothxandyobtain a value of 42. The second condition effectively asserts that later assignments hide earlier ones if there is a well-defined order between them.]
1.10p11:
An execution contains an inter-thread data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens-before the other. Any inter-thread data race results in undefined behavior. A multi-threaded program that does not contain a data race exhibits the behavior of a consistent execution. [Note: It can be shown that programs that correctly use simple locks to prevent all inter-thread data races, and use no other synchronization operations, behave as though the executions of their constituent threads were simply interleaved, with each observed value of an object being the last value assigned in that interleaving. This is normally referred to as "sequential consistency". However, this applies only to race-free programs, and race-free programs cannot observe most program transformations that do not change single-threaded program semantics. In fact, most single-threaded program transformations continue to be allowed, since any program that behaves differently as a result must perform an undefined operation.]
1.10p12:
[Note: Compiler transformations that introduce assignments to a potentially shared memory location that would not be modified by the abstract machine are generally precluded by this standard, since such an assignment might overwrite another assignment by a different thread in cases in which an abstract machine execution would not have encountered a data race.]
Various other changes in the base language are no doubt needed, but not yet clear. I think there is somewhat of a consensus that thread-safety of static initialization should be explicitly indicated with a new keyword such as "async"? Exception issues should probably be deferred to the thread API proposal.
5.2.2p8 (function call); deleted as redundant with (new) 1.9p17:
The order of evaluation of arguments is unspecified. All side effects of argument expression evaluations take effect before the function is entered. The order of evaluation of the postfix expression and the argument expression list is unspecified.
5.2.6p1 (post-increment):
The value obtained by applying of a postfix
++expression is the value that the of its operand had before applying the operator. [ Note: the value obtained is a copy of the original value —end note ] The operand shall be a modifiable lvalue. The type of the operand shall be an arithmetic type or a pointer to a complete object type. After the result is noted, the The value of the operand object is modified by adding1to it, unless the object is of typebool, in which case it is set totrue. [ Note: this use is deprecated, see Annex D. —end note ] The value computation of the++expression is sequenced before the modification of the operand object. The result is an rvalue. The type of the result is the cv-unqualified version of the type of the operand. See also 5.7 and 5.17.
5.14p2 (logical AND operator), and also 5.15p2 (logical OR operator):
The result is a
bool. All side effects of the first expression except for destruction of temporaries (12.2) happen before the second expression is evaluated. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.
5.16p1 (conditional operator):
Conditional expressions group right-to-left. The first expression is implicitly converted to
bool(clause 4). It is evaluated and if it istrue, the result of the conditional expression is the value of the second expression, otherwise that of the third expression. All side effects of the first expression except for destruction of temporaries (12.2) happen before the second or third expression is evaluated. Only one of the second and third expressions is evaluated. Every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second or third expression.
5.17p1 (assignment and compound assignment operators):
The assignment operator (
=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue with the type and value of the left operand after the assignment has taken place an lvalue referring to the left operand. The result in all cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
5.18p1 (comma operator):
A pair of expressions separated by a comma is evaluated left-to-right and the value of the left expression is discarded. The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conversions are not applied to the left expression. All side effects (1.9) of the left expression, except for the destruction of temporaries (12.2), are performed before the evaluation of the right expression. Every value computation and side effect associated with the left expression is sequenced before every value computation and side effect associated with the right expression. The type and value of the result are the type and value of the right operand; the result is an lvalue if its right operand is an lvalue, and is a bit-field if its right operand is an lvalue and a bit-field.
12.2p3:
When an implementation introduces a temporary object of a class that has a non-trivial constructor (12.1, 12.8), it shall ensure that a constructor is called for the temporary object. Similarly, the destructor shall be called for a temporary with a non-trivial destructor (12.4). Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception. The value computations and side effects of destroying a temporary object are associated only with the full-expression, not with any specific subexpression.
12.2p4:
There are two contexts in which temporaries are destroyed at a different point than the end of the full-expression. The first context is when a default constructor is called to initialize an element of an array. If the constructor has one or more default arguments, the destruction of any temporaries temporary created in the a default argument expressions are destroyed immediately after return from the constructor expression is sequenced before the construction of the next array element, if any.
12.2p5:
The second context is when a reference is bound to a temporary. The temporary to which the reference is bound or the temporary that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference except as specified below. A temporary bound to a reference member in a constructor’s ctor-initializer (12.6.2) persists until the constructor exits. A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full expression containing the call. A temporary bound to the returned value in a function return statement (6.6.3) persists until the function exits. In all these cases, the temporaries created during the evaluation of the expression initializing the reference, except the temporary to which the reference is bound, are destroyed at the end of the full-expression in which they are created and in the reverse order of the completion of their construction. The destruction of a temporary whose lifetime is not extended by being bound to a reference is sequenced before the destruction of any of any temporary which is constructed earlier in the same full-expression. If the lifetime of two or more temporaries to which references are bound ends at the same point, these temporaries are destroyed at that point in the reverse order of the completion of their construction. In addition, the destruction of temporaries bound to references shall take into account the ordering of destruction of objects with static or automatic storage duration (3.7.1, 3.7.2); that is, if
obj1is an object with the same storage duration as the temporary and created before the temporary is created the temporary shall be destroyed beforeobj1is destroyed; if obj2 is an object with the same storage duration as the temporary and created after the temporary is created the temporary shall be destroyed after obj2 is destroyed. [ Example:
3.6.2p1 (initialization of non-local objects):
Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place. A reference with static storage duration and an object of POD type with static storage duration can be initialized with a constant expression (5.19); this is called constant initialization. Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. Static initialization shall be performed before any dynamic initialization takes place. Dynamic initialization of an object is either ordered or unordered. Definitions of explicitly specialized class template static data members have ordered initialization. Other class template static data members (i.e., implicitly or explicitly instantiated specializations) have unordered initialization. Other objects defined in namespace scope have ordered initialization. Objects defined within a single translation unit and with ordered initialization shall be initialized in the order of their definitions in the translation unit. The order of initialization is unspecified for objects with unordered initialization and for objects defined in different translation units. An unordered initialization is indeterminately sequenced with respect to every other dynamic initialization. [ Note: 8.5.1 describes the order in which aggregate members are initialized. The initialization of local static objects is described in 6.7. —end note ]
8.5.1p17 (aggregate initialization); new paragraph:
The full-expressions in an initializer-clause are evaluated in the order in which they appear.
12.6.2p3 (mem-initializers):
The expression-list in a mem-initializer is used to initialize the base class or non-static data member subobject denoted by the mem-initializer-id. The semantics of a mem-initializer are as follows:
- if the expression-list of the mem-initializer is omitted, the base class or member subobject is value-initialized (see 8.5);
- otherwise, the subobject indicated by mem-initializer-id is direct-initialized using expression-list as the initializer (see 8.5).
[unchanged example omitted]
There is a sequence point (1.9) after the initialization of each base and member. The initialization of each base and member constitutes a full-expression. The expression-list of Any expression in a mem-initializer is evaluated as part of the initialization of the corresponding base or member full-expression that performs the initialization.
14.2 (template arguments):
- template-argument:
- assignment-expression constant-expression
- type-id
- id-expression
Concern has been expressed about whether it is safe and legal for a compiler to optimize based on the assumption that a loop will terminate. The canonical example:
	for (T * p = q; p != 0; p = p->next)
    ++count;
x = 42;
Is it valid for the compiler to move the assignment to x above the 
loop? If the loop terminates, clearly yes, because the overall effect of the code 
doesn't change, and, in the absence of synchronization, there is no guarantee that 
the assignment to x will not be visible to a different thread before 
any assignments to count. But what if the loop doesn't terminate? For 
example, may a user assume that a non-terminating loop effects synchronization, 
and may therefore be used to prevent a data race? Clearly, a loop that contains 
any explicit synchronizations must be assumed to interact with a different thread, 
and a loop that contains a volatile access or a call to an I/O function must be 
assumed to interact with the environment, so optimization opportunities for such 
a loop are already limited. But what about a simple loop, as above?
If such a loop does not terminate, then clearly neither the loop itself nor any code following the loop can have any observable behavior. Moreover, as the "least requirements" refer to data written to files "at program termination", the presence of a non-terminating loop may even nullify observable behavior preceding entry to the loop (for example, because of buffered output). For these reasons, there are problems with concluding that a strictly-conforming program can contain any non-terminating loop. We therefore conclude that a compiler is free to assume that a simple loop will terminate, and to optimize based on that assumption.
Add a new section after 17.4.4.8, entitled "Thread safety":
Unless otherwise specified:
- Every data type (e.g. container) implemented by the library shall be thread-safe in the same sense as an ordinary scalar object: The client must ensure that an operation that logically updates an object is not executed concurrently with another operation that reads or writes the same object. The implementation must protect against accesses to shared data that do not correspond to conflicting accesses at the abstract level, i.e. updates that occur in response to logical "read" operations, or against accesses to a data structure shared by multiple abstract objects. For example, implementations of "read operations" that maintain an internal shared cache must use internal synchronization mechanisms to protect that cache, as will any implementations that maintain other forms of per class, as opposed to per object, data.
- Library calls do not introduce synchronizes-with relationships.
- Operations that allocate memory, such as
allocator<T>::allocate(), do not modify shared data. Hence they can be invoked concurrently from different threads without introducing a data race.