2. Expression Evaluation
3. The Sequence Point Concept
3.1 Consistency of the definition
3.2 Tools for doing the analysis
4. Interpretation of the Standard
5. Function Call
6. Floating-Point Status Flags
7. Conclusion
A number of attempts has been made to address this problem. These include an effort during the drafting of C99 to clarify the specification, a formal model described in N925 that can be used to determine the status of any expression, and an analysis of the problem in N926 and N927. This effort is ongoing.
This technical report provides a conceptual framework to study the
problem, and applies the framework to the specification in the Standard.
This analysis can be used to further understand the model in N925, and
algorithms in N926 and N927.
An expression is consisted of operands and optionally operators. In order to evaluate it, an expression has to be broken down into individual operations - one operation at a time. This process is defined by the Standard recursively following the syntactic construct of an expression. The Standard implicitly imposes a partial ordering on the sequence of operations. We use the following example to illustrate.
a++ + b;
The above is a full expression with two operands and two operators. The sequence of operations can be broken down as follows:
1) evaluate a
2) evaluate a + 1
3) evaluate a + b (using the value in 1)
x) update a (using
the value in 2)
The order (or relative timing) of steps 1-3 is completely specified. But the timing of step x is not, not completely. It can take place anytime after step 2 and before the end of the evaluation of the expression. This gives implementations the freedom to exploit instruction level parallelism. But this also means that the evaluation could run into trouble if 'a' is also updated by some other means. Consider a variation of the above:
a = a++ + b;
There is an additional step due to the assignment:
4) update a (using the value in 3)
Step 4 can happen anytime after step 3 and before the end of the whole expression. This period of time overlaps the corresponding one for step x. The final value of 'a' becomes undefined.
In the same token, the evaluation could also run into trouble if the value of 'a' is read within the same questionable time period. We will examine this further later.
Note that we are interested in the relative timing of events, not the absolute timing. We are interested in whether one operation has to happen before another one, or whether multiple operations could happen concurrently. This relationship can best be represented in a graph as follows:
o
|
1
|
2
/ \
/ 3
|
\
x
4
\ /
\ /
$
The '$' marks the end of the whole expression and the 'o' marks
the beginning. The graph depicts a partial ordering. Nodes connected by
an edge are comparable, with the lower one greater. The key to our analysis
is to identify the sets of nodes (operations) that could potentially occur
concurrently.
1
|
2
/ \
x 3
\ /
|
|
4
|
$
the evaluation would become well defined. Our question here is therefore twofold: 1) Whether the Standard's definition of sequence point is internally consistent. 2) Whether the definition produces evaluation results that match our expectation as programmers.
The Standard defines sequence point in two steps. The first is the specification in 5.1.2.3 paragraph 2:
"... At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place."
And also in 6.5 paragraph 2:
"Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored."
If we treat events during the evaluation of an expression as happening within specific time intervals, the above imposes a restriction on the end points of these intervals. That is, it defines the relationship between side effect and sequence point, but does not define the relationship between sequence point and the rest of the expression. This latter is accomplished in the second step.
Within the specification of the semantics of expressions, the Standard
inserts sequence points at certain locations. One such example is in the
Logical-And operator (6.5.13). These specifications fix the positions of
sequence points within the expression. The relative positions of side effects
are therefore indirectly established.
int x;
(++x && x) + (++x && x);
The syntax parse tree can be drawn as follows:
+
/ \
/
\
/
\
/
\
&&
&&
/ \
/\
S \
S \
/ x
/ x
++
++
\
\
x
x
An 'S' is used to indicate a sequence point. It is treated like an operator in the tree. In the abstract machine, when a sequence point node is "evaluated", all outstanding side effects must be complete. The questions in this example is whether the sequence points under the two &&-operators sufficient to nail down the relevant side effects. Or, to put it in another way, what are the outstanding side effects with respect to a sequence point.
Listed below are the operations. They can be derived by going through all the nodes in the syntax tree:
1) evaluate (x+1) (in left subexpression of +)
2) update x (using
value in 1)
3) sequence point
4) evaluate (x) (in left subexpression
of +)
5) evaluate (x+1) (in right subexpression of +)
6) update x (using
value in 5)
7) sequence point
8) evaluate (x) (in right subexpression
of +)
9) evaluate +
The partial ordering of these steps can be determined as follows:
o
/ \
1 5
| |
2 6
| |
3 7
| |
4 8
\ /
9
|
$
There is a sequence point before the evaluation starts, and one when the full expression ends. We add dummy nodes 'o' and '$" to start and end the graph. Let us examine the application of 5.1.2.3p2.
5.1.2.3p2 requires all side effects of evaluations prior to a sequence point be complete at the sequence point. To determine if an evaluation satisfies this criteria, we need to determine all predecessors of a particular sequence point. Since the set of predecessors of a node in a partial ordering is defined, 5.1.2.3p2 is decidable for any well formed expression. For example, the predecessors of step 3 are 1 and 2; and of step 7 are 5 and 6. Therefore these two updates must be complete by the time we reach 3 and 7 respectively. But we cannot tell the relative timing of the two updates themselves.
It can be argued that 5.1.2.3p2 should be applied to steps 2 and 7 as well. Since step 2 could potentially start before 7 while step 3 end after, step 2 is a "potential predecessor" of 7, but its side effect might not be complete until after 7. This is a violation of 5.1.2.3p2.
This last point is the source of debate and confusion about sequence point. When a sequence point is too far deep inside a subexpression, it is sometimes not easy to determine if it provides sufficient coverage to protect the side effects. This difficulty is rooted at the partial ordering.
Other than this weakness, and another one to be discussed in section
4, there seems to be no inconsistency in the sequence point definition.
This is because the Standard does not explicitly provide a procedure for
determining the status of an expression (i.e. whether it is well defined).
An expression is well defined if it has no constraint or semantics violation.
This specifies what are undefined expressions. By its negative nature,
this cannot lead to an assertion where an expression is both well defined
and not well defined. The above weakness is therefore not an internal inconsistency.
Depending on how we interpret the "predecessor of a sequence point", we
may end up with a larger or smaller set of well defined expressions. Whether
this set matches the programmers' expectation is the question of external
consistency.
One important step in the analysis is in determining the set of predecessors of a give node. The usual definition is the set of all nodes that are compared smaller than a given one. But this excludes nodes that cannot be compared. As noted in the above discussion, the concept of "potential predecessors" is important. This is independent of how we interpret 5.1.2.3p2; in fact, the potential predecessors will shred light on how we should read this paragraph.
The set of potential predecessors for a given node is the union of the set of predecessors and the set of nodes that have no ordering with the given node. This gives a slightly larger set than we need; we are not interested in updates that are already committed before the previous sequence point(s). So we form a subset from the potential predecessors as follows:
For completeness, we also define successors of a given node (to
be those that compare greater), and potential successors (to be the union
of successors and the set of nodes that have no ordering with the given
node). The union of successors and potential predecessors is the same as
the union of predecessors and potential successors, which is the set of
all nodes minus the give node.
The list of nodes is as follows:
1) evaluate a
2) evaluate a+1
3) update a
S) sequence point
5) evalute a
6) evalute b
7) evalue a+b
The partial ordering:
1
|
2
|
3
|
S
/ \
4 5
\ /
6
And the relevant sets with respect to node 5 are,
In 5.1.2.3p2, previous evaluations of a sequence point are the potential predecessors of a sequence point. And subsequent evaluations are the successors of a sequence point. 5.1.2.3p2 therefore requires that when the abstract machine reaches a sequence point, there is no outstanding side effect.
In 6.5p2, the evaluations falling "between the previous and next sequence point" are those in the set of non committed predecessors of the next sequence point. So if an object is updated more than once within this set, the expression is undefined. We can design an algorithm to test the status of an expression basing on this property (together with other properties discussed below).
The resolution of DR087 implied there are operations that do not overlap. Function calls are such operations. An object can be updated more than once without causing trouble if all such updates are done by non-overlapping operations. But the resolution of this DR did not specify whether a function call can overlap with other operations. Consider the following piece of code:
int i = 1;
int foo() { i++; }
...
foo() + i++; /* line A, inside a function body */
Line A would not be undefined (but unspecified) if a function call
cannot overlap other operations in the rest of the expression.
"... Furthermore, the prior value shall be read only to determine the value to be stored."
The use of the word "only" seems to convey the wrong meaning. Notice that not all intermediate values from subexpressions are used in side effects (i.e. in stores). For example, a value can be used to compute the resulting value of the full expression. As noted in section 2, the evaluation could run into trouble if a read overlaps a write to the same object. The interpretation we have so far only talks about stores, not reads; a piece is therefore still missing. It seems that this sentence is trying to fill that gap. Consider the following expression:
++i + i;
Repeating the procedures above, we can come up with the following sequence of operations and partial ordering.
1) evaluate i
(first i)
2) evaluate (i+1)
3 update i
(using the value in 2)
4) evaluate i
(second i)
5) evaluate (2) + (4)
o
/ \
1 \
| |
2 |
/ \ |
3 \ 4
\ \|
\ 5
\ |
\ |
$
Since 3 is a non committed predecessor of 4, the second read of i and the store could potentially overlap. The expression should be undefined according to our intuition. But it doesn't violate 5.1.2.3p2 nor the first sentence of 6.5p2. This is because so far we only restrict writes but not reads.
A read can potentially overlap a non committed write if it follows the write or if it has no ordering relationship with the write. It is safe to have a read lying strictly before the write; that is, the read is a predecessor (not a potential predecessor). So it seems that the wordings of the second sentence of 6.5p2 should be changed.
If we examine the above expression more carefully, we will find that even though the value of the full expression is undetermined, it does not really affect the computation of the overall program. This is because the value is not used. The computation can still be defined provided that the overlap access doesn't cause any hardware execeptions. On the other hand, the following is a problem:
a = ++i + i;
as the value is used to modify 'a'. In the second sentence of 6.5p2, if the prior value is read but not used in any stores, the implementation can throw away the read (or as-if it does so) to produce an evaluation that is well behaved. To make the meaning explicit, we can revise the second sentence of 6.5p2 is as follows:
6.5p2
"... Furthermore, the prior value shall be read only to determine
side effects."
Regardless of which interpretation we choose, we cannot derive it from existing text in the Standard. One possible way to fix this is:
6.5p2:
"Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an expression
unless
the modification is done by non-overlapping operations."
6.5.2.2p10, add a sentence at the end:
"The actual call shall not be overlapped by any other operations."
6.5p2 first sentence:
"Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an expression
unless
the modification is done by non-overlapping operations or the object is
a floating-point status flag."
However there is still one issue.
A revised attempt is as follows:
6.5p2 first sentence:
"Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an expression
unless
the modification is done by non-overlapping operations or the object is
a floating-point flag set by the modification."
The central idea in the analysis is that sequence points are not related to each other in a linear manner. The underlying order is a partial order. The phrase "between the previous and next sequence point" has to be interpreted with this understanding in mind. After this is addressed, the rest of the analysis falls into place fairly easily.
There has been effort to provide algorithms and formal model to
determine the status of an expression. These are useful; and the analysis
presented here can used to help. It is not a good idea to incorporate such
algorithms as a normative part of the Standard. If the Standard is already
consistent, any addition can either add nothing new or make the specification
inconsistent. These models and algorithms are best presented in the Rationale.