org: | ISO/IEC JCT1/SC22/WG14 | document: | N2838 | ||||
target: | IS 9899:2023, TS 6010:2023 | version: | 1 | ||||
date: | 2021-10-11 | license: | CC BY |
Paper number | Title | Changes | |
---|---|---|---|
N2820 | Types and sizes | Initial version | |
N2838 | Types and sizes v1 | Supersedes N2820 | |
don’t use the term declarator | |||
insist that cast expressions may lead to UB | |||
In 6.5.2 of the C standard, sizes are primarily defined for types. Although this is not stated explicitly, it is commonly assumed that such sizes cannot exceed SIZE_MAX
. Sizes of objects (in contrast to storage instances as of TS 6010) are only a deduced property that is in most cases defined through the type that an object has. This proposal attempts to make this approach consistent throughout the standard, and to reduce the number of marginal cases where the interpretation of sizeof
is different between C and C++.
For the latter, observe that code as in
int const n = 23;
int const m = 24;
double A[n][m];
int j = 0;
printf("sizeof is %zu\n", sizeof A[++j]);
is interpreted much differently in C and C++ since both languages have quite different definitions for integer constant expressions. For both the declaration of A
is valid, but for C++ it is an array with compile-time fixed lengths, whereas for C it is a VLA. Therefore the sizeof
operator may evaluate the increment operator in C, but not in C++.
A complete type shall have a size that is less than or equal to
SIZE_MAX
. A type has known constant size ifthe type is not incompleteit is complete and is not a variable length array type.
In view of our recent discussion about overflow in calloc
, we did some search into existing implementations and asked on the WG14 reflector and some other media if there could be objects defined that with a size that exceeds SIZE_MAX
. It turned out that all interpret the current standard that huge objects make the behavior implicitly undefined. This change here makes that explicit. In particular, it makes it explicit that even requesting such a huge object has no defined behavior.
This change is only a clarification and should not have an impact on existing programs or implementations.
The
sizeof
operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. Ifthe type ofthe operand is a type name representing a variable length array type (possibly enclosed in a nested set oftypeof
declarators), the operand is evaluated;FNT1 otherwise, the operand is not evaluated and the behavior is undefined if the operand is an expression for which the size is not known before the sizeof operator is met. The result is an integer constant if the type has a known constant size, see 6.2.5..
FNT1 The evaluation of such declarators is specified in 6.7.6.2, below
This changes the specification when the operand of sizeof
is not an lvalue of VLA type. Such an object cannot be declared as a compound literal, so it must either
sizeof
expression,For the first, the type of such an object and by that its size is already fixed before the execution reads the sizeof
expression. The second already implicitly defines the type and can better be expressed with a type name variant
that avoids the evaluation and dereferencing of the variable p
, both operations potentially having undefined behavior.
Therefore we think that the evaluation should be restricted to the case where the operand is a type name for which the evaluation makes a difference in the type, namely to the case of the type name of a VLA that either occurs directly in the sizeof
or with intermediate typeof
.
Consider the the following code snippet
double A[n][m];
int j = 0;
int i = 0;
printf("sizeof is %zu\n", sizeof A[++j][++i]);
printf("now j is %d, i is %d\n", j, i);
printf("sizeof is %zu\n", sizeof A[++j]);
printf("now j is %d\n", j); // what is printed here?
where n
and m
are supposed to be some integer variables with values greater than 2
. In a non-representative survey among C and C++ enthusiasts we asked for the value that is printed for j
. Their answers were distributed as follows:
j |
% |
---|---|
0 |
37.8 |
1 |
7.8 |
2 |
54.3 |
So over 90% had a wrong answer. For us this clearly shows the need to reform that marginal property of the sizeof
operator.
With the current standard, the first two increment operators are not evaluated because the expression has type double
(so i
stays 0
, for example) but would increment the last because the expression has type double[*]
, a VLA. As a consequence, j
is 1
for the printf
in line 7. With the proposed change, none of the increment operators would be evaluated.
By that, this proposal changes the status of some sizeof
expressions, namely those that concern lvalues of VLA type.
This change also has an impact on the new typeof
feature (N2724) which follows the same strategie as currently for sizeof
. If this change here is agreed, we will coordinate for a similar change for typeof
with the author.
Shall we integrate Change 3.1 into C23?
Shall we integrate Change 3.2 into C23?
Shall we integrate the same changes into TS 6010?
We like to thank Joseph Myers for his feedback on the first version of this paper. The idea for Change 3.2 came from Tomasz Stanislawski.