This is a follow up of the now closed DR 423 which resulted in the clarification of the status of qualifications of rvalues.
This defect report aims to clarify the status of the controlling
expression of _Generic
primary expression:
_Generic
primary
expression undergo any type of conversion to calculate the type that
is used to do the selection?
Implementers have given different answers to this question; gcc
(choice 1 in the following) on one side and clang and IBM
(choice 2) on the other side went quite opposite ways,
resulting in severe incompatibility for _Generic
expression that use qualifiers or arrays.
char const* a = _Generic("bla", char*: "blu"); // clang error char const* b = _Generic("bla", char[4]: "blu"); // gcc error char const* c = _Generic((int const){ 0 }, int: "blu"); // clang error char const* d = _Generic((int const){ 0 }, int const: "blu"); // gcc error char const* e = _Generic(+(int const){ 0 }, int: "blu"); // both ok char const* f = _Generic(+(int const){ 0 }, int const: "blu"); // both error
The last two lines, where gcc and clang agree, points to the nature of the problem: gcc treats all such expressions as rvalues and does all applicable conversions of 6.3.2.1, that is lvalue to rvalue and array to pointer conversions. clang treats them as lvalues.
The problem arises to know whether or not the conversions of 6.3 apply to the controlling expression.
_Generic
is not an operator, but a primary
expression. The wording in 6.5.1.1 is has a type
and doesn't make any reference to type conversion.
_Generic
either, which are listed
in 6.5.1.1.
Applying promotions would have as an effect that we wouldn't be able
to distinguish narrow integer types from int
. There is
no indication that the text implies that form or conversion, nor
that anybody has proposed to use _Generic
like this.
All conversion in 6.3.2.1 p2 describe what would in normal
CS language be named the evaluation of an object. It has no
provision to apply it to types alone.
In particular it includes the special clause that
uninitialized register
variables lead to undefined
behavior when undergoing lvalue conversion. As a consequence:
Any lvalue conversion of an uninitialized register
variable leads to undefined behavior.
And thus
Under the hypothesis that the controlling expression undergoes
lvalue conversion, any _Generic
primary expression
that uses an uninitialized register
variable as
controlling expression leads to undefined behavior.
In view of the resolution of DR 423 (rvalues drop qualifiers)
using _Generic
primary expressions with objects in
controlling expression may have results that appear surprising.
#define F(X) _Generic((X), char const: 0, char: 1, int: 2) char const strc[] = ""; F(strc[0]) // -> 0 F(""[0]) // -> 1 F(+strc[0]) // -> 2
So the problem is here, that there is no type agnostic operator that
results in a simple lvalue conversion for char const
objects to char
; all such operators also
promote char
to int
.
Under the hypothesis that the controlling expression doesn't
undergo conversion, any _Generic
primary expression
that uses a qualified lvalue of narrow type T
can't
directly trigger the association for T
itself.
For many areas the two approaches are feature equivalent, that is both allow to implement the same semantic concepts, but with different syntax. Rewriting code that was written with one of choices in mind to the other choice is in general not straight forward and probably can't be automated.
X
is only a wide integer type or
an array or pointer type, a macro such as
#define bla(X) _Generic((X), ... something ... )would have to become
#define bla(X) _Generic((X)+0, ... something ... )Writing code that takes care of narrow integer types is a bit more difficult, but can be done with 48 extra case selections, taking care of all narrow types (6) and all their possible qualifications (8,
restrict
is not possible,
here). Code that uses struct
or union
types must use bizarre things like 1 ? (X) : (X)
to
enforce lvalue conversion.
#define blaOther((X), \ char: blub, char const: blub, ..., \ short: ..., \ default: _Generic(1 ? (X) : (X), struct toto: ... ) #define bla(X) _Generic((X)+0, ... something ... , \ default: blaOther(X))
&
.
#define blu(X) _Generic((X), \ char const: blub, \ char[4]: blob, \ ...)has to be changed to something like
#define blu(X) _Generic(&(X),\ char const*: blub, \ char(*)[4]: blob, \ ...)That is each individual type selection has to be transformed, and the syntactical change that is to be apply is no simple textual replacement.
Since today C implementations have already taken different paths for
this feature, applications should be careful when
using _Generic
to remain in the intersection of these
two interpretations. A certain number of design questions should be
answered when implementing a type generic macro:
struct
types?
This is e.g the case of the C library interfaces
in <tgmath.h>. If we know that the possible type of
the argument is restricted in such a way, the easiest is to apply
the unary plus operator +
, as in
#define F(X) _Generic(+(X), \ default: doubleFunc, \ int: intFunc, \ ... \ _Complex long double: cldoubleFunc)(X) #define fabs(X) _Generic(+(X), \ default: fabs, \ float: fabsf, \ long double: fabsl)(X)
This +
sign ensures an lvalue to rvalue conversion, and,
that it will error out at compilation time for pointer types or
arrays. It also forcibly promotes narrow integer types, usually
to int
. For the later case of fabs
all
integer types will map to the double
version of the
function, and the argument will eventually be converted
to double
before the call is made.
If we also want to capture pointer types and convert arrays
to pointers, we should use +0
.
#define F(X) _Generic((X)+0), \ default: doubleFunc, \ char*: stringFunc, \ char const*: stringFunc, \ int: intFunc, \ ... \ _Complex long double: cldoubleFunc)(X)
This binary +
ensures that any array is first converted
to a pointer; the properties of 0
ensure that this
constant works well with all the types that are to be captured, here.
It also forcibly promotes narrow integer types, usually
to int
.
If we k now that a macro will only be used for array and pointer
types, we can use the []
operator:
#define F(X) _Generic(&((X)[0]), \ char*: stringFunc, \ char const*: stringFunc, \ wchar_t*: wcsFunc, \ ... \ )(X)
This operator only applies to array or to pointer types and would error if present with any integer type.
If we want a macro that selects differently according to type
qualification or according to different array size, we can use
the &
operator:
#define F(X) _Generic(&(X), \ char**: stringFunc, \ char(*)[4]: string4Func, \ char const**: stringFunc, \ char const(*)[4]: string4Func, \ wchar_t**: wcsFunc, \ ... \ )(X)
The above discussion describes what can be read from the text of C11, alone, and not the intent of the committee. I think if the committee would have wanted a choice 2, the standard text would not have looked much different than what we have, now. Since also the intent of the committee to go for choice 1 seems not to be very clear from any additional text (minutes of the meetings, e.g) I think the reading of choice 2 should be the preferred one.
Suggested Technical Corrigendum (any choice)
Amend the list in footnote 121 for objects
with register
storage class. Change
Thus, the only operators that can be applied to an array declared
with storage-class specifier register
are sizeof
and _Alignof
.
Thus, an identifier with array type and declared with storage-class
specifier register
may only appear in primary
expressions and as operand to sizeof
and _Alignof
.
Suggested Technical Corrigendum (Choice 2)
Change 6.5.1.1 p3, first sentence
Add _Generic
to the exception list in 6.3.2.1
p3 to make it clear that array to pointer conversion applies to
none of the controlling or association expression if they are
lvalues of array type.
_Generic
primary expression, or is the operand of
the sizeof
operator, the _Alignof
operator, or the unary &
operator, or is a string
literal used to initialize an array, an expression that has type
‘‘array of type’’ is converted to an expression with type
‘‘pointer to type’’ that points to the initial element of the
array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
Also add a forward reference to _Generic
in 6.3.2.
Suggested Technical Corrigendum (Choice 1)
If the intent of the committee had been choice 1 or
similar, bigger changes of the standard would be indicated. I only
list some of the areas that would need changes:
_Generic
from primary expressions to a proper
subsection, and rename the feature to _Generic
operator.
Also, add _Generic
to the exception list in 6.3.2.1
p3 to make it clear that array to pointer conversion applies to
none of the association expression if they are lvalues of array type.
_Generic
expression, or is the operand of
the sizeof
operator, the _Alignof
operator, or the unary &
operator, or is a string
literal used to initialize an array, an expression that has type
‘‘array of type’’ is converted to an expression with type
‘‘pointer to type’’ that points to the initial element of the
array object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
Suggested Technical Corrigendum (Status quo)
A third possibility would be to leave this leeway to
implementations. I strongly object to that, but if so, I would
suggest to add a phrase to 6.5.1.1 p3 like:
... in the default generic association. Whether or not the type of the controlling expression is determined as if any of conversions described in Section 6.3 are applied is implementation defined. None of the expressions ...