December 4th, 2020
Document: n2619
Previous Revisions: None
Audience: WG14
Proposal Category: New Features
Target Audience: General Developers, Compiler/Tooling Developers
Latest Revision: https://thephd.github.io/_vendor/future_cxx/papers/source/C%20-%20typeof.html
Getting the type of an expression in Standard C code.
_Typeof
and _Remove_quals
.decltype
identifier for this and other compatibility issues.typeof
is a extension featured in many implementations of the C standard to get the type of an expression. It works similarly to sizeof
, which runs the expression in an “unevaluated context” to understand the final type, and thusly produce a size. typeof
stops before producing a byte size and instead just yields a type name, usable in all the places a type currently is in the C grammar.
There are many uses for typeof
that have come up over the intervening decades since its first introduction in a few compilers, most notably GCC. It can, for example, help produce a type-safe generic printing function that even has room for user extension (see example implementation). It can also help write code that can use the expansion of a macro expression as the return type for a function, or used within a macro itself to correctly cast to the desired result of a specific computation’s type (for width and precision purposes). The use cases are vast and endless, and many people have been locking themselves into implementation-specific vendorship. This keeps their code out of other compilers (for example, Microsoft’s Visual C Compiler) and weakens the ISO C ecosystem overall.
Every implementation in existence since C89 has an implementation of typeof
. Some compilers (GCC, Clang, EDG, tcc, and many, many more) expose this with the implementation extension typeof
. But, the Standard already requires typeof
to exist. Notably, with emphasis (not found in the standard) added:
The
sizeof
operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. — N2573, Programming Languages C - Working Draft, §6.5.3.4 Thesizeof
and_Alignof
operators, Semantics
Any implementation that can process sizeof("foo")
is already doing sizeof(typeof("foo"))
internally. This feature is the most “existing practice”-iest feature to be proposed to the C Standard, possibly in the entire history of the C standard. The feature was also mentioned in an “extension round up” paper that went over the state of C Extensions in 20071. typeof
was also considered an important extension during the discussion of that paper, but nobody brought forth the paper previously to make it a reality2.
Notes:
Putting a normal or VLA-type computation results in an idempotent type computation that simply yields that type in most implementations that support the feature. If the compiler supports Variable Length Arrays, then __typeof
– if it is similar to GCC, Clang, tcc, and others – then it is already supported with these semantics. These semantics also match how sizeof
would behave (computing the expression or having an internal placeholder “VLA” type), so we propagate that same ability in an identical manner.
Notably, this is how current implementations evaluate the semantics as well. Note that the standard claims that whether or not any computation done for Variably Modified Types – with side effects – is actually unspecified behavior, so there’s no additional guarantees about the evaluation for such types.
C++ has a feature it calls decltype(...)
, which serves most of the same purpose. “Most” is because it has a subtle difference which would wreak havoc on C code if it was employed in shared header code:
int value = 20;
#define GET_TARGET_VALUE (value)
inline decltype(GET_TARGET_VALUE) g () {
return value;
}
int main () {
int& r = g();
return r;
}
The return type of g
would be int&
in C++, and int
in C. Other expressions, such as array indexing and pointer dereferencing, also have this same issue. This is due to the parentheses in the expression. Macros in both languages frequently see extra parentheses employed around expressions to prevent mixing of precedence or other shenanigans from token-based macro expansion and subsequent language parsing; this would be a footgun of large proportions for C and C++ users, and create a divergence in standard use that would rise to the level of a liaison issue that may become unfixable. This is also part of the reason why decltype
was given that keyword in C++, and not typeof
: they did not want this kind of subtle and brutal change to afflict C and C++ code. typeof
does not have this problem because – if a Sister Paper ever proposes it for C++ – it will have identical behavior to std::remove_reference_t<decltype(T)>
.
This was also addressed when C++ was itself trying to introduce dectlype
and competing with typeof
in WG21 for C++3.
A similar feature should be proposed in C++, albeit it will likely take the keyword name typeof
rather than _Typeof
. This paper intends to have a similar paper brought before the C++ Committee – WG21 – through its Liaison Study Group, if this paper is successful.
There is some discussion about what happens with qualifiers, both standard and implementation-defined. For example, “Named Address Space” qualifiers are subject to issues with GCC"s typeof
extension, as shown here4. The intention of one of the GCC maintainers from that thread is:
Well, I think we should fix typeof to not retain the address space. It’s probably our implementation detail of having those in TYPE_QUALS that exposes the issue… — Richard Biener, GCC Maintainer, November 5th, 2020
There is also some disagreement between implementations about what qualifiers are worth keeping with respect to _Atomic
between implementations. Therefore, typeof
as proposed does not strips all qualifiers from the computed type result. The reason for this is that a user can add specifiers and qualifications to a type, but can not take them away once they are part of the expression. For example, consider the specification of <complex.h>
that contains macro-provided constants like _Imaginary_I
. These constants have the type const float _Imaginary
: should all typeof(_Imaginary_I)
expressions therefore result in a const float _Imaginary
, or a float _Imaginary
? What about volatile
? And so on, and so forth.
There is an argument to strip all type qualifiers (_Atomic
, const
, restrict
, and volatile
) from the final type expression is because they can be added back by the programmer easily. However, the opposite is not true: you cannot add back qualifiers or create macros where those qualifiers can be taken in as parameters and re-applied to the function. This does leave some room to be desired: some folk may want to deliberately propagate the const
-ness, volatile
-ness, or _Atomic
-ness of an expression to its end users.
Originally, the idea of a _Typeof
and an _Unqual_typeof
was explored. This was a tempting direction but ultimately unsuitable as it duplicated functionality with a slight caveat and did not have a targeted purpose. A much better set name for the functionality is _Typeof
and _Remove_quals
. _Typeof
is an all-qualifier-preserving type reproduction (or pass-through if a type is given) of the expression. It suitably envelopes the total space of existing practice. The only reason _Unqual_typeof
would exist is to… well, remove qualifiers. It only makes sense to just name it appropriately, then: _Remove_quals
is the name! It has identical functionality to the previously talked about _Unqual_typeof
but with the massive benefit that:
This means that we need not entertain the idea of a special keyword and can simply directly name it remove_quals
in the code instead, saving ourselves a massive debate about what should and should not be a keyword.
Separately, we should consider a Macro Programming facility for C that can address larger questions. This paper strives to focus on the material gains from existing practice and the pitfalls of said existing practice. Therefore, this paper proposes only _Typeof
and _Remove_quals
.
After this paper is handled, further research should be given to handling qualifiers, function types, and arrays in Macros for generic programming. Further research should be done in the area of conversions, which may aid in ABI issues from, e.g., certain function names creating ABI problems (c.f. the entire intmax_t
discussion5 currently in deadlock right now). _ExtInt
6 also brings up interesting consequences for _Generic
, with respect to how to match various types of _ExtInt
without having to write out a truly enormous list of explicit bit size associations, from 1 to some large N
. Answering those questions may prove useful, but this paper does not explore any further than the existing practice.
There is a choice that affects the wording here, to be voted on during the Virtual March WG14 - Programming Languages C meeting, contained below:
There are 3 options for names. We have wording for the options using find-and-replace on the TYPEOF_KEYWORD_TOKEN
. The option that provides the most consensus will be what is chosen:
_Typeof
keyword, <stdtypeof.h>
headerThis is the hyper conservative option that uses a _Typeof
keyword plus <stdtypeof.h>
to get access to the convenient spelling. It prevents implementations that have already settled on the typeof()
keyword in their extension modes from having to warn users or breakage or deal with that problem. Many have raised issues with this, annoyed at the constant spelling of keywords in fundamentally awkward and strange ways while requiring headers to fix up the messed up usages. This is consistent with other new keywords introduced in the Standard to avoid breakage at all costs, but suffers from strong lamentations in needing a header to access a common spelling. This is the authors’ middle of the road option.
typeof
keywordThis is the relatively aggressive (but still somewhat milquetoast) option. It takes over the extension that is used in non-conforming C modes in a few compilers, such as XL C and GCC. Maintainers/implementers from GCC and Clang have noted their approval for this option, but XL C maintainers and implementers are less enthused. The reason some folks are against this change is because there are “bugs” in the implementation where some qualifiers are preserved, but other implementation-defined qualifiers are not. An argument can be made that implementations can continue to do whatever they want with implementation-defined qualifiers as far as typeof
is concerned, since all of them preserve almost all of the pre-existing standard qualifiers present. This is the authors’ overwhelmingly strong preference.
This uses a completely novel name to avoid the problem altogether. These names take no interesting space from users or implementers and it is the safest option, though it risks obscurity in what is a commonly anticipated feature. Names for this include:
qual_typeof
qualified_typeof
decltypeof
(not the same as decltype
)Choosing this options means picking one of these novel keywords and substituting it for the TYPEOF_KEYWORD_TOKEN
spelling in the wording below. This is the authors’ least favorite option.
The following wording is relative to N2573.
Except when it is the operand of the
sizeof
operatorsizeof
, or typeof operators, or the unary&
operator, or is a string literal used to initialize an array, an expression that has type “array of type” is converted to an expression with type “pointer to type” that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.A function designator is an expression that has function type. Except when it is the operand of the
sizeof
operatorsizeof
operator, a typeof operator68)or the unary&
operator, a function designator with type “function returning type” is converted to an expression that has type “pointer to function returning type”.68)Because this conversion does not occur, the operand of the
sizeof
operatorsizeof
and typeof operators remains a function designator and violates the constraints in 6.5.3.4.
Add a keyword to the §6.4.1 Keywords:
_Thread_local
TYPEOF_KEYWORD_TOKEN
remove_quals
An integer constant expression125) shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants,
sizeof
expressions whose results are integer constants,_Alignof
expressions, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to thetypeof operators,sizeof
sizeof
operator, or_Alignof
operator.…
An arithmetic constant expression shall have arithmetic type and shall only have operands that are integer constants, floating constants, enumeration constants, character constants,
sizeof
expressions whose results are integer constants, and_Alignof
expressions. Cast operators in an arithmetic constant expression shall only convert arithmetic types to arithmetic types, except as part of an operand to thetypeof operators,sizeof
sizeof
operator, or_Alignof
operator.
type-specifier:
void
…
typedef-name
typeof-specifier
…
- enum specifier
- typedef name
- typeof specifier
Specifiers for
structures, unions, enumerations, and atomic typesstructures, unions, enumerations, atomic types, and typeof specifiers are discussed in 6.7.2.1 through6.7.2.46.7.2.5. Declarations of typedef names are discussed in 6.7.8. The characteristics of the other types are discussed in 6.2.5.
133)As specified in 6.7.2 above, if the actual type specifier used is
int
or a typedef-name defined asint
, then it is implementation-defined whether the bit-field is signed or unsigned. This includes anint
type specifier produced by the use of the typeof specifier (6.7.2.5).
§6.7.2.5 The Typeof specifiers
Syntax
typeof-specifier:
TYPEOF_KEYWORD_TOKEN
( expression )
TYPEOF_KEYWORD_TOKEN
( type-name )
remove_quals
( expression )
remove_quals
( type-name )
Constraints
The typeof-specifier shall not be applied to an expression that designates a bit-field member.
Semantics
The typeof-specifier applies the
TYPEOF_KEYWORD_TOKEN
orremove_quals
operator (collectively, the typeof operators) to a unary-expression (6.5) or a type-specifier. If the typeof operators are applied to an expression, they yield the type-name representing the type of their operand11�0). Otherwise, they produce the type-name with any nested typeof-specifier evaluated 11�1). If the type of the operand is a variably modified type, the operand is evaluated; otherwise, the operand is not evaluated.All qualifiers (6.7.3) on the type from the result of a
remove_quals
operation are removed, including_Atomic
. Otherwise, forTYPEOF_KEYWORD_TOKEN
operations, all qualifiers are preserved.11�0) When applied to a parameter declared to have array or function type, the
TYPEOF_KEYWORD_TOKEN
operator yields the adjusted (pointer) type (see 6.9.1).11�1) If the operand is a typeof operator, the operand will be evaluated before evaluating the current typeof operation. This happens recursively until a typeof-specifier is no longer the operand.
5 EXAMPLE 1 Type of an expression.
The following program:is equivalent to this program:1+1) main () { TYPEOF_KEYWORD_TOKEN(return 0; }
6 EXAMPLE 2 Types and qualifiers.int main() { return 0; }
The following program:is equivalent to this program:const _Atomic int purr = 0; const int meow = 1; const char* const mew[] = { "aardvark", "bluejay", "catte", }; int argc, char* argv[]) { TYPEOF_KEYWORD_TOKEN(purr) main ( remove_quals(purr) plain_purr;_Atomic TYPEOF_KEYWORD_TOKEN(meow)) atomic_meow; TYPEOF_KEYWORD_TOKEN( TYPEOF_KEYWORD_TOKEN(mew) mew_array; remove_quals(mew) mew2_array;return 0; }
7 EXAMPLE 3 Equivalence ofconst _Atomic int purr = 0; const int meow = 1; const char* const mew[] = { "aardvark", "bluejay", "catte", }; int main (int argc, char* argv[]) { int plain_purr; const _Atomic int atomic_meow; const char* const mew_array[3]; const char* mew2_array[3]; return 0; }
sizeof
andtypeof
.8 EXAMPLE 4 Nestedint main (int argc, char* argv[]) { // this program has no constraint violations _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN('p')) == sizeof(int)); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN('p')) == sizeof('p')); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN((char)'p')) == sizeof(char)); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN((char)'p')) == sizeof((char)'p')); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN("meow")) == sizeof(char[5])); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN("meow")) == sizeof("meow")); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN(argc)) == sizeof(int)); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN(argc)) == sizeof(argc)); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN(argv)) == sizeof(const char**)); _Static_assert(sizeof(TYPEOF_KEYWORD_TOKEN(argv)) == sizeof(argv)); _Static_assert(sizeof(remove_quals('p')) == sizeof(int)); _Static_assert(sizeof(remove_quals('p')) == sizeof('p')); _Static_assert(sizeof(remove_quals((char)'p')) == sizeof(char)); _Static_assert(sizeof(remove_quals((char)'p')) == sizeof((char)'p')); _Static_assert(sizeof(remove_quals("meow")) == sizeof(char[5])); _Static_assert(sizeof(remove_quals("meow")) == sizeof("meow")); _Static_assert(sizeof(remove_quals(argc)) == sizeof(int)); _Static_assert(sizeof(remove_quals(argc)) == sizeof(argc)); _Static_assert(sizeof(remove_quals(argv)) == sizeof(const char**)); _Static_assert(sizeof(remove_quals(argv)) == sizeof(argv)); return 0; }
TYPEOF_KEYWORD_TOKEN(...)
.
The following program:is equivalent to this program:int main (int argc, char*[]) { float val = 6.0f; return (TYPEOF_KEYWORD_TOKEN(remove_quals(TYPEOF_KEYWORD_TOKEN(argc))))val; }
9 EXAMPLE 5 Variable Length Arrays and typeof operators.int main (int argc, char*[]) { float val = 6.0f; return (int)val; }
10 EXAMPLE 6 Nested typeof operators, arrays, and pointers.#include <stddef.h> size_t vla_size (int n) { typedef char vla_type[n + 3]; // variable length array vla_type b; return sizeof( remove_quals(b)// execution-time sizeof, translation-time typeof operation ); } int main () { return (int)vla_size(10); // vla_size returns 13 }
int main () { const char*)[4]) y = { TYPEOF_KEYWORD_TOKEN(TYPEOF_KEYWORD_TOKEN("a", "b", "c", "d" // 4-element array of "const pointer to char" }; return 0; }
If the same qualifier appears more than once in the same specifier-qualifier list or as declaration specifiers, either directly, via one or more typeof specifiers, or via one or more
typedef
s, the behavior is the same as if it appeared only once. If other qualifiers appear along with the_Atomic
qualifier the resulting type is the so-qualified atomic type.
If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by
*
; otherwise, each time it is evaluated it shall have a value greater than zero. The size of each instance of a variable length array type does not change during its lifetime. Where a size expression is part of the operand of a typeof orsizeof
operator and changing the value of the size expression would not affect the result of the operator, it is unspecified whether or not the size expression is evaluated. Where a size expression is part of the operand of an_Alignof
operator, that expression is not evaluated.
There shall be no more than one external definition for each identifier declared with internal linkage in a translation unit. Moreover, if an identifier declared with internal linkage is used in an expression
(other than as a part of the operand of athere shall be exactly one external definition for the identifier in the translation unitsizeof
or_Alignof
operator whose result is an integer constant),., unless it is:
- part of the operand of a
sizeof
operator whose result is an integer constant,- part of the operand of a
_Alignof
operator whose result is an integer constant,- or, part of the operand of any typeof operator whose result is not a variably modified type.
…
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as a part of the operand of a
typeof operator whose result is not a variably modified type, or asizeof
,sizeof
or_Alignof
operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.173)
<stdtypeof.h>
(IF AND ONLY IF: Option 1 is not chosen):The header
<stdtypeof.h>
defines two macros.The macro
typeof
expands to
TYPEOF_KEYWORD_TOKEN
.The macro
STDC_TYPEOF_IS_DEFINED
is suitable for use in
#if
preprocessing directives. It expands to the integer constant1
.
Nick Stoughton. Potential Extensions For Inclusion In a Revision of ISO/IEC 9899. ISO/IEC SC22 WG14 - Programming Languages C. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1229.pdf↩︎
ISO/IEC JTC1 SC22 WG14. Meeting Minutes April 2007. ISO/IEC SC22 WG14 - Programming Languages C. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1267.pdf↩︎
Jaakko Järvi and Bjarne Stroustrup. Decltype and auto (revision 3). ISO/IEC JTC1 SC22 WG21 - Programming Languages C++. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1607.pdf↩︎
Uros Bizjak. typeof and operands in named address spaces. GNU Compiler Collection. https://gcc.gnu.org/pipermail/gcc/2020-November/234119.html↩︎
Martin Uecker. intmax_t, again. ISO/IEC JTC1 SC22 WG14 - Programming Languages C. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2498.pdf↩︎
Aaron Ballman, Melanie Blower, Tommy Hoffner, and Erich Keane. Adding a Fundamental Type for N-bit integers. ISO/IEC JTC1 SC22 WG14 - Programming Languages C. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2590.pdf↩︎