May 15th, 2021
Document: n2724
Previous Revisions: None
Audience: WG14
Proposal Category: New Features
Target Audience: General Developers, Compiler/Tooling Developers
Latest Revision: https://thephd.dev/_vendor/future_cxx/papers/C%20-%20typeof.html
Getting the type of an expression in Standard C code.
typeof
in the Appendix.Yes/No/Abstain
to the given question / option.Keyword Options:
Use
_Typeof
keyword, with<stdtypeof.h>
header.
6/7/5Use
typeof
keyword, no header.
16/2/1Use some other spelling (
qualified_typeof
, or similar).
1/14/3This was very strong direction to use the keywords directly, and not use an alternate spelling.
On the subject of using Expressions / types within
typeof
/remove_quals
.
typeof
with type names going in, in addition to expressions (voting “No” means no type names, just expressions)
17/1/4
remove_quals
applied to expressions, in addition to type names (voting No means no expressions are allowed)
11/2/5This was very strong direction to allow both types and expressions in both constructs.
remove_quals
spelling.remove_quals
(to match the other declarations)_Typeof
(or appropriate flavor) and _Remove_quals
.decltype
identifier for this and other compatibility issues.typeof
is a extension featured in many implementations of the C standard to get the type of an expression. It works similarly to sizeof
, which runs the expression in an “unevaluated context” to understand the final type, and thusly produce a size. typeof
stops before producing a byte size and instead just yields a type name, usable in all the places a type currently is in the C grammar.
There are many uses for typeof
that have come up over the intervening decades since its first introduction in a few compilers, most notably GCC. It can, for example, help produce a type-safe generic printing function that even has room for user extension (see example implementation). It can also help write code that can use the expansion of a macro expression as the return type for a function, or used within a macro itself to correctly cast to the desired result of a specific computation’s type (for width and precision purposes). The use cases are vast and endless, and many people have been locking themselves into implementation-specific vendorship. This keeps their code out of other compilers (for example, Microsoft’s Visual C Compiler) and weakens the ISO C ecosystem overall.
Every implementation in existence since C89 has an implementation of typeof
. Some compilers (GCC, Clang, EDG, tcc, and many, many more) expose this with the implementation extension typeof
. But, the Standard already requires typeof
to exist. Notably, with emphasis (not found in the standard) added:
The
sizeof
operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. — N2596, Programming Languages C - Working Draft, §6.5.3.4 Thesizeof
and_Alignof
operators, Semantics
Any implementation that can process sizeof("foo")
is already doing sizeof(typeof("foo"))
internally. This feature is the most “existing practice”-iest feature to be proposed to the C Standard, possibly in the entire history of the C standard. The feature was also mentioned in an “extension round up” paper that went over the state of C Extensions in 20071. typeof
was also considered an important extension during the discussion of that paper, but nobody brought forth the paper previously to make it a reality2.
Putting a normal or VLA-type computation results in an idempotent type computation that simply yields that type in most implementations that support the feature. If the compiler supports Variable Length Arrays, then __typeof
– if it is similar to GCC, Clang, tcc, and others – it is already supported with these semantics. These semantics also match how sizeof
would behave (computing the expression or having an internal placeholder “VLA” type), so we propagate that same ability in an identical manner.
Notably, this is how current implementations evaluate the semantics as well. Note that the standard claims that whether or not any computation done for Variably Modified Types – with side effects – is actually unspecified behavior, so there’s no additional guarantees about the evaluation for such types.
The goal was to be compatible with sizeof(...)
, which takes both expressions and types. Existing __typeof(...)
expressions also take this design choice. We see this as a good thing, since it is compatible with the usage of typeof(...)
extensions in existing Macros and code, where occasionally programmers use type names directly into these macros with the fore-knowledge that it will be used exclusively in __typeof(...)
or sizeof(...)
operations.
C++ has a feature it calls decltype(...)
, which serves most of the same purpose. “Most” is because it has a subtle difference which would wreak havoc on C code if it was employed in shared header code:
int value = 20;
#define GET_TARGET_VALUE (value)
inline decltype(GET_TARGET_VALUE) g () {
return value;
}
int main () {
int& r = g();
return r;
}
The return type of g
would be int&
in C++, and int
in C. Other expressions, such as array indexing and pointer dereferencing, also have this same issue. This is due to the parentheses in the expression. Macros in both languages frequently see extra parentheses employed around expressions to prevent mixing of precedence or other shenanigans from token-based macro expansion and subsequent language parsing; this would be a footgun of large proportions for C and C++ users, and create a divergence in standard use that would rise to the level of a liaison issue that may become unfixable. This is also part of the reason why decltype
was given that keyword in C++, and not typeof
: they did not want this kind of subtle and brutal change to afflict C and C++ code. typeof
does not have this problem because – if a Sister Paper ever proposes it for C++ – it will have identical behavior to std::remove_reference_t<decltype(T)>
.
This was also addressed when C++ was itself trying to introduce dectlype
and competing with typeof
in WG21 for C++3.
A similar feature should be proposed in C++, albeit it will likely take the keyword name typeof
rather than _Typeof
. This paper intends to have a similar paper brought before the C++ Committee – WG21 – through its Liaison Study Group, if this paper is successful.
There is some discussion about what happens with qualifiers, both standard and implementation-defined. For example, “Named Address Space” qualifiers are subject to issues with GCC"s typeof
extension, as shown here4. The intention of one of the GCC maintainers from that thread is:
Well, I think we should fix typeof to not retain the address space. It’s probably our implementation detail of having those in TYPE_QUALS that exposes the issue… — Richard Biener, GCC Maintainer, November 5th, 2020
There is also some disagreement between implementations about what qualifiers are worth keeping with respect to _Atomic
between implementations. Therefore, typeof
as proposed does not strips all qualifiers from the computed type result. The reason for this is that a user can add specifiers and qualifications to a type, but can not take them away once they are part of the expression. For example, consider the specification of <complex.h>
that contains macro-provided constants like _Imaginary_I
. These constants have the type const float _Imaginary
: should all typeof(_Imaginary_I)
expressions therefore result in a const float _Imaginary
, or a float _Imaginary
? What about volatile
? And so on, and so forth.
There is an argument to strip all type qualifiers (_Atomic
, const
, restrict
, and volatile
) from the final type expression is because they can be added back by the programmer easily. However, the opposite is not true: you cannot add back qualifiers or create macros where those qualifiers can be taken in as parameters and re-applied to the function. This does leave some room to be desired: some folk may want to deliberately propagate the const
-ness, volatile
-ness, or _Atomic
-ness of an expression to its end users.
Originally, the idea of a _Typeof
and an _Unqual_typeof
was explored. This was a tempting direction but ultimately unsuitable as it duplicated functionality with a slight caveat and did not have a targeted purpose. A much better set name for the functionality is typeof
and remove_quals
. typeof
is an all-qualifier-preserving type reproduction of the expression (or pass-through if a type is given) . It suitably envelopes the total space of existing practice. The only reason _Unqual_typeof
would exist is to… well, remove qualifiers. It only makes sense to just name it appropriately by using remove_quals
as a keyword. The benefits of choosing this name are also clear:
remove_quals
in searching the ACT database (catalogue of Debian/Fedora/etc. open source packages and their code) (December 5th, 2020); and,remove_quals
in the entirety of GitHub save for 4 instances of Python Code (December 5th, 2020).This means that we need not entertain the idea of needing a header or some other choice and can simply directly name remove_quals
as a keyword in the code instead, saving ourselves a massive debate about what should and should not be a keyword.
Separately, we should consider a Macro Programming facility for C that can address larger questions. This paper strives to focus on the material gains from existing practice and the pitfalls of said existing practice. Therefore, this paper proposes only typeof
and remove_quals
.
After this paper is handled, further research should be given to handling qualifiers, function types, and arrays in Macros for generic programming. This paper focuses only on what we can find existing practice for.
The below changes are for adding the two keywords.
The following wording is relative to N2596.
Except when it is the operand of the
sizeof
operatorsizeof
, or typeof operators, or the unary&
operator, or is a string literal used to initialize an array, an expression that has type “array of type” is converted to an expression with type “pointer to type” that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.A function designator is an expression that has function type. Except when it is the operand of the
sizeof
operatorsizeof
operator, a typeof operator69)or the unary&
operator, a function designator with type “function returning type” is converted to an expression that has type “pointer to function returning type”.69)Because this conversion does not occur, the operand of the
sizeof
operator remains a function designator and violates the constraints in 6.5.3.4.
Add a keyword to the §6.4.1 Keywords:
_Thread_local
typeof
remove_quals
An integer constant expression125) shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants,
sizeof
expressions whose results are integer constants,_Alignof
expressions, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to thetypeof operators,sizeof
sizeof
operator, or_Alignof
operator.…
An arithmetic constant expression shall have arithmetic type and shall only have operands that are integer constants, floating constants, enumeration constants, character constants,
sizeof
expressions whose results are integer constants, and_Alignof
expressions. Cast operators in an arithmetic constant expression shall only convert arithmetic types to arithmetic types, except as part of an operand to thetypeof operators,sizeof
sizeof
operator, or_Alignof
operator.
131) Thus, the only operator that can be applied to an array declared with storage-class specifier
register
issizeof
and the typeof operators.
type-specifier:
void
…
typedef-name
typeof-specifier
…
- enum specifier
- typedef name
- typeof specifier
Specifiers for
structures, unions, enumerations, and atomic typesstructures, unions, enumerations, atomic types, and typeof specifiers are discussed in 6.7.2.1 through6.7.2.46.7.2.5. Declarations of typedef names are discussed in 6.7.8. The characteristics of the other types are discussed in 6.2.5.
133)As specified in 6.7.2 above, if the actual type specifier used is
int
or a typedef-name defined asint
, then it is implementation-defined whether the bit-field is signed or unsigned. This includes anint
type specifier produced by the use of the typeof specifier (6.7.2.5).
§6.7.2.5 The Typeof specifiers
Syntax
typeof-specifier:
typeof
( typeof-specifier-argument )
remove_quals
( typeof-specifier-argument )
typeof-specifier-argument:
expression
type-name
The
typeof
andremove_quals
tokens are collectively called the typeof operators.
Constraints
The typeof operators shall not be applied to an expression that designates a bit-field member.
Semantics
The typeof-specifier applies the typeof operators to an expression (6.5) or a type-name. If the typeof operators are applied to an expression, they yield the type-name representing the type of their operand11�0). Otherwise, they produce the type-name with any nested typeof-specifier evaluated 11�1). If the type of the operand is a variably modified type, the operand is evaluated; otherwise, the operand is not evaluated.
All qualifiers (6.7.3) on the type from the result of a
remove_quals
operation are removed, including the_Atomic
qualifier._Atomic ( type-name )
, with parentheses, is not a qualifier, and remains unaffected byremove_quals
. Otherwise, fortypeof
operations, all qualifiers are preserved.11�0) When applied to a parameter declared to have array or function type, the
typeof
operator yields the adjusted (pointer) type (see 6.9.1).11�1) If the typeof-specifier-argument is itself a typeof-specifier, the operand will be evaluated before evaluating the current typeof operation. This happens recursively until a typeof-specifier is no longer the operand.
5 EXAMPLE 1 Type of an expression.
The following program:is equivalent to this program:1+1) main () { typeof(return 0; }
6 EXAMPLE 2 Types and qualifiers.int main() { return 0; }
The following program:is equivalent to this program:const _Atomic int purr = 0; const int meow = 1; const char* const mew[] = { "aardvark", "bluejay", "catte", }; int argc, char* argv[]) { remove_quals(meow) main ( remove_quals(purr) plain_purr;_Atomic typeof(meow)) atomic_meow; typeof( typeof(mew) mew_array; remove_quals(mew) mew2_array;return 0; }
7 EXAMPLE 3 Equivalence ofconst _Atomic int purr = 0; const int meow = 1; const char* const mew[] = { "aardvark", "bluejay", "catte", }; int main (int argc, char* argv[]) { int plain_purr; const _Atomic int atomic_meow; const char* const mew_array[3]; const char* mew2_array[3]; return 0; }
sizeof
andtypeof
.8 EXAMPLE 4 Nestedint main (int argc, char* argv[]) { // this program has no constraint violations _Static_assert(sizeof(typeof('p')) == sizeof(int)); _Static_assert(sizeof(typeof('p')) == sizeof('p')); _Static_assert(sizeof(typeof((char)'p')) == sizeof(char)); _Static_assert(sizeof(typeof((char)'p')) == sizeof((char)'p')); _Static_assert(sizeof(typeof("meow")) == sizeof(char[5])); _Static_assert(sizeof(typeof("meow")) == sizeof("meow")); _Static_assert(sizeof(typeof(argc)) == sizeof(int)); _Static_assert(sizeof(typeof(argc)) == sizeof(argc)); _Static_assert(sizeof(typeof(argv)) == sizeof(char**)); _Static_assert(sizeof(typeof(argv)) == sizeof(argv)); _Static_assert(sizeof(remove_quals('p')) == sizeof(int)); _Static_assert(sizeof(remove_quals('p')) == sizeof('p')); _Static_assert(sizeof(remove_quals((char)'p')) == sizeof(char)); _Static_assert(sizeof(remove_quals((char)'p')) == sizeof((char)'p')); _Static_assert(sizeof(remove_quals("meow")) == sizeof(char[5])); _Static_assert(sizeof(remove_quals("meow")) == sizeof("meow")); _Static_assert(sizeof(remove_quals(argc)) == sizeof(int)); _Static_assert(sizeof(remove_quals(argc)) == sizeof(argc)); _Static_assert(sizeof(remove_quals(argv)) == sizeof(char**)); _Static_assert(sizeof(remove_quals(argv)) == sizeof(argv)); return 0; }
typeof(...)
.
The following program:is equivalent to this program:int main (int argc, char*[]) { float val = 6.0f; return (typeof(remove_quals(typeof(argc))))val; }
9 EXAMPLE 5 Variable Length Arrays and typeof operators.int main (int argc, char*[]) { float val = 6.0f; return (int)val; }
10 EXAMPLE 6 Nested typeof operators, arrays, and pointers.#include <stddef.h> size_t vla_size (int n) { typedef char vla_type[n + 3]; // variable length array vla_type b; return sizeof( remove_quals(b)// execution-time sizeof, translation-time typeof operation ); } int main () { return (int)vla_size(10); // vla_size returns 13 }
11 EXAMPLE 7 Function types, pointer types, and array types.int main () { const char*)[4]) y = { typeof(typeof("a", "b", "c", "d" // 4-element array of "const pointer to char" }; return 0; }
void f(int); 5)) g(double x) { // g has type "void(double)" typeof(f("value %g\n", x); printf( } // h has type "void(*)(double)" typeof(g)* h; // k has type "void(*)(double)" typeof(true ? g : NULL) k; void j(double A[5], typeof(A)* B); // j has type "void(double*, double**)" extern typeof(double[]) D; // D has an incomplete type 0.7, 99 }; // C has type "double[2]" typeof(D) C = { 5, 8.9, 0.1, 99 }; // D is now completed to "double[4]" typeof(D) D = { // E has type "double[4]" from D's completed type typeof(D) E;
If the same qualifier appears more than once in the same specifier-qualifier list or as declaration specifiers, either directly, via one or more typeof specifiers, or via one or more
typedef
s, the behavior is the same as if it appeared only once. If other qualifiers appear along with the_Atomic
qualifier the resulting type is the so-qualified atomic type.
If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by
*
; otherwise, each time it is evaluated it shall have a value greater than zero. The size of each instance of a variable length array type does not change during its lifetime. Where a size expression is part of the operand of a typeof orsizeof
operator and changing the value of the size expression would not affect the result of the operator, it is unspecified whether or not the size expression is evaluated. Where a size expression is part of the operand of an_Alignof
operator, that expression is not evaluated.
There shall be no more than one external definition for each identifier declared with internal linkage in a translation unit. Moreover, if an identifier declared with internal linkage is used in an expression
(other than as a part of the operand of athere shall be exactly one external definition for the identifier in the translation unitsizeof
or_Alignof
operator whose result is an integer constant),., unless it is:
- part of the operand of a
sizeof
operator whose result is an integer constant,- part of the operand of a
_Alignof
operator whose result is an integer constant,- or, part of the operand of any typeof operator whose result is not a variably modified type.
…
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as a part of the operand of a
typeof operator whose result is not a variably modified type, or asizeof
,sizeof
or_Alignof
operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.173)
The following are old sections or references related to older parts of the proposal that have since been superceded and other interesting, but not critical, information.
The C99 rationale states that:
A proposed typeof operator was rejected on the grounds of insufficient utility.
The times have since changed drastically and typeof(...)
became powerfully useful and proved itself as good. Therefore, we are happy to include it. Another paper closer to the release of C11/C17 also came out: N1229, an omnibus that listed all of the different extensions and evaluated them. There, support was greater for typeof
, but nobody came forward with a paper to follow up on Nick Stoughton’s work.
This paper closes the loop on the request that Nick Stoughton did in that analysis as well as many user requests over the intervening more-than-a-decade of time.
There are 3 options for names. We have wording for the options using find-and-replace on the TYPEOF_KEYWORD_TOKEN
as well as the REMOVE_QUALIFIERS_KEYWORD_TOKEN
. The option that provides the most consensus will be what is chosen:
_Typeof
keyword, <stdtypeof.h>
header_Typeof
for the type of keywordremove_quals
for the remove qualifications keywordThis is the relatively conservative option that uses a _Typeof
keyword plus <stdtypeof.h>
to get access to the convenient spelling. It prevents implementations that have already settled on the typeof()
keyword in their extension modes from having to warn users or breakage or deal with that problem. Many have raised issues with this, annoyed at the constant spelling of keywords in fundamentally awkward and strange ways while requiring headers to fix up usage. This is consistent with other new keywords introduced in the Standard to avoid breakage at all costs, but suffers from strong lamentations in needing a header to access a common spelling.
This is the authors’ status quo and compromise position.
typeof
keywordtypeof
for the type of keywordremove_quals
for the remove qualifications keywordThis is the relatively aggressive (but still milquetoast, overall) option. It takes over the extension that is used in non-conforming C modes in a few compilers, such as XL C and GCC. Maintainers/implementers from GCC and Clang have noted their approval for this option, but e.g. XL C maintainers and implementers are less enthused.
The reason some folks are against this change is because there are “bugs” in the implementation where some qualifiers are preserved, but other implementation-defined qualifiers are not. Most implementations agree that things like _Atomic
and volatile
should be preserved (and the compiler that did not implement it this way acknowledged that it was, more or less, a mistake). There are also qualifiers that are dropped on some implementations for their vendor-specific extensions. An argument can be made that implementations can continue to do whatever they want with implementation-defined qualifiers as far as typeof
is concerned, as long as they preserve the standard qualifiers.
This option is the authors’ overwhelmingly strong preference.
This uses a completely novel name to avoid the problem altogether. These names take no interesting space from users or implementers and it is the safest option, though it risks obscurity in what is a commonly anticipated feature. Names for this include:
qual_typeof
remove_quals
qualified_typeof
remove_qualifiers
typeof_qual
remove_quals
typeof_qualified
remove_qualifiers
Choosing this options means picking one of these novel keywords and substituting it for the TYPEOF_KEYWORD_TOKEN
spelling in the wording above (not applicable any longer).
This is the authors’ least favorite option.
Nick Stoughton. Potential Extensions For Inclusion In a Revision of ISO/IEC 9899. ISO/IEC SC22 WG14 - Programming Languages C. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1229.pdf↩︎
ISO/IEC JTC1 SC22 WG14. Meeting Minutes April 2007. ISO/IEC SC22 WG14 - Programming Languages C. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1267.pdf↩︎
Jaakko Järvi and Bjarne Stroustrup. Decltype and auto (revision 3). ISO/IEC JTC1 SC22 WG21 - Programming Languages C++. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1607.pdf↩︎
Uros Bizjak. typeof and operands in named address spaces. GNU Compiler Collection. https://gcc.gnu.org/pipermail/gcc/2020-November/234119.html↩︎