register at file scope.
This proposal is inspired by N2067 and the desire for better optimizations even in simple implementations
The storage class register forbids taking the address of an object.
However, its use is currently allowed at block scope only.
Many simple implementations optimize one function or translation unit at a time. Thus they cannot know if the address of an object shared across multiple translation units has been taken. Often they will have to assume the worst case that any non-restrict pointer of unknown origin could point to any such object; this is a serious obstacle to optimizations. Similarly, with a likely future PNVI approach to provenance, the implementation has to assume a worst case of a pointer to such an object having been cast to an integer in another translation unit, which also becomes an obstacle to optimization.
The language already offers the register storage class for objects that cannot have their address taken. Allowing the use of register at file scope is an obvious and straightforward way of letting the programmer promise to never take the address of an object.
To evaluate the impact this change would have, we created a variant of the Small Device C Compiler (SDCC) that supports register at file scope and uses the information in two places in redundancy elimination. We consider SDCC to be typical of many C compilers targetting embedded systems, which tend to have no or little link-time optimization and lack state-of-the-art alias analysis.
In the 4 benchmarks Coremark, Dhrystone, stdcbench, Whetstone, we made file-scopes objects register where appropriate; we compiled the benchmarks with strong optimization (--max-allcos-per-node 100000) for the default balanced optimization goal.
We saw no change for Coremark and stdcbench, which is no surprise, since they both use very few file scope variables, and typically not in the inner loops.
Whetstone score increased by 0.8 %, code size was reduced by 0.3 %. Whetstone uses some file scope variables, but doesn't do much with pointers, so no surprise here, either.
Dhrystone score increased by 1.4 %, code size increased by 0.1 %. The increase in score is surprisingly large, considering that Dhrystone tends to spent most of its time in string functions from the standard library, which are not affected by the change at all.
In summary, the effect on benchmarks was much smaller than on synthetic code samples.
§6.7.1p2: Replace "..._Thread_local may appear with static or extern." by "..._Thread_local may appear with static, register or extern; register may appear with _Thread_local, static or extern."
§6.7.1p3: Replace "If _Thread_local appears in any declaration of an object, it shall be present in every declaration of that object." by "If _Thread_local appears in any declaration of an object, it shall be present in every declaration of that object. If register appears in any declaration of an object, it shall be present in every declaration of that object.".
§6.7.1p6 footnote: Replace "The implementation can treat any register declaration simply as an auto declaration." by "The implementation can treat any register declaration simply as if register had been omitted."
§6.9p2: Replace "...the storage class specifiers auto and register shall not appear in the declaration specifiers in an external declaration." by "...the storage class specifier auto shall not appear in the declaration specifiers in an external declaration."
§6.9.2p2: Replace "...without the storage class specifier static constitutes a tentative definition." by "...without the storage class specifier static or register constitutes a tentative definition."