org: | ISO/IEC JCT1/SC22/WG14 | document: | N2825 | |||||
target: | IS 9899:2023 | version: | 3 | |||||
date: | 2021-10-12 | license: | CC BY |
Paper number | Title | Changes | |
---|---|---|---|
N2477 | Const functions | Initial version | |
N2539 | Unsequenced functions | Supersedes N2477 | WG14 poll: 15-0-1 |
new wording (extracted from N2522) | |||
no application to the C library | |||
N2825 | Unsequenced functions v3 | Supersedes N2539 | |
no attribute verification imposed | |||
support for function pointers | |||
optional text for inclusion of lambdas | |||
Multiple compilers on the market support the concept of const
and pure
functions (so coined by GCC), also sometimes called no-side-effects functions. In that terminology, a const
function is a function that does not depend on or modify the global context of the program: it only uses its input parameters to modify its return value (GCC doesn’t allow a const function to have output parameters). To avoid ambiguities with the properties of the const qualifier, the new attribute unsequenced is proposed which defines that a call to such a function can be executed unsequenced with respect to unrelated evaluations, namely as soon as its input is available and as late as its result is used by another statement. Although widely supported, this concept is currently absent from the C standard. Unsequenced property can be gradually built from sub-properties:
It is proposed to add support to all those functions attributes in C23 such that functions that aren’t unsequenced can still benefit some optimisations.
Unsequenced functions can be leveraged in multiple ways to enhance performance and reduce code size. Optimization is key in a compiler: it allows adding more features to a given system or conversely selecting lower-power CPUs to perform the same tasks.
Model-based systems define software using a series of graphical blocks such as confirmators, delays, digital filters, etc. Code generators are often used to produce source code. Some of these generators use global variables to implement the data flow between models. Each block is implemented by a call to a library function. Let’s take a model with two confirmator blocks:
extern bool bInput1, bInput2, bInit, bInitState1, bInitState2;
extern int uConfTime1, uConfTime2;
void foo() {
static int uCounter1, uCounter2;
...
bOutput1 = CONF(bInput1, uConfTime1, bInit, bInitState1, &uCounter1);
bOutput2 = CONF(bInput2, uConfTime2, bInit, bInitState2, &uCounter2);
...
}
where all these are global variables. bInit represents the initialization state of the whole system (for CONF
, it resets the counter output parameter to zero). Since the compiler must assume CONF
may be modifying bInit
(through a direct access to this global variable), it is forced to load it’s value once before the first CONF
call and a second time before the second CONF
call. The second load can be avoided if the compiler is told CONF
is an unsequenced function. It will then re-use bInit
value from a register or stack, thus reducing the number of assembly instructions, i.e. potentially saving code size and performance. Code generators are not always flexible enough to put bInit
in a local variable before calling blocks in order to work around compilers not supporting unsequenced functions.
In large model-based systems, there can be hundreds of such optimization opportunities, thus having a major overall positive impact.
Multiple use cases of potentially unsequenced function exist in the standard library, taking for example the cos
<math.h> function:
In this case, the compiler will invoke four times the cos function, with three different parameters. If cos would be declared unsequenced, the compiler could have merged the first and the third calls, reusing the result of the first call for the third, thus reducing CPU throughput and code size.
A large portion of the <math.h> functions fit the unsequenced objective for most of their input domain. However, in case of errors, these functions may also modify errno or the floating-point environment, which contradicts the strict unsequenced property as formulated above. Special care has to be taken to allow the functions to be declared unsequenced and in a manner where additional constraints can be spelled out. More work needs to be done on that topic and this paper does not propose any attribute addition to standard library functions, yet.
The const
attribute is implemented since GCC 2.5 and pure
since GCC 2.96.
GCC distinguishes const
from pure
:
const
: “functions that do not examine any values except their arguments, and have no effects except the return value. … const
functions should never take pointer … arguments.”pure
: “functions that have no effects except the return value and their return value depends only on the parameters and/or global variables.”We think that these definitions go a bit too short, because they are centered around interoperability restrictions and not around the effective properties that we want for the corresponding functions.
Below, we will propose features that extend these specifications, namely unsequenced
and idempotent
. These extensions are done in a way that the implementations of the GCC attributes are valid implementations, in the sense that they don’t check for validation of the requested properties, anyhow, and that the wider permissions that we give still make the targeted optimizations valid.
Surprisingly, then one of the attributes that we propose, independent
, also comes close to another GCC attribute, malloc
, namely for functions that have no pointer parameters but a pointer return value. For such an independent
function it can be concluded that, if repeatedly called with the same arguments, it will either return a null pointer, a pointer to always the same static
allocation, or a pointer to a freshly allocated object at each call.
Clang supports all GCC attributes.
#pragma pure_function function({global | n}, ...), ...
#pragma no_side_effects function({global | n}, ...), ...
These pragma exist in Diab compiler since at least version 4.4. Diab uses pure_function
for unsequenced functions and no_side_effects
for idempotent functions.
pure_function
: “function does not modify or use any global or static data.”no_side_effects
: “function does not modify any global variables (it may use global variables).”In addition to the basic idempotent
/unsequenced
, Diab allows functions to still access some specially marked global variable. This is achieved by passing parameters to the pragma. It is the only compiler to our knowledge that goes beyond the GCC const
/pure
concept. As proposed in N2644, this could be used to solve the errno
and floating-point environment challenge of standard library functions. Nevertheless, we think that the general idempotent and unsequenced properties are currently more widely supported and so we concentrate this paper on these.
Diab 5.x compiler is a WindRiver proprietary compiler, not based on GCC or LLVM. It is in particular used in the highly popular WindRiver VxWorks RTOS for aerospace, automotive and industrial. It supports C99 (C11 experimental) and C++14 (except thread_local
, some Unicode types and some library functions related to atomic, chron/signal, thread, filesystem and localization).
Not to be confused with Diab 7.x which is a branch based on LLVM.
MULTI supports GCC const and pure attributes since at least version 4.0.
AdaCore GNAT Ada uses Pure_Function
for const functions, i.e. functions where “the compiler can assume that there are no side effects, and in particular that two calls with identical arguments produce the same result.”
SPARK language also has the “Global” aspects which is similar. Efforts are in progress to incorporate const/pure support in the next revision of Ada language.
Fortran allows a function to be declared PURE
, meaning that the function has no side effect. That is the equivalent of GCC const.
Const/pure compiler-specific features should be promoted to language features in order to:
Unsequenced and its sub-properties fit well into the realm of the new attribute feature of C23, because conforming code that has these attributes would remain conforming if the attribute is removed or ignored. The use of the attribute feature is clearly advantageous over pragmas (attributes have well-defined applicability) and the introduction of new keywords which suggest a semantic change.
Therefore this paper proposes to use the new attribute system to implement the properties. The names “const” and “pure” are not used because their definition is split into 5 sub-properties, because “unsequenced” has not the exact same meaning as GCC “const” and because “pure” has been used in the industry for different semantics (Fortran, Ada, Diab compiler).
GCC 8.3.0 does not seem to raise warnings when const functions modify global variables.
AdaCore GNAT Ada allows pure/const functions to modify global variables. This is to support cases like instrumentation (e.g. call counter), table-caching implementations, debug traces, etc.
WindRiver Diab const/pure attribute allows to specify exceptions, i.e. global variables that a const/pure function might still read/write. That may be used to work around above mentioned GNAT use case. However, it is proposed not to introduce this for now.
Modifying global context includes calling non-idempotent functions. Reading global context includes calling non-unsequenced functions.
Having compiler errors or warnings when code violate its properties would avoid source code bugs. Indeed, wrongly declaring as unsequenced a function might produce functionally incorrect executable code. Initial version of that paper required compiler enforcement of these properties. However, as pointed out by WG14 on August 2020 meeting, making that propagated type check might increase drastically compilation time and memory usage. Also, it would require attribute addition to existing code, some of which might be hard to change (e.g. OS headers). For these reasons, this paper proposes an “assume” flavor of the attributes meaning that:
Hence, this puts the verification burden on the developer. However, depending on the quality of implementation, compiler diagnostics could reduce that burden and increase safety, at least when function definitions are visible to the translation unit. Static analysis tools might also be used to cover that outside of the compilation process.
For the same reasons, this paper does not ask for attribute inference, i.e. that the compiler automatically tags a function with these properties by looking at its code.
ISO/IEC 9899:202x “6.7.3 Type qualifiers” mentions that “If the specification of a function type includes any type qualifiers, the behavior is Undefined.” Hence, these properties do not conflict with const
, restrict
, volatile
or _Atomic
keywords.
This paper proposes these properties for function pointers as well. Indeed, function pointers would benefit from those attribute as much as functions.
inline
Although there is no conflict here, it may seem superficial to add any of these properties to an inline function since the translator sees the function’s body and can assess whether it has side effects or depends on the global context. However, it is proposed to allow that combination to allow the programmer to profit from the corresponding diagnosis and to explicitly enforce constraints where these properties can’t be automatically deduced.
_Noreturn
Although none of the two above mentioned use cases apply for a _Noreturn
function, it might still be useful to declare a _Noreturn
function with one of the properties: it allows the _Noreturn
function programmer to add static check that it’s function does not rely on or modifies global variables. This combination is more likely to be useful for idempotent than for unsequenced, though.
nodiscard
nodiscard
attribute requires function calls to use a function’s return value. This is compatible with the properties.
The current implementations of these features leave a lot of the properties to be defined by the user, in particular which accesses would constitute an access to global state, be it for reading or writing. Here, this paper proposes to make three dedicated choices:
call_once
have a special status, because they basically enforce that a certain state is reached before any execution may progress beyond such a call. The callbacks that are used for such calls usually change global state (otherwise there is not much point to them) and the change in the underlying once_flag
by itself constitutes a state change. But since the guarantee is that the call is only effected exactly once, it can be thought to take place before any execution of the function that we want to annotate.These properties seem compatible with C++. It would be beneficial for the whole C and C++ communities that they be implemented the same way in both languages. Discussions for this harmonization have started in the joint WG14/WG21 SG.
A first observation has been that the emphasis of this proposal on call_once
is very C specific, this C library function is not even integrated in C++. The reason is that C++ has other mechanisms to initialize static objects dynamically at startup. Integration of the proposed attributes into C++ should discuss these mechanisms instead.
The same observation holds for memory allocation; for C++ the required pairings for allocation and deallocation for the noleak attribute have to be extended by analogous properties for operators new and delete. Also the fact of returning a new object from a function should not be considered a leak, either.
Link-time optimization can achieve some of the optimization benefits these properties bring. However, in addition to lacking manually-enforced constraints addition capabilities, it is not always possible to enable LTO. For example, multiple products in safety-critical markets such as avionics, nuclear, automotive and railway explicitly disable LTO in order to keep testing credit obtained by testing individual object files. Moreover, not all linkers offer LTO.
GCC forbids usage of pointer arguments to const functions. There are two reasons why this paper wants to support that:
The requirements for programs are less restricted for [[unsequenced]]
than for gcc ((__attribute__(const)))
. Thus, besides the syntax, gcc’s implementation is a valid implementation of [[unsequenced]]
, only that it may perhaps not be able to do all optimizations that would be possible.
Programs that are specified with gcc const, can easily be switched to use [[unsequenced]]
, instead, and remain valid. Additional standard attributes could then be added for functions that would not fit under the gcc model.
Similarily, gcc’s pure is less restricted than the new standard attribute [[indempotent]]
, and an implementation of the gcc feature is, syntax aside, an implementation of the new standard attribute.
Because the main change is the insertion of a whole clause, the proposed changes can be based on any intermediate version of ISO/IEC 9899:2023. Only reference to other clauses would eventually have to be modified if indpendent changes for ISO/IEC 9899:2023 insert other clauses. The only change that effectively would modify existing text is the following:
Add the new attributes
idempotent
,noleak
,unsequenced
,independent
andstateless
to the list of standard attributes in 6.7.11.1 p2.
Otherwise, the text to be inserted is as given in the following sections. If lambdas would be incorporated into C23, the proposed attributes should also apply to them, so some adjustments would then be in order. Also N2840 proposes some changes to the availibility of the call_once
function.
This introduces the general framework of these new function attributes. Changes are merely straightforward for the syntax feature itself and its impact on the type system.
We also make the necessary connections to two parts of the C library, namely dynamic startup-initialization via call_once
and memory management.
6.7.11.6 Standard attributes for function and lambda types
Constraints
1 The identifier in a standard function type attribute shall be one of:
idempotent
independent
noleak
stateless
unsequenced
2 The attribute tokens of function attributes shall appear only in the attribute specifier sequence of a function declarator or a lambda expression.FNTα) The corresponding attribute is a property of the function or lambda type.FNTβ) The attribute tokens shall appear at most once in each attribute list. The attribute argument clause shall be omitted.
FNTα) That is, they appear in the attributes right after the closing parenthesis of the parameter list or right after the capture clause, if a lambda has no parameter list. If they appear in a function declarator, the function declarator may declare a function or a pointer to function.
FNTβ) If several declarations of the same function or function pointer are visible, regardless whether an attribute is present at several or just one of the declarators, it is attached to the type of the object.
Description
3 Each attribute defined in this clause asserts a specific property of a function or lambda. The main purpose is to provide the translator with information about the access of objects by such a function or lambda such that certain assumptions about function calls can be deduced.
4 Although syntactically attached to a function type, the attributes described are not part of the prototype of a such annotated function or lambda, and redeclarations and conversions that drop such an attribute are valid and constitute compatible types. Conversely, if a definition or lambda expression that does not have the asserted property is accessed by a function declaration with external linkage or a function pointer (originating from a function definition or from a function literal expression) that have the attribute, the behavior is undefined.
5 The library function
call_once
and the associated typeonce_flag
(see 7.22 and 7.26.2.1) play a special role in this clause. They provide initialization facilities that are designed to be executed exactly once before entering into the specific functionality of the function in question. The data dependencies that are described in this clause are generally only considered after any initialization of objects with static storage duration and calls tocall_once
callbacks have taken place. We say that a static or thread-local objectY
of typeonce_flag
is a once-guard for a static or thread-local non-volatile
objectX
(X
andY
possibly being the same) ifX
is never modified other then by a call tocall_once(&Y)
and ifY
is unique with that property. IfX
has a once-guardY
, then, after any call tocall_once(&Y)
terminates,X
will not be modified until the end of the execution.
6 A second set of C library functions that interact with the properties described in this clause are the memory management functions (7.22.3). These are only considered for their effects, namely of allocating or deallocating storage. The possible effect on a hidden thread-specific state of the memory management system is ignored for the purpose of the properties, here.
Recommended Practice
7 If possible, it is recommended that implementations diagnose if an attribute of this clause is applied to a function definition or lambda expression that does not have the corresponding property.
noleak
Having no memory leaks (not in the sense of just accessibility) is an property for which we think that users should be able express their expectation (for function declarations) or their intent (for function definitions or lambda expressions). It will be an important feature for other attributes that we describe, but may also ease the work of static or dynamic program analyzers.
6.7.11.6.1 The noleak attribute
Description
1 A storage leak in a function definition or lambda expression is an allocation (7.22.3) and a possible control flow such that the pointer to the allocation is not returned by the function and such that no corresponding deallocation is performed before the end of a function call with that control flow.FNT0) This not withstanding, an allocation without deallocation is not a leak if it is used to initialize a pointer object
X
of static storage duration that is once-guarded byY
, also with static storage duration.
FNT0) Thus allocations of objects where the address is not returned or deallocated are considered to be leaks, even if the allocated object may still be accessible by other means, for example because the address has been stored in a variable.
2 The
noleak
attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function does not leak any storage.
Recommended Practice
3 It is recommended that for functions with a
noleak
attribute implementations diagnose if a pointer to newly allocated storage is not used as an argument tofree
orrealloc
, unless it is the return value of the function or lambda.
4 NOTE For this definitions, allocations that are effected by an initializer function or lambda that is guarded by a
once_flag
with static storage duration are not considered leaks. In contrast to that, because the number of threads that an application starts is in general not bounded, such allocations that are only guarded by aonce_flag
of thread storage duration are leaks. In both cases, applications are encouraged to deallocate storage that is such allocated by the appropriate means, that is byatexit
handlers,at_quick_exit
handlers ortss_t
destructors.
stateless
The stateless attribute ensures compile-time independence from local state. As such it is not sufficient to enable all the optimizations that are state of the art, but it is a first step that can be taken at compile-time without inspecting any function arguments.
6.7.11.6.2 The stateless attribute
Description
1 A function or lambda is stateless if in the function body any defined object
X
of static or thread storage duration is notvolatile
-qualified (including directly and indirectly in function calls) and has one of the following properties.
X
isconst
-qualified.
X
has a once-guardY
such thatcall_once(&Y)
is called before any access toX
,
2 The
stateless
attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function is stateless.
Example
In the following,
get_buffer
only modifiesthestatic variables through a call tocall_once
. Thus the annotation of the function with thestateless
attribute is validif the program does not use the file scope identifiers.buffer
andallocate_buffer
other than indicated
#include <stdlib.h> #include <threads.h> static unsigned* buffer; static void allocate_buffer(void) { buffer = malloc(something); /* now condition the buffer with respect to some runtime properties */ } unsigned* get_buffer(void) [[stateless]] { static once_flag once = ONCE_FLAG_INIT; call_once(&once, allocate_buffer); return buffer; }
#include <stdlib.h> #include <threads.h> unsigned* get_buffer(void) [[stateless, noleak]] { static unsigned* buffer; static once_flag once = ONCE_FLAG_INIT; call_once(&once, [](void) { buffer = malloc(something); /* now condition the buffer with respect to some runtime properties */ }); return buffer; }
idempotent
6.7.11.6.3 The idempotent attribute
Description
1 An evaluation
E
is idempotent if it has the following properties.
E
can be replaced by the evaluation(E,E)
without changing the observable state of any execution.If
E
uses an identifierX
with linkage to modify the underlying object (even indirectly during a function call)X
shall have a once-guardY
such that the modification ofX
happens whenE
issues a call tocall_once(&Y)
.FNT2)
FNT2) Thus if it has access by other means to a file scope object
X
other than using the identifier,E
may still modifyX
provided it always stores a consistent value. Also, ifX
is guarded byY
,E
is never evaluated during a call tocall_once(&Y)
.
2 A function designator or lambda value
f
is idempotent if the function or lambda is stateless, has no storage leak and if any evaluationr = f(a1, ..., an)
is idempotent, wherea1, ..., an
are valid function call arguments, and wherer
is a variable with the non-void return type of the function or lambda. Analogously,f
with a return type ofvoid
is idempotent iff(a1, ..., an)
is idempotent witha1, ..., an
as above, andf
with a prototype with an empty parameter list is idempotent if the expressionsr = f()
orf()
are idempotent, respectively.FNT3)
FNT3) If the function call
f()
is idempotent it has no observable side effects.
3 The
idempotent
attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function is idempotent.
Example
4 The attribute in the following function declaration asserts that two consecutive calls to the function will result in the same pointer return value. So if no change to the abstract state occurs between two calls the second is redundant and has no additional side effects.
Also, because the function has no storage leak the returned pointer value is either a null pointer, a pointer to a static object or a pointer to an object that has been allocated during a call to call_once. Thus application code and implementations may suppress repeated calls to the function as long as they retain the return value.
independent
6.7.11.6.4 The independent attribute
Description
1 A function call is independent if all lvalue conversions on an object
X
that happen during the call, inclusive calls into other functions or lambdas, refers to an object that is reachable through a pointer parameter of the function call or that has one of the following properties.
X
has automatic storage duration:
X
is the instance of a function parameter associated with the function call, or- its definition is reached during the call.
X
has allocated storage duration:
X
is allocated during the call.FNT4)X
has static or thread-local storage duration and is notvolatile
qualified:
X
isconst
qualified, orX
has a once-guardY
, and the called function callscall_once(&Y)
before any access toX
.
FNT4) An independent function call still change thread specific execution state when using memory management functions. As a consequence addresses of locally allocated storage depend on the exact moment in which an implementation effects the call.
2 A function definition or lambda expression is independent, if it has no storage leak, is stateless and if all function calls with the function designator or lambda value are independent.
3 The
independent
attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function is independent.
Example
3 The attribute in the following function declaration asserts that it doesn’t depend on any particular state of the abstract machine. Calls to the function can be effected out of sequence before the return value is needed, but two calls to the function with the same argument may in general result in different pointer return values.
Because the function has no storage leak the returned pointer is either
- a null pointer,
- a pointer to an object of static storage duration or to an object that has been allocated during a call to
call_once
,- or refers to a new allocation that has not been seen before.
If repeating the call several times with the same argument returns distinct values, such return values refer to newly allocated storage that is not aliased by any other object.
unsequenced
6.7.11.6.5 The unsequenced attribute
Description
1 The
unsequenced
attribute implies theindependent
andidempotent
attributes. For all function declarations and lambda expressions that have both theindependent
andidempotent
attributes theunsequenced
attribute is implied.FNT5)
FNT5) That is, the
unsequenced
attribute is a short form for simultaneously applying these two attributes.
2 A function or lambda value is unsequenced if a function call can be executed as soon as the values of the arguments and all objects that are accessible through them have been determined, and if it can be executed as late as any of its return value or modified pointed-to arguments, even indirectly, are accessed.
3 The
unsequenced
attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function is unsequenced.
4 NOTE 1 The
unsequenced
attribute asserts strong properties for the annotated function or lambda, in particular that certain sequencing requirements for function calls can be relaxed without affecting the state of the abstract machine. Thereby, calls to such functions are natural candidates for optimization techniques such as common subexpression elimination, local memoization or lazy evaluation.
5 NOTE 2 Whether a particular C library function is unsequenced depends on properties of the implementation. Many functions in the <math.h> header are unsequenced for an implementation that has
math_errhandling
evaluate to zero.
Example 1
6 The attribute in the following function declaration asserts that it doesn’t depend on any modifiable state of the abstract machine. Calls to the function can be executed out of sequence before the return value is needed and two calls to the function will result in the same return value.
Therefore such a call for a given argument value needs only to be executed once and the returned value can be reused when appropriate. For example, calls for all possible argument values can be executed during translation or program startup and tabulated.
Example 2
7 The attribute in the following function declaration asserts that it doesn’t depend on any modifiable state of the abstract machine. Calls to the function can be executed out of sequence before the return value is needed and two calls to the function will result in the same pointer return value. Therefore such a call needs only to be executed once and the returned pointer value can be reused when appropriate. For example, a single call can be executed during program startup and the result can be held in some hidden state.
Also, because the function has no storage leak the returned pointer value is either a null pointer or a pointer to a static object, but it does not refer to a new allocation because that would be different for each call. Since the function is also stateless, any local static object of the function is
const
qualified. As a consequence, if the return value is not a null pointer and does not refer to an object (or subobject thereof) with internal or external linkage, the translator can deduce that it refers to a local static object with aconst
-qualified but non-volatile
definition, and, that the pointed-to object does not alias any other object and will not change during the whole execution.