Explicit Initializers for Atomics

ISO/IEC JTC 1/SC 22/WG 14/N1482

ISO/IEC JTC1 SC22 WG14 N1482 - 2010-05-26

Paul E. McKenney, paulmck@linux.vnet.ibm.com
Mark Batty, mjb220@cl.cam.ac.uk
Clark Nelson, clark.nelson@intel.com
N.M. Maclaren, nmm1@cam.ac.uk
Hans Boehm, hans.boehm@hp.com
Anthony Williams, anthony@justsoftwaresolutions.co.uk
Peter Dimov, pdimov@mmltd.net
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org

Introduction

Mark Batty recently undertook a partial formalization of the C++ memory model, which Mark summarized in N2955. This paper summarizes the discussions on Mark's paper, both verbal and email, recommending appropriate actions for the Library Working Group. Core issues are dealt with in a companion N3074 paper.

This paper is based on N1445, which is in turn based on WG21 N3045, and has been updated to reflect discussions in the Concurrency subgroup of the Library Working Group in Pittsburgh. This paper also carries the C-language side of N3040, which was also discussed in the Concurrency subgroup of the Library Working Group in Pittsburgh.

Library Issues

Library Issue 1: 29.3p1 Limits to Memory-Order Relaxation (Non-Normative)

Add a note stating that memory_order_relaxed operations must maintain indivisibility, as described in the discussion of 1.10p4. This must be considered in conjunction with the resolution to LWG 1151, which is expected to be addressed by Hans Boehm in N3040.

Library Issue 2: 29.3p11 Schedulers, Loops, and Atomics (Normative)

The second sentence of this paragraph, “Implementations shall not move an atomic operation out of an unbounded loop”, does not add anything to the first sentence, and, worse, can be interpreted as restricting the meaning of the first sentence. This sentence should therefore be deleted. The Library Working Group discussed this change during the Santa Cruz meeting in October 2009, and agreed with this deletion.

Library Issue 3: 29.5.1 Uninitialized Atomics and C/C++ Compatibility (Normative)

This topic was the subject of a spirited discussion among a subset of the participants in the C/C++-compatibility effort this past October and November.

Unlike C++, C has no mechanism to force a given variable to be initialized. Therefore, if C++ atomics are going to be compatible with those of C, either C++ needs to tolerate uninitialized atomic objects, or C needs to require that all atomic objects be initialized. There are a number of cases to consider:

  1. C static variables. The C standard specifies that these are initialized bitwise to zero. The C “={value}” syntax may be used to explicitly initialize these values, however, such initialization may not contain any statements executing at run time.
  2. C on-stack auto variables. The C standard does not require that these be initialized. On some machines, such variables might be initialized to an error value (for example, not-a-thing (NAT) for variables on Itanium that live only in a machine register). The C “={value}” syntax may be used to explicitly initialize these values, and may include statements executing at run time.
  3. C dynamically allocated variables, for example, via malloc(). The C standard does not require that these be initialized. The C “={value}” syntax may not be used to explicitly initialize these values.

Of course, C on-stack auto variables and dynamically allocated variables are inaccessible to other threads until references to them are published. Such publication must ensure that any initialization happens before any access to the variable from another thread, for example, by use of store release or locking.

There are also a number of interesting constraints on these types:

  1. The C++0x Working Draft requires that the atomic integral type have standard layout (29.5.1p2).
  2. The C++0x Working Draft requires that the atomic pointer type have standard layout (29.5.2p1).
  3. The C++0x Working Draft requires that the atomic flag type have standard layout (29.7p3).

These constraints permit but three known ways for C++ to make use of non-generic atomic types defined in C-language translation units:

  1. The atomic type is a structure containing a single field of the underlying type, possibly followed by padding. There is an implementation-provided external lock table, and the implementation locates the lock corresponding to a given instance of an atomic type by hashing that instance's address. The implementation is of course responsible for correctly initializing the array of locks. This implementation permits C++ to tolerate an unspecified initial value for a given instance of an atomic type, but only in cases where every bit pattern corresponds to a valid value of the atomic type in question.
  2. The atomic type is a structure containing a single field of the underlying type, possibly followed by padding. If the atomic type is implemented in a non-lock-free manner, an external table is used to check whether a given instance of an atomic type has been initialized, allowing it to be initialized if required. Such initialization could include any locks that might be embedded in instance of the atomic type. This external table would be accessed by both C and C++ code for each access to the atomic variable in question (although a clever optimizer might be able to elide some table accesses). This table would clearly need to be implemented so as to tolerate multithreaded access and modification. In addition, special handling might be required to ensure that any atomic variables residing in deallocated memory were removed from the external table. There are therefore serious concerns about the overhead of this approach.
  3. If the underlying hardware supports atomic operations that are large enough to cover the given non-generic atomic type, then those atomic operations can be used directly.
  4. Any instance of an atomic type that is defined in a C-language translation units must be initialized by C code before the first C++ use of that instance. This approach requires two syntaxes for C-language initialization, one to be applied to static variables and another for dynamically allocated objects. Either syntax may be appled to auto variables.

The wording below permits any of the above implementation alternatives. Note that WG14's C-language working draft also requires initializers for non-flag atomic types (initialization is already provided in the C++ working draft via constructors). These are listed in a subsequent section.

Library Issue 4: WG14 7.16.1p2 lock-free macros

The current WG14 wording assumes that all integral types will have the same degree of lock freedom. This assumption fails for long long on 32-bit machines, and would fail for other types for machines with smaller word sizes. WG14 therefore needs separate per-integral-type macros that indicate whether the corresponding type has a lock-free implementation.

Library Issue 5: WG14 7.16.2p13 read-modify-write operations

As noted in N3040, the current specification is not very clear on what it means for an “atomic read-modify-write operation” to be “atomic”. In particular, a given atomic read-modify-write operations is required to read the last value written before the write associated with that read-modify-write operation.

WG21 C++-Language Wording

Wording for Library Issue 1

Add a note after 2.93p1 as follows:

The enumeration memory_order specifies the detailed regular (non–atomic) memory synchronization order as defined in 1.10 and may provide for operation ordering. Its enumerated values and their meanings are as follows:

— memory_order_relaxed: no operation orders memory.
memory_order_release, memory_order_acq_rel, and memory_order_seq_cst: a store operation performs a release operation on the affected memory location.
memory_order_consume: a load operation performs a consume operation on the affected memory location.
memory_order_acquire, memory_order_acq_rel, and memory_order_seq_cst: a load operation performs an acquire operation on the affected memory location.

[ Note: Atomic operations specifying memory_order_relaxed are relaxed only with respect to memory ordering. Implementations must still guarantee that any given atomic access to a particular atomic object be indivisible with respect to all other atomic accesses to that object. — end note. ]

Wording for Library Issue 2

Therefore, remove the second sentence of 29.3p11 as follows:

Implementations should make atomic stores visible to atomic loads within a reasonable amount of time. Implementations shall not move an atomic operation out of an unbounded loop.

Wording for Library Issue 3

Add the following to WG21 29.5.1 (Integral Types) in locations corresponding to the existing atomic_is_lock_free() functions:

void atomic_init(volatile atomic_bool*, bool);

void atomic_init(atomic_bool*, bool);

void atomic_init(volatile atomic_itype*, itype);

void atomic_init(atomic_itype*, itype);

Note that ATOMIC_INIT is already in use, for example, in the Linux kernel. Google code search was unable to find ATOMIC_VAR_INIT or atomic_init.

Add the following to WG21 29.5.2 (Address Type) located corresponding to the existing atomic_is_lock_free() function:

void atomic_init(volatile atomic_address*, void *);

void atomic_init(atomic_address*, void *);

Add the following after WG21 29.6p4 (Operations on Atomic Types):

#define ATOMIC_VAR_INIT(value) see below

Remarks: A macro expanding to a token sequence suitable for initializing an atomic variable of a type that is initialization–compatible with value. Concurrent access to the variable being initialized, even via an atomic operation, constitutes a data race.

[ Example:

atomic_int v = ATOMIC_VAR_INIT(5);

— end example ]

Add the following after WG21 29.6p5 (Operations on Atomic Types):

void atomic_init(volatile A *object, C desired);

void atomic_init(A *object, C desired);

Effects: Non–atomically assigns the value desired to object. Concurrent access from another thread, even via an atomic operation, constitutes a data race.

WG14 C–Language Wording

Change WG14 7.16.1p1 as follows:

The header <stdatomic.h> defines threeseveral macros and declares several types and functions for performing atomic operations on data shared between threads.

Change WG14 7.16.1p2 as follows:

The macros defined are

ATOMIC_INTEGRAL_LOCK_FREE
ATOMIC_CHAR_LOCK_FREE
ATOMIC_CHAR16_T_LOCK_FREE
ATOMIC_CHAR32_T_LOCK_FREE
ATOMIC_WCHAR_T_LOCK_FREE
ATOMIC_SHORT_LOCK_FREE
ATOMIC_INT_LOCK_FREE
ATOMIC_LONG_LOCK_FREE
ATOMIC_LLONG_LOCK_FREE
ATOMIC_ADDRESS_LOCK_FREE

which indicate the general lock–free property of integer and address the corresponding atomic types, with the signed and unsigned variants grouped together; and

ATOMIC_FLAG_INIT

which expands to an initializer for an object of type atomic_flag.

As called out in N3040 “More precise definition of atomic”, add a paragraph to WG14 following 7.16.2p9 as follows:

Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.

Add the following note after 7.16.2p6:

NOTE: Atomic operations specifying memory_order_relaxed are relaxed only with respect to memory ordering. Implementations must still guarantee that any given atomic access to a particular atomic object be indivisible with respect to all other atomic accesses to that object.

Change WG14 7.16.2p13 as follows:

Implementations should make atomic stores visible to atomic loads within a reasonable amount of time. Implementations shall not move an atomic operation out of an unbounded loop.

Add a new section to WG14 named “Initialization”:

7.16.N Initialization

7.16.N.1 The ATOMIC_VAR_INIT macro

#include <stdatomic.h>
#define ATOMIC_VAR_INIT(C value)

The macro ATOMIC_VAR_INIT expands to a token sequence suitable for initializing an atomic variable of a type that is initialization-compatible with value. Concurrent access to the variable being initialized, even via an atomic operation, constitutes a data race. However, the default zero–initialization is guaranteed to produce a valid object where it applies.

EXAMPLE

atomic_int guide = ATOMIC_VAR_INIT(42);

A non-static atomic variable that is not explicitly initialized with ATOMIC_VAR_INIT is initially in an indeterminate state.

Add a new section to WG14 named “7.16.N.2 The atomic_init generic function”:

Synopsis

#include <stdatomic.h>
void atomic_init(volatile A *obj, C value);

Description

Initializes the atomic object obj to the value value, while also initializing any additional state that the implementation might need to carry for the atomic object. Although this function initializes an atomic object, it does not avoid data races.

EXAMPLE

atomic_init(&p->a, 42);