strtok
calls from different threads depends on whether the internal state of this function has thread storage duration or not.localtime
function.
There is a strong incentive to implementations to make additional library state thread-specific because a lot of library code uses functions such as strtok
or localtime
in ways that assume that their effects are restricted to the current thread. (This is less of a problem for a single application, where the application developer generally knows whether a program is single-threaded or not.)
In many cases, these issues are the only source of thread safety issues.
Several C implementations already have made changes to better support such legacy libraries, but these implementations are currently non-conforming.
For example, Solaris documents such behavior for several functions:
The strtok()
function is safe to use in multithreaded applications because it saves its internal state in a thread-specific data area.
[Source]
Theasctime()
,ctime()
,gmtime()
, andlocaltime()
functions are safe to use in multithread applications because they employ thread-specific data. [Source]
Similarly, the Bionic C library for Android uses a buffer of thread storage duration for the strerror
result.
For some of the affected functions (such as localtime
), POSIX suggests that thread-safe implementations are possible, but does not go into details
This proposal subsumes the following previous submissions:
N2225 | Thread-local state for getenv , strtok , and set_constraint_handler |
N2226 | Thread-local state for the program locale (setlocale ) |
N2227 | Thread-local state for library functions with internal state (rand , srand , c16rtomb , c32rtomb , mbrlen , mbrtoc16 , mbrtoc32 , mbrtowc , mbsrtowcs , mbtowc , wcrtomb , wcsrtombs , wctomb ) |
N2228 | Thread-local state for library functions returning pointers to internal state (strerror , asctime , ctime , gmtime , localtime ) |
strtok
, the wording is tailored to this function in particular. It is based on a suggestion Philipp Klaus Krause posted to the reflector on 2018-04-19 (SC22WG14.15088).strerror
and setlocale
, it is made clear that a previously returned string may be deallocated by a subsequent call and subsequent access is therefore undefined.
setlocale
function), change:
The pointer to string returned by thesetlocale
function is such that a subsequent call with that string value and its associated category will restore that part of the program’s locale. The string pointed to shall not be modified by the program, but may be overwritten by a subsequent call to the. The behavior is undefined if the returned value is used after a subsequent call to thesetlocale
functionsetlocale
function, or after the thread which called the function to obtain the returned value has exited.
In J.2 (Undefined behavior), add:
— A pointer returned by the setlocale
function is used after a subsequent call to the function, or after the calling thread has exited (7.11.1.1).
For a state-dependent encoding, each function is placed into its initial conversion stateIn J.3.12 (Library functions), add:at program startupprior to the first call to the function and can be returned to that state by a call with a null pointer as its character pointer argument,s
. Subsequent calls withs
as other than a null pointer cause the internal conversion state of the function to be altered as necessary. It is implementation-defined whether internal conversion state has thread storage duration, and whether a newly created thread has the same state as the current thread at the time of creation, or the initial conversion state.
— Whether the internal state of multibyte/wide character conversion functions has thread-storage duration, and its initial value in newly created threads (7.22.7).
strtok
function), change:
A sequence of calls to the
strtok
function breaks the string pointed to bys1
into a sequence of tokens, each of which is delimited by a character from the string pointed to bys2
. The first call in the sequence has a non-null first argument; subsequent calls in the sequence have a null first argument. If any of the subsequent calls in the sequence is made by a different thread than the first, the behavior is undefined. The separator string pointed to bys2
may be different from call to call.
In J.2 (Undefined behavior), add:
— A sequence of calls of the strtok
function is made from different threads (7.24.5.8).
strerror
function), change:
TheIn J.2 (Undefined behavior), add:strerror
function returns a pointer to the string, the contents of which are locale- specific. The array pointed to shall not be modified by the program, but may be overwritten by a subsequent call to the. The behavior is undefined if the returned value is used after a subsequent call to thestrerror
functionstrerror
function, or after the thread which called the function to obtain the returned value has exited.
— A pointer returned by the strerror
function is used after a subsequent call to the function, or after the calling thread has exited (7.24.6.2).
Except for theIn J.2 (Undefined behavior), add:strftime
function, these functions each return a pointer to one of two types ofstaticobjects: a broken-down time structure or an array ofchar
. Execution of any of the functions that return a pointer to one of these object types may overwrite the information in any object of the same type pointed to by the value returned from any previous call to any of them and the functions are not required to avoid data races with each other. 322) Accessing the returned pointer after the thread that called the function that returned it has exited results in undefined behavior. The implementation shall behave as if no other library functions call these functions.
— An attempt is made to access the pointer returned by the time conversion functions after the thread that originally called the function to obtain it has exited (7.27.3).
IfIn 7.29.6.4 (Restartable multibyte/wide string conversion functions), change:ps
is a null pointer, each function uses its own internalmbstate_t
object instead, which is initializedat program startupprior to the first call to the function to the initial conversion state; the functions are not required to avoid data races with other calls to the same function in this case. It is implementation-defined whether the internalmbstate_t
object has thread storage duration; if it has thread storage duration, it is initialized to the initial conversion state prior to the first call to the function on the new thread. The implementation behaves as if no library function calls these functions with a null pointer forps
.
IfIn J.3.12 (Library functions), add:ps
is a null pointer, each function uses its own internal mbstate_t object instead, which is initializedat program startupprior to the first call to the function to the initial conversion state; the functions are not required to avoid data races with other calls to the same function in this case. It is implementation-defined whether the internalmbstate_t
object has thread storage duration; if it has thread storage duration, it is initialized to the initial conversion state prior to the first call to the function on the new thread. The implementation behaves as if no library function calls these functions with a null pointer forps
.
— Whether internal mbstate_t
objects have thread storage duration (7.28.1, 7.29.6.4).
getenv
, rand
, srand
, set_constraint_handler
. The changes were either controversial, or there is no implementation precedent for them.setlocale
is aligned with that for strerror
. No changes for the program's locale are proposed anymore.
This means the current proposed wording is compatible with a future standardization
of uselocale
from POSIX or a similar facility.strerror
is updated to make clear that a subsequent call to the function invalidates a previously returned pointer.strtok
proposal has been rewritten based on a discussion on the reflector.mbstate_t
objects per function.mbtowc
and wctomb
are now covered.