Document Number: N2226
Submitter: Florian Weimer
Submission Date: 2018-03-26
Subject: Optional thread storage duration for the program's locale

Summary

This document proposes to allow implementations to use thread storage duration for the program's locale. It is related to papers N2225, N2227, N2228.

Current implementations have a global locale objects shared between all threads, and calls to setlocale affect all threads in the process. As noted in 7.11.1.1, this can introduce data races.

POSIX allows the activation of a per-thread locale using the uselocale function. An alternative would be to standardize that. However, few libraries written with POSIX in mind currently use uselocale. For example, several JSON libraries currently call setlocale to switch to the C locale, which is suitable for parsing and formatting numbers in the JSON syntax. This is unsafe due to the race condition it introduces.

POSIX uses the term global locale to refer to the program's locale. It does not say whether it is implemented with an object of thread storage duration.

With a locale objects that has thread storage duration, the locale name returned by the setlocale function may be deallocated if a thread exits, so it is desirable to make subsequent access to the pointer undefined. (This aspect of the proposal is already part of POSIX.) As an alternative, an implementation could reuse the storage for a subsequent setlocale call, but this would only obfuscate questionable application behavior as far as memory debuggers are concerned.

Another alternative, partially implemented in POSIX, involves adding explicit locale arguments to all locale-sensitive functions, but this would introduce numerous new interfaces because a lot of the I/O functions are locale-sensitive on at least some systems.

Proposed Resolution

In 7.11.1.1 (The setlocale function), add:
A call to the setlocale function may introduce a data race with other calls to the setlocale function or with calls to functions that are affected by the current locale. It is implementation-defined whether the program's locale has thread storage duration. The program's locale in a new thread shall be the same as the program's locale for the current thread at the time of creation.
And:
The string pointed to shall not be modified by the program, but may be overwritten by a subsequent call to the setlocale function. If the returned pointer is accessed after the thread which has called the setlocale function has exited, the behavior is undefined.
In J.2 (Undefined behavior), add:
— Access to the pointer returned by the setlocale function after the thread that originally called the function has exited (7.11.1.1).
In J.3.12 (Library functions), add:
— Whether the program's locale has thread storage duration (7.11.1.1).