This paper proposes the addition of two new launch policies to std::async,
one sychronous (launch::sync) and one asynchronous (launch::task).
It also suggests changes to the default launch policy.
launch::task is an asynchronous execution policy that is
similar to the existing launch::async, except that it doesn't
require the creation of a new thread for each task.
The current asynchronous policy, launch::async, specifies that
execution occurs "as if in a new thread". The implementation is thus required
to create a new thread for each task. This is expensive.
The motivation for this imposed cost is that the task is guaranteed to start with fresh, default-constructed, thread local variables, and that those thread local variables are guaranteed to be destroyed immediately after completion.
A common use of thread local variables is to locally cache objects that are expensive to recreate. For such uses, destroying and reinitializing the thread local variables imposes an additional source of inefficiency on top of the mandated thread creation. Reuse of such thread locals is actually desirable.
In most other cases, reusing thread local variables across tasks is harmless.
Therefore, a launch policy that would allow the implementation to reuse a thread for more than one task execution would be a significant performance enhancement.
The common concerns about such thread reuse are:
The answers, in this proposal, are no and no.
Implementation-induced deadlocks are specifically disallowed, by introducing a requirement
that a task using the task (and async) policy shall be assigned a
thread no later than the first call to a wait function. The implementation may
avoid spawning too many threads and oversubscribing the CPU by taking advantage of its freedom
to use deferred or synchronous execution, if the user has included launch::deferred
or launch::sync as an allowed policy for the std::async call.
At program termination, completed or running tasks using the proposed launch::task
policy have the thread local variables of their corresponding threads destroyed before static
destruction takes place. This implies that exit may need to wait for the currently
running tasks to complete. Tasks that are launched after static destruction starts behave as if
launch::async has been used.
launch::sync is a synchronous policy that executes the task
directly in the std::async call.
On its surface, a policy that executes the task immediately may seem superfluous; the user could
have just executed the task instead of going through the trouble of using std::async.
Its advantages become more apparent if we consider that a routine may take a launch policy as a
parameter, as in the following pseudocode:
void routine( std::launch policy, args...)
{
/* ... */
std::future<X> fx = std::async( policy, ... );
/* ... */
}
Such parameterization is desirable, for example, if we want to be able to experiment with different launch policies and pick the one that delivers the best performance.
In such cases, it is very convenient to be able to tell routine to execute everything
synchronously, for the following reasons:
routine does not work as intended, the problem may have something to
do with the asynchronous execution, or it may not. Switching to launch::sync allows us
to quickly determine which of these two is the case.routine with launch::sync
can be very useful both as a sanity check (is it by chance faster than the supposedly parallel version?)
and as a baseline (how well does it scale?)launch::sync
for some of the recursive calls allows us finer control over which branches is executed in parallel
and which aren't.
In addition, launch::sync can be combined with other policies, to grant the implementation
the option to execute in the calling thread. This allows the implementation to better balance the load if,
for example, it detects that the task queue has grown too big.
Half-seriously, the policy also allows one to obtain a ready future holding a specific value or
exception:
std::future<int> x = std::async( std::launch::sync, []{ return 42; } );
std::future<int> y = std::async( std::launch::sync, [] -> int { throw std::runtime_error( "Hello exceptional world!" ); } );
The default launch policy is currently launch::async | launch::deferred and is unnamed.
This proposal suggest two changes. First, the default policy should be given a name, launch::default_.
Second, the default should be launch::sync | launch::async | launch::task | launch::deferred.
The default policy should be given a name both to simplify the specification and isolate any eventual changes to a single place, and to allow users to name it without spelling it out.
The plain std::async call, which implicitly uses the default policy, is, for many programmers,
their first encounter with parallelism in C++. It should make a good first impression, and good performance
is essential. The default policy should afford the implementation maximum flexibility in meeting the
performance expectations of a C++ programmer. That is why this paper suggests that the implementation should
be free to choose among all of the available policies.
Currently, there is still not much code that depends on the default, so the change will be relatively painless.
As more and more programmers take advantage of std::async, the default policy will progressively
become more entrenched and harder to change. The time for a change is now.
(All edits are relative to ISO/IEC 14882-2011.)
Change enum class launch in the synopsis of <future> in 30.6.1
[futures.overview] p1 as follows:
enum class launch : unspecified {
async = unspecified,
deferred = unspecified,
task = unspecified,
sync = unspecified,
default_ = sync | async | task | deferred,
implementation-defined
};
Change the first sentence of 30.6.1 [futures.overview] p2 as follows:
The enum typelaunchis an implementation-defined bitmask type (17.5.2.1.3) withlaunch::async,andlaunch::deferred,launch::task, andlaunch::syncdenoting individual bits.
Change the first sentence of 30.6.8 [futures.async] p3 as follows:
Effects: The first function behaves the same as a call to the second function with a policy argument oflaunch::async | launch::deferredlaunch::default_and the same arguments forFandArgs.
Add the following two bullets to 30.6.8 [futures.async] p3:
policy & launch::task is non-zero — equivalent to the
policy & launch::async case, except that the task may inherit the
thread_local variables from a previous completed task execution, and the
thread_local variables of the current execution are not necessarily destroyed immediately
after its completion. If the async call happens before a call to exit or return
from main, destructors for thread_local variables corresponding to the task's
thread will run before those for static duration objects. The call to exit or the return from
main may implicitly wait for currently running tasks using the launch::task policy to
complete. If the exit call or return from main happens before an std::async
call with launch::task policy then that call behaves as though it had used
launch::async policy. [Note: in a long-lived program, implementations are encouraged
to eventually destroy the thread_local variables of completed executions. — end note.]policy & launch::sync is non-zero — calls
INVOKE(DECAY_COPY(std::forward<F>(f)), DECAY_COPY(std::forward<Args>(args))...).
Any return value is stored as the result in the shared state. Any exception propagated from the execution of
INVOKE(DECAY_COPY(std::forward<F>(f)), DECAY_COPY(std::forward<Args>(args))...)
is stored as the exceptional result in the shared state.Add the following paragraph to 30.6.8 [futures.async] p3, after the bullets:
Tasks using thelaunch::asyncandlaunch::taskpolicies shall be assigned a thread and begin execution no later than the first call to awaitfunction (30.6.4). [Note: In other words, the implementation is not allowed to deadlock if an earlier task waits for a later one. — end note.]
Change 30.6.8 [futures.async] p6 as follows:
Throws:system_errorifpolicyislaunch::asyncorlaunch::taskand the implementation is unable to start a new thread.
Change 30.6.8 [futures.async] p7 as follows:
Error conditions:
resource_unavailable_try_again— ifpolicyislaunch::asyncorlaunch::taskand the system is unable to start a new thread.
Thanks to Hans Boehm, Herb Sutter, Niklas Gustafsson and Anthony Williams.
— end