| Doc. no. | P0776R0 | 
| Date: | 2017-09-08 | 
| Project: | Programming Language C++ | 
| Audience: | SG1, LEWG | 
| Reply to: | Alisdair Meredith <ameredith1@bloomberg.net> | 
Original version of the paper for the 2017 pre-Albuquerque mailing.
The current working draft for the Parallelism TS is based on C++14. Many of the features were significantly revised and adopted into the C++17 standard, so this paper proposed revising the draft to reflect the new standard.
The first Parallelism TS achieved its purpose, and its contents were adopted into C++17 by applying the paper P0024R2 out of the Jacksonville 2016 meeting. However, there were several issues with the original TS that were resolved in the C++17 process, incorporating a further 11 papers and 4 LWG issue resolutions that were not applied back to the TS. Future TSes should be based on that more recent specification.
There are two namespace conventions for use inside the experimental namespace intended for TSes. The simplest is to use an inline namespace nested directly inside namespace std::experimental named after the TS, and incorporating a version suffix. This is used when the intent is to land the TS features directly into namespace std, should they be incorporated into a future standard. The alternative is to have a named (non-inline) namespace nested inside namespace std::experimental, and an inline version-number namespace inside that. This form is intended for use when it is expected that the named namespace will wrap the TS facilities in namespace std in the event that they are incorporated into a future standard.
The parallelism TS uses the second formulation, while document the intended adoption plan corresponds to the first formulation, and this is how it was applied by P0024R2.
There were a number of issues with the exception_list facility proposed by the original TS. One major concern was that there was no way for a user to populate an exception_list of their own, if they wanted to provide their own parallel algorithms following the standard interface. The intent was always to nail this down as part of adopting the specification into the standard, but doing so turned out to be more controversial than expected, and ultimately this form of error handling was deferred to a future set of parallel execution policies.
The most prominent concerns are discussed in paper P0322R0 but there is still no clear sign of progress on using this error-reporting scheme, or even if the exception_list class should be immutable. There are important design issues to reconsider for its use in the new task_block facility, given that the class is no longer needed for the parallel algorithms. It may be possible to resolve the problematic design concerns if the context is a class intended only for use in the task_block interface.
The parallel algorithms in C++17 use a different error reporting strategy, which is deferred to the parallel execution policy (and by default, simply calls terminate). It would seem unhelpful to retain the rejected error reporting behavior in the TS. If it were to be retained, it would be specified differently, by adopting new execution policies that have that behavior, which would have no impact on the specification of the algorithms themselves, so the duplication within the TS would be redundant.
There were concerns about adopting execution policies with runtime state (among other concerns) so the dynamic execution policy was deliberately dropped from the feature set adopted for C++17. Given the implementation concerns, and a deliberate poll to remove the feature as part of standardization, its ongoing presence in the TS is questionable.
The names for the remaining three policies went through several revisions, and do not match the policy names in the TS. Given the policies for the intended semantics have been adopted, we should not have duplicate policies with different names in the TS.
A variety of concerns came up with the specification for the algorithms, questioning the assumption that the serial specification was adequate for the parallel form of each algorithm. Several algorithms required explicit wording to get the complexity correct. One or two algorithms lost their parallel form, as it was unhelpful (due to mandatory serial stages) in the presence of the more general parallel algorithms added by the TS. Similarly, it was not clear what the correct restrictions to impose on algorithms taking input and output iterators would be in order to support parallelization (through some kind of scatter/gather) approach, and requiring support in all cases essentially serialized those algorithms, even where better approaches may be known to work. Hence, the iterator categories for many algorithms were changed, which required separate declarations for each algorithm, rather than the table with a list of names specified in this TS. Porting the revised specification for each algorithm into the TS is of dubious benefit, when we could just defer to the standard directly,
Two papers, P0296R2 and P0299R1, provide the vocabulary essential to provide forward progress guarantees, and then apply that vocabulary to the standard library. This wording is found only in C++17, and not back-ported to the TS, so we need to at least incorporate the C++17 specification by reference.
The resolution proposed by this paper is to update the reference to the base C++ Standard to refer to ISO 14882:2017, rather than 2014, and then strike the text for all components that were adopted (in some form) into C++17.
Essentially two components remain, exception_list remains underspecified, with no progress on understanding the needs or implementation details. However, the new task_block component is specified to report errors using this class, so it must remain, unless task_block is respecified to use an error handling strategy more compatible with C++17.
This paper further recommends opening library issues to address the concerns raised by the ongoing presence of exception_list in this TS.
Finally, this paper proposes using the inline namespace parallelism_v2 to contain the contents of this experimental TS, rather than an inline version namespace inside a nested parallel namespace, that is not intended for integration with a future C++ standard.
Another option would be to apply all of the C++17 papers to the TS. However, that would not be as simple as applying the edits to the standard, as many of those papers adjust words that are not present in the TS. It is also not entirely clear that there is a benefit in duplicating the whole specification in this way.
Another idea is that some of the changes for C++17 were the 'minimal' changes necessary to get a well-supported specification at this time, with the expectation that a number of issues would be revisited in time, with more generic (or specific) specification to follow, allowing more parallelism with better error-handling strategies. It may be worth retaining the original TS specification as a place to perform those ongoing experiments. This direction is rejected by this paper as holding over far too much of a pre-standard specification with a broad set of defects unrelated to that task. It would be preferred to add targeted new features as those proposals arise.
One last option is to entirely strike the specification for exception_list given the issues that arose trying to nail down a complete specification. However, it remains an essential part of the error-handling strategy for task_block, and this paper does not aim to review existing components that are not directly impacted by the adoption of C++17. While a review of this design in the light of C++17 is encouraged, this paper is not the place for that review.
Strike the sections deleted in the table of contents below, entirely
from the working paper:
- Scope
- Normative references
Terms and definitions- General
- Namespaces and headers
- Feature-testing recommendations
Execution policies
In general
Header <experimental/execution_policy> synopsis
Execution policy type trait
Sequential execution policy
Parallel execution policy
Parallel+Vector execution policy
Dynamic execution policy
execution_policy construct/assign
execution_policy object access
Execution policy objects- Parallel exceptions
Exception reporting behavior- Header <experimental/exception_list> synopsis
Parallel algorithms
In general
Requirements on user-provided function objects
Effect of execution policies on algorithm execution
ExecutionPolicy algorithm overloads
Definitions
Non-Numeric Parallel Algorithms
Header <experimental/algorithm> synopsis
For each
Numeric Parallel Algorithms
Header <experimental/numeric> synopsis
Reduce
Exclusive scan
Inclusive scan
Transform reduce
Transform exclusive scan
Transform inclusive scan- Task Block
- Header <experimental/task_block> synopsis
- Class task_cancelled_exception
- task_cancelled_exception member function what
- Class task_block
- task_block member function template run
- task_block member function wait
- Function template define_task_block
- Exception Handling
In addition, make the following edits to the clauses that remain. Original clause numbering is retained for clarity, although it is expected that the clauses will be renumbered applying the edits above.
2 Normative references [parallel.references]
- The following referenced document is indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
- ISO/IEC 14882:2017
—1, Programming Languages — C++- ISO/IEC 14882:2017
—is herein called the C++ Standard. The library described in ISO/IEC 14882:2017—clauses17-3020-33 is herein called the C++ Standard Library. The C++ Standard Library components described in ISO/IEC 14882:2017—clauses25, 26.7 and 20.7.228, 29.8 and 23.10.10 are herein called the C++ Standard Algorithms Library.- Unless otherwise specified, the whole of the C++ Standard's Library introduction
(C++14 §17)(C++17 §20) is included into this Technical Specification by reference.
1) To be published. Section references are relative to N3937.4.1 Namespaces and headers [parallel.general.namespaces]
- Since the extensions described in this Technical Specification are experimental and not part of the C++ Standard Library, they should not be declared directly within namespace std. Unless otherwise specified, all components described in this Technical Specification are declared in namespace std::experimental::
parallel::v2parallelism_v2. [ Note: Once standardized, the components described by this Technical Specification are expected to be promoted to namespace std. — end note ]- Unless otherwise specified, references to such entities described in this Technical Specification are assumed to be qualified with std::experimental::
parallel::v2parallelism_v2, and references to entities described in the C++ Standard Library are assumed to be qualified with std::.- Extensions that are expected to eventually be added to an existing header <meow> are provided inside the <experimental/meow> header, which shall include the standard contents of <meow> as if by
#include <meow>4.2 Feature-testing recommendations [parallel.general.features]
Table 1 — Feature Test Macro(s)
- An implementation that provides support for this Technical Specification shall define the feature test macro(s) in Table 1.
Name Value Header __cpp_lib_experimental_parallel_algorithm201505<experimental/algorithm><experimental/exception_list><experimental/execution_policy><experimental/numeric>__cpp_lib_experimental_parallel_task_block 201510 <experimental/exception_list> <experimental/task_block> 6.2 Header <experimental/exception_list> synopsis [parallel.exceptions.synopsis]
namespace std {namespace std::experimental { namespace inline parallelism_v2 {inline namespace v2 {class exception_list : public exception { public:typedef unspecified iterator;using iterator = unspecified; size_t size() const noexcept; iterator begin() const noexcept; iterator end() const noexcept; const char* what() const noexcept override; };}} }}
- The class exception_list owns a sequence of exception_ptr objects.
The parallel algorithms may use the exception_list to communicate uncaught exceptions encountered during parallel execution to the caller of the algorithm.8.1 Header <experimental/task_block> synopsis [parallel.task_block.synopsis]
namespace std {namespace std::experimental { namespace inline parallelism_v2 {inline namespace v2 {class task_cancelled_exception; class task_block; template<class F> void define_task_block(F&& f); template<class F> void define_task_block_restore_thread(F&& f);}} }}8.2 Class task_cancelled_exception [parallel.task_block.task_cancelled_exception]
namespace std {namespace std::experimental { namespace inline parallelism_v2 {inline namespace v2 {class task_cancelled_exception : public exception { public: task_cancelled_exception() noexcept; virtual const char* what() const noexcept; };}} }}8.3 Class task_block [parallel.task_block.class]
namespace std {namespace std::experimental { namespace inline parallelism_v2 {inline namespace v2 {class task_block { private: ~task_block(); public: task_block(const task_block&) = delete; task_block& operator=(const task_block&) = delete; void operator&() const = delete; template<class F> void run(F&& f); void wait(); };}} }}
Thanks to Jared Hoberock for the initial review of this paper.
The following papers and issues were applied to the C++ Working Paper as part of C++17:
The following papers informed the discussion of the papers and issues above: