| Document Number: | P0574r1 |
| Date: | 2017-03-02 |
| Author: | Anthony
Williams Just Software Solutions Ltd |
| Reply-To: | Anthony
Williams Just Software Solutions Ltd |
| Bryce Adelstein Lelbach Lawrence Berkeley National Laboratory | |
| Audience: | SG1, LWG |
This paper addresses NB comments CH 10.
The C++17 draft features overloads of a lot of the standard library
algorithms which take an ExecutionPolicy argument, in order to
allow parallel implementations. However, the requirements of many of the
algorithms are written in such a way as to require a sequential
implementation, or a sub-optimal parallel implementation, thus eliminating
some or all of the benefit in the new overloads.
P0523r1 attempts to address this
problem by a blanket relaxation of the constraints for overloads with
an ExecutionPolicy. I think this is an incorrect solution. It
simultaneously grants too much leeway for some algorithms, while not fixing
the problems with others. Instead, I think it is better to address each
algorithm individually.
In this paper I have gathered those algorithms where the requirements are specified in such a way as to rule out an efficient parallel implementation, and proposed alternative specifications for those algorithms.
This paper has been updated since P0574r0 to incorporate changes suggested by SG1 at Kona 2017. Notably:
std::execution::sequential_policy is no longer special;
it is treated as any other ExecutionPolicy from the point
of view of complexity constraints.exclusive_scan,
inclusive_scan, transform_exclusive_scan
and transform_inclusive_scan.It has also been updated to incorporate changes suggested by LWG at Kona 2017, notably:
BinaryOperation, etc are function objects.last iterator.unique_copy, unique_copy_if and partition_copy to P0467r2.adjacent_difference and move it to P0467r2.Modify the complexity for adjacent_find paragraph 2 as follows:
2 Complexity: For the overloads with noExecutionPolicyFor a nonempty range, exactlymin((i - first) + 1, (last - first) - 1)applications of the corresponding predicate, whereiisadjacent_find's return value. For the overloads with anExecutionPolicy, O(last - first) applications of the corresponding predicate.
Modify the requirements for copy_if in paragraph 12 as follows:
12 Requires: The ranges[first, last)and[result, result + (last - first))shall not overlap. [Note: For the overloads with anExecutionPolicy, there may be a performance cost ifiterator_traits<ForwardIterator1>::value_typeis notMoveConstructible. — end note]
Modify the requirements for remove_copy and remove_copy_if in paragraph 7 as follows:
7 Requires: The ranges[first, last)and[result, result + (last - first))shall not overlap. The expression*result = *firstshall be valid. [Note: For the overloads with anExecutionPolicy, there may be a performance cost ifiterator_traits<ForwardIterator1>::value_typeis notMoveConstructible. — end note]
Modify the complexity for nth_element in paragraph 3 as follows:
3 Complexity: For the overloads with noExecutionPolicy, lLinear on average. For the overloads with anExecutionPolicy, O(N) applications of the predicate, and O(N log N) swaps, where N =last - first.
Modify the complexity for partition in paragraph 7 as follows:
7 Complexity:IfLet N =ForwardIteratormeets the requirements for aBidirectionalIterator, at most(last - first) / 2swaps are done; otherwise at mostlast - firstswaps are done. Exactlylast - firstapplications of the predicate are done.last - first:
- For the overloads with no
ExecutionPolicy, exactly N applications of the predicate. At most N / 2 swaps ifForwardIteratormeets theBidirectionalIteratorrequirements and at most N swaps otherwise.- For the overloads with an
ExecutionPolicy, O(N log N) swaps and O(N) applications of the predicate.
Modify the complexity for stable_partition in paragraph 11 as follows:
11 Complexity:At most N log(N) swaps, where N =Let N =last - first, but only O(N) swaps if there is enough extra memory. Exactlylast - firstapplications of the predicate.last - first:
- For the overloads with no
ExecutionPolicy, at most N log N swaps, but only O(N) swaps if there is enough extra memory. Exactly N applications of the predicate.- For the overloads with an
ExecutionPolicy, O(N log N) swaps and O(N) applications of the predicate.
Modify the complexity for merge in paragraph 4 as follows:
4 Complexity: For the overloads with noExecutionPolicy, aAt most(last1 - first1) + (last2 - first2) - 1comparisons. For the overloads with anExecutionPolicy, O((last1 - first1) + (last2 - first2)) comparisons.
Modify the complexity for inplace_merge in paragraph 8 as follows:
8 Complexity:When enough additional memory is available,Let N =(last - first) - 1comparisons. If no additional memory is available, an algorithm with complexity N log(N) may be used, where N =last - first.last - first:
- For the overloads with no
ExecutionPolicy, if enough additional memory is available, exactly N- 1comparisons.- For the overloads with no
ExecutionPolicyif no additional memory is available, and for the overloads with anExecutionPolicy, O(N log N) comparisons.
<numeric> synopsis [numeric.ops.overview]Add the following paragraph after paragraph 1:
1 The requirements on the types of algorithms' arguments that are described in the introduction to Clause 25 also apply to the following algorithms.
2 Throughout this Clause, the
UnaryOperation,BinaryOperation,BinaryOperation1andBinaryOperation2parameters are used whenever an algorithm expects a function object (20.14).
Modify the requirements for reduce in paragraph 5 as follows:
5 Requires:
Tshall beMoveConstructible(Table 23).- All of
binary_op(init, *first),binary_op(*first, init),binary_op(init, init), andbinary_op(*first, *first)shall be convertible toT.binary_opshall neither invalidate iterators or subranges, nor modify elements in the range[first, last.)]
Modify the requirements for transform_reduce in paragraph 1 as follows:
1 Requires:
Tshall beMoveConstructible(Table 23).- All of
binary_op(init, unary_op(*first)),binary_op(unary_op(*first), init),binary_op(init, init), andbinary_op(unary_op(*first), unary_op(*first))shall be convertible toT.- Neither
unary_opnorbinary_opshall invalidate subranges, or modify elements in the range[first, last.)]
Modify the requirements for exclusive_scan in paragraph 3 as follows:
3 Requires:
Tshall beMoveConstructible(Table 23).- All of
binary_op(init, *first),binary_op(init, init), andbinary_op(*first, *first)shall be convertible toT.binary_opshall neither invalidate iterators or subranges, nor modify elements in the ranges[first, lastor)][result, result + (last - first).)]
Modify the requirements for inclusive_scan in paragraph 3 as follows:
3 Requires:
- If
initis provided,Tshall beMoveConstructible(Table 23); otherwise,ForwardIterator1's value type shall beMoveConstructible(Table 23).- If
initis provided, all ofbinary_op(init, *first),binary_op(init, init), andbinary_op(*first, *first)shall be convertible toT; otherwise,binary_op(*first, *first)shall be convertible toForwardIterator1's value type.binary_opshall neither invalidate iterators or subranges, nor modify elements in the ranges[first, lastor)][result, result + (last - first).)]
Modify the requirements for transform_exclusive_scan in paragraph 1 as follows:
1 Requires:
Tshall beMoveConstructible(Table 23).- All of
binary_op(init, unary_op(*first)),binary_op(init, init), andbinary_op(unary_op(*first), unary_op(*first))shall be convertible toT.- Neither
unary_opnorbinary_opshall invalidate subranges, or modify elements in the ranges[first, lastor)][result, result + (last - first).)]
Modify the requirements for transform_inclusive_scan in paragraph 1 as follows:
1 Requires:Direction to the Editor: Add a footnote to each use of fully-closed ranges in the changes above that apply to 26.8 [numeric.ops] that says "The use of fully closed ranges is intentional".
- If
initis provided,Tshall beMoveConstructible(Table 23); otherwise,ForwardIterator1's value type shall beMoveConstructible(Table 23).- If
initis provided, all ofbinary_op(init, unary_op(*first)),binary_op(init, init), andbinary_op(unary_op(*first), unary_op(*first))shall be convertible toT; otherwise,binary_op(unary_op(*first), unary_op(*first))shall be convertible toForwardIterator1's value type.- Neither
unary_opnorbinary_opshall invalidate iterators or subranges, or modify elements in the ranges[first, lastor)][result, result + (last - first).)]