| Document Number: | P0574R0 |
| Date: | 2017-02-04 |
| Author: | Anthony
Williams Just Software Solutions Ltd |
| Audience: | SG1, LWG |
The C++17 draft features overloads of a lot of the standard library
algorithms which take an ExecutionPolicy argument, in order to
allow parallel implementations. However, the requirements of many of the
algorithms are written in such a way as to require a sequential
implementation, or a sub-optimal parallel implementation, thus eliminating
some or all of the benefit in the new overloads.
P0523r0: Wording for CH 10: Complexity
of parallel algorithms attempts to address this problem by a blanket
relaxation of the constraints for overloads with
an ExecutionPolicy. I think this is an incorrect solution. It
simultaneously grants too much leeway for some algorithms, while not fixing
the problems with others. Instead, I think it is better to address each
algorithm individually.
In this paper I have gathered those algorithms where the requirements are specified in such a way as to rule out an efficient parallel implementation, and proposed alternative specifications for those algorithms.
adjacent_find [alg.adjacent.find]Change paragraph 2 as follows:
Complexity: For the overloads with noExecutionPolicy, or anExecutionPolicyofstd::execution::sequential_policy, fFor a nonempty range, exactlymin((i - first) + 1, (last - first) - 1)applications of the corresponding predicate, whereiisadjacent_find’s return value. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, at most(last-first)-1applications of the corresponding predicate.
Modify the requirements for copy_if in paragraph 12 as
follows:
Requires: The ranges[first, last)and[result, result + (last - first))shall not overlap. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy,std::iterator_traits<InputIterator>::value_typeshall beMoveConstructible.
Modify the requirements for remove_copy
and remove_copy_if in paragraph 7 as follows:
Requires: The ranges[first, last)and[result, result + (last - first))shall not overlap. The expression*result = *firstshall be valid. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, the type of*firstshall satisfy theCopyConstructiblerequirements.
Modify the requirements for unique_copy
and unique_copy_if in paragraph 5 as follows:
Requires: The comparison function shall be an equivalence relation. The ranges[first, last)and[result, result+(last-first))shall not overlap. The expression*result = *firstshall be valid. LetTbe the value type ofInputIterator. For the overloads with noExecutionPolicy, or anExecutionPolicyofstd::execution::sequential_policy, the following shall hold:— If
InputIteratormeets the forward iterator requirements, then there are no additional requirements forT. Otherwise, ifOutputIteratormeets the forward iterator requirements and its value type is the same asT, thenTshall beCopyAssignable(Table 26). Otherwise,Tshall be bothCopyConstructible(Table 24) andCopyAssignable.For the overloads with an
ExecutionPolicyother thanstd::execution::sequential_policy,Tshall be bothCopyConstructible(Table 24) andCopyAssignable.
Modify the complexity for partition in paragraph 7 as
follows:
Complexity: For the overloads with noExecutionPolicy, or anExecutionPolicyofstd::execution::sequential_policy, iIfForwardIteratormeets the requirements for aBidirectionalIterator, at most(last - first) / 2swaps are done; otherwise at mostlast - firstswaps are done. Exactlylast - firstapplications of the predicate are done. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, no worse than O(N log N) swaps, where N =last - first. Exactlylast - firstapplications of the predicate are done.
Modify the complexity for stable_partition in paragraph 11 as
follows:
Complexity: For the overloads with noExecutionPolicy, or anExecutionPolicyofstd::execution::sequential_policy, aAt most N log(N ) swaps, where N =last - first, but only O(N ) swaps if there is enough extra memory. Exactlylast - firstapplications of the predicate. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, O(N log N) swaps, where N =last - first. Exactlylast - firstapplications of the predicate are done.
Modify the requirements for partition_copy in paragraph 12
as follows:
Requires:InputIterator’s value type shall beCopyAssignable, and shall be writable (24.2.1) to theout_trueandout_falseOutputIterators, and shall be convertible toPredicate’s argument type. The input range shall not overlap with either of the output ranges. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy,InputIterator’s value type shall also beCopyConstructible.
Modify the complexity for nth_element in paragraph 3 as
follows:
Complexity: For the overloads with noExecutionPolicy, or anExecutionPolicyofstd::execution::sequential_policy, lLinear on average. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, O(N) executions of the predicate, and no worse than O(N log N) swaps, where N =last - first.
Modify the complexity for merge in paragraph 4 as
follows:
Complexity: For the overloads with noExecutionPolicy, or anExecutionPolicyofstd::execution::sequential_policy, aAt most(last1 - first1) + (last2 - first2) - 1comparisons. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, O((last1 - first1) + (last2 - first2) - 1) comparisons.
Modify the complexity for inplace_merge in paragraph 8 as
follows:
Complexity: For the overloads with noExecutionPolicy, or anExecutionPolicyofstd::execution::sequential_policy, ifWhenenough additional memory is available,(last - first) - 1comparisons.If no additional memory is availableOtherwise, an algorithm with complexity at most O(Nlog(N)) may be used, whereN = last - first.
Modify paragraph 5 as follows:
Requires:binary_opshall neither invalidate iterators or subranges, nor modify elements in the range[first, last). For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, bothTandBinaryOperationshall beCopyConstructible. All ofbinary_op(init,*first),binary_op(*first,init),binary_op(init,init), andbinary_op(*first,*first)shall be convertible toT.
Modify paragraph 1 as follows:
Requires: Neitherunary_opnorbinary_opshall invalidate subranges, or modify elements in the range[first, last). For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, all ofT,UnaryOperationandBinaryOperationshall beCopyConstructible. All ofbinary_op(init,unary_op(*first),binary_op(unary_op(*first),init),binary_op(init,init), andbinary_op(unary_op(*first),unary_op(*first))shall be convertible toT.
Change paragraph 2 as follows:
Effects: For the overloads with noExecutionPolicy, or anExecutionPolicyofstd::execution::sequential_policy, fFor a non-empty range, the function creates an accumulatoraccwhose type isInputIterator’s value type, initializes it with*first, and assigns the result to*result. For every iteratoriin [first + 1,last) in order, creates an objectvalwhose type isInputIterator’s value type, initializes it with*i, computesval - accorbinary_op(val, acc), assigns the result to*(result + (i - first)), and move assigns fromvaltoacc. For the overloads with anExecutionPolicyother thanstd::execution::sequential_policy, for a non-empty range, for every valuedin [1,last-first-1) the function computes*(first+d)-*(first+d-1)orbinary_op(*(first+d),*(first+d-1))and assigns the result to*(result + d)