[algorithms.parallel] - C++17 → C++20

Files changed (1) hide show

tmp/tmppm29sv2y/{from.md → to.md} +105 -71

tmp/tmppm29sv2y/{from.md → to.md} RENAMED Viewed

@@ -1,15 +1,15 @@
 ## Parallel algorithms <a id="algorithms.parallel">[[algorithms.parallel]]</a>
-This section describes components that C++programs may use to perform
-operations on containers and other sequences in parallel.
-### Terms and definitions <a id="algorithms.parallel.defns">[[algorithms.parallel.defns]]</a>
-A *parallel algorithm* is a function template listed in this
-International Standard with a template parameter named
-`ExecutionPolicy`.
 Parallel algorithms access objects indirectly accessible via their
 arguments by invoking the following functions:
 - All operations of the categories of the iterators that the algorithm
@@ -17,11 +17,11 @@ arguments by invoking the following functions:
 - Operations on those sequence elements that are required by its
   specification.
 - User-provided function objects to be applied during the execution of
   the algorithm, if required by the specification.
 - Operations on those function objects required by the specification.
-  \[*Note 1*: See  [[algorithms.general]]. — *end note*]
 These functions are herein called *element access functions*.
 [*Example 1*:
@@ -34,62 +34,123 @@ The `sort` function may invoke the following element access functions:
   preconditions specified in [[sort]]).
 - The user-provided `Compare` function object.
 — *end example*]
 ### Requirements on user-provided function objects <a id="algorithms.parallel.user">[[algorithms.parallel.user]]</a>
 Unless otherwise specified, function objects passed into parallel
 algorithms as objects of type `Predicate`, `BinaryPredicate`, `Compare`,
 `UnaryOperation`, `BinaryOperation`, `BinaryOperation1`,
 `BinaryOperation2`, and the operators used by the analogous overloads to
 these parallel algorithms that could be formed by the invocation with
 the specified default predicate or operation (where applicable) shall
 not directly or indirectly modify objects via their arguments, nor shall
-they rely on the identity of the provided objects..
 ### Effect of execution policies on algorithm execution <a id="algorithms.parallel.exec">[[algorithms.parallel.exec]]</a>
-Parallel algorithms have template parameters named `ExecutionPolicy` (
-[[execpol]]) which describe the manner in which the execution of these
 algorithms may be parallelized and the manner in which they apply the
 element access functions.
 Unless otherwise stated, implementations may make arbitrary copies of
 elements (with type `T`) from sequences where
 `is_trivially_copy_constructible_v<T>` and
 `is_trivially_destructible_v<T>` are `true`.
-[*Note 1*: This implies that user-supplied function objects should not
 rely on object identity of arguments for such input sequences. Users for
 whom the object identity of the arguments to these function objects is
 important should consider using a wrapping iterator that returns a
-non-copied implementation object such as `reference_wrapper<T>` (
-[[refwrap]]) or some equivalent solution. — *end note*]
 The invocations of element access functions in parallel algorithms
 invoked with an execution policy object of type
 `execution::sequenced_policy` all occur in the calling thread of
 execution.
-[*Note 2*: The invocations are not interleaved; see
 [[intro.execution]]. — *end note*]
 The invocations of element access functions in parallel algorithms
 invoked with an execution policy object of type
-`execution::parallel_policy` are permitted to execute in either the
 invoking thread of execution or in a thread of execution implicitly
 created by the library to support parallel algorithm execution. If the
-threads of execution created by `thread` ([[thread.thread.class]])
-provide concurrent forward progress guarantees ([[intro.progress]]),
-then a thread of execution implicitly created by the library will
-provide parallel forward progress guarantees; otherwise, the provided
-forward progress guarantee is *implementation-defined*. Any such
-invocations executing in the same thread of execution are
-indeterminately sequenced with respect to each other.
-[*Note 3*: It is the caller’s responsibility to ensure that the
 invocation does not introduce data races or deadlocks. — *end note*]
 [*Example 1*:
 ``` cpp
@@ -109,13 +170,13 @@ to the container `v`.
 ``` cpp
 std::atomic<int> x{0};
 int a[] = {1,2};
 std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) {
-  x.fetch_add(1, std::memory_order_relaxed);
   // spin wait for another iteration to change the value of x
-  while (x.load(std::memory_order_relaxed) == 1) { } // incorrect: assumes execution order
 });
 ```
 The above example depends on the order of execution of the iterations,
 and will not terminate if both iterations are executed sequentially on
@@ -139,78 +200,51 @@ The above example synchronizes access to object `x` ensuring that it is
 incremented correctly.
 — *end example*]
 The invocations of element access functions in parallel algorithms
-invoked with an execution policy of type
 `execution::parallel_unsequenced_policy` are permitted to execute in an
 unordered fashion in unspecified threads of execution, and unsequenced
 with respect to one another within each thread of execution. These
 threads of execution are either the invoking thread of execution or
 threads of execution implicitly created by the library; the latter will
 provide weakly parallel forward progress guarantees.
-[*Note 4*: This means that multiple function object invocations may be
 interleaved on a single thread of execution, which overrides the usual
 guarantee from [[intro.execution]] that function executions do not
-interleave with one another. — *end note*]
-Since `execution::parallel_unsequenced_policy` allows the execution of
-element access functions to be interleaved on a single thread of
-execution, blocking synchronization, including the use of mutexes, risks
-deadlock. Thus, the synchronization with
-`execution::parallel_unsequenced_policy` is restricted as follows: A
-standard library function is *vectorization-unsafe* if it is specified
-to synchronize with another function invocation, or another function
-invocation is specified to synchronize with it, and if it is not a
-memory allocation or deallocation function. Vectorization-unsafe
-standard library functions may not be invoked by user code called from
-`execution::parallel_unsequenced_policy` algorithms.
-[*Note 5*: Implementations must ensure that internal synchronization
-inside standard library functions does not prevent forward progress when
-those functions are executed by threads of execution with weakly
-parallel forward progress guarantees. — *end note*]
-[*Example 4*:
-``` cpp
-int x = 0;
-std::mutex m;
-int a[] = {1,2};
-std::for_each(std::execution::par_unseq, std::begin(a), std::end(a), [&](int) {
-  std::lock_guard<mutex> guard(m); // incorrect: lock_guard constructor calls m.lock()
-  ++x;
-});
-```
-The above program may result in two consecutive calls to `m.lock()` on
-the same thread of execution (which may deadlock), because the
-applications of the function object are not guaranteed to run on
-different threads of execution.
-— *end example*]
-[*Note 6*: The semantics of the `execution::parallel_policy` or the
-`execution::parallel_unsequenced_policy` invocation allow the
-implementation to fall back to sequential execution if the system cannot
-parallelize an algorithm invocation due to lack of
-resources. — *end note*]
 If an invocation of a parallel algorithm uses threads of execution
 implicitly created by the library, then the invoking thread of execution
 will either
-- temporarily block with forward progress guarantee delegation (
-  [[intro.progress]]) on the completion of these library-managed threads
   of execution, or
 - eventually execute an element access function;
 the thread of execution will continue to do so until the algorithm is
 finished.
-[*Note 7*: In blocking with forward progress guarantee delegation in
 this context, a thread of execution created by the library is considered
 to have finished execution as soon as it has finished the execution of
 the particular element access function that the invoking thread of
 execution logically depends on. — *end note*]
@@ -248,7 +282,7 @@ says “at most *expr*” or “exactly *expr*” and does not specify the
 number of assignments or swaps, and *expr* is not already expressed with
 𝑂() notation, the complexity of the algorithm shall be
 𝑂(\placeholder{expr}).
 Parallel algorithms shall not participate in overload resolution unless
-`is_execution_policy_v<decay_t<ExecutionPolicy>>` is `true`.

 ## Parallel algorithms <a id="algorithms.parallel">[[algorithms.parallel]]</a>
+### Preamble <a id="algorithms.parallel.defns">[[algorithms.parallel.defns]]</a>
+Subclause [[algorithms.parallel]] describes components that C++ programs
+may use to perform operations on containers and other sequences in
+parallel.
+A *parallel algorithm* is a function template listed in this document
+with a template parameter named `ExecutionPolicy`.
 Parallel algorithms access objects indirectly accessible via their
 arguments by invoking the following functions:
 - All operations of the categories of the iterators that the algorithm
 - Operations on those sequence elements that are required by its
   specification.
 - User-provided function objects to be applied during the execution of
   the algorithm, if required by the specification.
 - Operations on those function objects required by the specification.
+  \[*Note 1*: See  [[algorithms.requirements]]. — *end note*]
 These functions are herein called *element access functions*.
 [*Example 1*:
   preconditions specified in [[sort]]).
 - The user-provided `Compare` function object.
 — *end example*]
+A standard library function is *vectorization-unsafe* if it is specified
+to synchronize with another function invocation, or another function
+invocation is specified to synchronize with it, and if it is not a
+memory allocation or deallocation function.
+[*Note 2*: Implementations must ensure that internal synchronization
+inside standard library functions does not prevent forward progress when
+those functions are executed by threads of execution with weakly
+parallel forward progress guarantees. — *end note*]
+[*Example 2*:
+``` cpp
+int x = 0;
+std::mutex m;
+void f() {
+  int a[] = {1,2};
+  std::for_each(std::execution::par_unseq, std::begin(a), std::end(a), [&](int) {
+    std::lock_guard<mutex> guard(m);            // incorrect: lock_guard constructor calls m.lock()
+  ++x;
+  });
+}
+```
+The above program may result in two consecutive calls to `m.lock()` on
+the same thread of execution (which may deadlock), because the
+applications of the function object are not guaranteed to run on
+different threads of execution.
+— *end example*]
 ### Requirements on user-provided function objects <a id="algorithms.parallel.user">[[algorithms.parallel.user]]</a>
 Unless otherwise specified, function objects passed into parallel
 algorithms as objects of type `Predicate`, `BinaryPredicate`, `Compare`,
 `UnaryOperation`, `BinaryOperation`, `BinaryOperation1`,
 `BinaryOperation2`, and the operators used by the analogous overloads to
 these parallel algorithms that could be formed by the invocation with
 the specified default predicate or operation (where applicable) shall
 not directly or indirectly modify objects via their arguments, nor shall
+they rely on the identity of the provided objects.
 ### Effect of execution policies on algorithm execution <a id="algorithms.parallel.exec">[[algorithms.parallel.exec]]</a>
+Parallel algorithms have template parameters named `ExecutionPolicy`
+[[execpol]] which describe the manner in which the execution of these
 algorithms may be parallelized and the manner in which they apply the
 element access functions.
+If an object is modified by an element access function, the algorithm
+will perform no other unsynchronized accesses to that object. The
+modifying element access functions are those which are specified as
+modifying the object.
+[*Note 1*: For example, `swap`, `++`, `--`, `@=`, and assignments
+modify the object. For the assignment and `@=` operators, only the left
+argument is modified. — *end note*]
 Unless otherwise stated, implementations may make arbitrary copies of
 elements (with type `T`) from sequences where
 `is_trivially_copy_constructible_v<T>` and
 `is_trivially_destructible_v<T>` are `true`.
+[*Note 2*: This implies that user-supplied function objects should not
 rely on object identity of arguments for such input sequences. Users for
 whom the object identity of the arguments to these function objects is
 important should consider using a wrapping iterator that returns a
+non-copied implementation object such as `reference_wrapper<T>`
+[[refwrap]] or some equivalent solution. — *end note*]
 The invocations of element access functions in parallel algorithms
 invoked with an execution policy object of type
 `execution::sequenced_policy` all occur in the calling thread of
 execution.
+[*Note 3*: The invocations are not interleaved; see
 [[intro.execution]]. — *end note*]
 The invocations of element access functions in parallel algorithms
 invoked with an execution policy object of type
+`execution::unsequenced_policy` are permitted to execute in an unordered
+fashion in the calling thread of execution, unsequenced with respect to
+one another in the calling thread of execution.
+[*Note 4*: This means that multiple function object invocations may be
+interleaved on a single thread of execution, which overrides the usual
+guarantee from [[intro.execution]] that function executions do not
+overlap with one another. — *end note*]
+The behavior of a program is undefined if it invokes a
+vectorization-unsafe standard library function from user code called
+from a `execution::unsequenced_policy` algorithm.
+[*Note 5*: Because `execution::unsequenced_policy` allows the execution
+of element access functions to be interleaved on a single thread of
+execution, blocking synchronization, including the use of mutexes, risks
+deadlock. — *end note*]
+The invocations of element access functions in parallel algorithms
+invoked with an execution policy object of type
+`execution::parallel_policy` are permitted to execute either in the
 invoking thread of execution or in a thread of execution implicitly
 created by the library to support parallel algorithm execution. If the
+threads of execution created by `thread` [[thread.thread.class]] or
+`jthread` [[thread.jthread.class]] provide concurrent forward progress
+guarantees [[intro.progress]], then a thread of execution implicitly
+created by the library will provide parallel forward progress
+guarantees; otherwise, the provided forward progress guarantee is
+*implementation-defined*. Any such invocations executing in the same
+thread of execution are indeterminately sequenced with respect to each
+other.
+[*Note 6*: It is the caller’s responsibility to ensure that the
 invocation does not introduce data races or deadlocks. — *end note*]
 [*Example 1*:
 ``` cpp
 ``` cpp
 std::atomic<int> x{0};
 int a[] = {1,2};
 std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) {
+  x.fetch_add(1, std::memory_order::relaxed);
   // spin wait for another iteration to change the value of x
+  while (x.load(std::memory_order::relaxed) == 1) { } // incorrect: assumes execution order
 });
 ```
 The above example depends on the order of execution of the iterations,
 and will not terminate if both iterations are executed sequentially on
 incremented correctly.
 — *end example*]
 The invocations of element access functions in parallel algorithms
+invoked with an execution policy object of type
 `execution::parallel_unsequenced_policy` are permitted to execute in an
 unordered fashion in unspecified threads of execution, and unsequenced
 with respect to one another within each thread of execution. These
 threads of execution are either the invoking thread of execution or
 threads of execution implicitly created by the library; the latter will
 provide weakly parallel forward progress guarantees.
+[*Note 7*: This means that multiple function object invocations may be
 interleaved on a single thread of execution, which overrides the usual
 guarantee from [[intro.execution]] that function executions do not
+overlap with one another. — *end note*]
+The behavior of a program is undefined if it invokes a
+vectorization-unsafe standard library function from user code called
+from a `execution::parallel_unsequenced_policy` algorithm.
+[*Note 8*: Because `execution::parallel_unsequenced_policy` allows the
+execution of element access functions to be interleaved on a single
+thread of execution, blocking synchronization, including the use of
+mutexes, risks deadlock. — *end note*]
+[*Note 9*: The semantics of invocation with
+`execution::unsequenced_policy`, `execution::parallel_policy`, or
+`execution::parallel_unsequenced_policy` allow the implementation to
+fall back to sequential execution if the system cannot parallelize an
+algorithm invocation, e.g., due to lack of resources. — *end note*]
 If an invocation of a parallel algorithm uses threads of execution
 implicitly created by the library, then the invoking thread of execution
 will either
+- temporarily block with forward progress guarantee delegation
+  [[intro.progress]] on the completion of these library-managed threads
   of execution, or
 - eventually execute an element access function;
 the thread of execution will continue to do so until the algorithm is
 finished.
+[*Note 10*: In blocking with forward progress guarantee delegation in
 this context, a thread of execution created by the library is considered
 to have finished execution as soon as it has finished the execution of
 the particular element access function that the invoking thread of
 execution logically depends on. — *end note*]
 number of assignments or swaps, and *expr* is not already expressed with
 𝑂() notation, the complexity of the algorithm shall be
 𝑂(\placeholder{expr}).
 Parallel algorithms shall not participate in overload resolution unless
+`is_execution_policy_v<remove_cvref_t<ExecutionPolicy>>` is `true`.

Diff to HTML by rtfpessoa