From Jason Turner

[algorithms.parallel]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmpocucftxj/{from.md → to.md} +175 -31
tmp/tmpocucftxj/{from.md → to.md} RENAMED
@@ -5,22 +5,31 @@
5
  Subclause [[algorithms.parallel]] describes components that C++ programs
6
  may use to perform operations on containers and other sequences in
7
  parallel.
8
 
9
  A *parallel algorithm* is a function template listed in this document
10
- with a template parameter named `ExecutionPolicy`.
 
11
 
12
- Parallel algorithms access objects indirectly accessible via their
 
 
 
 
 
 
 
 
13
  arguments by invoking the following functions:
14
 
15
- - All operations of the categories of the iterators that the algorithm
16
- is instantiated with.
17
  - Operations on those sequence elements that are required by its
18
  specification.
19
- - User-provided function objects to be applied during the execution of
20
  the algorithm, if required by the specification.
21
- - Operations on those function objects required by the specification.
22
  \[*Note 1*: See  [[algorithms.requirements]]. — *end note*]
23
 
24
  These functions are herein called *element access functions*.
25
 
26
  [*Example 1*:
@@ -37,11 +46,12 @@ The `sort` function may invoke the following element access functions:
37
  — *end example*]
38
 
39
  A standard library function is *vectorization-unsafe* if it is specified
40
  to synchronize with another function invocation, or another function
41
  invocation is specified to synchronize with it, and if it is not a
42
- memory allocation or deallocation function.
 
43
 
44
  [*Note 2*: Implementations must ensure that internal synchronization
45
  inside standard library functions does not prevent forward progress when
46
  those functions are executed by threads of execution with weakly
47
  parallel forward progress guarantees. — *end note*]
@@ -67,25 +77,26 @@ different threads of execution.
67
 
68
  — *end example*]
69
 
70
  ### Requirements on user-provided function objects <a id="algorithms.parallel.user">[[algorithms.parallel.user]]</a>
71
 
72
- Unless otherwise specified, function objects passed into parallel
73
- algorithms as objects of type `Predicate`, `BinaryPredicate`, `Compare`,
74
- `UnaryOperation`, `BinaryOperation`, `BinaryOperation1`,
75
- `BinaryOperation2`, and the operators used by the analogous overloads to
 
 
76
  these parallel algorithms that are formed by an invocation with the
77
  specified default predicate or operation (where applicable) shall not
78
  directly or indirectly modify objects via their arguments, nor shall
79
  they rely on the identity of the provided objects.
80
 
81
  ### Effect of execution policies on algorithm execution <a id="algorithms.parallel.exec">[[algorithms.parallel.exec]]</a>
82
 
83
- Parallel algorithms have template parameters named `ExecutionPolicy`
84
- [[execpol]] which describe the manner in which the execution of these
85
- algorithms may be parallelized and the manner in which they apply the
86
- element access functions.
87
 
88
  If an object is modified by an element access function, the algorithm
89
  will perform no other unsynchronized accesses to that object. The
90
  modifying element access functions are those which are specified as
91
  modifying the object.
@@ -257,32 +268,165 @@ During the execution of a parallel algorithm, if temporary memory
257
  resources are required for parallelization and none are available, the
258
  algorithm throws a `bad_alloc` exception.
259
 
260
  During the execution of a parallel algorithm, if the invocation of an
261
  element access function exits via an uncaught exception, the behavior is
262
- determined by the `ExecutionPolicy`.
263
 
264
- ### `ExecutionPolicy` algorithm overloads <a id="algorithms.parallel.overloads">[[algorithms.parallel.overloads]]</a>
265
 
266
  Parallel algorithms are algorithm overloads. Each parallel algorithm
267
- overload has an additional template type parameter named
268
- `ExecutionPolicy`, which is the first template parameter. Additionally,
269
- each parallel algorithm overload has an additional function parameter of
270
- type `ExecutionPolicy&&`, which is the first function parameter.
271
 
272
  [*Note 1*: Not all algorithms have parallel algorithm
273
  overloads. — *end note*]
274
 
275
- Unless otherwise specified, the semantics of `ExecutionPolicy` algorithm
276
- overloads are identical to their overloads without.
 
277
 
278
- Unless otherwise specified, the complexity requirements of
279
- `ExecutionPolicy` algorithm overloads are relaxed from the complexity
280
- requirements of the overloads without as follows: when the guarantee
281
- says “at most *expr*” or “exactly *expr*” and does not specify the
282
- number of assignments or swaps, and *expr* is not already expressed with
283
- 𝑂() notation, the complexity of the algorithm shall be
284
  𝑂(\placeholder{expr}).
285
 
286
- Parallel algorithms shall not participate in overload resolution unless
287
- `is_execution_policy_v<remove_cvref_t<ExecutionPolicy>>` is `true`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
288
 
 
5
  Subclause [[algorithms.parallel]] describes components that C++ programs
6
  may use to perform operations on containers and other sequences in
7
  parallel.
8
 
9
  A *parallel algorithm* is a function template listed in this document
10
+ with a template parameter named `ExecutionPolicy` or constrained by the
11
+ following exposition-only concept:
12
 
13
+ ``` cpp
14
+ template<class Ep>
15
+ concept execution-policy = is_execution_policy_v<remove_cvref_t<Ep>>; // exposition only
16
+ ```
17
+
18
+ Such a template parameter is termed an
19
+ *execution policy template parameter*.
20
+
21
+ A parallel algorithm accesses objects indirectly accessible via its
22
  arguments by invoking the following functions:
23
 
24
+ - All operations of the categories of the iterators, sentinels, or
25
+ `mdspan` types that the algorithm is instantiated with.
26
  - Operations on those sequence elements that are required by its
27
  specification.
28
+ - User-provided invocable objects to be applied during the execution of
29
  the algorithm, if required by the specification.
30
+ - Operations on those invocable objects required by the specification.
31
  \[*Note 1*: See  [[algorithms.requirements]]. — *end note*]
32
 
33
  These functions are herein called *element access functions*.
34
 
35
  [*Example 1*:
 
46
  — *end example*]
47
 
48
  A standard library function is *vectorization-unsafe* if it is specified
49
  to synchronize with another function invocation, or another function
50
  invocation is specified to synchronize with it, and if it is not a
51
+ memory allocation or deallocation function or lock-free atomic
52
+ modify-write operation [[atomics.order]].
53
 
54
  [*Note 2*: Implementations must ensure that internal synchronization
55
  inside standard library functions does not prevent forward progress when
56
  those functions are executed by threads of execution with weakly
57
  parallel forward progress guarantees. — *end note*]
 
77
 
78
  — *end example*]
79
 
80
  ### Requirements on user-provided function objects <a id="algorithms.parallel.user">[[algorithms.parallel.user]]</a>
81
 
82
+ Unless otherwise specified, invocable objects passed into parallel
83
+ algorithms as objects of a type denoted by a template parameter named
84
+ `Predicate`, `BinaryPredicate`, `Compare`, `UnaryOperation`,
85
+ `BinaryOperation`, `BinaryOperation1`, `BinaryOperation2`,
86
+ `BinaryDivideOp`, or constrained by a concept that subsumes
87
+ `regular_invocable` and the operators used by the analogous overloads to
88
  these parallel algorithms that are formed by an invocation with the
89
  specified default predicate or operation (where applicable) shall not
90
  directly or indirectly modify objects via their arguments, nor shall
91
  they rely on the identity of the provided objects.
92
 
93
  ### Effect of execution policies on algorithm execution <a id="algorithms.parallel.exec">[[algorithms.parallel.exec]]</a>
94
 
95
+ An execution policy template parameter describes the manner in which the
96
+ execution of a parallel algorithm may be parallelized and the manner in
97
+ which it applies the element access functions.
 
98
 
99
  If an object is modified by an element access function, the algorithm
100
  will perform no other unsynchronized accesses to that object. The
101
  modifying element access functions are those which are specified as
102
  modifying the object.
 
268
  resources are required for parallelization and none are available, the
269
  algorithm throws a `bad_alloc` exception.
270
 
271
  During the execution of a parallel algorithm, if the invocation of an
272
  element access function exits via an uncaught exception, the behavior is
273
+ determined by the execution policy.
274
 
275
+ ### Parallel algorithm overloads <a id="algorithms.parallel.overloads">[[algorithms.parallel.overloads]]</a>
276
 
277
  Parallel algorithms are algorithm overloads. Each parallel algorithm
278
+ overload has an additional function parameter P of type `T&&` as the
279
+ first function parameter, where `T` is the execution policy template
280
+ parameter.
 
281
 
282
  [*Note 1*: Not all algorithms have parallel algorithm
283
  overloads. — *end note*]
284
 
285
+ Unless otherwise specified, the semantics of calling a parallel
286
+ algorithm overload are identical to calling the corresponding algorithm
287
+ overload without the parameter P, using all but the first argument.
288
 
289
+ Unless otherwise specified, the complexity requirements of a parallel
290
+ algorithm overload are relaxed from the complexity requirements of the
291
+ corresponding overload without the parameter P as follows: when the
292
+ guarantee says “at most *expr*” or “exactly *expr*” and does not specify
293
+ the number of assignments or swaps, and *expr* is not already expressed
294
+ with 𝑂() notation, the complexity of the algorithm shall be
295
  𝑂(\placeholder{expr}).
296
 
297
+ A parallel algorithm with a template parameter named `ExecutionPolicy`
298
+ shall not participate in overload resolution unless that template
299
+ parameter satisfies `execution-policy`.
300
+
301
+ ### Execution policies <a id="execpol">[[execpol]]</a>
302
+
303
+ #### General <a id="execpol.general">[[execpol.general]]</a>
304
+
305
+ Subclause  [[execpol]] describes classes that are *execution policy*
306
+ types. An object of an execution policy type indicates the kinds of
307
+ parallelism allowed in the execution of an algorithm and expresses the
308
+ consequent requirements on the element access functions. Execution
309
+ policy types are declared in header `<execution>`.
310
+
311
+ [*Example 1*:
312
+
313
+ ``` cpp
314
+ using namespace std;
315
+ vector<int> v = ...;
316
+
317
+ // standard sequential sort
318
+ sort(v.begin(), v.end());
319
+
320
+ // explicitly sequential sort
321
+ sort(execution::seq, v.begin(), v.end());
322
+
323
+ // permitting parallel execution
324
+ sort(execution::par, v.begin(), v.end());
325
+
326
+ // permitting vectorization as well
327
+ sort(execution::par_unseq, v.begin(), v.end());
328
+ ```
329
+
330
+ — *end example*]
331
+
332
+ [*Note 1*: Implementations can provide additional execution policies to
333
+ those described in this document as extensions to address parallel
334
+ architectures that require idiosyncratic parameters for efficient
335
+ execution. — *end note*]
336
+
337
+ #### Execution policy type trait <a id="execpol.type">[[execpol.type]]</a>
338
+
339
+ ``` cpp
340
+ template<class T> struct is_execution_policy;
341
+ ```
342
+
343
+ `is_execution_policy` can be used to detect execution policies for the
344
+ purpose of excluding function signatures from otherwise ambiguous
345
+ overload resolution participation.
346
+
347
+ `is_execution_policy<T>` is a *Cpp17UnaryTypeTrait* with a base
348
+ characteristic of `true_type` if `T` is the type of a standard or
349
+ *implementation-defined* execution policy, otherwise `false_type`.
350
+
351
+ [*Note 1*: This provision reserves the privilege of creating
352
+ non-standard execution policies to the library
353
+ implementation. — *end note*]
354
+
355
+ The behavior of a program that adds specializations for
356
+ `is_execution_policy` is undefined.
357
+
358
+ #### Sequenced execution policy <a id="execpol.seq">[[execpol.seq]]</a>
359
+
360
+ ``` cpp
361
+ class execution::sequenced_policy { unspecified };
362
+ ```
363
+
364
+ The class `execution::sequenced_policy` is an execution policy type used
365
+ as a unique type to disambiguate parallel algorithm overloading and
366
+ require that a parallel algorithm’s execution may not be parallelized.
367
+
368
+ During the execution of a parallel algorithm with the
369
+ `execution::sequenced_policy` policy, if the invocation of an element
370
+ access function exits via an exception, `terminate` is
371
+ invoked [[except.terminate]].
372
+
373
+ #### Parallel execution policy <a id="execpol.par">[[execpol.par]]</a>
374
+
375
+ ``` cpp
376
+ class execution::parallel_policy { unspecified };
377
+ ```
378
+
379
+ The class `execution::parallel_policy` is an execution policy type used
380
+ as a unique type to disambiguate parallel algorithm overloading and
381
+ indicate that a parallel algorithm’s execution may be parallelized.
382
+
383
+ During the execution of a parallel algorithm with the
384
+ `execution::parallel_policy` policy, if the invocation of an element
385
+ access function exits via an exception, `terminate` is
386
+ invoked [[except.terminate]].
387
+
388
+ #### Parallel and unsequenced execution policy <a id="execpol.parunseq">[[execpol.parunseq]]</a>
389
+
390
+ ``` cpp
391
+ class execution::parallel_unsequenced_policy { unspecified };
392
+ ```
393
+
394
+ The class `execution::parallel_unsequenced_policy` is an execution
395
+ policy type used as a unique type to disambiguate parallel algorithm
396
+ overloading and indicate that a parallel algorithm’s execution may be
397
+ parallelized and vectorized.
398
+
399
+ During the execution of a parallel algorithm with the
400
+ `execution::parallel_unsequenced_policy` policy, if the invocation of an
401
+ element access function exits via an exception, `terminate` is
402
+ invoked [[except.terminate]].
403
+
404
+ #### Unsequenced execution policy <a id="execpol.unseq">[[execpol.unseq]]</a>
405
+
406
+ ``` cpp
407
+ class execution::unsequenced_policy { unspecified };
408
+ ```
409
+
410
+ The class `unsequenced_policy` is an execution policy type used as a
411
+ unique type to disambiguate parallel algorithm overloading and indicate
412
+ that a parallel algorithm’s execution may be vectorized, e.g., executed
413
+ on a single thread using instructions that operate on multiple data
414
+ items.
415
+
416
+ During the execution of a parallel algorithm with the
417
+ `execution::unsequenced_policy` policy, if the invocation of an element
418
+ access function exits via an exception, `terminate` is
419
+ invoked [[except.terminate]].
420
+
421
+ #### Execution policy objects <a id="execpol.objects">[[execpol.objects]]</a>
422
+
423
+ ``` cpp
424
+ inline constexpr execution::sequenced_policy execution::seq{ unspecified };
425
+ inline constexpr execution::parallel_policy execution::par{ unspecified };
426
+ inline constexpr execution::parallel_unsequenced_policy execution::par_unseq{ unspecified };
427
+ inline constexpr execution::unsequenced_policy execution::unseq{ unspecified };
428
+ ```
429
+
430
+ The header `<execution>` declares global objects associated with each
431
+ type of execution policy.
432