- tmp/tmpdb8kobds/{from.md → to.md} +179 -175
tmp/tmpdb8kobds/{from.md → to.md}
RENAMED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
-
## Multi-threaded executions and data races <a id="intro.multithread">[[intro.multithread]]</a>
|
| 2 |
|
| 3 |
A *thread of execution* (also known as a *thread*) is a single flow of
|
| 4 |
control within a program, including the initial invocation of a specific
|
| 5 |
top-level function, and recursively including every function invocation
|
| 6 |
subsequently executed by the thread.
|
|
@@ -8,15 +8,15 @@ subsequently executed by the thread.
|
|
| 8 |
[*Note 1*: When one thread creates another, the initial call to the
|
| 9 |
top-level function of the new thread is executed by the new thread, not
|
| 10 |
by the creating thread. — *end note*]
|
| 11 |
|
| 12 |
Every thread in a program can potentially access every object and
|
| 13 |
-
function in a program.[^
|
| 14 |
can have more than one thread running concurrently. The execution of
|
| 15 |
-
each thread proceeds as defined by the remainder of this
|
| 16 |
-
|
| 17 |
-
|
| 18 |
|
| 19 |
[*Note 2*: Usually the execution can be viewed as an interleaving of
|
| 20 |
all its threads. However, some kinds of atomic operations, for example,
|
| 21 |
allow executions inconsistent with a simple interleaving, as described
|
| 22 |
below. — *end note*]
|
|
@@ -26,71 +26,67 @@ whether a program can have more than one thread of execution.
|
|
| 26 |
|
| 27 |
For a signal handler that is not executed as a result of a call to the
|
| 28 |
`std::raise` function, it is unspecified which thread of execution
|
| 29 |
contains the signal handler invocation.
|
| 30 |
|
| 31 |
-
### Data races <a id="intro.races">[[intro.races]]</a>
|
| 32 |
|
| 33 |
-
The value of an object visible to a thread
|
| 34 |
-
the initial value of the object, a value assigned to the object by
|
| 35 |
-
|
| 36 |
-
|
| 37 |
|
| 38 |
[*Note 1*: In some cases, there may instead be undefined behavior. Much
|
| 39 |
-
of this
|
| 40 |
-
with explicit and detailed visibility constraints. However,
|
| 41 |
-
implicitly supports a simpler view for more restricted
|
| 42 |
programs. — *end note*]
|
| 43 |
|
| 44 |
Two expression evaluations *conflict* if one of them modifies a memory
|
| 45 |
-
location
|
| 46 |
-
|
| 47 |
|
| 48 |
-
The library defines a number of atomic operations
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
fence,
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
which have special characteristics.
|
| 60 |
|
| 61 |
[*Note 2*: For example, a call that acquires a mutex will perform an
|
| 62 |
acquire operation on the locations comprising the mutex.
|
| 63 |
Correspondingly, a call that releases the same mutex will perform a
|
| 64 |
release operation on those same locations. Informally, performing a
|
| 65 |
-
release operation on
|
| 66 |
locations to become visible to other threads that later perform a
|
| 67 |
-
consume or an acquire operation on
|
| 68 |
not synchronization operations even though, like synchronization
|
| 69 |
operations, they cannot contribute to data races. — *end note*]
|
| 70 |
|
| 71 |
-
All modifications to a particular atomic object
|
| 72 |
-
particular total order, called the *modification order* of
|
| 73 |
|
| 74 |
[*Note 3*: There is a separate order for each atomic object. There is
|
| 75 |
no requirement that these can be combined into a single total order for
|
| 76 |
all objects. In general this will be impossible since different threads
|
| 77 |
may observe modifications to different objects in inconsistent
|
| 78 |
orders. — *end note*]
|
| 79 |
|
| 80 |
-
A *release sequence* headed by a release operation
|
| 81 |
-
|
| 82 |
-
modification order of
|
| 83 |
-
subsequent operation
|
| 84 |
-
|
| 85 |
-
- is performed by the same thread that performed `A`, or
|
| 86 |
-
- is an atomic read-modify-write operation.
|
| 87 |
|
| 88 |
Certain library calls *synchronize with* other library calls performed
|
| 89 |
by another thread. For example, an atomic store-release synchronizes
|
| 90 |
-
with a load-acquire that takes its value from the store
|
| 91 |
-
[[atomics.order]]
|
| 92 |
|
| 93 |
[*Note 4*: Except in the specified cases, reading a later value does
|
| 94 |
not necessarily ensure visibility as described below. Such a requirement
|
| 95 |
would sometimes interfere with efficient implementation. — *end note*]
|
| 96 |
|
|
@@ -98,55 +94,52 @@ would sometimes interfere with efficient implementation. — *end note*]
|
|
| 98 |
when one reads the value written by another. For atomic objects, the
|
| 99 |
definition is clear. All operations on a given mutex occur in a single
|
| 100 |
total order. Each mutex acquisition “reads the value written” by the
|
| 101 |
last mutex release. — *end note*]
|
| 102 |
|
| 103 |
-
An evaluation
|
| 104 |
|
| 105 |
-
- the value of
|
| 106 |
-
-
|
| 107 |
-
|
| 108 |
-
-
|
| 109 |
-
[[expr.log.and]]) or logical
|
| 110 |
-
|
| 111 |
-
-
|
| 112 |
operator, or
|
| 113 |
-
-
|
| 114 |
-
[[expr.comma]]
|
| 115 |
|
| 116 |
or
|
| 117 |
-
-
|
| 118 |
-
|
| 119 |
-
- for some evaluation
|
| 120 |
-
|
| 121 |
|
| 122 |
[*Note 6*: “Carries a dependency to” is a subset of “is sequenced
|
| 123 |
before”, and is similarly strictly intra-thread. — *end note*]
|
| 124 |
|
| 125 |
-
An evaluation
|
| 126 |
|
| 127 |
-
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
carries a dependency to *B*.
|
| 133 |
|
| 134 |
[*Note 7*: The relation “is dependency-ordered before” is analogous to
|
| 135 |
“synchronizes with”, but uses release/consume in place of
|
| 136 |
release/acquire. — *end note*]
|
| 137 |
|
| 138 |
-
An evaluation
|
| 139 |
|
| 140 |
-
-
|
| 141 |
-
-
|
| 142 |
-
- for some evaluation
|
| 143 |
-
-
|
| 144 |
-
-
|
| 145 |
-
|
| 146 |
-
- *A* inter-thread happens before *X* and *X* inter-thread happens
|
| 147 |
-
before *B*.
|
| 148 |
|
| 149 |
[*Note 8*: The “inter-thread happens before” relation describes
|
| 150 |
arbitrary concatenations of “sequenced before”, “synchronizes with” and
|
| 151 |
“dependency-ordered before” relationships, with two exceptions. The
|
| 152 |
first exception is that a concatenation is not permitted to end with
|
|
@@ -161,101 +154,111 @@ exception is that a concatenation is not permitted to consist entirely
|
|
| 161 |
of “sequenced before”. The reasons for this limitation are (1) to permit
|
| 162 |
“inter-thread happens before” to be transitively closed and (2) the
|
| 163 |
“happens before” relation, defined below, provides for relationships
|
| 164 |
consisting entirely of “sequenced before”. — *end note*]
|
| 165 |
|
| 166 |
-
An evaluation
|
| 167 |
-
*
|
| 168 |
|
| 169 |
-
-
|
| 170 |
-
-
|
| 171 |
|
| 172 |
The implementation shall ensure that no program execution demonstrates a
|
| 173 |
cycle in the “happens before” relation.
|
| 174 |
|
| 175 |
[*Note 9*: This cycle would otherwise be possible only through the use
|
| 176 |
of consume operations. — *end note*]
|
| 177 |
|
| 178 |
-
An evaluation
|
| 179 |
|
| 180 |
-
-
|
| 181 |
-
-
|
| 182 |
-
-
|
| 183 |
|
| 184 |
[*Note 10*: In the absence of consume operations, the happens before
|
| 185 |
-
and
|
| 186 |
-
before essentially excludes consume operations. — *end note*]
|
| 187 |
|
| 188 |
-
A *
|
| 189 |
-
respect to a value computation *B* of *M* satisfies the conditions:
|
| 190 |
|
| 191 |
-
-
|
| 192 |
-
-
|
| 193 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 194 |
|
| 195 |
-
|
| 196 |
-
|
| 197 |
-
*
|
| 198 |
|
| 199 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 200 |
non-atomic object or bit-field is visible, then the behavior is either
|
| 201 |
unspecified or undefined. — *end note*]
|
| 202 |
|
| 203 |
-
[*Note
|
| 204 |
visibly reordered. This is not actually detectable without data races,
|
| 205 |
but it is necessary to ensure that data races, as defined below, and
|
| 206 |
with suitable restrictions on the use of atomics, correspond to data
|
| 207 |
races in a simple interleaved (sequentially consistent)
|
| 208 |
execution. — *end note*]
|
| 209 |
|
| 210 |
-
The value of an atomic object
|
| 211 |
-
|
| 212 |
-
|
| 213 |
|
| 214 |
-
[*Note
|
| 215 |
of the rules described here, and in particular, by the coherence
|
| 216 |
requirements below. — *end note*]
|
| 217 |
|
| 218 |
-
If an operation
|
| 219 |
-
operation
|
| 220 |
-
|
| 221 |
|
| 222 |
-
[*Note
|
| 223 |
coherence. — *end note*]
|
| 224 |
|
| 225 |
-
If a value computation
|
| 226 |
-
|
| 227 |
-
|
| 228 |
-
|
| 229 |
-
|
| 230 |
|
| 231 |
-
[*Note
|
| 232 |
coherence. — *end note*]
|
| 233 |
|
| 234 |
-
If a value computation
|
| 235 |
-
operation
|
| 236 |
-
|
| 237 |
-
of *M*.
|
| 238 |
|
| 239 |
-
[*Note
|
| 240 |
coherence. — *end note*]
|
| 241 |
|
| 242 |
-
If a side effect
|
| 243 |
-
computation
|
| 244 |
-
from
|
| 245 |
-
order of *M*.
|
| 246 |
|
| 247 |
-
[*Note
|
| 248 |
coherence. — *end note*]
|
| 249 |
|
| 250 |
-
[*Note
|
| 251 |
disallow compiler reordering of atomic operations to a single object,
|
| 252 |
even if both operations are relaxed loads. This effectively makes the
|
| 253 |
-
cache coherence guarantee provided by most hardware available to
|
| 254 |
-
|
| 255 |
|
| 256 |
-
[*Note
|
| 257 |
“happens before” relation, which depends on the values observed by loads
|
| 258 |
of atomics. The intended reading is that there must exist an association
|
| 259 |
of atomic loads with modifications they observe that, together with
|
| 260 |
suitably chosen modification orders and the “happens before” relation
|
| 261 |
derived as described above, satisfy the resulting constraints as imposed
|
|
@@ -271,12 +274,12 @@ The execution of a program contains a *data race* if it contains two
|
|
| 271 |
potentially concurrent conflicting actions, at least one of which is not
|
| 272 |
atomic, and neither happens before the other, except for the special
|
| 273 |
case for signal handlers described below. Any such data race results in
|
| 274 |
undefined behavior.
|
| 275 |
|
| 276 |
-
[*Note
|
| 277 |
-
and `
|
| 278 |
no other synchronization operations behave as if the operations executed
|
| 279 |
by their constituent threads were simply interleaved, with each value
|
| 280 |
computation of an object being taken from the last side effect on that
|
| 281 |
object in that interleaving. This is normally referred to as “sequential
|
| 282 |
consistency”. However, this applies only to data-race-free programs, and
|
|
@@ -288,38 +291,38 @@ undefined operation. — *end note*]
|
|
| 288 |
|
| 289 |
Two accesses to the same object of type `volatile std::sig_atomic_t` do
|
| 290 |
not result in a data race if both occur in the same thread, even if one
|
| 291 |
or more occurs in a signal handler. For each signal handler invocation,
|
| 292 |
evaluations performed by the thread invoking a signal handler can be
|
| 293 |
-
divided into two groups
|
| 294 |
-
|
| 295 |
`volatile std::sig_atomic_t` objects take values as though all
|
| 296 |
-
evaluations in
|
| 297 |
-
|
| 298 |
-
|
| 299 |
|
| 300 |
-
[*Note
|
| 301 |
potentially shared memory location that would not be modified by the
|
| 302 |
-
abstract machine are generally precluded by this
|
| 303 |
-
|
| 304 |
-
|
| 305 |
-
|
| 306 |
-
|
| 307 |
-
|
| 308 |
-
|
| 309 |
-
|
| 310 |
|
| 311 |
-
[*Note
|
| 312 |
potentially shared memory location may not preserve the semantics of the
|
| 313 |
-
C++program as defined in this
|
| 314 |
-
|
| 315 |
-
|
| 316 |
-
|
| 317 |
hypothetical machine that is not tolerant of races or provides hardware
|
| 318 |
race detection. — *end note*]
|
| 319 |
|
| 320 |
-
### Forward progress <a id="intro.progress">[[intro.progress]]</a>
|
| 321 |
|
| 322 |
The implementation may assume that any thread will eventually do one of
|
| 323 |
the following:
|
| 324 |
|
| 325 |
- terminate,
|
|
@@ -329,17 +332,17 @@ the following:
|
|
| 329 |
|
| 330 |
[*Note 1*: This is intended to allow compiler transformations such as
|
| 331 |
removal of empty loops, even when termination cannot be
|
| 332 |
proven. — *end note*]
|
| 333 |
|
| 334 |
-
Executions of atomic functions that are either defined to be lock-free
|
| 335 |
-
[[atomics.flag]]
|
| 336 |
*lock-free executions*.
|
| 337 |
|
| 338 |
-
- If there is only one thread that is not blocked
|
| 339 |
-
|
| 340 |
-
|
| 341 |
progress of a lock-free execution. For example, this situation can
|
| 342 |
occur with load-locked store-conditional implementations. This
|
| 343 |
property is sometimes termed obstruction-free. — *end note*]
|
| 344 |
- When one or more lock-free executions run concurrently, at least one
|
| 345 |
should complete. \[*Note 3*: It is difficult for some implementations
|
|
@@ -359,13 +362,13 @@ termed an *execution step*:
|
|
| 359 |
- termination of the thread of execution,
|
| 360 |
- performing an access through a volatile glvalue, or
|
| 361 |
- completion of a call to a library I/O function, a synchronization
|
| 362 |
operation, or an atomic operation.
|
| 363 |
|
| 364 |
-
An invocation of a standard library function that blocks
|
| 365 |
-
|
| 366 |
-
|
| 367 |
|
| 368 |
[*Example 1*: A library I/O function that blocks until the I/O
|
| 369 |
operation is complete can be considered to continuously check whether
|
| 370 |
the operation is complete. Each such check might consist of one or more
|
| 371 |
execution steps, for example using observable behavior of the abstract
|
|
@@ -389,16 +392,17 @@ make progress for as long as it has not terminated.
|
|
| 389 |
of executions (if any) have been or are making progress. To eventually
|
| 390 |
fulfill this requirement means that this will happen in an unspecified
|
| 391 |
but finite amount of time. — *end note*]
|
| 392 |
|
| 393 |
It is *implementation-defined* whether the implementation-created thread
|
| 394 |
-
of execution that executes `main`
|
| 395 |
-
|
| 396 |
-
provide concurrent forward
|
|
|
|
| 397 |
|
| 398 |
-
[*Note 6*: General-purpose implementations
|
| 399 |
-
|
| 400 |
|
| 401 |
For a thread of execution providing *parallel forward progress
|
| 402 |
guarantees*, the implementation is not required to ensure that the
|
| 403 |
thread will eventually make progress if it has not yet executed any
|
| 404 |
execution step; once this thread has executed a step, it provides
|
|
@@ -430,38 +434,38 @@ parallel forward progress guarantees.
|
|
| 430 |
[*Note 9*: For example, some kinds of synchronization between threads
|
| 431 |
of execution may only make progress if the respective threads of
|
| 432 |
execution provide parallel forward progress guarantees, but will fail to
|
| 433 |
make progress under weakly parallel guarantees. — *end note*]
|
| 434 |
|
| 435 |
-
When a thread of execution
|
| 436 |
-
progress guarantee delegation* on the completion of a set
|
| 437 |
-
of execution, then throughout the whole time of
|
| 438 |
-
|
| 439 |
-
|
| 440 |
-
|
| 441 |
|
| 442 |
-
[*Note 10*: It is unspecified which thread or threads of execution in
|
| 443 |
-
|
| 444 |
-
|
| 445 |
-
of the
|
| 446 |
-
|
| 447 |
-
|
| 448 |
|
| 449 |
-
Once a thread of execution in
|
| 450 |
-
|
| 451 |
|
| 452 |
-
[*Note 11*: A thread of execution
|
| 453 |
effectively stronger forward progress guarantee for a certain amount of
|
| 454 |
-
time, due to a second thread of execution
|
| 455 |
-
forward progress guarantee delegation. In turn, if
|
| 456 |
-
forward progress guarantee delegation on
|
| 457 |
-
provide a stronger forward progress guarantee to
|
| 458 |
|
| 459 |
-
[*Note 12*: If all threads of execution in
|
| 460 |
they terminate and do not use blocking synchronization incorrectly),
|
| 461 |
-
then
|
| 462 |
-
guarantee delegation will not result in
|
| 463 |
effectively weakened. — *end note*]
|
| 464 |
|
| 465 |
[*Note 13*: This does not remove any constraints regarding blocking
|
| 466 |
synchronization for threads of execution providing parallel or weakly
|
| 467 |
parallel forward progress guarantees because the implementation is not
|
|
|
|
| 1 |
+
### Multi-threaded executions and data races <a id="intro.multithread">[[intro.multithread]]</a>
|
| 2 |
|
| 3 |
A *thread of execution* (also known as a *thread*) is a single flow of
|
| 4 |
control within a program, including the initial invocation of a specific
|
| 5 |
top-level function, and recursively including every function invocation
|
| 6 |
subsequently executed by the thread.
|
|
|
|
| 8 |
[*Note 1*: When one thread creates another, the initial call to the
|
| 9 |
top-level function of the new thread is executed by the new thread, not
|
| 10 |
by the creating thread. — *end note*]
|
| 11 |
|
| 12 |
Every thread in a program can potentially access every object and
|
| 13 |
+
function in a program.[^28] Under a hosted implementation, a C++ program
|
| 14 |
can have more than one thread running concurrently. The execution of
|
| 15 |
+
each thread proceeds as defined by the remainder of this document. The
|
| 16 |
+
execution of the entire program consists of an execution of all of its
|
| 17 |
+
threads.
|
| 18 |
|
| 19 |
[*Note 2*: Usually the execution can be viewed as an interleaving of
|
| 20 |
all its threads. However, some kinds of atomic operations, for example,
|
| 21 |
allow executions inconsistent with a simple interleaving, as described
|
| 22 |
below. — *end note*]
|
|
|
|
| 26 |
|
| 27 |
For a signal handler that is not executed as a result of a call to the
|
| 28 |
`std::raise` function, it is unspecified which thread of execution
|
| 29 |
contains the signal handler invocation.
|
| 30 |
|
| 31 |
+
#### Data races <a id="intro.races">[[intro.races]]</a>
|
| 32 |
|
| 33 |
+
The value of an object visible to a thread T at a particular point is
|
| 34 |
+
the initial value of the object, a value assigned to the object by T, or
|
| 35 |
+
a value assigned to the object by another thread, according to the rules
|
| 36 |
+
below.
|
| 37 |
|
| 38 |
[*Note 1*: In some cases, there may instead be undefined behavior. Much
|
| 39 |
+
of this subclause is motivated by the desire to support atomic
|
| 40 |
+
operations with explicit and detailed visibility constraints. However,
|
| 41 |
+
it also implicitly supports a simpler view for more restricted
|
| 42 |
programs. — *end note*]
|
| 43 |
|
| 44 |
Two expression evaluations *conflict* if one of them modifies a memory
|
| 45 |
+
location [[intro.memory]] and the other one reads or modifies the same
|
| 46 |
+
memory location.
|
| 47 |
|
| 48 |
+
The library defines a number of atomic operations [[atomics]] and
|
| 49 |
+
operations on mutexes [[thread]] that are specially identified as
|
| 50 |
+
synchronization operations. These operations play a special role in
|
| 51 |
+
making assignments in one thread visible to another. A synchronization
|
| 52 |
+
operation on one or more memory locations is either a consume operation,
|
| 53 |
+
an acquire operation, a release operation, or both an acquire and
|
| 54 |
+
release operation. A synchronization operation without an associated
|
| 55 |
+
memory location is a fence and can be either an acquire fence, a release
|
| 56 |
+
fence, or both an acquire and release fence. In addition, there are
|
| 57 |
+
relaxed atomic operations, which are not synchronization operations, and
|
| 58 |
+
atomic read-modify-write operations, which have special characteristics.
|
|
|
|
| 59 |
|
| 60 |
[*Note 2*: For example, a call that acquires a mutex will perform an
|
| 61 |
acquire operation on the locations comprising the mutex.
|
| 62 |
Correspondingly, a call that releases the same mutex will perform a
|
| 63 |
release operation on those same locations. Informally, performing a
|
| 64 |
+
release operation on A forces prior side effects on other memory
|
| 65 |
locations to become visible to other threads that later perform a
|
| 66 |
+
consume or an acquire operation on A. “Relaxed” atomic operations are
|
| 67 |
not synchronization operations even though, like synchronization
|
| 68 |
operations, they cannot contribute to data races. — *end note*]
|
| 69 |
|
| 70 |
+
All modifications to a particular atomic object M occur in some
|
| 71 |
+
particular total order, called the *modification order* of M.
|
| 72 |
|
| 73 |
[*Note 3*: There is a separate order for each atomic object. There is
|
| 74 |
no requirement that these can be combined into a single total order for
|
| 75 |
all objects. In general this will be impossible since different threads
|
| 76 |
may observe modifications to different objects in inconsistent
|
| 77 |
orders. — *end note*]
|
| 78 |
|
| 79 |
+
A *release sequence* headed by a release operation A on an atomic object
|
| 80 |
+
M is a maximal contiguous sub-sequence of side effects in the
|
| 81 |
+
modification order of M, where the first operation is A, and every
|
| 82 |
+
subsequent operation is an atomic read-modify-write operation.
|
|
|
|
|
|
|
|
|
|
| 83 |
|
| 84 |
Certain library calls *synchronize with* other library calls performed
|
| 85 |
by another thread. For example, an atomic store-release synchronizes
|
| 86 |
+
with a load-acquire that takes its value from the store
|
| 87 |
+
[[atomics.order]].
|
| 88 |
|
| 89 |
[*Note 4*: Except in the specified cases, reading a later value does
|
| 90 |
not necessarily ensure visibility as described below. Such a requirement
|
| 91 |
would sometimes interfere with efficient implementation. — *end note*]
|
| 92 |
|
|
|
|
| 94 |
when one reads the value written by another. For atomic objects, the
|
| 95 |
definition is clear. All operations on a given mutex occur in a single
|
| 96 |
total order. Each mutex acquisition “reads the value written” by the
|
| 97 |
last mutex release. — *end note*]
|
| 98 |
|
| 99 |
+
An evaluation A *carries a dependency* to an evaluation B if
|
| 100 |
|
| 101 |
+
- the value of A is used as an operand of B, unless:
|
| 102 |
+
- B is an invocation of any specialization of `std::kill_dependency`
|
| 103 |
+
[[atomics.order]], or
|
| 104 |
+
- A is the left operand of a built-in logical (`&&`, see
|
| 105 |
+
[[expr.log.and]]) or logical (`||`, see [[expr.log.or]]) operator,
|
| 106 |
+
or
|
| 107 |
+
- A is the left operand of a conditional (`?:`, see [[expr.cond]])
|
| 108 |
operator, or
|
| 109 |
+
- A is the left operand of the built-in comma (`,`) operator
|
| 110 |
+
[[expr.comma]];
|
| 111 |
|
| 112 |
or
|
| 113 |
+
- A writes a scalar object or bit-field M, B reads the value written by
|
| 114 |
+
A from M, and A is sequenced before B, or
|
| 115 |
+
- for some evaluation X, A carries a dependency to X, and X carries a
|
| 116 |
+
dependency to B.
|
| 117 |
|
| 118 |
[*Note 6*: “Carries a dependency to” is a subset of “is sequenced
|
| 119 |
before”, and is similarly strictly intra-thread. — *end note*]
|
| 120 |
|
| 121 |
+
An evaluation A is *dependency-ordered before* an evaluation B if
|
| 122 |
|
| 123 |
+
- A performs a release operation on an atomic object M, and, in another
|
| 124 |
+
thread, B performs a consume operation on M and reads the value
|
| 125 |
+
written by A, or
|
| 126 |
+
- for some evaluation X, A is dependency-ordered before X and X carries
|
| 127 |
+
a dependency to B.
|
|
|
|
| 128 |
|
| 129 |
[*Note 7*: The relation “is dependency-ordered before” is analogous to
|
| 130 |
“synchronizes with”, but uses release/consume in place of
|
| 131 |
release/acquire. — *end note*]
|
| 132 |
|
| 133 |
+
An evaluation A *inter-thread happens before* an evaluation B if
|
| 134 |
|
| 135 |
+
- A synchronizes with B, or
|
| 136 |
+
- A is dependency-ordered before B, or
|
| 137 |
+
- for some evaluation X
|
| 138 |
+
- A synchronizes with X and X is sequenced before B, or
|
| 139 |
+
- A is sequenced before X and X inter-thread happens before B, or
|
| 140 |
+
- A inter-thread happens before X and X inter-thread happens before B.
|
|
|
|
|
|
|
| 141 |
|
| 142 |
[*Note 8*: The “inter-thread happens before” relation describes
|
| 143 |
arbitrary concatenations of “sequenced before”, “synchronizes with” and
|
| 144 |
“dependency-ordered before” relationships, with two exceptions. The
|
| 145 |
first exception is that a concatenation is not permitted to end with
|
|
|
|
| 154 |
of “sequenced before”. The reasons for this limitation are (1) to permit
|
| 155 |
“inter-thread happens before” to be transitively closed and (2) the
|
| 156 |
“happens before” relation, defined below, provides for relationships
|
| 157 |
consisting entirely of “sequenced before”. — *end note*]
|
| 158 |
|
| 159 |
+
An evaluation A *happens before* an evaluation B (or, equivalently, B
|
| 160 |
+
*happens after* A) if:
|
| 161 |
|
| 162 |
+
- A is sequenced before B, or
|
| 163 |
+
- A inter-thread happens before B.
|
| 164 |
|
| 165 |
The implementation shall ensure that no program execution demonstrates a
|
| 166 |
cycle in the “happens before” relation.
|
| 167 |
|
| 168 |
[*Note 9*: This cycle would otherwise be possible only through the use
|
| 169 |
of consume operations. — *end note*]
|
| 170 |
|
| 171 |
+
An evaluation A *simply happens before* an evaluation B if either
|
| 172 |
|
| 173 |
+
- A is sequenced before B, or
|
| 174 |
+
- A synchronizes with B, or
|
| 175 |
+
- A simply happens before X and X simply happens before B.
|
| 176 |
|
| 177 |
[*Note 10*: In the absence of consume operations, the happens before
|
| 178 |
+
and simply happens before relations are identical. — *end note*]
|
|
|
|
| 179 |
|
| 180 |
+
An evaluation A *strongly happens before* an evaluation D if, either
|
|
|
|
| 181 |
|
| 182 |
+
- A is sequenced before D, or
|
| 183 |
+
- A synchronizes with D, and both A and D are sequentially consistent
|
| 184 |
+
atomic operations [[atomics.order]], or
|
| 185 |
+
- there are evaluations B and C such that A is sequenced before B, B
|
| 186 |
+
simply happens before C, and C is sequenced before D, or
|
| 187 |
+
- there is an evaluation B such that A strongly happens before B, and B
|
| 188 |
+
strongly happens before D.
|
| 189 |
|
| 190 |
+
[*Note 11*: Informally, if A strongly happens before B, then A appears
|
| 191 |
+
to be evaluated before B in all contexts. Strongly happens before
|
| 192 |
+
excludes consume operations. — *end note*]
|
| 193 |
|
| 194 |
+
A *visible side effect* A on a scalar object or bit-field M with respect
|
| 195 |
+
to a value computation B of M satisfies the conditions:
|
| 196 |
+
|
| 197 |
+
- A happens before B and
|
| 198 |
+
- there is no other side effect X to M such that A happens before X and
|
| 199 |
+
X happens before B.
|
| 200 |
+
|
| 201 |
+
The value of a non-atomic scalar object or bit-field M, as determined by
|
| 202 |
+
evaluation B, shall be the value stored by the visible side effect A.
|
| 203 |
+
|
| 204 |
+
[*Note 12*: If there is ambiguity about which side effect to a
|
| 205 |
non-atomic object or bit-field is visible, then the behavior is either
|
| 206 |
unspecified or undefined. — *end note*]
|
| 207 |
|
| 208 |
+
[*Note 13*: This states that operations on ordinary objects are not
|
| 209 |
visibly reordered. This is not actually detectable without data races,
|
| 210 |
but it is necessary to ensure that data races, as defined below, and
|
| 211 |
with suitable restrictions on the use of atomics, correspond to data
|
| 212 |
races in a simple interleaved (sequentially consistent)
|
| 213 |
execution. — *end note*]
|
| 214 |
|
| 215 |
+
The value of an atomic object M, as determined by evaluation B, shall be
|
| 216 |
+
the value stored by some side effect A that modifies M, where B does not
|
| 217 |
+
happen before A.
|
| 218 |
|
| 219 |
+
[*Note 14*: The set of such side effects is also restricted by the rest
|
| 220 |
of the rules described here, and in particular, by the coherence
|
| 221 |
requirements below. — *end note*]
|
| 222 |
|
| 223 |
+
If an operation A that modifies an atomic object M happens before an
|
| 224 |
+
operation B that modifies M, then A shall be earlier than B in the
|
| 225 |
+
modification order of M.
|
| 226 |
|
| 227 |
+
[*Note 15*: This requirement is known as write-write
|
| 228 |
coherence. — *end note*]
|
| 229 |
|
| 230 |
+
If a value computation A of an atomic object M happens before a value
|
| 231 |
+
computation B of M, and A takes its value from a side effect X on M,
|
| 232 |
+
then the value computed by B shall either be the value stored by X or
|
| 233 |
+
the value stored by a side effect Y on M, where Y follows X in the
|
| 234 |
+
modification order of M.
|
| 235 |
|
| 236 |
+
[*Note 16*: This requirement is known as read-read
|
| 237 |
coherence. — *end note*]
|
| 238 |
|
| 239 |
+
If a value computation A of an atomic object M happens before an
|
| 240 |
+
operation B that modifies M, then A shall take its value from a side
|
| 241 |
+
effect X on M, where X precedes B in the modification order of M.
|
|
|
|
| 242 |
|
| 243 |
+
[*Note 17*: This requirement is known as read-write
|
| 244 |
coherence. — *end note*]
|
| 245 |
|
| 246 |
+
If a side effect X on an atomic object M happens before a value
|
| 247 |
+
computation B of M, then the evaluation B shall take its value from X or
|
| 248 |
+
from a side effect Y that follows X in the modification order of M.
|
|
|
|
| 249 |
|
| 250 |
+
[*Note 18*: This requirement is known as write-read
|
| 251 |
coherence. — *end note*]
|
| 252 |
|
| 253 |
+
[*Note 19*: The four preceding coherence requirements effectively
|
| 254 |
disallow compiler reordering of atomic operations to a single object,
|
| 255 |
even if both operations are relaxed loads. This effectively makes the
|
| 256 |
+
cache coherence guarantee provided by most hardware available to C++
|
| 257 |
+
atomic operations. — *end note*]
|
| 258 |
|
| 259 |
+
[*Note 20*: The value observed by a load of an atomic depends on the
|
| 260 |
“happens before” relation, which depends on the values observed by loads
|
| 261 |
of atomics. The intended reading is that there must exist an association
|
| 262 |
of atomic loads with modifications they observe that, together with
|
| 263 |
suitably chosen modification orders and the “happens before” relation
|
| 264 |
derived as described above, satisfy the resulting constraints as imposed
|
|
|
|
| 274 |
potentially concurrent conflicting actions, at least one of which is not
|
| 275 |
atomic, and neither happens before the other, except for the special
|
| 276 |
case for signal handlers described below. Any such data race results in
|
| 277 |
undefined behavior.
|
| 278 |
|
| 279 |
+
[*Note 21*: It can be shown that programs that correctly use mutexes
|
| 280 |
+
and `memory_order::seq_cst` operations to prevent all data races and use
|
| 281 |
no other synchronization operations behave as if the operations executed
|
| 282 |
by their constituent threads were simply interleaved, with each value
|
| 283 |
computation of an object being taken from the last side effect on that
|
| 284 |
object in that interleaving. This is normally referred to as “sequential
|
| 285 |
consistency”. However, this applies only to data-race-free programs, and
|
|
|
|
| 291 |
|
| 292 |
Two accesses to the same object of type `volatile std::sig_atomic_t` do
|
| 293 |
not result in a data race if both occur in the same thread, even if one
|
| 294 |
or more occurs in a signal handler. For each signal handler invocation,
|
| 295 |
evaluations performed by the thread invoking a signal handler can be
|
| 296 |
+
divided into two groups A and B, such that no evaluations in B happen
|
| 297 |
+
before evaluations in A, and the evaluations of such
|
| 298 |
`volatile std::sig_atomic_t` objects take values as though all
|
| 299 |
+
evaluations in A happened before the execution of the signal handler and
|
| 300 |
+
the execution of the signal handler happened before all evaluations in
|
| 301 |
+
B.
|
| 302 |
|
| 303 |
+
[*Note 22*: Compiler transformations that introduce assignments to a
|
| 304 |
potentially shared memory location that would not be modified by the
|
| 305 |
+
abstract machine are generally precluded by this document, since such an
|
| 306 |
+
assignment might overwrite another assignment by a different thread in
|
| 307 |
+
cases in which an abstract machine execution would not have encountered
|
| 308 |
+
a data race. This includes implementations of data member assignment
|
| 309 |
+
that overwrite adjacent members in separate memory locations. Reordering
|
| 310 |
+
of atomic loads in cases in which the atomics in question may alias is
|
| 311 |
+
also generally precluded, since this may violate the coherence
|
| 312 |
+
rules. — *end note*]
|
| 313 |
|
| 314 |
+
[*Note 23*: Transformations that introduce a speculative read of a
|
| 315 |
potentially shared memory location may not preserve the semantics of the
|
| 316 |
+
C++ program as defined in this document, since they potentially
|
| 317 |
+
introduce a data race. However, they are typically valid in the context
|
| 318 |
+
of an optimizing compiler that targets a specific machine with
|
| 319 |
+
well-defined semantics for data races. They would be invalid for a
|
| 320 |
hypothetical machine that is not tolerant of races or provides hardware
|
| 321 |
race detection. — *end note*]
|
| 322 |
|
| 323 |
+
#### Forward progress <a id="intro.progress">[[intro.progress]]</a>
|
| 324 |
|
| 325 |
The implementation may assume that any thread will eventually do one of
|
| 326 |
the following:
|
| 327 |
|
| 328 |
- terminate,
|
|
|
|
| 332 |
|
| 333 |
[*Note 1*: This is intended to allow compiler transformations such as
|
| 334 |
removal of empty loops, even when termination cannot be
|
| 335 |
proven. — *end note*]
|
| 336 |
|
| 337 |
+
Executions of atomic functions that are either defined to be lock-free
|
| 338 |
+
[[atomics.flag]] or indicated as lock-free [[atomics.lockfree]] are
|
| 339 |
*lock-free executions*.
|
| 340 |
|
| 341 |
+
- If there is only one thread that is not blocked [[defns.block]] in a
|
| 342 |
+
standard library function, a lock-free execution in that thread shall
|
| 343 |
+
complete. \[*Note 2*: Concurrently executing threads may prevent
|
| 344 |
progress of a lock-free execution. For example, this situation can
|
| 345 |
occur with load-locked store-conditional implementations. This
|
| 346 |
property is sometimes termed obstruction-free. — *end note*]
|
| 347 |
- When one or more lock-free executions run concurrently, at least one
|
| 348 |
should complete. \[*Note 3*: It is difficult for some implementations
|
|
|
|
| 362 |
- termination of the thread of execution,
|
| 363 |
- performing an access through a volatile glvalue, or
|
| 364 |
- completion of a call to a library I/O function, a synchronization
|
| 365 |
operation, or an atomic operation.
|
| 366 |
|
| 367 |
+
An invocation of a standard library function that blocks [[defns.block]]
|
| 368 |
+
is considered to continuously execute execution steps while waiting for
|
| 369 |
+
the condition that it blocks on to be satisfied.
|
| 370 |
|
| 371 |
[*Example 1*: A library I/O function that blocks until the I/O
|
| 372 |
operation is complete can be considered to continuously check whether
|
| 373 |
the operation is complete. Each such check might consist of one or more
|
| 374 |
execution steps, for example using observable behavior of the abstract
|
|
|
|
| 392 |
of executions (if any) have been or are making progress. To eventually
|
| 393 |
fulfill this requirement means that this will happen in an unspecified
|
| 394 |
but finite amount of time. — *end note*]
|
| 395 |
|
| 396 |
It is *implementation-defined* whether the implementation-created thread
|
| 397 |
+
of execution that executes `main` [[basic.start.main]] and the threads
|
| 398 |
+
of execution created by `std::thread` [[thread.thread.class]] or
|
| 399 |
+
`std::jthread` [[thread.jthread.class]] provide concurrent forward
|
| 400 |
+
progress guarantees.
|
| 401 |
|
| 402 |
+
[*Note 6*: General-purpose implementations should provide these
|
| 403 |
+
guarantees. — *end note*]
|
| 404 |
|
| 405 |
For a thread of execution providing *parallel forward progress
|
| 406 |
guarantees*, the implementation is not required to ensure that the
|
| 407 |
thread will eventually make progress if it has not yet executed any
|
| 408 |
execution step; once this thread has executed a step, it provides
|
|
|
|
| 434 |
[*Note 9*: For example, some kinds of synchronization between threads
|
| 435 |
of execution may only make progress if the respective threads of
|
| 436 |
execution provide parallel forward progress guarantees, but will fail to
|
| 437 |
make progress under weakly parallel guarantees. — *end note*]
|
| 438 |
|
| 439 |
+
When a thread of execution P is specified to *block with forward
|
| 440 |
+
progress guarantee delegation* on the completion of a set S of threads
|
| 441 |
+
of execution, then throughout the whole time of P being blocked on S,
|
| 442 |
+
the implementation shall ensure that the forward progress guarantees
|
| 443 |
+
provided by at least one thread of execution in S is at least as strong
|
| 444 |
+
as P’s forward progress guarantees.
|
| 445 |
|
| 446 |
+
[*Note 10*: It is unspecified which thread or threads of execution in S
|
| 447 |
+
are chosen and for which number of execution steps. The strengthening is
|
| 448 |
+
not permanent and not necessarily in place for the rest of the lifetime
|
| 449 |
+
of the affected thread of execution. As long as P is blocked, the
|
| 450 |
+
implementation has to eventually select and potentially strengthen a
|
| 451 |
+
thread of execution in S. — *end note*]
|
| 452 |
|
| 453 |
+
Once a thread of execution in S terminates, it is removed from S. Once S
|
| 454 |
+
is empty, P is unblocked.
|
| 455 |
|
| 456 |
+
[*Note 11*: A thread of execution B thus can temporarily provide an
|
| 457 |
effectively stronger forward progress guarantee for a certain amount of
|
| 458 |
+
time, due to a second thread of execution A being blocked on it with
|
| 459 |
+
forward progress guarantee delegation. In turn, if B then blocks with
|
| 460 |
+
forward progress guarantee delegation on C, this may also temporarily
|
| 461 |
+
provide a stronger forward progress guarantee to C. — *end note*]
|
| 462 |
|
| 463 |
+
[*Note 12*: If all threads of execution in S finish executing (e.g.,
|
| 464 |
they terminate and do not use blocking synchronization incorrectly),
|
| 465 |
+
then P’s execution of the operation that blocks with forward progress
|
| 466 |
+
guarantee delegation will not result in P’s progress guarantee being
|
| 467 |
effectively weakened. — *end note*]
|
| 468 |
|
| 469 |
[*Note 13*: This does not remove any constraints regarding blocking
|
| 470 |
synchronization for threads of execution providing parallel or weakly
|
| 471 |
parallel forward progress guarantees because the implementation is not
|