tmp/tmpdyc6ovse/{from.md → to.md}
RENAMED
|
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
### Optional extended floating-point types <a id="basic.extended.fp">[[basic.extended.fp]]</a>
|
| 2 |
+
|
| 3 |
+
If the implementation supports an extended floating-point type
|
| 4 |
+
[[basic.fundamental]] whose properties are specified by the ISO/IEC/IEEE
|
| 5 |
+
60559 floating-point interchange format binary16, then the
|
| 6 |
+
*typedef-name* `std::float16_t` is defined in the header `<stdfloat>`
|
| 7 |
+
and names such a type, the macro `__STDCPP_FLOAT16_T__` is defined
|
| 8 |
+
[[cpp.predefined]], and the floating-point literal suffixes `f16` and
|
| 9 |
+
`F16` are supported [[lex.fcon]].
|
| 10 |
+
|
| 11 |
+
If the implementation supports an extended floating-point type whose
|
| 12 |
+
properties are specified by the ISO/IEC/IEEE 60559 floating-point
|
| 13 |
+
interchange format binary32, then the *typedef-name* `std::float32_t` is
|
| 14 |
+
defined in the header `<stdfloat>` and names such a type, the macro
|
| 15 |
+
`__STDCPP_FLOAT32_T__` is defined, and the floating-point literal
|
| 16 |
+
suffixes `f32` and `F32` are supported.
|
| 17 |
+
|
| 18 |
+
If the implementation supports an extended floating-point type whose
|
| 19 |
+
properties are specified by the ISO/IEC/IEEE 60559 floating-point
|
| 20 |
+
interchange format binary64, then the *typedef-name* `std::float64_t` is
|
| 21 |
+
defined in the header `<stdfloat>` and names such a type, the macro
|
| 22 |
+
`__STDCPP_FLOAT64_T__` is defined, and the floating-point literal
|
| 23 |
+
suffixes `f64` and `F64` are supported.
|
| 24 |
+
|
| 25 |
+
If the implementation supports an extended floating-point type whose
|
| 26 |
+
properties are specified by the ISO/IEC/IEEE 60559 floating-point
|
| 27 |
+
interchange format binary128, then the *typedef-name* `std::float128_t`
|
| 28 |
+
is defined in the header `<stdfloat>` and names such a type, the macro
|
| 29 |
+
`__STDCPP_FLOAT128_T__` is defined, and the floating-point literal
|
| 30 |
+
suffixes `f128` and `F128` are supported.
|
| 31 |
+
|
| 32 |
+
If the implementation supports an extended floating-point type with the
|
| 33 |
+
properties, as specified by ISO/IEC/IEEE 60559, of radix (b) of 2,
|
| 34 |
+
storage width in bits (k) of 16, precision in bits (p) of 8, maximum
|
| 35 |
+
exponent (emax) of 127, and exponent field width in bits (w) of 8, then
|
| 36 |
+
the *typedef-name* `std::bfloat16_t` is defined in the header
|
| 37 |
+
`<stdfloat>` and names such a type, the macro `__STDCPP_BFLOAT16_T__` is
|
| 38 |
+
defined, and the floating-point literal suffixes `bf16` and `BF16` are
|
| 39 |
+
supported.
|
| 40 |
+
|
| 41 |
+
[*Note 1*: A summary of the parameters for each type is given in
|
| 42 |
+
[[basic.extended.fp]]. The precision p includes the implicit 1 bit at
|
| 43 |
+
the beginning of the mantissa, so the storage used for the mantissa is
|
| 44 |
+
p-1 bits. ISO/IEC/IEEE 60559 does not assign a name for a type having
|
| 45 |
+
the parameters specified for `std::bfloat16_t`. — *end note*]
|
| 46 |
+
|
| 47 |
+
**Table: Properties of named extended floating-point types** <a id="basic.extended.fp">[basic.extended.fp]</a>
|
| 48 |
+
|
| 49 |
+
| Parameter | `float16_t` | `float32_t` | `float64_t` | `float128_t` | `bfloat16_t` |
|
| 50 |
+
| --------------------------------- | ----------- | ----------- | ----------- | ------------ | ------------ |
|
| 51 |
+
| ISO/IEC/IEEE 60559 name | binary16 | binary32 | binary64 | binary128 | |
|
| 52 |
+
| $k$, storage width in bits | 16 | 32 | 64 | 128 | 16 |
|
| 53 |
+
| $p$, precision in bits | 11 | 24 | 53 | 113 | 8 |
|
| 54 |
+
| $emax$, maximum exponent | 15 | 127 | 1023 | 16383 | 127 |
|
| 55 |
+
| $w$, exponent field width in bits | 5 | 8 | 11 | 15 | 8 |
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
*Recommended practice:* Any names that the implementation provides for
|
| 59 |
+
the extended floating-point types described in this subsection that are
|
| 60 |
+
in addition to the names defined in the `<stdfloat>` header should be
|
| 61 |
+
chosen to increase compatibility and interoperability with the
|
| 62 |
+
interchange types `_Float16`, `_Float32`, `_Float64`, and `_Float128`
|
| 63 |
+
defined in ISO/IEC TS 18661-3 and with future versions of the C
|
| 64 |
+
standard.
|
| 65 |
+
|