[lex.ccon] - C++23 → Trunk

Files changed (1) hide show

tmp/tmpn0ex8ukm/{from.md → to.md} +21 -36

tmp/tmpn0ex8ukm/{from.md → to.md} RENAMED Viewed

@@ -10,12 +10,11 @@ encoding-prefix: one of
     'u8' 'u' 'U' 'L'
 ```
 ``` bnf
 c-char-sequence:
-    c-char
-    c-char-sequence c-char
 ```
 ``` bnf
 c-char:
     basic-c-char
@@ -52,12 +51,11 @@ numeric-escape-sequence:
     hexadecimal-escape-sequence
 ```
 ``` bnf
 simple-octal-digit-sequence:
-    octal-digit
-    simple-octal-digit-sequence octal-digit
 ```
 ``` bnf
 octal-escape-sequence:
     '\' octal-digit
@@ -80,60 +78,47 @@ conditional-escape-sequence:
 ``` bnf
 conditional-escape-sequence-char:
     any member of the basic character set that is not an octal-digit, a simple-escape-sequence-char, or the characters 'N', 'o', 'u', 'U', or 'x'
 ```
-A *non-encodable character literal* is a *character-literal* whose
-*c-char-sequence* consists of a single *c-char* that is not a
-*numeric-escape-sequence* and that specifies a character that either
-lacks representation in the literal’s associated character encoding or
-that cannot be encoded as a single code unit. A *multicharacter literal*
-is a *character-literal* whose *c-char-sequence* consists of more than
-one *c-char*. The *encoding-prefix* of a non-encodable character literal
-or a multicharacter literal shall be absent. Such *character-literal*s
-are conditionally-supported.
 The kind of a *character-literal*, its type, and its associated
 character encoding [[lex.charset]] are determined by its
 *encoding-prefix* and its *c-char-sequence* as defined by
-[[lex.ccon.literal]]. The special cases for non-encodable character
-literals and multicharacter literals take precedence over the base kind.
-[*Note 1*: The associated character encoding for ordinary character
-literals determines encodability, but does not determine the value of
-non-encodable ordinary character literals or ordinary multicharacter
-literals. The examples in [[lex.ccon.literal]] for non-encodable
-ordinary character literals assume that the specified character lacks
-representation in the ordinary literal encoding or that encoding the
-character would require more than one code unit. — *end note*]
 **Table: Character literals** <a id="lex.ccon.literal">[lex.ccon.literal]</a>
-| | | | | |
-| ---- | -------------------------- | ---------- | ------------ | ------- |
-| none | ordinary character literal | `char`     | ordinary | `'v'`   |
 | `L`             | wide character literal     | `wchar_t`  | wide literal                    | `L'w'`  |
 |                 |                            |            | encoding                        |         |
 | `u8`            | UTF-8 character literal    | `char8_t`  | UTF-8                           | `u8'x'` |
 | `u`             | UTF-16 character literal   | `char16_t` | UTF-16                          | `u'y'`  |
 | `U`             | UTF-32 character literal   | `char32_t` | UTF-32                          | `U'z'`  |
 In translation phase 4, the value of a *character-literal* is determined
 using the range of representable values of the *character-literal*’s
-type in translation phase 7. A non-encodable character literal or a
-multicharacter literal has an *implementation-defined* value. The value
-of any other kind of *character-literal* is determined as follows:
 - A *character-literal* with a *c-char-sequence* consisting of a single
   *basic-c-char*, *simple-escape-sequence*, or
   *universal-character-name* is the code unit value of the specified
   character as encoded in the literal’s associated character encoding.
- \[*Note 2*: If the specified character lacks representation in the
- literal’s associated character encoding or if it cannot be encoded as
- a single code unit, then the literal is a non-encodable character
-  literal. — *end note*]
 - A *character-literal* with a *c-char-sequence* consisting of a single
   *numeric-escape-sequence* has a value as follows:
   - Let v be the integer value represented by the octal number
     comprising the sequence of *octal-digit*s in an
     *octal-escape-sequence* or by the hexadecimal number comprising the
@@ -144,20 +129,20 @@ of any other kind of *character-literal* is determined as follows:
     or `L`, and v does not exceed the range of representable values of
     the corresponding unsigned type for the underlying type of the
     *character-literal*’s type, then the value is the unique value of
     the *character-literal*’s type `T` that is congruent to v modulo 2ᴺ,
     where N is the width of `T`.
-  - Otherwise, the *character-literal* is ill-formed.
 - A *character-literal* with a *c-char-sequence* consisting of a single
   *conditional-escape-sequence* is conditionally-supported and has an
   *implementation-defined* value.
 The character specified by a *simple-escape-sequence* is specified in
 [[lex.ccon.esc]].
-[*Note 3*: Using an escape sequence for a question mark is supported
-for compatibility with ISO C++14 and ISO C. — *end note*]
 **Table: Simple escape sequences** <a id="lex.ccon.esc">[lex.ccon.esc]</a>
 | character |                      | *simple-escape-sequence* |
 | --------- | -------------------- | ------------------------ |

     'u8' 'u' 'U' 'L'
 ```
 ``` bnf
 c-char-sequence:
+    c-char c-char-sequenceₒₚₜ
 ```
 ``` bnf
 c-char:
     basic-c-char
     hexadecimal-escape-sequence
 ```
 ``` bnf
 simple-octal-digit-sequence:
+    octal-digit simple-octal-digit-sequenceₒₚₜ
 ```
 ``` bnf
 octal-escape-sequence:
     '\' octal-digit
 ``` bnf
 conditional-escape-sequence-char:
     any member of the basic character set that is not an octal-digit, a simple-escape-sequence-char, or the characters 'N', 'o', 'u', 'U', or 'x'
 ```
+A *multicharacter literal* is a *character-literal* whose
+*c-char-sequence* consists of more than one *c-char*. A multicharacter
+literal shall not have an *encoding-prefix*. If a multicharacter literal
+contains a *c-char* that is not encodable as a single code unit in the
+ordinary literal encoding, the program is ill-formed. Multicharacter
+literals are conditionally-supported.
 The kind of a *character-literal*, its type, and its associated
 character encoding [[lex.charset]] are determined by its
 *encoding-prefix* and its *c-char-sequence* as defined by
+[[lex.ccon.literal]].
 **Table: Character literals** <a id="lex.ccon.literal">[lex.ccon.literal]</a>
+| Encoding prefix | Kind \chdr                 | Type \chdr | Associated char- acter encoding | Example |
+| --------------- | -------------------------- | ---------- | ------------------------------- | ------- |
+| none | ordinary character literal | `char`     | ordinary literal                | `'v'`   |
 | `L`             | wide character literal     | `wchar_t`  | wide literal                    | `L'w'`  |
 |                 |                            |            | encoding                        |         |
 | `u8`            | UTF-8 character literal    | `char8_t`  | UTF-8                           | `u8'x'` |
 | `u`             | UTF-16 character literal   | `char16_t` | UTF-16                          | `u'y'`  |
 | `U`             | UTF-32 character literal   | `char32_t` | UTF-32                          | `U'z'`  |
 In translation phase 4, the value of a *character-literal* is determined
 using the range of representable values of the *character-literal*’s
+type in translation phase 7. A multicharacter literal has an
+*implementation-defined* value. The value of any other kind of
+*character-literal* is determined as follows:
 - A *character-literal* with a *c-char-sequence* consisting of a single
   *basic-c-char*, *simple-escape-sequence*, or
   *universal-character-name* is the code unit value of the specified
   character as encoded in the literal’s associated character encoding.
+  If the specified character lacks representation in the literal’s
+  associated character encoding or if it cannot be encoded as a single
+  code unit, then the program is ill-formed.
 - A *character-literal* with a *c-char-sequence* consisting of a single
   *numeric-escape-sequence* has a value as follows:
   - Let v be the integer value represented by the octal number
     comprising the sequence of *octal-digit*s in an
     *octal-escape-sequence* or by the hexadecimal number comprising the
     or `L`, and v does not exceed the range of representable values of
     the corresponding unsigned type for the underlying type of the
     *character-literal*’s type, then the value is the unique value of
     the *character-literal*’s type `T` that is congruent to v modulo 2ᴺ,
     where N is the width of `T`.
+  - Otherwise, the program is ill-formed.
 - A *character-literal* with a *c-char-sequence* consisting of a single
   *conditional-escape-sequence* is conditionally-supported and has an
   *implementation-defined* value.
 The character specified by a *simple-escape-sequence* is specified in
 [[lex.ccon.esc]].
+[*Note 1*: Using an escape sequence for a question mark is supported
+for compatibility with C++14 and C. — *end note*]
 **Table: Simple escape sequences** <a id="lex.ccon.esc">[lex.ccon.esc]</a>
 | character |                      | *simple-escape-sequence* |
 | --------- | -------------------- | ------------------------ |

Diff to HTML by rtfpessoa