From Jason Turner

[lex.ccon]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmpn0ex8ukm/{from.md → to.md} +21 -36
tmp/tmpn0ex8ukm/{from.md → to.md} RENAMED
@@ -10,12 +10,11 @@ encoding-prefix: one of
10
  'u8' 'u' 'U' 'L'
11
  ```
12
 
13
  ``` bnf
14
  c-char-sequence:
15
- c-char
16
- c-char-sequence c-char
17
  ```
18
 
19
  ``` bnf
20
  c-char:
21
  basic-c-char
@@ -52,12 +51,11 @@ numeric-escape-sequence:
52
  hexadecimal-escape-sequence
53
  ```
54
 
55
  ``` bnf
56
  simple-octal-digit-sequence:
57
- octal-digit
58
- simple-octal-digit-sequence octal-digit
59
  ```
60
 
61
  ``` bnf
62
  octal-escape-sequence:
63
  '\' octal-digit
@@ -80,60 +78,47 @@ conditional-escape-sequence:
80
  ``` bnf
81
  conditional-escape-sequence-char:
82
  any member of the basic character set that is not an octal-digit, a simple-escape-sequence-char, or the characters 'N', 'o', 'u', 'U', or 'x'
83
  ```
84
 
85
- A *non-encodable character literal* is a *character-literal* whose
86
- *c-char-sequence* consists of a single *c-char* that is not a
87
- *numeric-escape-sequence* and that specifies a character that either
88
- lacks representation in the literal’s associated character encoding or
89
- that cannot be encoded as a single code unit. A *multicharacter literal*
90
- is a *character-literal* whose *c-char-sequence* consists of more than
91
- one *c-char*. The *encoding-prefix* of a non-encodable character literal
92
- or a multicharacter literal shall be absent. Such *character-literal*s
93
- are conditionally-supported.
94
 
95
  The kind of a *character-literal*, its type, and its associated
96
  character encoding [[lex.charset]] are determined by its
97
  *encoding-prefix* and its *c-char-sequence* as defined by
98
- [[lex.ccon.literal]]. The special cases for non-encodable character
99
- literals and multicharacter literals take precedence over the base kind.
100
-
101
- [*Note 1*: The associated character encoding for ordinary character
102
- literals determines encodability, but does not determine the value of
103
- non-encodable ordinary character literals or ordinary multicharacter
104
- literals. The examples in [[lex.ccon.literal]] for non-encodable
105
- ordinary character literals assume that the specified character lacks
106
- representation in the ordinary literal encoding or that encoding the
107
- character would require more than one code unit. — *end note*]
108
 
109
  **Table: Character literals** <a id="lex.ccon.literal">[lex.ccon.literal]</a>
110
 
111
- | | | | | |
112
- | ---- | -------------------------- | ---------- | ------------ | ------- |
113
- | none | ordinary character literal | `char` | ordinary | `'v'` |
114
  | `L` | wide character literal | `wchar_t` | wide literal | `L'w'` |
115
  | | | | encoding | |
116
  | `u8` | UTF-8 character literal | `char8_t` | UTF-8 | `u8'x'` |
117
  | `u` | UTF-16 character literal | `char16_t` | UTF-16 | `u'y'` |
118
  | `U` | UTF-32 character literal | `char32_t` | UTF-32 | `U'z'` |
119
 
120
 
121
  In translation phase 4, the value of a *character-literal* is determined
122
  using the range of representable values of the *character-literal*’s
123
- type in translation phase 7. A non-encodable character literal or a
124
- multicharacter literal has an *implementation-defined* value. The value
125
- of any other kind of *character-literal* is determined as follows:
126
 
127
  - A *character-literal* with a *c-char-sequence* consisting of a single
128
  *basic-c-char*, *simple-escape-sequence*, or
129
  *universal-character-name* is the code unit value of the specified
130
  character as encoded in the literal’s associated character encoding.
131
- \[*Note 2*: If the specified character lacks representation in the
132
- literal’s associated character encoding or if it cannot be encoded as
133
- a single code unit, then the literal is a non-encodable character
134
- literal. — *end note*]
135
  - A *character-literal* with a *c-char-sequence* consisting of a single
136
  *numeric-escape-sequence* has a value as follows:
137
  - Let v be the integer value represented by the octal number
138
  comprising the sequence of *octal-digit*s in an
139
  *octal-escape-sequence* or by the hexadecimal number comprising the
@@ -144,20 +129,20 @@ of any other kind of *character-literal* is determined as follows:
144
  or `L`, and v does not exceed the range of representable values of
145
  the corresponding unsigned type for the underlying type of the
146
  *character-literal*’s type, then the value is the unique value of
147
  the *character-literal*’s type `T` that is congruent to v modulo 2ᴺ,
148
  where N is the width of `T`.
149
- - Otherwise, the *character-literal* is ill-formed.
150
  - A *character-literal* with a *c-char-sequence* consisting of a single
151
  *conditional-escape-sequence* is conditionally-supported and has an
152
  *implementation-defined* value.
153
 
154
  The character specified by a *simple-escape-sequence* is specified in
155
  [[lex.ccon.esc]].
156
 
157
- [*Note 3*: Using an escape sequence for a question mark is supported
158
- for compatibility with ISO C++14 and ISO C. — *end note*]
159
 
160
  **Table: Simple escape sequences** <a id="lex.ccon.esc">[lex.ccon.esc]</a>
161
 
162
  | character | | *simple-escape-sequence* |
163
  | --------- | -------------------- | ------------------------ |
 
10
  'u8' 'u' 'U' 'L'
11
  ```
12
 
13
  ``` bnf
14
  c-char-sequence:
15
+ c-char c-char-sequenceₒₚₜ
 
16
  ```
17
 
18
  ``` bnf
19
  c-char:
20
  basic-c-char
 
51
  hexadecimal-escape-sequence
52
  ```
53
 
54
  ``` bnf
55
  simple-octal-digit-sequence:
56
+ octal-digit simple-octal-digit-sequenceₒₚₜ
 
57
  ```
58
 
59
  ``` bnf
60
  octal-escape-sequence:
61
  '\' octal-digit
 
78
  ``` bnf
79
  conditional-escape-sequence-char:
80
  any member of the basic character set that is not an octal-digit, a simple-escape-sequence-char, or the characters 'N', 'o', 'u', 'U', or 'x'
81
  ```
82
 
83
+ A *multicharacter literal* is a *character-literal* whose
84
+ *c-char-sequence* consists of more than one *c-char*. A multicharacter
85
+ literal shall not have an *encoding-prefix*. If a multicharacter literal
86
+ contains a *c-char* that is not encodable as a single code unit in the
87
+ ordinary literal encoding, the program is ill-formed. Multicharacter
88
+ literals are conditionally-supported.
 
 
 
89
 
90
  The kind of a *character-literal*, its type, and its associated
91
  character encoding [[lex.charset]] are determined by its
92
  *encoding-prefix* and its *c-char-sequence* as defined by
93
+ [[lex.ccon.literal]].
 
 
 
 
 
 
 
 
 
94
 
95
  **Table: Character literals** <a id="lex.ccon.literal">[lex.ccon.literal]</a>
96
 
97
+ | Encoding prefix | Kind \chdr | Type \chdr | Associated char- acter encoding | Example |
98
+ | --------------- | -------------------------- | ---------- | ------------------------------- | ------- |
99
+ | none | ordinary character literal | `char` | ordinary literal | `'v'` |
100
  | `L` | wide character literal | `wchar_t` | wide literal | `L'w'` |
101
  | | | | encoding | |
102
  | `u8` | UTF-8 character literal | `char8_t` | UTF-8 | `u8'x'` |
103
  | `u` | UTF-16 character literal | `char16_t` | UTF-16 | `u'y'` |
104
  | `U` | UTF-32 character literal | `char32_t` | UTF-32 | `U'z'` |
105
 
106
 
107
  In translation phase 4, the value of a *character-literal* is determined
108
  using the range of representable values of the *character-literal*’s
109
+ type in translation phase 7. A multicharacter literal has an
110
+ *implementation-defined* value. The value of any other kind of
111
+ *character-literal* is determined as follows:
112
 
113
  - A *character-literal* with a *c-char-sequence* consisting of a single
114
  *basic-c-char*, *simple-escape-sequence*, or
115
  *universal-character-name* is the code unit value of the specified
116
  character as encoded in the literal’s associated character encoding.
117
+ If the specified character lacks representation in the literal’s
118
+ associated character encoding or if it cannot be encoded as a single
119
+ code unit, then the program is ill-formed.
 
120
  - A *character-literal* with a *c-char-sequence* consisting of a single
121
  *numeric-escape-sequence* has a value as follows:
122
  - Let v be the integer value represented by the octal number
123
  comprising the sequence of *octal-digit*s in an
124
  *octal-escape-sequence* or by the hexadecimal number comprising the
 
129
  or `L`, and v does not exceed the range of representable values of
130
  the corresponding unsigned type for the underlying type of the
131
  *character-literal*’s type, then the value is the unique value of
132
  the *character-literal*’s type `T` that is congruent to v modulo 2ᴺ,
133
  where N is the width of `T`.
134
+ - Otherwise, the program is ill-formed.
135
  - A *character-literal* with a *c-char-sequence* consisting of a single
136
  *conditional-escape-sequence* is conditionally-supported and has an
137
  *implementation-defined* value.
138
 
139
  The character specified by a *simple-escape-sequence* is specified in
140
  [[lex.ccon.esc]].
141
 
142
+ [*Note 1*: Using an escape sequence for a question mark is supported
143
+ for compatibility with C++14 and C. — *end note*]
144
 
145
  **Table: Simple escape sequences** <a id="lex.ccon.esc">[lex.ccon.esc]</a>
146
 
147
  | character | | *simple-escape-sequence* |
148
  | --------- | -------------------- | ------------------------ |