From Jason Turner

[lex.ext]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmp1d86r7_u/{from.md → to.md} +43 -30
tmp/tmp1d86r7_u/{from.md → to.md} RENAMED
@@ -18,10 +18,12 @@ user-defined-integer-literal:
18
 
19
  ``` bnf
20
  user-defined-floating-literal:
21
  fractional-constant exponent-partₒₚₜ ud-suffix
22
  digit-sequence exponent-part ud-suffix
 
 
23
  ```
24
 
25
  ``` bnf
26
  user-defined-string-literal:
27
  string-literal ud-suffix
@@ -35,15 +37,24 @@ user-defined-character-literal:
35
  ``` bnf
36
  ud-suffix:
37
  identifier
38
  ```
39
 
40
- If a token matches both *user-defined-literal* and another literal kind,
41
- it is treated as the latter. `123_km` is a *user-defined-literal*, but
42
- `12LL` is an *integer-literal*. The syntactic non-terminal preceding the
43
- *ud-suffix* in a *user-defined-literal* is taken to be the longest
44
- sequence of characters that could match that non-terminal.
 
 
 
 
 
 
 
 
 
45
 
46
  A *user-defined-literal* is treated as a call to a literal operator or
47
  literal operator template ([[over.literal]]). To determine the form of
48
  this call for a given *user-defined-literal* *L* with *ud-suffix* *X*,
49
  the *literal-operator-id* whose literal suffix identifier is *X* is
@@ -73,13 +84,14 @@ a call of the form
73
 
74
  ``` cpp
75
  operator "" X<'c₁', 'c₂', ... 'cₖ'>()
76
  ```
77
 
78
- where *n* is the source character sequence c₁c₂...cₖ. The sequence
79
- c₁c₂...cₖ can only contain characters from the basic source character
80
- set.
 
81
 
82
  If *L* is a *user-defined-floating-literal*, let *f* be the literal
83
  without its *ud-suffix*. If *S* contains a literal operator with
84
  parameter type `long double`, the literal *L* is treated as a call of
85
  the form
@@ -101,32 +113,35 @@ a call of the form
101
 
102
  ``` cpp
103
  operator "" X<'c₁', 'c₂', ... 'cₖ'>()
104
  ```
105
 
106
- where *f* is the source character sequence c₁c₂...cₖ. The sequence
107
- c₁c₂...cₖ can only contain characters from the basic source character
108
- set.
 
109
 
110
  If *L* is a *user-defined-string-literal*, let *str* be the literal
111
  without its *ud-suffix* and let *len* be the number of code units in
112
  *str* (i.e., its length excluding the terminating null character). The
113
  literal *L* is treated as a call of the form
114
 
115
  ``` cpp
116
- operator "" X(str{}, len{})
117
  ```
118
 
119
  If *L* is a *user-defined-character-literal*, let *ch* be the literal
120
  without its *ud-suffix*. *S* shall contain a literal operator (
121
  [[over.literal]]) whose only parameter has the type of *ch* and the
122
  literal *L* is treated as a call of the form
123
 
124
  ``` cpp
125
- operator "" X(ch{})
126
  ```
127
 
 
 
128
  ``` cpp
129
  long double operator "" _w(long double);
130
  std::string operator "" _w(const char16_t*, std::size_t);
131
  unsigned operator "" _w(const char*);
132
  int main() {
@@ -135,48 +150,47 @@ int main() {
135
  12_w; // calls operator "" _w("12")
136
  "two"_w; // error: no applicable literal operator
137
  }
138
  ```
139
 
 
 
140
  In translation phase 6 ([[lex.phases]]), adjacent string literals are
141
  concatenated and *user-defined-string-literal*s are considered string
142
  literals for that purpose. During concatenation, *ud-suffix*es are
143
  removed and ignored and the concatenation process occurs as described
144
  in  [[lex.string]]. At the end of phase 6, if a string literal is the
145
  result of a concatenation involving at least one
146
  *user-defined-string-literal*, all the participating
147
  *user-defined-string-literal*s shall have the same *ud-suffix* and that
148
  suffix is applied to the result of the concatenation.
149
 
 
 
150
  ``` cpp
151
  int main() {
152
  L"A" "B" "C"_x; // OK: same as L"ABC"_x
153
  "P"_x "Q" "R"_y;// error: two different ud-suffix{es}
154
  }
155
  ```
156
 
157
- Some *identifier*s appearing as *ud-suffix*es are reserved for future
158
- standardization ([[usrlit.suffix]]). A program containing such a
159
- *ud-suffix* is ill-formed, no diagnostic required.
160
 
161
  <!-- Link reference definitions -->
162
  [basic.fundamental]: basic.md#basic.fundamental
163
  [basic.link]: basic.md#basic.link
164
  [basic.lookup.unqual]: basic.md#basic.lookup.unqual
165
  [basic.stc]: basic.md#basic.stc
166
  [basic.types]: basic.md#basic.types
167
- [charname.allowed]: charname.md#charname.allowed
168
- [charname.disallowed]: charname.md#charname.disallowed
169
  [conv.mem]: conv.md#conv.mem
170
  [conv.ptr]: conv.md#conv.ptr
171
  [cpp]: cpp.md#cpp
172
  [cpp.concat]: cpp.md#cpp.concat
173
  [cpp.cond]: cpp.md#cpp.cond
174
  [cpp.include]: cpp.md#cpp.include
175
  [cpp.stringize]: cpp.md#cpp.stringize
176
  [dcl.attr.grammar]: dcl.md#dcl.attr.grammar
177
- [global.names]: library.md#global.names
178
  [headers]: library.md#headers
179
  [lex]: #lex
180
  [lex.bool]: #lex.bool
181
  [lex.ccon]: #lex.ccon
182
  [lex.charset]: #lex.charset
@@ -196,23 +210,22 @@ standardization ([[usrlit.suffix]]). A program containing such a
196
  [lex.ppnumber]: #lex.ppnumber
197
  [lex.pptoken]: #lex.pptoken
198
  [lex.separate]: #lex.separate
199
  [lex.string]: #lex.string
200
  [lex.token]: #lex.token
201
- [lex.trigraph]: #lex.trigraph
202
  [over.literal]: over.md#over.literal
203
  [tab:alternative.representations]: #tab:alternative.representations
204
  [tab:alternative.tokens]: #tab:alternative.tokens
 
 
205
  [tab:escape.sequences]: #tab:escape.sequences
206
  [tab:identifiers.special]: #tab:identifiers.special
207
  [tab:keywords]: #tab:keywords
208
  [tab:lex.string.concat]: #tab:lex.string.concat
209
  [tab:lex.type.integer.literal]: #tab:lex.type.integer.literal
210
- [tab:trigraph.sequences]: #tab:trigraph.sequences
211
  [temp.explicit]: temp.md#temp.explicit
212
  [temp.names]: temp.md#temp.names
213
- [usrlit.suffix]: library.md#usrlit.suffix
214
 
215
  [^1]: Implementations must behave as if these separate phases occur,
216
  although in practice different phases might be folded together.
217
 
218
  [^2]: A partial preprocessing token would arise from a source file
@@ -227,16 +240,16 @@ standardization ([[usrlit.suffix]]). A program containing such a
227
  [^4]: The glyphs for the members of the basic source character set are
228
  intended to identify characters from the subset of ISO/IEC 10646
229
  which corresponds to the ASCII character set. However, because the
230
  mapping from source file characters to the source character set
231
  (described in translation phase 1) is specified as
232
- implementation-defined, an implementation is required to document
233
  how the basic source characters are represented in source files.
234
 
235
- [^5]: A sequence of characters resembling a universal-character-name in
236
- an *r-char-sequence* ([[lex.string]]) does not form a
237
- universal-character-name.
238
 
239
  [^6]: These include “digraphs” and additional reserved words. The term
240
  “digraph” (token consisting of two characters) is not perfectly
241
  descriptive, since one of the alternative preprocessing-tokens is
242
  `%:%:` and of course several primary tokens contain two characters.
@@ -253,14 +266,14 @@ standardization ([[usrlit.suffix]]). A program containing such a
253
  might result in an error, be interpreted as the character
254
  corresponding to the escape sequence, or have a completely different
255
  meaning, depending on the implementation.
256
 
257
  [^10]: On systems in which linkers cannot accept extended characters, an
258
- encoding of the universal-character-name may be used in forming
259
  valid external identifiers. For example, some otherwise unused
260
  character or sequence of characters may be used to encode the `\u`
261
- in a universal-character-name. Extended characters may produce a
262
  long external identifier, but C++does not place a translation limit
263
  on significant characters for external identifiers. In C++, upper-
264
  and lower-case letters are considered different for all identifiers,
265
  including external identifiers.
266
 
@@ -270,7 +283,7 @@ standardization ([[usrlit.suffix]]). A program containing such a
270
  [^12]: The digits `8` and `9` are not octal digits.
271
 
272
  [^13]: They are intended for character sets where a character does not
273
  fit into a single byte.
274
 
275
- [^14]: Using an escape sequence for a question mark can avoid
276
- accidentally creating a trigraph.
 
18
 
19
  ``` bnf
20
  user-defined-floating-literal:
21
  fractional-constant exponent-partₒₚₜ ud-suffix
22
  digit-sequence exponent-part ud-suffix
23
+ hexadecimal-prefix hexadecimal-fractional-constant binary-exponent-part ud-suffix
24
+ hexadecimal-prefix hexadecimal-digit-sequence binary-exponent-part ud-suffix
25
  ```
26
 
27
  ``` bnf
28
  user-defined-string-literal:
29
  string-literal ud-suffix
 
37
  ``` bnf
38
  ud-suffix:
39
  identifier
40
  ```
41
 
42
+ If a token matches both *user-defined-literal* and another *literal*
43
+ kind, it is treated as the latter.
44
+
45
+ [*Example 1*:
46
+
47
+ `123_km`
48
+
49
+ is a *user-defined-literal*, but `12LL` is an *integer-literal*.
50
+
51
+ — *end example*]
52
+
53
+ The syntactic non-terminal preceding the *ud-suffix* in a
54
+ *user-defined-literal* is taken to be the longest sequence of characters
55
+ that could match that non-terminal.
56
 
57
  A *user-defined-literal* is treated as a call to a literal operator or
58
  literal operator template ([[over.literal]]). To determine the form of
59
  this call for a given *user-defined-literal* *L* with *ud-suffix* *X*,
60
  the *literal-operator-id* whose literal suffix identifier is *X* is
 
84
 
85
  ``` cpp
86
  operator "" X<'c₁', 'c₂', ... 'cₖ'>()
87
  ```
88
 
89
+ where *n* is the source character sequence c₁c₂...cₖ.
90
+
91
+ [*Note 1*: The sequence c₁c₂...cₖ can only contain characters from the
92
+ basic source character set. — *end note*]
93
 
94
  If *L* is a *user-defined-floating-literal*, let *f* be the literal
95
  without its *ud-suffix*. If *S* contains a literal operator with
96
  parameter type `long double`, the literal *L* is treated as a call of
97
  the form
 
113
 
114
  ``` cpp
115
  operator "" X<'c₁', 'c₂', ... 'cₖ'>()
116
  ```
117
 
118
+ where *f* is the source character sequence c₁c₂...cₖ.
119
+
120
+ [*Note 2*: The sequence c₁c₂...cₖ can only contain characters from the
121
+ basic source character set. — *end note*]
122
 
123
  If *L* is a *user-defined-string-literal*, let *str* be the literal
124
  without its *ud-suffix* and let *len* be the number of code units in
125
  *str* (i.e., its length excluding the terminating null character). The
126
  literal *L* is treated as a call of the form
127
 
128
  ``` cpp
129
+ operator "" X(str, len)
130
  ```
131
 
132
  If *L* is a *user-defined-character-literal*, let *ch* be the literal
133
  without its *ud-suffix*. *S* shall contain a literal operator (
134
  [[over.literal]]) whose only parameter has the type of *ch* and the
135
  literal *L* is treated as a call of the form
136
 
137
  ``` cpp
138
+ operator "" X(ch)
139
  ```
140
 
141
+ [*Example 2*:
142
+
143
  ``` cpp
144
  long double operator "" _w(long double);
145
  std::string operator "" _w(const char16_t*, std::size_t);
146
  unsigned operator "" _w(const char*);
147
  int main() {
 
150
  12_w; // calls operator "" _w("12")
151
  "two"_w; // error: no applicable literal operator
152
  }
153
  ```
154
 
155
+ — *end example*]
156
+
157
  In translation phase 6 ([[lex.phases]]), adjacent string literals are
158
  concatenated and *user-defined-string-literal*s are considered string
159
  literals for that purpose. During concatenation, *ud-suffix*es are
160
  removed and ignored and the concatenation process occurs as described
161
  in  [[lex.string]]. At the end of phase 6, if a string literal is the
162
  result of a concatenation involving at least one
163
  *user-defined-string-literal*, all the participating
164
  *user-defined-string-literal*s shall have the same *ud-suffix* and that
165
  suffix is applied to the result of the concatenation.
166
 
167
+ [*Example 3*:
168
+
169
  ``` cpp
170
  int main() {
171
  L"A" "B" "C"_x; // OK: same as L"ABC"_x
172
  "P"_x "Q" "R"_y;// error: two different ud-suffix{es}
173
  }
174
  ```
175
 
176
+ *end example*]
 
 
177
 
178
  <!-- Link reference definitions -->
179
  [basic.fundamental]: basic.md#basic.fundamental
180
  [basic.link]: basic.md#basic.link
181
  [basic.lookup.unqual]: basic.md#basic.lookup.unqual
182
  [basic.stc]: basic.md#basic.stc
183
  [basic.types]: basic.md#basic.types
 
 
184
  [conv.mem]: conv.md#conv.mem
185
  [conv.ptr]: conv.md#conv.ptr
186
  [cpp]: cpp.md#cpp
187
  [cpp.concat]: cpp.md#cpp.concat
188
  [cpp.cond]: cpp.md#cpp.cond
189
  [cpp.include]: cpp.md#cpp.include
190
  [cpp.stringize]: cpp.md#cpp.stringize
191
  [dcl.attr.grammar]: dcl.md#dcl.attr.grammar
 
192
  [headers]: library.md#headers
193
  [lex]: #lex
194
  [lex.bool]: #lex.bool
195
  [lex.ccon]: #lex.ccon
196
  [lex.charset]: #lex.charset
 
210
  [lex.ppnumber]: #lex.ppnumber
211
  [lex.pptoken]: #lex.pptoken
212
  [lex.separate]: #lex.separate
213
  [lex.string]: #lex.string
214
  [lex.token]: #lex.token
 
215
  [over.literal]: over.md#over.literal
216
  [tab:alternative.representations]: #tab:alternative.representations
217
  [tab:alternative.tokens]: #tab:alternative.tokens
218
+ [tab:charname.allowed]: #tab:charname.allowed
219
+ [tab:charname.disallowed]: #tab:charname.disallowed
220
  [tab:escape.sequences]: #tab:escape.sequences
221
  [tab:identifiers.special]: #tab:identifiers.special
222
  [tab:keywords]: #tab:keywords
223
  [tab:lex.string.concat]: #tab:lex.string.concat
224
  [tab:lex.type.integer.literal]: #tab:lex.type.integer.literal
 
225
  [temp.explicit]: temp.md#temp.explicit
226
  [temp.names]: temp.md#temp.names
 
227
 
228
  [^1]: Implementations must behave as if these separate phases occur,
229
  although in practice different phases might be folded together.
230
 
231
  [^2]: A partial preprocessing token would arise from a source file
 
240
  [^4]: The glyphs for the members of the basic source character set are
241
  intended to identify characters from the subset of ISO/IEC 10646
242
  which corresponds to the ASCII character set. However, because the
243
  mapping from source file characters to the source character set
244
  (described in translation phase 1) is specified as
245
+ *implementation-defined*, an implementation is required to document
246
  how the basic source characters are represented in source files.
247
 
248
+ [^5]: A sequence of characters resembling a *universal-character-name*
249
+ in an *r-char-sequence* ([[lex.string]]) does not form a
250
+ *universal-character-name*.
251
 
252
  [^6]: These include “digraphs” and additional reserved words. The term
253
  “digraph” (token consisting of two characters) is not perfectly
254
  descriptive, since one of the alternative preprocessing-tokens is
255
  `%:%:` and of course several primary tokens contain two characters.
 
266
  might result in an error, be interpreted as the character
267
  corresponding to the escape sequence, or have a completely different
268
  meaning, depending on the implementation.
269
 
270
  [^10]: On systems in which linkers cannot accept extended characters, an
271
+ encoding of the *universal-character-name* may be used in forming
272
  valid external identifiers. For example, some otherwise unused
273
  character or sequence of characters may be used to encode the `\u`
274
+ in a *universal-character-name*. Extended characters may produce a
275
  long external identifier, but C++does not place a translation limit
276
  on significant characters for external identifiers. In C++, upper-
277
  and lower-case letters are considered different for all identifiers,
278
  including external identifiers.
279
 
 
283
  [^12]: The digits `8` and `9` are not octal digits.
284
 
285
  [^13]: They are intended for character sets where a character does not
286
  fit into a single byte.
287
 
288
+ [^14]: Using an escape sequence for a question mark is supported for
289
+ compatibility with ISO C++14and ISO C.