From Jason Turner

[lex.literal]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmp8c3i9k84/{from.md → to.md} +85 -57
tmp/tmp8c3i9k84/{from.md → to.md} RENAMED
@@ -17,44 +17,58 @@ literal:
17
 
18
  ### Integer literals <a id="lex.icon">[[lex.icon]]</a>
19
 
20
  ``` bnf
21
  integer-literal:
22
- decimal-literal integer-suffixₒₚₜ
23
  octal-literal integer-suffixₒₚₜ
 
24
  hexadecimal-literal integer-suffixₒₚₜ
25
  ```
26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  ``` bnf
28
  decimal-literal:
29
  nonzero-digit
30
- decimal-literal digit
31
- ```
32
-
33
- ``` bnf
34
- octal-literal:
35
- '0'
36
- octal-literal octal-digit
37
  ```
38
 
39
  ``` bnf
40
  hexadecimal-literal:
41
  '0x' hexadecimal-digit
42
  '0X' hexadecimal-digit
43
- hexadecimal-literal hexadecimal-digit
44
  ```
45
 
46
  ``` bnf
47
- nonzero-digit: one of
48
- '1 2 3 4 5 6 7 8 9'
 
49
  ```
50
 
51
  ``` bnf
52
  octal-digit: one of
53
  '0 1 2 3 4 5 6 7'
54
  ```
55
 
 
 
 
 
 
56
  ``` bnf
57
  hexadecimal-digit: one of
58
  '0 1 2 3 4 5 6 7 8 9'
59
  'a b c d e f'
60
  'A B C D E F'
@@ -82,27 +96,31 @@ long-suffix: one of
82
  long-long-suffix: one of
83
  'll LL'
84
  ```
85
 
86
  An *integer literal* is a sequence of digits that has no period or
87
- exponent part. An integer literal may have a prefix that specifies its
88
- base and a suffix that specifies its type. The lexically first digit of
89
- the sequence of digits is the most significant. A *decimal* integer
90
- literal (base ten) begins with a digit other than `0` and consists of a
91
- sequence of decimal digits. An *octal* integer literal (base eight)
92
- begins with the digit `0` and consists of a sequence of octal
93
- digits.[^12] A *hexadecimal* integer literal (base sixteen) begins with
94
- `0x` or `0X` and consists of a sequence of hexadecimal digits, which
95
- include the decimal digits and the letters `a` through `f` and `A`
96
- through `F` with decimal values ten through fifteen. the number twelve
97
- can be written `12`, `014`, or `0XC`.
 
 
 
 
98
 
99
  The type of an integer literal is the first of the corresponding list in
100
- Table  [[tab:lex.type.integer.constant]] in which its value can be
101
  represented.
102
 
103
- **Table: Types of integer constants** <a id="tab:lex.type.integer.constant">[tab:lex.type.integer.constant]</a>
104
 
105
  | | | |
106
  | ---------------- | ------------------------ | ------------------------ |
107
  | none | `int` | `int` |
108
  | | `long int` | `unsigned int` |
@@ -181,15 +199,18 @@ hexadecimal-escape-sequence:
181
  A character literal is one or more characters enclosed in single quotes,
182
  as in `'x'`, optionally preceded by one of the letters `u`, `U`, or `L`,
183
  as in `u'y'`, `U'z'`, or `L'x'`, respectively. A character literal that
184
  does not begin with `u`, `U`, or `L` is an ordinary character literal,
185
  also referred to as a narrow-character literal. An ordinary character
186
- literal that contains a single *c-char* has type `char`, with value
187
- equal to the numerical value of the encoding of the *c-char* in the
188
- execution character set. An ordinary character literal that contains
189
- more than one *c-char* is a *multicharacter literal*. A multicharacter
190
- literal has type `int` and *implementation-defined* value.
 
 
 
191
 
192
  A character literal that begins with the letter `u`, such as `u'y'`, is
193
  a character literal of type `char16_t`. The value of a `char16_t`
194
  literal containing a single *c-char* is equal to its ISO 10646 code
195
  point value, provided that the code point is representable with a single
@@ -291,35 +312,37 @@ sign: one of
291
  ```
292
 
293
  ``` bnf
294
  digit-sequence:
295
  digit
296
- digit-sequence digit
297
  ```
298
 
299
  ``` bnf
300
  floating-suffix: one of
301
  'f l F L'
302
  ```
303
 
304
  A floating literal consists of an integer part, a decimal point, a
305
  fraction part, an `e` or `E`, an optionally signed integer exponent, and
306
  an optional type suffix. The integer and fraction parts both consist of
307
- a sequence of decimal (base ten) digits. Either the integer part or the
308
- fraction part (not both) can be omitted; either the decimal point or the
309
- letter `e` (or `E` ) and the exponent (not both) can be omitted. The
310
- integer part, the optional decimal point and the optional fraction part
311
- form the *significant part* of the floating literal. The exponent, if
312
- present, indicates the power of 10 by which the significant part is to
313
- be scaled. If the scaled value is in the range of representable values
314
- for its type, the result is the scaled value if representable, else the
315
- larger or smaller representable value nearest the scaled value, chosen
316
- in an *implementation-defined* manner. The type of a floating literal is
317
- `double` unless explicitly specified by a suffix. The suffixes `f` and
318
- `F` specify `float`, the suffixes `l` and `L` specify `long` `double`.
319
- If the scaled value is not in the range of representable values for its
320
- type, the program is ill-formed.
 
 
321
 
322
  ### String literals <a id="lex.string">[[lex.string]]</a>
323
 
324
  ``` bnf
325
  string-literal:
@@ -412,18 +435,21 @@ is equivalent to `"\n)\?\?=\"\n"`.
412
  After translation phase 6, a string literal that does not begin with an
413
  *encoding-prefix* is an ordinary string literal, and is initialized with
414
  the given characters.
415
 
416
  A string literal that begins with `u8`, such as `u8"asdf"`, is a UTF-8
417
- string literal and is initialized with the given characters as encoded
418
- in UTF-8.
419
 
420
  Ordinary string literals and UTF-8 string literals are also referred to
421
  as narrow string literals. A narrow string literal has type “array of
422
  *n* `const char`”, where *n* is the size of the string as defined below,
423
  and has static storage duration ([[basic.stc]]).
424
 
 
 
 
 
425
  A string literal that begins with `u`, such as `u"asdf"`, is a
426
  `char16_t` string literal. A `char16_t` string literal has type “array
427
  of *n* `const char16_t`”, where *n* is the size of the string as defined
428
  below; it has static storage duration and is initialized with the given
429
  characters. A single *c-char* may produce more than one `char16_t`
@@ -448,18 +474,18 @@ In translation phase 6 ([[lex.phases]]), adjacent string literals are
448
  concatenated. If both string literals have the same *encoding-prefix*,
449
  the resulting concatenated string literal has that *encoding-prefix*. If
450
  one string literal has no *encoding-prefix*, it is treated as a string
451
  literal of the same *encoding-prefix* as the other operand. If a UTF-8
452
  string literal token is adjacent to a wide string literal token, the
453
- program is ill-formed. Any other concatenations are conditionally
454
- supported with *implementation-defined* behavior. This concatenation is
455
- an interpretation, not a conversion. Because the interpretation happens
456
- in translation phase 6 (after each character from a literal has been
457
- translated into a value from the appropriate character set), a string
458
- literal’s initial rawness has no effect on the interpretation or
459
- well-formedness of the concatenation. Table  [[tab:lex.string.concat]]
460
- has some examples of valid concatenations.
461
 
462
  **Table: String literal concatenations** <a id="tab:lex.string.concat">[tab:lex.string.concat]</a>
463
 
464
  | | | | | | |
465
  | -------------------------- | ----- | -------------------------- | ----- | -------------------------- | ----- |
@@ -539,10 +565,11 @@ user-defined-literal:
539
  ``` bnf
540
  user-defined-integer-literal:
541
  decimal-literal ud-suffix
542
  octal-literal ud-suffix
543
  hexadecimal-literal ud-suffix
 
544
  ```
545
 
546
  ``` bnf
547
  user-defined-floating-literal:
548
  fractional-constant exponent-partₒₚₜ ud-suffix
@@ -587,11 +614,11 @@ call of the form
587
  operator "" X(nULL)
588
  ```
589
 
590
  Otherwise, *S* shall contain a raw literal operator or a literal
591
  operator template ([[over.literal]]) but not both. If *S* contains a
592
- raw literal operator, the *literal* *L* is treated as a call of the form
593
 
594
  ``` cpp
595
  operator "" X("n{"})
596
  ```
597
 
@@ -652,11 +679,11 @@ literal *L* is treated as a call of the form
652
  operator "" X(ch{})
653
  ```
654
 
655
  ``` cpp
656
  long double operator "" _w(long double);
657
- std::string operator "" _w(const char16_t*, size_t);
658
  unsigned operator "" _w(const char*);
659
  int main() {
660
  1.2_w; // calls operator "" _w(1.2L)
661
  u"one"_w; // calls operator "" _w(u"one", 3)
662
  12_w; // calls operator "" _w("12")
@@ -688,10 +715,11 @@ standardization ([[usrlit.suffix]]). A program containing such a
688
  <!-- Link reference definitions -->
689
  [basic.fundamental]: basic.md#basic.fundamental
690
  [basic.link]: basic.md#basic.link
691
  [basic.lookup.unqual]: basic.md#basic.lookup.unqual
692
  [basic.stc]: basic.md#basic.stc
 
693
  [charname.allowed]: charname.md#charname.allowed
694
  [charname.disallowed]: charname.md#charname.disallowed
695
  [conv.mem]: conv.md#conv.mem
696
  [conv.ptr]: conv.md#conv.ptr
697
  [cpp]: cpp.md#cpp
@@ -730,11 +758,11 @@ standardization ([[usrlit.suffix]]). A program containing such a
730
  [tab:alternative.tokens]: #tab:alternative.tokens
731
  [tab:escape.sequences]: #tab:escape.sequences
732
  [tab:identifiers.special]: #tab:identifiers.special
733
  [tab:keywords]: #tab:keywords
734
  [tab:lex.string.concat]: #tab:lex.string.concat
735
- [tab:lex.type.integer.constant]: #tab:lex.type.integer.constant
736
  [tab:trigraph.sequences]: #tab:trigraph.sequences
737
  [temp.explicit]: temp.md#temp.explicit
738
  [temp.names]: temp.md#temp.names
739
  [usrlit.suffix]: library.md#usrlit.suffix
740
 
 
17
 
18
  ### Integer literals <a id="lex.icon">[[lex.icon]]</a>
19
 
20
  ``` bnf
21
  integer-literal:
22
+ binary-literal integer-suffixₒₚₜ
23
  octal-literal integer-suffixₒₚₜ
24
+ decimal-literal integer-suffixₒₚₜ
25
  hexadecimal-literal integer-suffixₒₚₜ
26
  ```
27
 
28
+ ``` bnf
29
+ binary-literal:
30
+ '0b' binary-digit
31
+ '0B' binary-digit
32
+ binary-literal '''ₒₚₜ binary-digit
33
+ ```
34
+
35
+ ``` bnf
36
+ octal-literal:
37
+ '0'
38
+ octal-literal '''ₒₚₜ octal-digit
39
+ ```
40
+
41
  ``` bnf
42
  decimal-literal:
43
  nonzero-digit
44
+ decimal-literal '''ₒₚₜ digit
 
 
 
 
 
 
45
  ```
46
 
47
  ``` bnf
48
  hexadecimal-literal:
49
  '0x' hexadecimal-digit
50
  '0X' hexadecimal-digit
51
+ hexadecimal-literal '''ₒₚₜ hexadecimal-digit
52
  ```
53
 
54
  ``` bnf
55
+ binary-digit:
56
+ '0'
57
+ '1'
58
  ```
59
 
60
  ``` bnf
61
  octal-digit: one of
62
  '0 1 2 3 4 5 6 7'
63
  ```
64
 
65
+ ``` bnf
66
+ nonzero-digit: one of
67
+ '1 2 3 4 5 6 7 8 9'
68
+ ```
69
+
70
  ``` bnf
71
  hexadecimal-digit: one of
72
  '0 1 2 3 4 5 6 7 8 9'
73
  'a b c d e f'
74
  'A B C D E F'
 
96
  long-long-suffix: one of
97
  'll LL'
98
  ```
99
 
100
  An *integer literal* is a sequence of digits that has no period or
101
+ exponent part, with optional separating single quotes that are ignored
102
+ when determining its value. An integer literal may have a prefix that
103
+ specifies its base and a suffix that specifies its type. The lexically
104
+ first digit of the sequence of digits is the most significant. A
105
+ *binary* integer literal (base two) begins with `0b` or `0B` and
106
+ consists of a sequence of binary digits. An *octal* integer literal
107
+ (base eight) begins with the digit `0` and consists of a sequence of
108
+ octal digits.[^12] A *decimal* integer literal (base ten) begins with a
109
+ digit other than `0` and consists of a sequence of decimal digits. A
110
+ *hexadecimal* integer literal (base sixteen) begins with `0x` or `0X`
111
+ and consists of a sequence of hexadecimal digits, which include the
112
+ decimal digits and the letters `a` through `f` and `A` through `F` with
113
+ decimal values ten through fifteen. The number twelve can be written
114
+ `12`, `014`, `0XC`, or `0b1100`. The literals `1048576`, `1'048'576`,
115
+ `0X100000`, `0x10'0000`, and `0'004'000'000` all have the same value.
116
 
117
  The type of an integer literal is the first of the corresponding list in
118
+ Table  [[tab:lex.type.integer.literal]] in which its value can be
119
  represented.
120
 
121
+ **Table: Types of integer literals** <a id="tab:lex.type.integer.literal">[tab:lex.type.integer.literal]</a>
122
 
123
  | | | |
124
  | ---------------- | ------------------------ | ------------------------ |
125
  | none | `int` | `int` |
126
  | | `long int` | `unsigned int` |
 
199
  A character literal is one or more characters enclosed in single quotes,
200
  as in `'x'`, optionally preceded by one of the letters `u`, `U`, or `L`,
201
  as in `u'y'`, `U'z'`, or `L'x'`, respectively. A character literal that
202
  does not begin with `u`, `U`, or `L` is an ordinary character literal,
203
  also referred to as a narrow-character literal. An ordinary character
204
+ literal that contains a single *c-char* representable in the execution
205
+ character set has type `char`, with value equal to the numerical value
206
+ of the encoding of the *c-char* in the execution character set. An
207
+ ordinary character literal that contains more than one *c-char* is a
208
+ *multicharacter literal*. A multicharacter literal, or an ordinary
209
+ character literal containing a single *c-char* not representable in the
210
+ execution character set, is conditionally-supported, has type `int`, and
211
+ has an *implementation-defined* value.
212
 
213
  A character literal that begins with the letter `u`, such as `u'y'`, is
214
  a character literal of type `char16_t`. The value of a `char16_t`
215
  literal containing a single *c-char* is equal to its ISO 10646 code
216
  point value, provided that the code point is representable with a single
 
312
  ```
313
 
314
  ``` bnf
315
  digit-sequence:
316
  digit
317
+ digit-sequence '''ₒₚₜ digit
318
  ```
319
 
320
  ``` bnf
321
  floating-suffix: one of
322
  'f l F L'
323
  ```
324
 
325
  A floating literal consists of an integer part, a decimal point, a
326
  fraction part, an `e` or `E`, an optionally signed integer exponent, and
327
  an optional type suffix. The integer and fraction parts both consist of
328
+ a sequence of decimal (base ten) digits. Optional separating single
329
+ quotes in a *digit-sequence* are ignored when determining its value. The
330
+ literals `1.602'176'565e-19` and `1.602176565e-19` have the same value.
331
+ Either the integer part or the fraction part (not both) can be omitted;
332
+ either the decimal point or the letter `e` (or `E` ) and the exponent
333
+ (not both) can be omitted. The integer part, the optional decimal point
334
+ and the optional fraction part form the *significant part* of the
335
+ floating literal. The exponent, if present, indicates the power of 10 by
336
+ which the significant part is to be scaled. If the scaled value is in
337
+ the range of representable values for its type, the result is the scaled
338
+ value if representable, else the larger or smaller representable value
339
+ nearest the scaled value, chosen in an *implementation-defined* manner.
340
+ The type of a floating literal is `double` unless explicitly specified
341
+ by a suffix. The suffixes `f` and `F` specify `float`, the suffixes `l`
342
+ and `L` specify `long` `double`. If the scaled value is not in the range
343
+ of representable values for its type, the program is ill-formed.
344
 
345
  ### String literals <a id="lex.string">[[lex.string]]</a>
346
 
347
  ``` bnf
348
  string-literal:
 
435
  After translation phase 6, a string literal that does not begin with an
436
  *encoding-prefix* is an ordinary string literal, and is initialized with
437
  the given characters.
438
 
439
  A string literal that begins with `u8`, such as `u8"asdf"`, is a UTF-8
440
+ string literal.
 
441
 
442
  Ordinary string literals and UTF-8 string literals are also referred to
443
  as narrow string literals. A narrow string literal has type “array of
444
  *n* `const char`”, where *n* is the size of the string as defined below,
445
  and has static storage duration ([[basic.stc]]).
446
 
447
+ For a UTF-8 string literal, each successive element of the object
448
+ representation ([[basic.types]]) has the value of the corresponding
449
+ code unit of the UTF-8 encoding of the string.
450
+
451
  A string literal that begins with `u`, such as `u"asdf"`, is a
452
  `char16_t` string literal. A `char16_t` string literal has type “array
453
  of *n* `const char16_t`”, where *n* is the size of the string as defined
454
  below; it has static storage duration and is initialized with the given
455
  characters. A single *c-char* may produce more than one `char16_t`
 
474
  concatenated. If both string literals have the same *encoding-prefix*,
475
  the resulting concatenated string literal has that *encoding-prefix*. If
476
  one string literal has no *encoding-prefix*, it is treated as a string
477
  literal of the same *encoding-prefix* as the other operand. If a UTF-8
478
  string literal token is adjacent to a wide string literal token, the
479
+ program is ill-formed. Any other concatenations are
480
+ conditionally-supported with *implementation-defined* behavior. This
481
+ concatenation is an interpretation, not a conversion. Because the
482
+ interpretation happens in translation phase 6 (after each character from
483
+ a literal has been translated into a value from the appropriate
484
+ character set), a string literal’s initial rawness has no effect on the
485
+ interpretation or well-formedness of the concatenation. Table 
486
+ [[tab:lex.string.concat]] has some examples of valid concatenations.
487
 
488
  **Table: String literal concatenations** <a id="tab:lex.string.concat">[tab:lex.string.concat]</a>
489
 
490
  | | | | | | |
491
  | -------------------------- | ----- | -------------------------- | ----- | -------------------------- | ----- |
 
565
  ``` bnf
566
  user-defined-integer-literal:
567
  decimal-literal ud-suffix
568
  octal-literal ud-suffix
569
  hexadecimal-literal ud-suffix
570
+ binary-literal ud-suffix
571
  ```
572
 
573
  ``` bnf
574
  user-defined-floating-literal:
575
  fractional-constant exponent-partₒₚₜ ud-suffix
 
614
  operator "" X(nULL)
615
  ```
616
 
617
  Otherwise, *S* shall contain a raw literal operator or a literal
618
  operator template ([[over.literal]]) but not both. If *S* contains a
619
+ raw literal operator, the literal *L* is treated as a call of the form
620
 
621
  ``` cpp
622
  operator "" X("n{"})
623
  ```
624
 
 
679
  operator "" X(ch{})
680
  ```
681
 
682
  ``` cpp
683
  long double operator "" _w(long double);
684
+ std::string operator "" _w(const char16_t*, std::size_t);
685
  unsigned operator "" _w(const char*);
686
  int main() {
687
  1.2_w; // calls operator "" _w(1.2L)
688
  u"one"_w; // calls operator "" _w(u"one", 3)
689
  12_w; // calls operator "" _w("12")
 
715
  <!-- Link reference definitions -->
716
  [basic.fundamental]: basic.md#basic.fundamental
717
  [basic.link]: basic.md#basic.link
718
  [basic.lookup.unqual]: basic.md#basic.lookup.unqual
719
  [basic.stc]: basic.md#basic.stc
720
+ [basic.types]: basic.md#basic.types
721
  [charname.allowed]: charname.md#charname.allowed
722
  [charname.disallowed]: charname.md#charname.disallowed
723
  [conv.mem]: conv.md#conv.mem
724
  [conv.ptr]: conv.md#conv.ptr
725
  [cpp]: cpp.md#cpp
 
758
  [tab:alternative.tokens]: #tab:alternative.tokens
759
  [tab:escape.sequences]: #tab:escape.sequences
760
  [tab:identifiers.special]: #tab:identifiers.special
761
  [tab:keywords]: #tab:keywords
762
  [tab:lex.string.concat]: #tab:lex.string.concat
763
+ [tab:lex.type.integer.literal]: #tab:lex.type.integer.literal
764
  [tab:trigraph.sequences]: #tab:trigraph.sequences
765
  [temp.explicit]: temp.md#temp.explicit
766
  [temp.names]: temp.md#temp.names
767
  [usrlit.suffix]: library.md#usrlit.suffix
768