From Jason Turner

[lex]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmpk1cv93wi/{from.md → to.md} +108 -77
tmp/tmpk1cv93wi/{from.md → to.md} RENAMED
@@ -38,17 +38,18 @@ following phases.[^1]
38
  notation), are handled equivalently except where this replacement is
39
  reverted in a raw string literal.)
40
  2. Each instance of a backslash character (\\ immediately followed by a
41
  new-line character is deleted, splicing physical source lines to
42
  form logical source lines. Only the last backslash on any physical
43
- source line shall be eligible for being part of such a splice. If,
44
- as a result, a character sequence that matches the syntax of a
45
- universal-character-name is produced, the behavior is undefined. A
46
- source file that is not empty and that does not end in a new-line
47
- character, or that ends in a new-line character immediately preceded
48
- by a backslash character before any such splicing takes place, shall
49
- be processed as if an additional new-line character were appended to
 
50
  the file.
51
  3. The source file is decomposed into preprocessing tokens (
52
  [[lex.pptoken]]) and sequences of white-space characters (including
53
  comments). A source file shall not end in a partial preprocessing
54
  token or in a partial comment.[^2] Each comment is replaced by one
@@ -117,11 +118,11 @@ a b c d e f g h i j k l m n o p q r s t u v w x y z
117
 
118
  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
119
 
120
  0 1 2 3 4 5 6 7 8 9
121
 
122
- _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | \sim ! = , \" '
123
  ```
124
 
125
  The *universal-character-name* construct provides a way to name other
126
  characters.
127
 
@@ -293,16 +294,16 @@ characters.
293
 
294
  ## Comments <a id="lex.comment">[[lex.comment]]</a>
295
 
296
  The characters `/*` start a comment, which terminates with the
297
  characters `*/`. These comments do not nest. The characters `//` start a
298
- comment, which terminates with the next new-line character. If there is
299
- a form-feed or a vertical-tab character in such a comment, only
300
- white-space characters shall appear between it and the new-line that
301
- terminates the comment; no diagnostic is required. The comment
302
- characters `//`, `/*`, and `*/` have no special meaning within a `//`
303
- comment and are treated just like other characters. Similarly, the
304
  comment characters `//` and `/*` have no special meaning within a `/*`
305
  comment.
306
 
307
  ## Header names <a id="lex.header">[[lex.header]]</a>
308
 
@@ -340,11 +341,11 @@ of *header-name*s are mapped in an *implementation-defined* manner to
340
  headers or to external source file names as specified in 
341
  [[cpp.include]].
342
 
343
  The appearance of either of the characters `'` or `\` or of either of
344
  the character sequences `/*` or `//` in a *q-char-sequence* or an
345
- *h-char-sequence* is conditionally supported with implementation-defined
346
  semantics, as is the appearance of the character `"` in an
347
  *h-char-sequence*.[^9]
348
 
349
  ## Preprocessing numbers <a id="lex.ppnumber">[[lex.ppnumber]]</a>
350
 
@@ -352,20 +353,22 @@ semantics, as is the appearance of the character `"` in an
352
  pp-number:
353
  digit
354
  '.' digit
355
  pp-number digit
356
  pp-number identifier-nondigit
 
 
357
  pp-number 'e' sign
358
  pp-number 'E' sign
359
  pp-number '.'
360
  ```
361
 
362
- Preprocessing number tokens lexically include all integral literal
363
  tokens ([[lex.icon]]) and all floating literal tokens ([[lex.fcon]]).
364
 
365
  A preprocessing number does not have a type or a value; it acquires both
366
- after a successful conversion to an integral literal token or a floating
367
  literal token.
368
 
369
  ## Identifiers <a id="lex.name">[[lex.name]]</a>
370
 
371
  ``` bnf
@@ -404,13 +407,13 @@ into one of the ranges specified in  [[charname.disallowed]]. Upper- and
404
  lower-case letters are different. All characters are significant.[^10]
405
 
406
  The identifiers in Table  [[tab:identifiers.special]] have a special
407
  meaning when appearing in a certain context. When referred to in the
408
  grammar, these identifiers are used explicitly rather than using the
409
- *identifier* grammar production. any ambiguity as to whether a given
410
- *identifier* has a special meaning is resolved to interpret the token as
411
- a regular *identifier*.
412
 
413
  **Table: Identifiers with special meaning** <a id="tab:identifiers.special">[tab:identifiers.special]</a>
414
 
415
  | | |
416
  | ---------- | ------- |
@@ -489,44 +492,58 @@ literal:
489
 
490
  ### Integer literals <a id="lex.icon">[[lex.icon]]</a>
491
 
492
  ``` bnf
493
  integer-literal:
494
- decimal-literal integer-suffixₒₚₜ
495
  octal-literal integer-suffixₒₚₜ
 
496
  hexadecimal-literal integer-suffixₒₚₜ
497
  ```
498
 
 
 
 
 
 
 
 
 
 
 
 
 
 
499
  ``` bnf
500
  decimal-literal:
501
  nonzero-digit
502
- decimal-literal digit
503
- ```
504
-
505
- ``` bnf
506
- octal-literal:
507
- '0'
508
- octal-literal octal-digit
509
  ```
510
 
511
  ``` bnf
512
  hexadecimal-literal:
513
  '0x' hexadecimal-digit
514
  '0X' hexadecimal-digit
515
- hexadecimal-literal hexadecimal-digit
516
  ```
517
 
518
  ``` bnf
519
- nonzero-digit: one of
520
- '1 2 3 4 5 6 7 8 9'
 
521
  ```
522
 
523
  ``` bnf
524
  octal-digit: one of
525
  '0 1 2 3 4 5 6 7'
526
  ```
527
 
 
 
 
 
 
528
  ``` bnf
529
  hexadecimal-digit: one of
530
  '0 1 2 3 4 5 6 7 8 9'
531
  'a b c d e f'
532
  'A B C D E F'
@@ -554,27 +571,31 @@ long-suffix: one of
554
  long-long-suffix: one of
555
  'll LL'
556
  ```
557
 
558
  An *integer literal* is a sequence of digits that has no period or
559
- exponent part. An integer literal may have a prefix that specifies its
560
- base and a suffix that specifies its type. The lexically first digit of
561
- the sequence of digits is the most significant. A *decimal* integer
562
- literal (base ten) begins with a digit other than `0` and consists of a
563
- sequence of decimal digits. An *octal* integer literal (base eight)
564
- begins with the digit `0` and consists of a sequence of octal
565
- digits.[^12] A *hexadecimal* integer literal (base sixteen) begins with
566
- `0x` or `0X` and consists of a sequence of hexadecimal digits, which
567
- include the decimal digits and the letters `a` through `f` and `A`
568
- through `F` with decimal values ten through fifteen. the number twelve
569
- can be written `12`, `014`, or `0XC`.
 
 
 
 
570
 
571
  The type of an integer literal is the first of the corresponding list in
572
- Table  [[tab:lex.type.integer.constant]] in which its value can be
573
  represented.
574
 
575
- **Table: Types of integer constants** <a id="tab:lex.type.integer.constant">[tab:lex.type.integer.constant]</a>
576
 
577
  | | | |
578
  | ---------------- | ------------------------ | ------------------------ |
579
  | none | `int` | `int` |
580
  | | `long int` | `unsigned int` |
@@ -653,15 +674,18 @@ hexadecimal-escape-sequence:
653
  A character literal is one or more characters enclosed in single quotes,
654
  as in `'x'`, optionally preceded by one of the letters `u`, `U`, or `L`,
655
  as in `u'y'`, `U'z'`, or `L'x'`, respectively. A character literal that
656
  does not begin with `u`, `U`, or `L` is an ordinary character literal,
657
  also referred to as a narrow-character literal. An ordinary character
658
- literal that contains a single *c-char* has type `char`, with value
659
- equal to the numerical value of the encoding of the *c-char* in the
660
- execution character set. An ordinary character literal that contains
661
- more than one *c-char* is a *multicharacter literal*. A multicharacter
662
- literal has type `int` and *implementation-defined* value.
 
 
 
663
 
664
  A character literal that begins with the letter `u`, such as `u'y'`, is
665
  a character literal of type `char16_t`. The value of a `char16_t`
666
  literal containing a single *c-char* is equal to its ISO 10646 code
667
  point value, provided that the code point is representable with a single
@@ -763,35 +787,37 @@ sign: one of
763
  ```
764
 
765
  ``` bnf
766
  digit-sequence:
767
  digit
768
- digit-sequence digit
769
  ```
770
 
771
  ``` bnf
772
  floating-suffix: one of
773
  'f l F L'
774
  ```
775
 
776
  A floating literal consists of an integer part, a decimal point, a
777
  fraction part, an `e` or `E`, an optionally signed integer exponent, and
778
  an optional type suffix. The integer and fraction parts both consist of
779
- a sequence of decimal (base ten) digits. Either the integer part or the
780
- fraction part (not both) can be omitted; either the decimal point or the
781
- letter `e` (or `E` ) and the exponent (not both) can be omitted. The
782
- integer part, the optional decimal point and the optional fraction part
783
- form the *significant part* of the floating literal. The exponent, if
784
- present, indicates the power of 10 by which the significant part is to
785
- be scaled. If the scaled value is in the range of representable values
786
- for its type, the result is the scaled value if representable, else the
787
- larger or smaller representable value nearest the scaled value, chosen
788
- in an *implementation-defined* manner. The type of a floating literal is
789
- `double` unless explicitly specified by a suffix. The suffixes `f` and
790
- `F` specify `float`, the suffixes `l` and `L` specify `long` `double`.
791
- If the scaled value is not in the range of representable values for its
792
- type, the program is ill-formed.
 
 
793
 
794
  ### String literals <a id="lex.string">[[lex.string]]</a>
795
 
796
  ``` bnf
797
  string-literal:
@@ -884,18 +910,21 @@ is equivalent to `"\n)\?\?=\"\n"`.
884
  After translation phase 6, a string literal that does not begin with an
885
  *encoding-prefix* is an ordinary string literal, and is initialized with
886
  the given characters.
887
 
888
  A string literal that begins with `u8`, such as `u8"asdf"`, is a UTF-8
889
- string literal and is initialized with the given characters as encoded
890
- in UTF-8.
891
 
892
  Ordinary string literals and UTF-8 string literals are also referred to
893
  as narrow string literals. A narrow string literal has type “array of
894
  *n* `const char`”, where *n* is the size of the string as defined below,
895
  and has static storage duration ([[basic.stc]]).
896
 
 
 
 
 
897
  A string literal that begins with `u`, such as `u"asdf"`, is a
898
  `char16_t` string literal. A `char16_t` string literal has type “array
899
  of *n* `const char16_t`”, where *n* is the size of the string as defined
900
  below; it has static storage duration and is initialized with the given
901
  characters. A single *c-char* may produce more than one `char16_t`
@@ -920,18 +949,18 @@ In translation phase 6 ([[lex.phases]]), adjacent string literals are
920
  concatenated. If both string literals have the same *encoding-prefix*,
921
  the resulting concatenated string literal has that *encoding-prefix*. If
922
  one string literal has no *encoding-prefix*, it is treated as a string
923
  literal of the same *encoding-prefix* as the other operand. If a UTF-8
924
  string literal token is adjacent to a wide string literal token, the
925
- program is ill-formed. Any other concatenations are conditionally
926
- supported with *implementation-defined* behavior. This concatenation is
927
- an interpretation, not a conversion. Because the interpretation happens
928
- in translation phase 6 (after each character from a literal has been
929
- translated into a value from the appropriate character set), a string
930
- literal’s initial rawness has no effect on the interpretation or
931
- well-formedness of the concatenation. Table  [[tab:lex.string.concat]]
932
- has some examples of valid concatenations.
933
 
934
  **Table: String literal concatenations** <a id="tab:lex.string.concat">[tab:lex.string.concat]</a>
935
 
936
  | | | | | | |
937
  | -------------------------- | ----- | -------------------------- | ----- | -------------------------- | ----- |
@@ -1011,10 +1040,11 @@ user-defined-literal:
1011
  ``` bnf
1012
  user-defined-integer-literal:
1013
  decimal-literal ud-suffix
1014
  octal-literal ud-suffix
1015
  hexadecimal-literal ud-suffix
 
1016
  ```
1017
 
1018
  ``` bnf
1019
  user-defined-floating-literal:
1020
  fractional-constant exponent-partₒₚₜ ud-suffix
@@ -1059,11 +1089,11 @@ call of the form
1059
  operator "" X(nULL)
1060
  ```
1061
 
1062
  Otherwise, *S* shall contain a raw literal operator or a literal
1063
  operator template ([[over.literal]]) but not both. If *S* contains a
1064
- raw literal operator, the *literal* *L* is treated as a call of the form
1065
 
1066
  ``` cpp
1067
  operator "" X("n{"})
1068
  ```
1069
 
@@ -1124,11 +1154,11 @@ literal *L* is treated as a call of the form
1124
  operator "" X(ch{})
1125
  ```
1126
 
1127
  ``` cpp
1128
  long double operator "" _w(long double);
1129
- std::string operator "" _w(const char16_t*, size_t);
1130
  unsigned operator "" _w(const char*);
1131
  int main() {
1132
  1.2_w; // calls operator "" _w(1.2L)
1133
  u"one"_w; // calls operator "" _w(u"one", 3)
1134
  12_w; // calls operator "" _w("12")
@@ -1160,10 +1190,11 @@ standardization ([[usrlit.suffix]]). A program containing such a
1160
  <!-- Link reference definitions -->
1161
  [basic.fundamental]: basic.md#basic.fundamental
1162
  [basic.link]: basic.md#basic.link
1163
  [basic.lookup.unqual]: basic.md#basic.lookup.unqual
1164
  [basic.stc]: basic.md#basic.stc
 
1165
  [charname.allowed]: charname.md#charname.allowed
1166
  [charname.disallowed]: charname.md#charname.disallowed
1167
  [conv.mem]: conv.md#conv.mem
1168
  [conv.ptr]: conv.md#conv.ptr
1169
  [cpp]: cpp.md#cpp
@@ -1202,11 +1233,11 @@ standardization ([[usrlit.suffix]]). A program containing such a
1202
  [tab:alternative.tokens]: #tab:alternative.tokens
1203
  [tab:escape.sequences]: #tab:escape.sequences
1204
  [tab:identifiers.special]: #tab:identifiers.special
1205
  [tab:keywords]: #tab:keywords
1206
  [tab:lex.string.concat]: #tab:lex.string.concat
1207
- [tab:lex.type.integer.constant]: #tab:lex.type.integer.constant
1208
  [tab:trigraph.sequences]: #tab:trigraph.sequences
1209
  [temp.explicit]: temp.md#temp.explicit
1210
  [temp.names]: temp.md#temp.names
1211
  [usrlit.suffix]: library.md#usrlit.suffix
1212
 
 
38
  notation), are handled equivalently except where this replacement is
39
  reverted in a raw string literal.)
40
  2. Each instance of a backslash character (\\ immediately followed by a
41
  new-line character is deleted, splicing physical source lines to
42
  form logical source lines. Only the last backslash on any physical
43
+ source line shall be eligible for being part of such a splice.
44
+ Except for splices reverted in a raw string literal, if a splice
45
+ results in a character sequence that matches the syntax of a
46
+ universal-character-name, the behavior is undefined. A source file
47
+ that is not empty and that does not end in a new-line character, or
48
+ that ends in a new-line character immediately preceded by a
49
+ backslash character before any such splicing takes place, shall be
50
+ processed as if an additional new-line character were appended to
51
  the file.
52
  3. The source file is decomposed into preprocessing tokens (
53
  [[lex.pptoken]]) and sequences of white-space characters (including
54
  comments). A source file shall not end in a partial preprocessing
55
  token or in a partial comment.[^2] Each comment is replaced by one
 
118
 
119
  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
120
 
121
  0 1 2 3 4 5 6 7 8 9
122
 
123
+ _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \" '
124
  ```
125
 
126
  The *universal-character-name* construct provides a way to name other
127
  characters.
128
 
 
294
 
295
  ## Comments <a id="lex.comment">[[lex.comment]]</a>
296
 
297
  The characters `/*` start a comment, which terminates with the
298
  characters `*/`. These comments do not nest. The characters `//` start a
299
+ comment, which terminates immediately before the next new-line
300
+ character. If there is a form-feed or a vertical-tab character in such a
301
+ comment, only white-space characters shall appear between it and the
302
+ new-line that terminates the comment; no diagnostic is required. The
303
+ comment characters `//`, `/*`, and `*/` have no special meaning within a
304
+ `//` comment and are treated just like other characters. Similarly, the
305
  comment characters `//` and `/*` have no special meaning within a `/*`
306
  comment.
307
 
308
  ## Header names <a id="lex.header">[[lex.header]]</a>
309
 
 
341
  headers or to external source file names as specified in 
342
  [[cpp.include]].
343
 
344
  The appearance of either of the characters `'` or `\` or of either of
345
  the character sequences `/*` or `//` in a *q-char-sequence* or an
346
+ *h-char-sequence* is conditionally-supported with implementation-defined
347
  semantics, as is the appearance of the character `"` in an
348
  *h-char-sequence*.[^9]
349
 
350
  ## Preprocessing numbers <a id="lex.ppnumber">[[lex.ppnumber]]</a>
351
 
 
353
  pp-number:
354
  digit
355
  '.' digit
356
  pp-number digit
357
  pp-number identifier-nondigit
358
+ pp-number ''' digit
359
+ pp-number ''' nondigit
360
  pp-number 'e' sign
361
  pp-number 'E' sign
362
  pp-number '.'
363
  ```
364
 
365
+ Preprocessing number tokens lexically include all integer literal
366
  tokens ([[lex.icon]]) and all floating literal tokens ([[lex.fcon]]).
367
 
368
  A preprocessing number does not have a type or a value; it acquires both
369
+ after a successful conversion to an integer literal token or a floating
370
  literal token.
371
 
372
  ## Identifiers <a id="lex.name">[[lex.name]]</a>
373
 
374
  ``` bnf
 
407
  lower-case letters are different. All characters are significant.[^10]
408
 
409
  The identifiers in Table  [[tab:identifiers.special]] have a special
410
  meaning when appearing in a certain context. When referred to in the
411
  grammar, these identifiers are used explicitly rather than using the
412
+ *identifier* grammar production. Unless otherwise specified, any
413
+ ambiguity as to whether a given *identifier* has a special meaning is
414
+ resolved to interpret the token as a regular *identifier*.
415
 
416
  **Table: Identifiers with special meaning** <a id="tab:identifiers.special">[tab:identifiers.special]</a>
417
 
418
  | | |
419
  | ---------- | ------- |
 
492
 
493
  ### Integer literals <a id="lex.icon">[[lex.icon]]</a>
494
 
495
  ``` bnf
496
  integer-literal:
497
+ binary-literal integer-suffixₒₚₜ
498
  octal-literal integer-suffixₒₚₜ
499
+ decimal-literal integer-suffixₒₚₜ
500
  hexadecimal-literal integer-suffixₒₚₜ
501
  ```
502
 
503
+ ``` bnf
504
+ binary-literal:
505
+ '0b' binary-digit
506
+ '0B' binary-digit
507
+ binary-literal '''ₒₚₜ binary-digit
508
+ ```
509
+
510
+ ``` bnf
511
+ octal-literal:
512
+ '0'
513
+ octal-literal '''ₒₚₜ octal-digit
514
+ ```
515
+
516
  ``` bnf
517
  decimal-literal:
518
  nonzero-digit
519
+ decimal-literal '''ₒₚₜ digit
 
 
 
 
 
 
520
  ```
521
 
522
  ``` bnf
523
  hexadecimal-literal:
524
  '0x' hexadecimal-digit
525
  '0X' hexadecimal-digit
526
+ hexadecimal-literal '''ₒₚₜ hexadecimal-digit
527
  ```
528
 
529
  ``` bnf
530
+ binary-digit:
531
+ '0'
532
+ '1'
533
  ```
534
 
535
  ``` bnf
536
  octal-digit: one of
537
  '0 1 2 3 4 5 6 7'
538
  ```
539
 
540
+ ``` bnf
541
+ nonzero-digit: one of
542
+ '1 2 3 4 5 6 7 8 9'
543
+ ```
544
+
545
  ``` bnf
546
  hexadecimal-digit: one of
547
  '0 1 2 3 4 5 6 7 8 9'
548
  'a b c d e f'
549
  'A B C D E F'
 
571
  long-long-suffix: one of
572
  'll LL'
573
  ```
574
 
575
  An *integer literal* is a sequence of digits that has no period or
576
+ exponent part, with optional separating single quotes that are ignored
577
+ when determining its value. An integer literal may have a prefix that
578
+ specifies its base and a suffix that specifies its type. The lexically
579
+ first digit of the sequence of digits is the most significant. A
580
+ *binary* integer literal (base two) begins with `0b` or `0B` and
581
+ consists of a sequence of binary digits. An *octal* integer literal
582
+ (base eight) begins with the digit `0` and consists of a sequence of
583
+ octal digits.[^12] A *decimal* integer literal (base ten) begins with a
584
+ digit other than `0` and consists of a sequence of decimal digits. A
585
+ *hexadecimal* integer literal (base sixteen) begins with `0x` or `0X`
586
+ and consists of a sequence of hexadecimal digits, which include the
587
+ decimal digits and the letters `a` through `f` and `A` through `F` with
588
+ decimal values ten through fifteen. The number twelve can be written
589
+ `12`, `014`, `0XC`, or `0b1100`. The literals `1048576`, `1'048'576`,
590
+ `0X100000`, `0x10'0000`, and `0'004'000'000` all have the same value.
591
 
592
  The type of an integer literal is the first of the corresponding list in
593
+ Table  [[tab:lex.type.integer.literal]] in which its value can be
594
  represented.
595
 
596
+ **Table: Types of integer literals** <a id="tab:lex.type.integer.literal">[tab:lex.type.integer.literal]</a>
597
 
598
  | | | |
599
  | ---------------- | ------------------------ | ------------------------ |
600
  | none | `int` | `int` |
601
  | | `long int` | `unsigned int` |
 
674
  A character literal is one or more characters enclosed in single quotes,
675
  as in `'x'`, optionally preceded by one of the letters `u`, `U`, or `L`,
676
  as in `u'y'`, `U'z'`, or `L'x'`, respectively. A character literal that
677
  does not begin with `u`, `U`, or `L` is an ordinary character literal,
678
  also referred to as a narrow-character literal. An ordinary character
679
+ literal that contains a single *c-char* representable in the execution
680
+ character set has type `char`, with value equal to the numerical value
681
+ of the encoding of the *c-char* in the execution character set. An
682
+ ordinary character literal that contains more than one *c-char* is a
683
+ *multicharacter literal*. A multicharacter literal, or an ordinary
684
+ character literal containing a single *c-char* not representable in the
685
+ execution character set, is conditionally-supported, has type `int`, and
686
+ has an *implementation-defined* value.
687
 
688
  A character literal that begins with the letter `u`, such as `u'y'`, is
689
  a character literal of type `char16_t`. The value of a `char16_t`
690
  literal containing a single *c-char* is equal to its ISO 10646 code
691
  point value, provided that the code point is representable with a single
 
787
  ```
788
 
789
  ``` bnf
790
  digit-sequence:
791
  digit
792
+ digit-sequence '''ₒₚₜ digit
793
  ```
794
 
795
  ``` bnf
796
  floating-suffix: one of
797
  'f l F L'
798
  ```
799
 
800
  A floating literal consists of an integer part, a decimal point, a
801
  fraction part, an `e` or `E`, an optionally signed integer exponent, and
802
  an optional type suffix. The integer and fraction parts both consist of
803
+ a sequence of decimal (base ten) digits. Optional separating single
804
+ quotes in a *digit-sequence* are ignored when determining its value. The
805
+ literals `1.602'176'565e-19` and `1.602176565e-19` have the same value.
806
+ Either the integer part or the fraction part (not both) can be omitted;
807
+ either the decimal point or the letter `e` (or `E` ) and the exponent
808
+ (not both) can be omitted. The integer part, the optional decimal point
809
+ and the optional fraction part form the *significant part* of the
810
+ floating literal. The exponent, if present, indicates the power of 10 by
811
+ which the significant part is to be scaled. If the scaled value is in
812
+ the range of representable values for its type, the result is the scaled
813
+ value if representable, else the larger or smaller representable value
814
+ nearest the scaled value, chosen in an *implementation-defined* manner.
815
+ The type of a floating literal is `double` unless explicitly specified
816
+ by a suffix. The suffixes `f` and `F` specify `float`, the suffixes `l`
817
+ and `L` specify `long` `double`. If the scaled value is not in the range
818
+ of representable values for its type, the program is ill-formed.
819
 
820
  ### String literals <a id="lex.string">[[lex.string]]</a>
821
 
822
  ``` bnf
823
  string-literal:
 
910
  After translation phase 6, a string literal that does not begin with an
911
  *encoding-prefix* is an ordinary string literal, and is initialized with
912
  the given characters.
913
 
914
  A string literal that begins with `u8`, such as `u8"asdf"`, is a UTF-8
915
+ string literal.
 
916
 
917
  Ordinary string literals and UTF-8 string literals are also referred to
918
  as narrow string literals. A narrow string literal has type “array of
919
  *n* `const char`”, where *n* is the size of the string as defined below,
920
  and has static storage duration ([[basic.stc]]).
921
 
922
+ For a UTF-8 string literal, each successive element of the object
923
+ representation ([[basic.types]]) has the value of the corresponding
924
+ code unit of the UTF-8 encoding of the string.
925
+
926
  A string literal that begins with `u`, such as `u"asdf"`, is a
927
  `char16_t` string literal. A `char16_t` string literal has type “array
928
  of *n* `const char16_t`”, where *n* is the size of the string as defined
929
  below; it has static storage duration and is initialized with the given
930
  characters. A single *c-char* may produce more than one `char16_t`
 
949
  concatenated. If both string literals have the same *encoding-prefix*,
950
  the resulting concatenated string literal has that *encoding-prefix*. If
951
  one string literal has no *encoding-prefix*, it is treated as a string
952
  literal of the same *encoding-prefix* as the other operand. If a UTF-8
953
  string literal token is adjacent to a wide string literal token, the
954
+ program is ill-formed. Any other concatenations are
955
+ conditionally-supported with *implementation-defined* behavior. This
956
+ concatenation is an interpretation, not a conversion. Because the
957
+ interpretation happens in translation phase 6 (after each character from
958
+ a literal has been translated into a value from the appropriate
959
+ character set), a string literal’s initial rawness has no effect on the
960
+ interpretation or well-formedness of the concatenation. Table 
961
+ [[tab:lex.string.concat]] has some examples of valid concatenations.
962
 
963
  **Table: String literal concatenations** <a id="tab:lex.string.concat">[tab:lex.string.concat]</a>
964
 
965
  | | | | | | |
966
  | -------------------------- | ----- | -------------------------- | ----- | -------------------------- | ----- |
 
1040
  ``` bnf
1041
  user-defined-integer-literal:
1042
  decimal-literal ud-suffix
1043
  octal-literal ud-suffix
1044
  hexadecimal-literal ud-suffix
1045
+ binary-literal ud-suffix
1046
  ```
1047
 
1048
  ``` bnf
1049
  user-defined-floating-literal:
1050
  fractional-constant exponent-partₒₚₜ ud-suffix
 
1089
  operator "" X(nULL)
1090
  ```
1091
 
1092
  Otherwise, *S* shall contain a raw literal operator or a literal
1093
  operator template ([[over.literal]]) but not both. If *S* contains a
1094
+ raw literal operator, the literal *L* is treated as a call of the form
1095
 
1096
  ``` cpp
1097
  operator "" X("n{"})
1098
  ```
1099
 
 
1154
  operator "" X(ch{})
1155
  ```
1156
 
1157
  ``` cpp
1158
  long double operator "" _w(long double);
1159
+ std::string operator "" _w(const char16_t*, std::size_t);
1160
  unsigned operator "" _w(const char*);
1161
  int main() {
1162
  1.2_w; // calls operator "" _w(1.2L)
1163
  u"one"_w; // calls operator "" _w(u"one", 3)
1164
  12_w; // calls operator "" _w("12")
 
1190
  <!-- Link reference definitions -->
1191
  [basic.fundamental]: basic.md#basic.fundamental
1192
  [basic.link]: basic.md#basic.link
1193
  [basic.lookup.unqual]: basic.md#basic.lookup.unqual
1194
  [basic.stc]: basic.md#basic.stc
1195
+ [basic.types]: basic.md#basic.types
1196
  [charname.allowed]: charname.md#charname.allowed
1197
  [charname.disallowed]: charname.md#charname.disallowed
1198
  [conv.mem]: conv.md#conv.mem
1199
  [conv.ptr]: conv.md#conv.ptr
1200
  [cpp]: cpp.md#cpp
 
1233
  [tab:alternative.tokens]: #tab:alternative.tokens
1234
  [tab:escape.sequences]: #tab:escape.sequences
1235
  [tab:identifiers.special]: #tab:identifiers.special
1236
  [tab:keywords]: #tab:keywords
1237
  [tab:lex.string.concat]: #tab:lex.string.concat
1238
+ [tab:lex.type.integer.literal]: #tab:lex.type.integer.literal
1239
  [tab:trigraph.sequences]: #tab:trigraph.sequences
1240
  [temp.explicit]: temp.md#temp.explicit
1241
  [temp.names]: temp.md#temp.names
1242
  [usrlit.suffix]: library.md#usrlit.suffix
1243