From Jason Turner

[lex.charset]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmphhqbm1s0/{from.md → to.md} +15 -18
tmp/tmphhqbm1s0/{from.md → to.md} RENAMED
@@ -5,15 +5,12 @@ character, the control characters representing horizontal tab, vertical
5
  tab, form feed, and new-line, plus the following 91 graphical
6
  characters:[^4]
7
 
8
  ``` cpp
9
  a b c d e f g h i j k l m n o p q r s t u v w x y z
10
-
11
  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
12
-
13
  0 1 2 3 4 5 6 7 8 9
14
-
15
  _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \" '
16
  ```
17
 
18
  The *universal-character-name* construct provides a way to name other
19
  characters.
@@ -27,33 +24,33 @@ hex-quad:
27
  universal-character-name:
28
  '\u' hex-quad
29
  '\U' hex-quad hex-quad
30
  ```
31
 
32
- The character designated by the universal-character-name `\UNNNNNNNN` is
33
- that character whose character short name in ISO/IEC 10646 is
34
- `NNNNNNNN`; the character designated by the universal-character-name
35
  `\uNNNN` is that character whose character short name in ISO/IEC 10646
36
- is `0000NNNN`. If the hexadecimal value for a universal-character-name
37
  corresponds to a surrogate code point (in the range 0xD800–0xDFFF,
38
  inclusive), the program is ill-formed. Additionally, if the hexadecimal
39
- value for a universal-character-name outside the *c-char-sequence*,
40
  *s-char-sequence*, or *r-char-sequence* of a character or string literal
41
  corresponds to a control character (in either of the ranges 0x00–0x1F or
42
  0x7F–0x9F, both inclusive) or to a character in the basic source
43
  character set, the program is ill-formed.[^5]
44
 
45
  The *basic execution character set* and the *basic execution
46
  wide-character set* shall each contain all the members of the basic
47
  source character set, plus control characters representing alert,
48
  backspace, and carriage return, plus a *null character* (respectively,
49
- *null wide character*), whose representation has all zero bits. For each
50
- basic execution character set, the values of the members shall be
51
- non-negative and distinct from one another. In both the source and
52
- execution basic character sets, the value of each character after `0` in
53
- the above list of decimal digits shall be one greater than the value of
54
- the previous. The *execution character set* and the *execution
55
- wide-character set* are implementation-defined supersets of the basic
56
- execution character set and the basic execution wide-character set,
57
- respectively. The values of the members of the execution character sets
58
- and the sets of additional members are locale-specific.
59
 
 
5
  tab, form feed, and new-line, plus the following 91 graphical
6
  characters:[^4]
7
 
8
  ``` cpp
9
  a b c d e f g h i j k l m n o p q r s t u v w x y z
 
10
  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 
11
  0 1 2 3 4 5 6 7 8 9
 
12
  _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \" '
13
  ```
14
 
15
  The *universal-character-name* construct provides a way to name other
16
  characters.
 
24
  universal-character-name:
25
  '\u' hex-quad
26
  '\U' hex-quad hex-quad
27
  ```
28
 
29
+ The character designated by the *universal-character-name* `\UNNNNNNNN`
30
+ is that character whose character short name in ISO/IEC 10646 is
31
+ `NNNNNNNN`; the character designated by the *universal-character-name*
32
  `\uNNNN` is that character whose character short name in ISO/IEC 10646
33
+ is `0000NNNN`. If the hexadecimal value for a *universal-character-name*
34
  corresponds to a surrogate code point (in the range 0xD800–0xDFFF,
35
  inclusive), the program is ill-formed. Additionally, if the hexadecimal
36
+ value for a *universal-character-name* outside the *c-char-sequence*,
37
  *s-char-sequence*, or *r-char-sequence* of a character or string literal
38
  corresponds to a control character (in either of the ranges 0x00–0x1F or
39
  0x7F–0x9F, both inclusive) or to a character in the basic source
40
  character set, the program is ill-formed.[^5]
41
 
42
  The *basic execution character set* and the *basic execution
43
  wide-character set* shall each contain all the members of the basic
44
  source character set, plus control characters representing alert,
45
  backspace, and carriage return, plus a *null character* (respectively,
46
+ *null wide character*), whose value is 0. For each basic execution
47
+ character set, the values of the members shall be non-negative and
48
+ distinct from one another. In both the source and execution basic
49
+ character sets, the value of each character after `0` in the above list
50
+ of decimal digits shall be one greater than the value of the previous.
51
+ The *execution character set* and the *execution wide-character set* are
52
+ *implementation-defined* supersets of the basic execution character set
53
+ and the basic execution wide-character set, respectively. The values of
54
+ the members of the execution character sets and the sets of additional
55
+ members are locale-specific.
56