From Jason Turner

[lex.pptoken]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmpj4_1oqdi/{from.md → to.md} +34 -29
tmp/tmpj4_1oqdi/{from.md → to.md} RENAMED
@@ -11,48 +11,53 @@ preprocessing-token:
11
  character-literal
12
  user-defined-character-literal
13
  string-literal
14
  user-defined-string-literal
15
  preprocessing-op-or-punc
16
- each non-white-space character that cannot be one of the above
17
  ```
18
 
19
  Each preprocessing token that is converted to a token [[lex.token]]
20
  shall have the lexical form of a keyword, an identifier, a literal, or
21
  an operator or punctuator.
22
 
23
  A preprocessing token is the minimal lexical element of the language in
24
- translation phases 3 through 6. The categories of preprocessing token
25
- are: header names, placeholder tokens produced by preprocessing `import`
26
- and `module` directives (*import-keyword*, *module-keyword*, and
27
- *export-keyword*), identifiers, preprocessing numbers, character
28
- literals (including user-defined character literals), string literals
29
- (including user-defined string literals), preprocessing operators and
30
- punctuators, and single non-white-space characters that do not lexically
31
- match the other preprocessing token categories. If a `'` or a `"`
32
- character matches the last category, the behavior is undefined.
33
- Preprocessing tokens can be separated by white space; this consists of
34
- comments [[lex.comment]], or white-space characters (space, horizontal
35
- tab, new-line, vertical tab, and form-feed), or both. As described in
36
- [[cpp]], in certain circumstances during translation phase 4, white
37
- space (or the absence thereof) serves as more than preprocessing token
38
- separation. White space can appear within a preprocessing token only as
39
- part of a header name or between the quotation characters in a character
40
- literal or string literal.
 
 
 
 
 
41
 
42
  If the input stream has been parsed into preprocessing tokens up to a
43
  given character:
44
 
45
  - If the next character begins a sequence of characters that could be
46
  the prefix and initial double quote of a raw string literal, such as
47
  `R"`, the next preprocessing token shall be a raw string literal.
48
  Between the initial and final double quote characters of the raw
49
- string, any transformations performed in phases 1 and 2
50
- (*universal-character-name*s and line splicing) are reverted; this
51
- reversion shall apply before any *d-char*, *r-char*, or delimiting
52
- parenthesis is identified. The raw string literal is defined as the
53
- shortest sequence of characters that matches the raw-string pattern
54
  ``` bnf
55
  encoding-prefixₒₚₜ 'R' raw-string
56
  ```
57
  - Otherwise, if the next three characters are `<::` and the subsequent
58
  character is neither `:` nor `>`, the `<` is treated as a
@@ -83,16 +88,16 @@ by preprocessing either of the previous two directives.
83
  [*Note 1*: None has any observable spelling. — *end note*]
84
 
85
  [*Example 2*: The program fragment `0xe+foo` is parsed as a
86
  preprocessing number token (one that is not a valid *integer-literal* or
87
  *floating-point-literal* token), even though a parse as three
88
- preprocessing tokens `0xe`, `+`, and `foo` might produce a valid
89
- expression (for example, if `foo` were a macro defined as `1`).
90
- Similarly, the program fragment `1E1` is parsed as a preprocessing
91
- number (one that is a valid *floating-point-literal* token), whether or
92
- not `E` is a macro name. — *end example*]
93
 
94
  [*Example 3*: The program fragment `x+++++y` is parsed as `x
95
  ++ ++ + y`, which, if `x` and `y` have integral types, violates a
96
  constraint on increment operators, even though the parse `x ++ + ++ y`
97
- might yield a correct expression. — *end example*]
98
 
 
11
  character-literal
12
  user-defined-character-literal
13
  string-literal
14
  user-defined-string-literal
15
  preprocessing-op-or-punc
16
+ each non-whitespace character that cannot be one of the above
17
  ```
18
 
19
  Each preprocessing token that is converted to a token [[lex.token]]
20
  shall have the lexical form of a keyword, an identifier, a literal, or
21
  an operator or punctuator.
22
 
23
  A preprocessing token is the minimal lexical element of the language in
24
+ translation phases 3 through 6. In this document, glyphs are used to
25
+ identify elements of the basic character set [[lex.charset]]. The
26
+ categories of preprocessing token are: header names, placeholder tokens
27
+ produced by preprocessing `import` and `module` directives
28
+ (*import-keyword*, *module-keyword*, and *export-keyword*), identifiers,
29
+ preprocessing numbers, character literals (including user-defined
30
+ character literals), string literals (including user-defined string
31
+ literals), preprocessing operators and punctuators, and single
32
+ non-whitespace characters that do not lexically match the other
33
+ preprocessing token categories. If a U+0027 (apostrophe) or a
34
+ U+0022 (quotation mark) character matches the last category, the
35
+ behavior is undefined. If any character not in the basic character set
36
+ matches the last category, the program is ill-formed. Preprocessing
37
+ tokens can be separated by whitespace; this consists of comments
38
+ [[lex.comment]], or whitespace characters (U+0020 (space),
39
+ U+0009 (character tabulation), new-line, U+000b (line tabulation), and
40
+ U+000c (form feed)), or both. As described in [[cpp]], in certain
41
+ circumstances during translation phase 4, whitespace (or the absence
42
+ thereof) serves as more than preprocessing token separation. Whitespace
43
+ can appear within a preprocessing token only as part of a header name or
44
+ between the quotation characters in a character literal or string
45
+ literal.
46
 
47
  If the input stream has been parsed into preprocessing tokens up to a
48
  given character:
49
 
50
  - If the next character begins a sequence of characters that could be
51
  the prefix and initial double quote of a raw string literal, such as
52
  `R"`, the next preprocessing token shall be a raw string literal.
53
  Between the initial and final double quote characters of the raw
54
+ string, any transformations performed in phase 2 (line splicing) are
55
+ reverted; this reversion shall apply before any *d-char*, *r-char*, or
56
+ delimiting parenthesis is identified. The raw string literal is
57
+ defined as the shortest sequence of characters that matches the
58
+ raw-string pattern
59
  ``` bnf
60
  encoding-prefixₒₚₜ 'R' raw-string
61
  ```
62
  - Otherwise, if the next three characters are `<::` and the subsequent
63
  character is neither `:` nor `>`, the `<` is treated as a
 
88
  [*Note 1*: None has any observable spelling. — *end note*]
89
 
90
  [*Example 2*: The program fragment `0xe+foo` is parsed as a
91
  preprocessing number token (one that is not a valid *integer-literal* or
92
  *floating-point-literal* token), even though a parse as three
93
+ preprocessing tokens `0xe`, `+`, and `foo` can produce a valid
94
+ expression (for example, if `foo` is a macro defined as `1`). Similarly,
95
+ the program fragment `1E1` is parsed as a preprocessing number (one that
96
+ is a valid *floating-point-literal* token), whether or not `E` is a
97
+ macro name. — *end example*]
98
 
99
  [*Example 3*: The program fragment `x+++++y` is parsed as `x
100
  ++ ++ + y`, which, if `x` and `y` have integral types, violates a
101
  constraint on increment operators, even though the parse `x ++ + ++ y`
102
+ can yield a correct expression. — *end example*]
103