tmp/tmpj4_1oqdi/{from.md → to.md}
RENAMED
|
@@ -11,48 +11,53 @@ preprocessing-token:
|
|
| 11 |
character-literal
|
| 12 |
user-defined-character-literal
|
| 13 |
string-literal
|
| 14 |
user-defined-string-literal
|
| 15 |
preprocessing-op-or-punc
|
| 16 |
-
each non-
|
| 17 |
```
|
| 18 |
|
| 19 |
Each preprocessing token that is converted to a token [[lex.token]]
|
| 20 |
shall have the lexical form of a keyword, an identifier, a literal, or
|
| 21 |
an operator or punctuator.
|
| 22 |
|
| 23 |
A preprocessing token is the minimal lexical element of the language in
|
| 24 |
-
translation phases 3 through 6.
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
If the input stream has been parsed into preprocessing tokens up to a
|
| 43 |
given character:
|
| 44 |
|
| 45 |
- If the next character begins a sequence of characters that could be
|
| 46 |
the prefix and initial double quote of a raw string literal, such as
|
| 47 |
`R"`, the next preprocessing token shall be a raw string literal.
|
| 48 |
Between the initial and final double quote characters of the raw
|
| 49 |
-
string, any transformations performed in
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
``` bnf
|
| 55 |
encoding-prefixₒₚₜ 'R' raw-string
|
| 56 |
```
|
| 57 |
- Otherwise, if the next three characters are `<::` and the subsequent
|
| 58 |
character is neither `:` nor `>`, the `<` is treated as a
|
|
@@ -83,16 +88,16 @@ by preprocessing either of the previous two directives.
|
|
| 83 |
[*Note 1*: None has any observable spelling. — *end note*]
|
| 84 |
|
| 85 |
[*Example 2*: The program fragment `0xe+foo` is parsed as a
|
| 86 |
preprocessing number token (one that is not a valid *integer-literal* or
|
| 87 |
*floating-point-literal* token), even though a parse as three
|
| 88 |
-
preprocessing tokens `0xe`, `+`, and `foo`
|
| 89 |
-
expression (for example, if `foo`
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
|
| 94 |
[*Example 3*: The program fragment `x+++++y` is parsed as `x
|
| 95 |
++ ++ + y`, which, if `x` and `y` have integral types, violates a
|
| 96 |
constraint on increment operators, even though the parse `x ++ + ++ y`
|
| 97 |
-
|
| 98 |
|
|
|
|
| 11 |
character-literal
|
| 12 |
user-defined-character-literal
|
| 13 |
string-literal
|
| 14 |
user-defined-string-literal
|
| 15 |
preprocessing-op-or-punc
|
| 16 |
+
each non-whitespace character that cannot be one of the above
|
| 17 |
```
|
| 18 |
|
| 19 |
Each preprocessing token that is converted to a token [[lex.token]]
|
| 20 |
shall have the lexical form of a keyword, an identifier, a literal, or
|
| 21 |
an operator or punctuator.
|
| 22 |
|
| 23 |
A preprocessing token is the minimal lexical element of the language in
|
| 24 |
+
translation phases 3 through 6. In this document, glyphs are used to
|
| 25 |
+
identify elements of the basic character set [[lex.charset]]. The
|
| 26 |
+
categories of preprocessing token are: header names, placeholder tokens
|
| 27 |
+
produced by preprocessing `import` and `module` directives
|
| 28 |
+
(*import-keyword*, *module-keyword*, and *export-keyword*), identifiers,
|
| 29 |
+
preprocessing numbers, character literals (including user-defined
|
| 30 |
+
character literals), string literals (including user-defined string
|
| 31 |
+
literals), preprocessing operators and punctuators, and single
|
| 32 |
+
non-whitespace characters that do not lexically match the other
|
| 33 |
+
preprocessing token categories. If a U+0027 (apostrophe) or a
|
| 34 |
+
U+0022 (quotation mark) character matches the last category, the
|
| 35 |
+
behavior is undefined. If any character not in the basic character set
|
| 36 |
+
matches the last category, the program is ill-formed. Preprocessing
|
| 37 |
+
tokens can be separated by whitespace; this consists of comments
|
| 38 |
+
[[lex.comment]], or whitespace characters (U+0020 (space),
|
| 39 |
+
U+0009 (character tabulation), new-line, U+000b (line tabulation), and
|
| 40 |
+
U+000c (form feed)), or both. As described in [[cpp]], in certain
|
| 41 |
+
circumstances during translation phase 4, whitespace (or the absence
|
| 42 |
+
thereof) serves as more than preprocessing token separation. Whitespace
|
| 43 |
+
can appear within a preprocessing token only as part of a header name or
|
| 44 |
+
between the quotation characters in a character literal or string
|
| 45 |
+
literal.
|
| 46 |
|
| 47 |
If the input stream has been parsed into preprocessing tokens up to a
|
| 48 |
given character:
|
| 49 |
|
| 50 |
- If the next character begins a sequence of characters that could be
|
| 51 |
the prefix and initial double quote of a raw string literal, such as
|
| 52 |
`R"`, the next preprocessing token shall be a raw string literal.
|
| 53 |
Between the initial and final double quote characters of the raw
|
| 54 |
+
string, any transformations performed in phase 2 (line splicing) are
|
| 55 |
+
reverted; this reversion shall apply before any *d-char*, *r-char*, or
|
| 56 |
+
delimiting parenthesis is identified. The raw string literal is
|
| 57 |
+
defined as the shortest sequence of characters that matches the
|
| 58 |
+
raw-string pattern
|
| 59 |
``` bnf
|
| 60 |
encoding-prefixₒₚₜ 'R' raw-string
|
| 61 |
```
|
| 62 |
- Otherwise, if the next three characters are `<::` and the subsequent
|
| 63 |
character is neither `:` nor `>`, the `<` is treated as a
|
|
|
|
| 88 |
[*Note 1*: None has any observable spelling. — *end note*]
|
| 89 |
|
| 90 |
[*Example 2*: The program fragment `0xe+foo` is parsed as a
|
| 91 |
preprocessing number token (one that is not a valid *integer-literal* or
|
| 92 |
*floating-point-literal* token), even though a parse as three
|
| 93 |
+
preprocessing tokens `0xe`, `+`, and `foo` can produce a valid
|
| 94 |
+
expression (for example, if `foo` is a macro defined as `1`). Similarly,
|
| 95 |
+
the program fragment `1E1` is parsed as a preprocessing number (one that
|
| 96 |
+
is a valid *floating-point-literal* token), whether or not `E` is a
|
| 97 |
+
macro name. — *end example*]
|
| 98 |
|
| 99 |
[*Example 3*: The program fragment `x+++++y` is parsed as `x
|
| 100 |
++ ++ + y`, which, if `x` and `y` have integral types, violates a
|
| 101 |
constraint on increment operators, even though the parse `x ++ + ++ y`
|
| 102 |
+
can yield a correct expression. — *end example*]
|
| 103 |
|