tmp/tmpzpjraf1k/{from.md → to.md}
RENAMED
|
@@ -1,40 +1,45 @@
|
|
| 1 |
## Preprocessing tokens <a id="lex.pptoken">[[lex.pptoken]]</a>
|
| 2 |
|
| 3 |
``` bnf
|
| 4 |
preprocessing-token:
|
| 5 |
header-name
|
|
|
|
|
|
|
|
|
|
| 6 |
identifier
|
| 7 |
pp-number
|
| 8 |
character-literal
|
| 9 |
user-defined-character-literal
|
| 10 |
string-literal
|
| 11 |
user-defined-string-literal
|
| 12 |
preprocessing-op-or-punc
|
| 13 |
each non-white-space character that cannot be one of the above
|
| 14 |
```
|
| 15 |
|
| 16 |
-
Each preprocessing token that is converted to a token
|
| 17 |
-
shall have the lexical form of a keyword, an identifier, a literal,
|
| 18 |
-
operator
|
| 19 |
|
| 20 |
A preprocessing token is the minimal lexical element of the language in
|
| 21 |
translation phases 3 through 6. The categories of preprocessing token
|
| 22 |
-
are: header names,
|
|
|
|
|
|
|
| 23 |
literals (including user-defined character literals), string literals
|
| 24 |
(including user-defined string literals), preprocessing operators and
|
| 25 |
punctuators, and single non-white-space characters that do not lexically
|
| 26 |
match the other preprocessing token categories. If a `'` or a `"`
|
| 27 |
character matches the last category, the behavior is undefined.
|
| 28 |
Preprocessing tokens can be separated by white space; this consists of
|
| 29 |
-
comments
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
|
| 37 |
If the input stream has been parsed into preprocessing tokens up to a
|
| 38 |
given character:
|
| 39 |
|
| 40 |
- If the next character begins a sequence of characters that could be
|
|
@@ -54,29 +59,39 @@ given character:
|
|
| 54 |
preprocessing token by itself and not as the first character of the
|
| 55 |
alternative token `<:`.
|
| 56 |
- Otherwise, the next preprocessing token is the longest sequence of
|
| 57 |
characters that could constitute a preprocessing token, even if that
|
| 58 |
would cause further lexical analysis to fail, except that a
|
| 59 |
-
*header-name*
|
| 60 |
-
|
|
|
|
|
|
|
| 61 |
|
| 62 |
[*Example 1*:
|
| 63 |
|
| 64 |
``` cpp
|
| 65 |
#define R "x"
|
| 66 |
const char* s = R"y"; // ill-formed raw string, not "x" "y"
|
| 67 |
```
|
| 68 |
|
| 69 |
— *end example*]
|
| 70 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
[*Example 2*: The program fragment `0xe+foo` is parsed as a
|
| 72 |
-
preprocessing number token (one that is not a valid
|
| 73 |
-
literal token), even though a parse as three
|
| 74 |
-
`+`, and `foo` might produce a valid
|
| 75 |
-
were a macro defined as `1`).
|
| 76 |
-
|
| 77 |
-
|
|
|
|
| 78 |
|
| 79 |
[*Example 3*: The program fragment `x+++++y` is parsed as `x
|
| 80 |
++ ++ + y`, which, if `x` and `y` have integral types, violates a
|
| 81 |
constraint on increment operators, even though the parse `x ++ + ++ y`
|
| 82 |
might yield a correct expression. — *end example*]
|
|
|
|
| 1 |
## Preprocessing tokens <a id="lex.pptoken">[[lex.pptoken]]</a>
|
| 2 |
|
| 3 |
``` bnf
|
| 4 |
preprocessing-token:
|
| 5 |
header-name
|
| 6 |
+
import-keyword
|
| 7 |
+
module-keyword
|
| 8 |
+
export-keyword
|
| 9 |
identifier
|
| 10 |
pp-number
|
| 11 |
character-literal
|
| 12 |
user-defined-character-literal
|
| 13 |
string-literal
|
| 14 |
user-defined-string-literal
|
| 15 |
preprocessing-op-or-punc
|
| 16 |
each non-white-space character that cannot be one of the above
|
| 17 |
```
|
| 18 |
|
| 19 |
+
Each preprocessing token that is converted to a token [[lex.token]]
|
| 20 |
+
shall have the lexical form of a keyword, an identifier, a literal, or
|
| 21 |
+
an operator or punctuator.
|
| 22 |
|
| 23 |
A preprocessing token is the minimal lexical element of the language in
|
| 24 |
translation phases 3 through 6. The categories of preprocessing token
|
| 25 |
+
are: header names, placeholder tokens produced by preprocessing `import`
|
| 26 |
+
and `module` directives (*import-keyword*, *module-keyword*, and
|
| 27 |
+
*export-keyword*), identifiers, preprocessing numbers, character
|
| 28 |
literals (including user-defined character literals), string literals
|
| 29 |
(including user-defined string literals), preprocessing operators and
|
| 30 |
punctuators, and single non-white-space characters that do not lexically
|
| 31 |
match the other preprocessing token categories. If a `'` or a `"`
|
| 32 |
character matches the last category, the behavior is undefined.
|
| 33 |
Preprocessing tokens can be separated by white space; this consists of
|
| 34 |
+
comments [[lex.comment]], or white-space characters (space, horizontal
|
| 35 |
+
tab, new-line, vertical tab, and form-feed), or both. As described in
|
| 36 |
+
[[cpp]], in certain circumstances during translation phase 4, white
|
| 37 |
+
space (or the absence thereof) serves as more than preprocessing token
|
| 38 |
+
separation. White space can appear within a preprocessing token only as
|
| 39 |
+
part of a header name or between the quotation characters in a character
|
| 40 |
+
literal or string literal.
|
| 41 |
|
| 42 |
If the input stream has been parsed into preprocessing tokens up to a
|
| 43 |
given character:
|
| 44 |
|
| 45 |
- If the next character begins a sequence of characters that could be
|
|
|
|
| 59 |
preprocessing token by itself and not as the first character of the
|
| 60 |
alternative token `<:`.
|
| 61 |
- Otherwise, the next preprocessing token is the longest sequence of
|
| 62 |
characters that could constitute a preprocessing token, even if that
|
| 63 |
would cause further lexical analysis to fail, except that a
|
| 64 |
+
*header-name* [[lex.header]] is only formed
|
| 65 |
+
- after the `include` or `import` preprocessing token in an `#include`
|
| 66 |
+
[[cpp.include]] or `import` [[cpp.import]] directive, or
|
| 67 |
+
- within a *has-include-expression*.
|
| 68 |
|
| 69 |
[*Example 1*:
|
| 70 |
|
| 71 |
``` cpp
|
| 72 |
#define R "x"
|
| 73 |
const char* s = R"y"; // ill-formed raw string, not "x" "y"
|
| 74 |
```
|
| 75 |
|
| 76 |
— *end example*]
|
| 77 |
|
| 78 |
+
The *import-keyword* is produced by processing an `import` directive
|
| 79 |
+
[[cpp.import]], the *module-keyword* is produced by preprocessing a
|
| 80 |
+
`module` directive [[cpp.module]], and the *export-keyword* is produced
|
| 81 |
+
by preprocessing either of the previous two directives.
|
| 82 |
+
|
| 83 |
+
[*Note 1*: None has any observable spelling. — *end note*]
|
| 84 |
+
|
| 85 |
[*Example 2*: The program fragment `0xe+foo` is parsed as a
|
| 86 |
+
preprocessing number token (one that is not a valid *integer-literal* or
|
| 87 |
+
*floating-point-literal* token), even though a parse as three
|
| 88 |
+
preprocessing tokens `0xe`, `+`, and `foo` might produce a valid
|
| 89 |
+
expression (for example, if `foo` were a macro defined as `1`).
|
| 90 |
+
Similarly, the program fragment `1E1` is parsed as a preprocessing
|
| 91 |
+
number (one that is a valid *floating-point-literal* token), whether or
|
| 92 |
+
not `E` is a macro name. — *end example*]
|
| 93 |
|
| 94 |
[*Example 3*: The program fragment `x+++++y` is parsed as `x
|
| 95 |
++ ++ + y`, which, if `x` and `y` have integral types, violates a
|
| 96 |
constraint on increment operators, even though the parse `x ++ + ++ y`
|
| 97 |
might yield a correct expression. — *end example*]
|