From Jason Turner

[lex.pptoken]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmpzpjraf1k/{from.md → to.md} +34 -19
tmp/tmpzpjraf1k/{from.md → to.md} RENAMED
@@ -1,40 +1,45 @@
1
  ## Preprocessing tokens <a id="lex.pptoken">[[lex.pptoken]]</a>
2
 
3
  ``` bnf
4
  preprocessing-token:
5
  header-name
 
 
 
6
  identifier
7
  pp-number
8
  character-literal
9
  user-defined-character-literal
10
  string-literal
11
  user-defined-string-literal
12
  preprocessing-op-or-punc
13
  each non-white-space character that cannot be one of the above
14
  ```
15
 
16
- Each preprocessing token that is converted to a token ([[lex.token]])
17
- shall have the lexical form of a keyword, an identifier, a literal, an
18
- operator, or a punctuator.
19
 
20
  A preprocessing token is the minimal lexical element of the language in
21
  translation phases 3 through 6. The categories of preprocessing token
22
- are: header names, identifiers, preprocessing numbers, character
 
 
23
  literals (including user-defined character literals), string literals
24
  (including user-defined string literals), preprocessing operators and
25
  punctuators, and single non-white-space characters that do not lexically
26
  match the other preprocessing token categories. If a `'` or a `"`
27
  character matches the last category, the behavior is undefined.
28
  Preprocessing tokens can be separated by white space; this consists of
29
- comments ([[lex.comment]]), or white-space characters (space,
30
- horizontal tab, new-line, vertical tab, and form-feed), or both. As
31
- described in Clause  [[cpp]], in certain circumstances during
32
- translation phase 4, white space (or the absence thereof) serves as more
33
- than preprocessing token separation. White space can appear within a
34
- preprocessing token only as part of a header name or between the
35
- quotation characters in a character literal or string literal.
36
 
37
  If the input stream has been parsed into preprocessing tokens up to a
38
  given character:
39
 
40
  - If the next character begins a sequence of characters that could be
@@ -54,29 +59,39 @@ given character:
54
  preprocessing token by itself and not as the first character of the
55
  alternative token `<:`.
56
  - Otherwise, the next preprocessing token is the longest sequence of
57
  characters that could constitute a preprocessing token, even if that
58
  would cause further lexical analysis to fail, except that a
59
- *header-name* ([[lex.header]]) is only formed within a `#include`
60
- directive ([[cpp.include]]).
 
 
61
 
62
  [*Example 1*:
63
 
64
  ``` cpp
65
  #define R "x"
66
  const char* s = R"y"; // ill-formed raw string, not "x" "y"
67
  ```
68
 
69
  — *end example*]
70
 
 
 
 
 
 
 
 
71
  [*Example 2*: The program fragment `0xe+foo` is parsed as a
72
- preprocessing number token (one that is not a valid floating or integer
73
- literal token), even though a parse as three preprocessing tokens `0xe`,
74
- `+`, and `foo` might produce a valid expression (for example, if `foo`
75
- were a macro defined as `1`). Similarly, the program fragment `1E1` is
76
- parsed as a preprocessing number (one that is a valid floating literal
77
- token), whether or not `E` is a macro name. *end example*]
 
78
 
79
  [*Example 3*: The program fragment `x+++++y` is parsed as `x
80
  ++ ++ + y`, which, if `x` and `y` have integral types, violates a
81
  constraint on increment operators, even though the parse `x ++ + ++ y`
82
  might yield a correct expression. — *end example*]
 
1
  ## Preprocessing tokens <a id="lex.pptoken">[[lex.pptoken]]</a>
2
 
3
  ``` bnf
4
  preprocessing-token:
5
  header-name
6
+ import-keyword
7
+ module-keyword
8
+ export-keyword
9
  identifier
10
  pp-number
11
  character-literal
12
  user-defined-character-literal
13
  string-literal
14
  user-defined-string-literal
15
  preprocessing-op-or-punc
16
  each non-white-space character that cannot be one of the above
17
  ```
18
 
19
+ Each preprocessing token that is converted to a token [[lex.token]]
20
+ shall have the lexical form of a keyword, an identifier, a literal, or
21
+ an operator or punctuator.
22
 
23
  A preprocessing token is the minimal lexical element of the language in
24
  translation phases 3 through 6. The categories of preprocessing token
25
+ are: header names, placeholder tokens produced by preprocessing `import`
26
+ and `module` directives (*import-keyword*, *module-keyword*, and
27
+ *export-keyword*), identifiers, preprocessing numbers, character
28
  literals (including user-defined character literals), string literals
29
  (including user-defined string literals), preprocessing operators and
30
  punctuators, and single non-white-space characters that do not lexically
31
  match the other preprocessing token categories. If a `'` or a `"`
32
  character matches the last category, the behavior is undefined.
33
  Preprocessing tokens can be separated by white space; this consists of
34
+ comments [[lex.comment]], or white-space characters (space, horizontal
35
+ tab, new-line, vertical tab, and form-feed), or both. As described in
36
+ [[cpp]], in certain circumstances during translation phase 4, white
37
+ space (or the absence thereof) serves as more than preprocessing token
38
+ separation. White space can appear within a preprocessing token only as
39
+ part of a header name or between the quotation characters in a character
40
+ literal or string literal.
41
 
42
  If the input stream has been parsed into preprocessing tokens up to a
43
  given character:
44
 
45
  - If the next character begins a sequence of characters that could be
 
59
  preprocessing token by itself and not as the first character of the
60
  alternative token `<:`.
61
  - Otherwise, the next preprocessing token is the longest sequence of
62
  characters that could constitute a preprocessing token, even if that
63
  would cause further lexical analysis to fail, except that a
64
+ *header-name* [[lex.header]] is only formed
65
+ - after the `include` or `import` preprocessing token in an `#include`
66
+ [[cpp.include]] or `import` [[cpp.import]] directive, or
67
+ - within a *has-include-expression*.
68
 
69
  [*Example 1*:
70
 
71
  ``` cpp
72
  #define R "x"
73
  const char* s = R"y"; // ill-formed raw string, not "x" "y"
74
  ```
75
 
76
  — *end example*]
77
 
78
+ The *import-keyword* is produced by processing an `import` directive
79
+ [[cpp.import]], the *module-keyword* is produced by preprocessing a
80
+ `module` directive [[cpp.module]], and the *export-keyword* is produced
81
+ by preprocessing either of the previous two directives.
82
+
83
+ [*Note 1*: None has any observable spelling. — *end note*]
84
+
85
  [*Example 2*: The program fragment `0xe+foo` is parsed as a
86
+ preprocessing number token (one that is not a valid *integer-literal* or
87
+ *floating-point-literal* token), even though a parse as three
88
+ preprocessing tokens `0xe`, `+`, and `foo` might produce a valid
89
+ expression (for example, if `foo` were a macro defined as `1`).
90
+ Similarly, the program fragment `1E1` is parsed as a preprocessing
91
+ number (one that is a valid *floating-point-literal* token), whether or
92
+ not `E` is a macro name. — *end example*]
93
 
94
  [*Example 3*: The program fragment `x+++++y` is parsed as `x
95
  ++ ++ + y`, which, if `x` and `y` have integral types, violates a
96
  constraint on increment operators, even though the parse `x ++ + ++ y`
97
  might yield a correct expression. — *end example*]