[lex.pptoken] - C++17 → C++20

Files changed (1) hide show

tmp/tmpzpjraf1k/{from.md → to.md} +34 -19

tmp/tmpzpjraf1k/{from.md → to.md} RENAMED Viewed

@@ -1,40 +1,45 @@
 ## Preprocessing tokens <a id="lex.pptoken">[[lex.pptoken]]</a>
 ``` bnf
 preprocessing-token:
     header-name
     identifier
     pp-number
     character-literal
     user-defined-character-literal
     string-literal
     user-defined-string-literal
     preprocessing-op-or-punc
     each non-white-space character that cannot be one of the above
 ```
-Each preprocessing token that is converted to a token ([[lex.token]])
-shall have the lexical form of a keyword, an identifier, a literal, an
-operator, or a punctuator.
 A preprocessing token is the minimal lexical element of the language in
 translation phases 3 through 6. The categories of preprocessing token
-are: header names, identifiers, preprocessing numbers, character
 literals (including user-defined character literals), string literals
 (including user-defined string literals), preprocessing operators and
 punctuators, and single non-white-space characters that do not lexically
 match the other preprocessing token categories. If a `'` or a `"`
 character matches the last category, the behavior is undefined.
 Preprocessing tokens can be separated by white space; this consists of
-comments ([[lex.comment]]), or white-space characters (space,
-horizontal tab, new-line, vertical tab, and form-feed), or both. As
-described in Clause  [[cpp]], in certain circumstances during
-translation phase 4, white space (or the absence thereof) serves as more
-than preprocessing token separation. White space can appear within a
-preprocessing token only as part of a header name or between the
-quotation characters in a character literal or string literal.
 If the input stream has been parsed into preprocessing tokens up to a
 given character:
 - If the next character begins a sequence of characters that could be
@@ -54,29 +59,39 @@ given character:
   preprocessing token by itself and not as the first character of the
   alternative token `<:`.
 - Otherwise, the next preprocessing token is the longest sequence of
   characters that could constitute a preprocessing token, even if that
   would cause further lexical analysis to fail, except that a
-  *header-name* ([[lex.header]]) is only formed within a `#include`
- directive ([[cpp.include]]).
 [*Example 1*:
 ``` cpp
 #define R "x"
 const char* s = R"y";           // ill-formed raw string, not "x" "y"
 ```
 — *end example*]
 [*Example 2*: The program fragment `0xe+foo` is parsed as a
-preprocessing number token (one that is not a valid floating or integer
-literal token), even though a parse as three preprocessing tokens `0xe`,
-`+`, and `foo` might produce a valid expression (for example, if `foo`
-were a macro defined as `1`). Similarly, the program fragment `1E1` is
-parsed as a preprocessing number (one that is a valid floating literal
-token), whether or not `E` is a macro name. — *end example*]
 [*Example 3*: The program fragment `x+++++y` is parsed as `x
 ++ ++ + y`, which, if `x` and `y` have integral types, violates a
 constraint on increment operators, even though the parse `x ++ + ++ y`
 might yield a correct expression. — *end example*]

 ## Preprocessing tokens <a id="lex.pptoken">[[lex.pptoken]]</a>
 ``` bnf
 preprocessing-token:
     header-name
+    import-keyword
+    module-keyword
+    export-keyword
     identifier
     pp-number
     character-literal
     user-defined-character-literal
     string-literal
     user-defined-string-literal
     preprocessing-op-or-punc
     each non-white-space character that cannot be one of the above
 ```
+Each preprocessing token that is converted to a token [[lex.token]]
+shall have the lexical form of a keyword, an identifier, a literal, or
+an operator or punctuator.
 A preprocessing token is the minimal lexical element of the language in
 translation phases 3 through 6. The categories of preprocessing token
+are: header names, placeholder tokens produced by preprocessing `import`
+and `module` directives (*import-keyword*, *module-keyword*, and
+*export-keyword*), identifiers, preprocessing numbers, character
 literals (including user-defined character literals), string literals
 (including user-defined string literals), preprocessing operators and
 punctuators, and single non-white-space characters that do not lexically
 match the other preprocessing token categories. If a `'` or a `"`
 character matches the last category, the behavior is undefined.
 Preprocessing tokens can be separated by white space; this consists of
+comments [[lex.comment]], or white-space characters (space, horizontal
+tab, new-line, vertical tab, and form-feed), or both. As described in
+[[cpp]], in certain circumstances during translation phase 4, white
+space (or the absence thereof) serves as more than preprocessing token
+separation. White space can appear within a preprocessing token only as
+part of a header name or between the quotation characters in a character
+literal or string literal.
 If the input stream has been parsed into preprocessing tokens up to a
 given character:
 - If the next character begins a sequence of characters that could be
   preprocessing token by itself and not as the first character of the
   alternative token `<:`.
 - Otherwise, the next preprocessing token is the longest sequence of
   characters that could constitute a preprocessing token, even if that
   would cause further lexical analysis to fail, except that a
+  *header-name* [[lex.header]] is only formed
+ - after the `include` or `import` preprocessing token in an `#include`
+    [[cpp.include]] or `import` [[cpp.import]] directive, or
+  - within a *has-include-expression*.
 [*Example 1*:
 ``` cpp
 #define R "x"
 const char* s = R"y";           // ill-formed raw string, not "x" "y"
 ```
 — *end example*]
+The *import-keyword* is produced by processing an `import` directive
+[[cpp.import]], the *module-keyword* is produced by preprocessing a
+`module` directive [[cpp.module]], and the *export-keyword* is produced
+by preprocessing either of the previous two directives.
+[*Note 1*: None has any observable spelling. — *end note*]
 [*Example 2*: The program fragment `0xe+foo` is parsed as a
+preprocessing number token (one that is not a valid *integer-literal* or
+*floating-point-literal* token), even though a parse as three
+preprocessing tokens `0xe`, `+`, and `foo` might produce a valid
+expression (for example, if `foo` were a macro defined as `1`).
+Similarly, the program fragment `1E1` is parsed as a preprocessing
+number (one that is a valid *floating-point-literal* token), whether or
+not `E` is a macro name. — *end example*]
 [*Example 3*: The program fragment `x+++++y` is parsed as `x
 ++ ++ + y`, which, if `x` and `y` have integral types, violates a
 constraint on increment operators, even though the parse `x ++ + ++ y`
 might yield a correct expression. — *end example*]

Diff to HTML by rtfpessoa