tmp/tmpi8vy37oi/{from.md → to.md}
RENAMED
|
@@ -1,93 +1,102 @@
|
|
| 1 |
## Phases of translation <a id="lex.phases">[[lex.phases]]</a>
|
| 2 |
|
| 3 |
The precedence among the syntax rules of translation is specified by the
|
| 4 |
following phases.[^1]
|
| 5 |
|
| 6 |
-
1.
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
*implementation-defined*
|
| 11 |
-
|
| 12 |
-
*
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
*universal-character-name*, the behavior is undefined. A source file
|
| 26 |
that is not empty and that does not end in a new-line character, or
|
| 27 |
-
that ends in a
|
| 28 |
-
|
| 29 |
-
processed as if an additional new-line character were appended to
|
| 30 |
-
the file.
|
| 31 |
3. The source file is decomposed into preprocessing tokens
|
| 32 |
-
[[lex.pptoken]] and sequences of
|
| 33 |
comments). A source file shall not end in a partial preprocessing
|
| 34 |
token or in a partial comment.[^2] Each comment is replaced by one
|
| 35 |
space character. New-line characters are retained. Whether each
|
| 36 |
-
nonempty sequence of
|
| 37 |
-
retained or replaced by one space character is unspecified.
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
4. Preprocessing directives are executed, macro invocations are
|
| 42 |
-
expanded, and `_Pragma` unary operator expressions are executed.
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
[[cpp.concat]], the behavior is undefined. A `#include`
|
| 46 |
-
preprocessing directive causes the named header or source file to be
|
| 47 |
-
processed from phase 1 through phase 4, recursively. All
|
| 48 |
preprocessing directives are then deleted.
|
| 49 |
-
5.
|
| 50 |
-
*
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
*implementation-defined* member other than the null (wide)
|
| 56 |
-
character.[^3]
|
| 57 |
-
6. Adjacent string literal tokens are concatenated.
|
| 58 |
-
7. White-space characters separating tokens are no longer significant.
|
| 59 |
Each preprocessing token is converted into a token [[lex.token]].
|
| 60 |
-
The resulting tokens
|
| 61 |
-
|
| 62 |
-
analyzing and translating the tokens
|
| 63 |
-
token being replaced by a sequence of
|
| 64 |
-
[[temp.names]]. — *end note*] It is
|
| 65 |
-
whether the sources for module units and
|
| 66 |
-
current translation unit has an interface
|
| 67 |
-
[[module.unit]], [[module.import]]
|
| 68 |
-
\[*Note
|
| 69 |
-
translation units need not necessarily be stored as
|
| 70 |
-
there be any one-to-one correspondence between these
|
| 71 |
-
any external representation. The description is
|
| 72 |
-
does not specify any particular
|
|
|
|
| 73 |
8. Translated translation units and instantiation units are combined as
|
| 74 |
-
follows: \[*Note
|
| 75 |
library. — *end note*] Each translated translation unit is examined
|
| 76 |
-
to produce a list of required instantiations. \[*Note
|
| 77 |
include instantiations which have been explicitly requested
|
| 78 |
[[temp.explicit]]. — *end note*] The definitions of the required
|
| 79 |
templates are located. It is *implementation-defined* whether the
|
| 80 |
source of the translation units containing these definitions is
|
| 81 |
-
required to be available. \[*Note
|
| 82 |
-
sufficient information into the translated translation
|
| 83 |
-
ensure the source is not required here. — *end note*]
|
| 84 |
-
required instantiations are performed to produce
|
| 85 |
-
units*. \[*Note
|
| 86 |
-
units, but contain no references to uninstantiated
|
| 87 |
-
template definitions. — *end note*] The program is
|
| 88 |
-
any instantiation fails.
|
| 89 |
9. All external entity references are resolved. Library components are
|
| 90 |
linked to satisfy external references to entities not defined in the
|
| 91 |
current translation. All such translator output is collected into a
|
| 92 |
program image which contains information needed for execution in its
|
| 93 |
execution environment.
|
|
|
|
| 1 |
## Phases of translation <a id="lex.phases">[[lex.phases]]</a>
|
| 2 |
|
| 3 |
The precedence among the syntax rules of translation is specified by the
|
| 4 |
following phases.[^1]
|
| 5 |
|
| 6 |
+
1. An implementation shall support input files that are a sequence of
|
| 7 |
+
UTF-8 code units (UTF-8 files). It may also support an
|
| 8 |
+
*implementation-defined* set of other kinds of input files, and, if
|
| 9 |
+
so, the kind of an input file is determined in an
|
| 10 |
+
*implementation-defined* manner that includes a means of designating
|
| 11 |
+
input files as UTF-8 files, independent of their content.
|
| 12 |
+
\[*Note 1*: In other words, recognizing the U+feff (byte order mark)
|
| 13 |
+
is not sufficient. — *end note*] If an input file is determined to
|
| 14 |
+
be a UTF-8 file, then it shall be a well-formed UTF-8 code unit
|
| 15 |
+
sequence and it is decoded to produce a sequence of Unicode scalar
|
| 16 |
+
values. A sequence of translation character set elements is then
|
| 17 |
+
formed by mapping each Unicode scalar value to the corresponding
|
| 18 |
+
translation character set element. In the resulting sequence, each
|
| 19 |
+
pair of characters in the input sequence consisting of
|
| 20 |
+
U+000d (carriage return) followed by U+000a (line feed), as well as
|
| 21 |
+
each U+000d (carriage return) not immediately followed by a
|
| 22 |
+
U+000a (line feed), is replaced by a single new-line character. For
|
| 23 |
+
any other kind of input file supported by the implementation,
|
| 24 |
+
characters are mapped, in an *implementation-defined* manner, to a
|
| 25 |
+
sequence of translation character set elements [[lex.charset]],
|
| 26 |
+
representing end-of-line indicators as new-line characters.
|
| 27 |
+
2. If the first translation character is U+feff (byte order mark), it
|
| 28 |
+
is deleted. Each sequence of a backslash character (\\ immediately
|
| 29 |
+
followed by zero or more whitespace characters other than new-line
|
| 30 |
+
followed by a new-line character is deleted, splicing physical
|
| 31 |
+
source lines to form logical source lines. Only the last backslash
|
| 32 |
+
on any physical source line shall be eligible for being part of such
|
| 33 |
+
a splice. Except for splices reverted in a raw string literal, if a
|
| 34 |
+
splice results in a character sequence that matches the syntax of a
|
| 35 |
*universal-character-name*, the behavior is undefined. A source file
|
| 36 |
that is not empty and that does not end in a new-line character, or
|
| 37 |
+
that ends in a splice, shall be processed as if an additional
|
| 38 |
+
new-line character were appended to the file.
|
|
|
|
|
|
|
| 39 |
3. The source file is decomposed into preprocessing tokens
|
| 40 |
+
[[lex.pptoken]] and sequences of whitespace characters (including
|
| 41 |
comments). A source file shall not end in a partial preprocessing
|
| 42 |
token or in a partial comment.[^2] Each comment is replaced by one
|
| 43 |
space character. New-line characters are retained. Whether each
|
| 44 |
+
nonempty sequence of whitespace characters other than new-line is
|
| 45 |
+
retained or replaced by one space character is unspecified. As
|
| 46 |
+
characters from the source file are consumed to form the next
|
| 47 |
+
preprocessing token (i.e., not being consumed as part of a comment
|
| 48 |
+
or other forms of whitespace), except when matching a
|
| 49 |
+
*c-char-sequence*, *s-char-sequence*, *r-char-sequence*,
|
| 50 |
+
*h-char-sequence*, or *q-char-sequence*, *universal-character-name*s
|
| 51 |
+
are recognized and replaced by the designated element of the
|
| 52 |
+
translation character set. The process of dividing a source file’s
|
| 53 |
+
characters into preprocessing tokens is context-dependent.
|
| 54 |
+
\[*Example 1*: See the handling of `<` within a `#include`
|
| 55 |
+
preprocessing directive. — *end example*]
|
| 56 |
4. Preprocessing directives are executed, macro invocations are
|
| 57 |
+
expanded, and `_Pragma` unary operator expressions are executed. A
|
| 58 |
+
`#include` preprocessing directive causes the named header or source
|
| 59 |
+
file to be processed from phase 1 through phase 4, recursively. All
|
|
|
|
|
|
|
|
|
|
| 60 |
preprocessing directives are then deleted.
|
| 61 |
+
5. For a sequence of two or more adjacent *string-literal* tokens, a
|
| 62 |
+
common *encoding-prefix* is determined as specified in
|
| 63 |
+
[[lex.string]]. Each such *string-literal* token is then considered
|
| 64 |
+
to have that common *encoding-prefix*.
|
| 65 |
+
6. Adjacent *string-literal* tokens are concatenated [[lex.string]].
|
| 66 |
+
7. Whitespace characters separating tokens are no longer significant.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
Each preprocessing token is converted into a token [[lex.token]].
|
| 68 |
+
The resulting tokens constitute a *translation unit* and are
|
| 69 |
+
syntactically and semantically analyzed and translated.
|
| 70 |
+
\[*Note 2*: The process of analyzing and translating the tokens can
|
| 71 |
+
occasionally result in one token being replaced by a sequence of
|
| 72 |
+
other tokens [[temp.names]]. — *end note*] It is
|
| 73 |
+
*implementation-defined* whether the sources for module units and
|
| 74 |
+
header units on which the current translation unit has an interface
|
| 75 |
+
dependency [[module.unit]], [[module.import]] are required to be
|
| 76 |
+
available. \[*Note 3*: Source files, translation units and
|
| 77 |
+
translated translation units need not necessarily be stored as
|
| 78 |
+
files, nor need there be any one-to-one correspondence between these
|
| 79 |
+
entities and any external representation. The description is
|
| 80 |
+
conceptual only, and does not specify any particular
|
| 81 |
+
implementation. — *end note*]
|
| 82 |
8. Translated translation units and instantiation units are combined as
|
| 83 |
+
follows: \[*Note 4*: Some or all of these can be supplied from a
|
| 84 |
library. — *end note*] Each translated translation unit is examined
|
| 85 |
+
to produce a list of required instantiations. \[*Note 5*: This can
|
| 86 |
include instantiations which have been explicitly requested
|
| 87 |
[[temp.explicit]]. — *end note*] The definitions of the required
|
| 88 |
templates are located. It is *implementation-defined* whether the
|
| 89 |
source of the translation units containing these definitions is
|
| 90 |
+
required to be available. \[*Note 6*: An implementation can choose
|
| 91 |
+
to encode sufficient information into the translated translation
|
| 92 |
+
unit so as to ensure the source is not required here. — *end note*]
|
| 93 |
+
All the required instantiations are performed to produce
|
| 94 |
+
*instantiation units*. \[*Note 7*: These are similar to translated
|
| 95 |
+
translation units, but contain no references to uninstantiated
|
| 96 |
+
templates and no template definitions. — *end note*] The program is
|
| 97 |
+
ill-formed if any instantiation fails.
|
| 98 |
9. All external entity references are resolved. Library components are
|
| 99 |
linked to satisfy external references to entities not defined in the
|
| 100 |
current translation. All such translator output is collected into a
|
| 101 |
program image which contains information needed for execution in its
|
| 102 |
execution environment.
|