From Jason Turner

[lex.phases]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmpyaxjsfmd/{from.md → to.md} +45 -42
tmp/tmpyaxjsfmd/{from.md → to.md} RENAMED
@@ -5,27 +5,26 @@ following phases.[^1]
5
 
6
  1. Physical source file characters are mapped, in an
7
  *implementation-defined* manner, to the basic source character set
8
  (introducing new-line characters for end-of-line indicators) if
9
  necessary. The set of physical source file characters accepted is
10
- *implementation-defined*. Trigraph sequences ([[lex.trigraph]]) are
11
- replaced by corresponding single-character internal representations.
12
- Any source file character not in the basic source character set (
13
- [[lex.charset]]) is replaced by the universal-character-name that
14
- designates that character. (An implementation may use any internal
15
- encoding, so long as an actual extended character encountered in the
16
- source file, and the same extended character expressed in the source
17
- file as a universal-character-name (i.e., using the `\uXXXX`
18
- notation), are handled equivalently except where this replacement is
19
- reverted in a raw string literal.)
20
  2. Each instance of a backslash character (\\ immediately followed by a
21
  new-line character is deleted, splicing physical source lines to
22
  form logical source lines. Only the last backslash on any physical
23
  source line shall be eligible for being part of such a splice.
24
  Except for splices reverted in a raw string literal, if a splice
25
  results in a character sequence that matches the syntax of a
26
- universal-character-name, the behavior is undefined. A source file
27
  that is not empty and that does not end in a new-line character, or
28
  that ends in a new-line character immediately preceded by a
29
  backslash character before any such splicing takes place, shall be
30
  processed as if an additional new-line character were appended to
31
  the file.
@@ -35,53 +34,57 @@ following phases.[^1]
35
  token or in a partial comment.[^2] Each comment is replaced by one
36
  space character. New-line characters are retained. Whether each
37
  nonempty sequence of white-space characters other than new-line is
38
  retained or replaced by one space character is unspecified. The
39
  process of dividing a source file’s characters into preprocessing
40
- tokens is context-dependent. see the handling of `<` within a
41
- `#include` preprocessing directive.
42
  4. Preprocessing directives are executed, macro invocations are
43
  expanded, and `_Pragma` unary operator expressions are executed. If
44
  a character sequence that matches the syntax of a
45
- universal-character-name is produced by token concatenation (
46
  [[cpp.concat]]), the behavior is undefined. A `#include`
47
  preprocessing directive causes the named header or source file to be
48
  processed from phase 1 through phase 4, recursively. All
49
  preprocessing directives are then deleted.
50
  5. Each source character set member in a character literal or a string
51
  literal, as well as each escape sequence and
52
- universal-character-name in a character literal or a non-raw string
53
- literal, is converted to the corresponding member of the execution
54
- character set ([[lex.ccon]], [[lex.string]]); if there is no
55
- corresponding member, it is converted to an *implementation-defined*
56
- member other than the null (wide) character.[^3]
 
57
  6. Adjacent string literal tokens are concatenated.
58
  7. White-space characters separating tokens are no longer significant.
59
- Each preprocessing token is converted into a token. (
60
- [[lex.token]]). The resulting tokens are syntactically and
61
- semantically analyzed and translated as a translation unit. The
62
- process of analyzing and translating the tokens may occasionally
63
- result in one token being replaced by a sequence of other tokens (
64
- [[temp.names]]).Source files, translation units and translated
65
- translation units need not necessarily be stored as files, nor need
66
- there be any one-to-one correspondence between these entities and
67
- any external representation. The description is conceptual only, and
68
- does not specify any particular implementation.
 
69
  8. Translated translation units and instantiation units are combined as
70
- follows: Some or all of these may be supplied from a library. Each
71
- translated translation unit is examined to produce a list of
72
- required instantiations. This may include instantiations which have
73
- been explicitly requested ([[temp.explicit]]). The definitions of
74
- the required templates are located. It is *implementation-defined*
75
- whether the source of the translation units containing these
76
- definitions is required to be available. An implementation could
77
- encode sufficient information into the translated translation unit
78
- so as to ensure the source is not required here. All the required
79
- instantiations are performed to produce *instantiation units*. These
80
- are similar to translated translation units, but contain no
81
- references to uninstantiated templates and no template definitions.
82
- The program is ill-formed if any instantiation fails.
 
 
83
  9. All external entity references are resolved. Library components are
84
  linked to satisfy external references to entities not defined in the
85
  current translation. All such translator output is collected into a
86
  program image which contains information needed for execution in its
87
  execution environment.
 
5
 
6
  1. Physical source file characters are mapped, in an
7
  *implementation-defined* manner, to the basic source character set
8
  (introducing new-line characters for end-of-line indicators) if
9
  necessary. The set of physical source file characters accepted is
10
+ *implementation-defined*. Any source file character not in the basic
11
+ source character set ([[lex.charset]]) is replaced by the
12
+ *universal-character-name* that designates that character. An
13
+ implementation may use any internal encoding, so long as an actual
14
+ extended character encountered in the source file, and the same
15
+ extended character expressed in the source file as a
16
+ *universal-character-name* (e.g., using the `\uXXXX` notation), are
17
+ handled equivalently except where this replacement is reverted (
18
+ [[lex.pptoken]]) in a raw string literal.
 
19
  2. Each instance of a backslash character (\\ immediately followed by a
20
  new-line character is deleted, splicing physical source lines to
21
  form logical source lines. Only the last backslash on any physical
22
  source line shall be eligible for being part of such a splice.
23
  Except for splices reverted in a raw string literal, if a splice
24
  results in a character sequence that matches the syntax of a
25
+ *universal-character-name*, the behavior is undefined. A source file
26
  that is not empty and that does not end in a new-line character, or
27
  that ends in a new-line character immediately preceded by a
28
  backslash character before any such splicing takes place, shall be
29
  processed as if an additional new-line character were appended to
30
  the file.
 
34
  token or in a partial comment.[^2] Each comment is replaced by one
35
  space character. New-line characters are retained. Whether each
36
  nonempty sequence of white-space characters other than new-line is
37
  retained or replaced by one space character is unspecified. The
38
  process of dividing a source file’s characters into preprocessing
39
+ tokens is context-dependent. \[*Example 1*: see the handling of `<`
40
+ within a `#include` preprocessing directive. — *end example*]
41
  4. Preprocessing directives are executed, macro invocations are
42
  expanded, and `_Pragma` unary operator expressions are executed. If
43
  a character sequence that matches the syntax of a
44
+ *universal-character-name* is produced by token concatenation (
45
  [[cpp.concat]]), the behavior is undefined. A `#include`
46
  preprocessing directive causes the named header or source file to be
47
  processed from phase 1 through phase 4, recursively. All
48
  preprocessing directives are then deleted.
49
  5. Each source character set member in a character literal or a string
50
  literal, as well as each escape sequence and
51
+ *universal-character-name* in a character literal or a non-raw
52
+ string literal, is converted to the corresponding member of the
53
+ execution character set ([[lex.ccon]], [[lex.string]]); if there is
54
+ no corresponding member, it is converted to an
55
+ *implementation-defined* member other than the null (wide)
56
+ character.[^3]
57
  6. Adjacent string literal tokens are concatenated.
58
  7. White-space characters separating tokens are no longer significant.
59
+ Each preprocessing token is converted into a token ([[lex.token]]).
60
+ The resulting tokens are syntactically and semantically analyzed and
61
+ translated as a translation unit. \[*Note 1*: The process of
62
+ analyzing and translating the tokens may occasionally result in one
63
+ token being replaced by a sequence of other tokens (
64
+ [[temp.names]]). *end note*] \[*Note 2*: Source files,
65
+ translation units and translated translation units need not
66
+ necessarily be stored as files, nor need there be any one-to-one
67
+ correspondence between these entities and any external
68
+ representation. The description is conceptual only, and does not
69
+ specify any particular implementation. — *end note*]
70
  8. Translated translation units and instantiation units are combined as
71
+ follows: \[*Note 3*: Some or all of these may be supplied from a
72
+ library. *end note*] Each translated translation unit is examined
73
+ to produce a list of required instantiations. \[*Note 4*: This may
74
+ include instantiations which have been explicitly requested (
75
+ [[temp.explicit]]). *end note*] The definitions of the required
76
+ templates are located. It is *implementation-defined* whether the
77
+ source of the translation units containing these definitions is
78
+ required to be available. \[*Note 5*: An implementation could encode
79
+ sufficient information into the translated translation unit so as to
80
+ ensure the source is not required here. — *end note*] All the
81
+ required instantiations are performed to produce *instantiation
82
+ units*. \[*Note 6*: These are similar to translated translation
83
+ units, but contain no references to uninstantiated templates and no
84
+ template definitions. — *end note*] The program is ill-formed if
85
+ any instantiation fails.
86
  9. All external entity references are resolved. Library components are
87
  linked to satisfy external references to entities not defined in the
88
  current translation. All such translator output is collected into a
89
  program image which contains information needed for execution in its
90
  execution environment.