tmp/tmpdd98eubv/{from.md → to.md}
RENAMED
|
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#### General <a id="text.encoding.general">[[text.encoding.general]]</a>
|
| 2 |
+
|
| 3 |
+
A *registered character encoding* is a character encoding scheme in the
|
| 4 |
+
IANA Character Sets registry.
|
| 5 |
+
|
| 6 |
+
[*Note 1*: The IANA Character Sets registry uses the term “character
|
| 7 |
+
sets” to refer to character encodings. — *end note*]
|
| 8 |
+
|
| 9 |
+
The primary name of a registered character encoding is the name of that
|
| 10 |
+
encoding specified in the IANA Character Sets registry.
|
| 11 |
+
|
| 12 |
+
The set of known registered character encodings contains every
|
| 13 |
+
registered character encoding specified in the IANA Character Sets
|
| 14 |
+
registry except for the following:
|
| 15 |
+
|
| 16 |
+
- NATS-DANO (33)
|
| 17 |
+
- NATS-DANO-ADD (34)
|
| 18 |
+
|
| 19 |
+
Each known registered character encoding is identified by an enumerator
|
| 20 |
+
in `text_encoding::id`, and has a set of zero or more *aliases*.
|
| 21 |
+
|
| 22 |
+
The set of aliases of a known registered character encoding is an
|
| 23 |
+
*implementation-defined* superset of the aliases specified in the IANA
|
| 24 |
+
Character Sets registry. The set of aliases for US-ASCII includes
|
| 25 |
+
“ASCII”. No two aliases or primary names of distinct registered
|
| 26 |
+
character encodings are equivalent when compared by
|
| 27 |
+
`text_encoding::comp-name`.
|
| 28 |
+
|
| 29 |
+
How a `text_encoding` object is determined to be representative of a
|
| 30 |
+
character encoding scheme implemented in the translation or execution
|
| 31 |
+
environment is *implementation-defined*.
|
| 32 |
+
|
| 33 |
+
An object `e` of type `text_encoding` such that
|
| 34 |
+
`e.mib() == text_encoding::id::unknown` is `false` and
|
| 35 |
+
`e.mib() == text_encoding::id::other` is `false` maintains the following
|
| 36 |
+
invariants:
|
| 37 |
+
|
| 38 |
+
- `*e.name() == '\0'` is `false`, and
|
| 39 |
+
- `e.mib() == text_encoding(e.name()).mib()` is `true`.
|
| 40 |
+
|
| 41 |
+
*Recommended practice:*
|
| 42 |
+
|
| 43 |
+
- Implementations should not consider registered encodings to be
|
| 44 |
+
interchangeable. \[*Example 1*: Shift_JIS and Windows-31J denote
|
| 45 |
+
different encodings. — *end example*]
|
| 46 |
+
- Implementations should not use the name of a registered encoding to
|
| 47 |
+
describe another similar yet different non-registered encoding unless
|
| 48 |
+
there is a precedent on that implementation.
|
| 49 |
+
\[*Example 2*: Big5 — *end example*]
|
| 50 |
+
|