[lex.universal.char] - C++23 → Trunk

Files changed (1) hide show

tmp/tmp6h9bn631/{from.md → to.md} +64 -0

tmp/tmp6h9bn631/{from.md → to.md} RENAMED Viewed

	@@ -0,0 +1,64 @@

+### Universal character names <a id="lex.universal.char">[[lex.universal.char]]</a>
+``` bnf
+n-char:
+     any member of the translation character set except the U+007d (right curly bracket) or new-line character
+```
+``` bnf
+n-char-sequence:
+    n-char n-char-sequenceₒₚₜ
+```
+``` bnf
+named-universal-character:
+    '\N{' n-char-sequence '}'
+```
+``` bnf
+hex-quad:
+    hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
+```
+``` bnf
+simple-hexadecimal-digit-sequence:
+    hexadecimal-digit simple-hexadecimal-digit-sequenceₒₚₜ
+```
+``` bnf
+universal-character-name:
+    '\u' hex-quad
+    '\U' hex-quad hex-quad
+    '\u{' simple-hexadecimal-digit-sequence '}'
+    named-universal-character
+```
+The *universal-character-name* construct provides a way to name any
+element in the translation character set using just the basic character
+set. If a *universal-character-name* outside the *c-char-sequence*,
+*s-char-sequence*, or *r-char-sequence* of a *character-literal* or
+*string-literal* (in either case, including within a
+*user-defined-literal*) corresponds to a control character or to a
+character in the basic character set, the program is ill-formed.
+[*Note 1*: A sequence of characters resembling a
+*universal-character-name* in an *r-char-sequence* [[lex.string]] does
+not form a *universal-character-name*. — *end note*]
+A *universal-character-name* of the form `\u` *hex-quad*, `\U`
+*hex-quad* *hex-quad*, or `\u{simple-hexadecimal-digit-sequence}`
+designates the character in the translation character set whose Unicode
+scalar value is the hexadecimal number represented by the sequence of
+*hexadecimal-digit*s in the *universal-character-name*. The program is
+ill-formed if that number is not a Unicode scalar value.
+A *universal-character-name* that is a *named-universal-character*
+designates the corresponding character in the Unicode Standard (chapter
+4.8 Name) if the *n-char-sequence* is equal to its character name or to
+one of its character name aliases of type “control”, “correction”, or
+“alternate”; otherwise, the program is ill-formed.
+[*Note 2*: These aliases are listed in the Unicode Character Database’s
+`NameAliases.txt`. None of these names or aliases have leading or
+trailing spaces. — *end note*]

Diff to HTML by rtfpessoa