From Jason Turner

[re]

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmp5vt3uzwg/{from.md → to.md} +85 -159
tmp/tmp5vt3uzwg/{from.md → to.md} RENAMED
@@ -1,11 +1,11 @@
1
- # Regular expressions library <a id="re">[[re]]</a>
2
 
3
- ## General <a id="re.general">[[re.general]]</a>
4
 
5
- This Clause describes components that C++ programs may use to perform
6
- operations involving regular expression matching and searching.
7
 
8
  The following subclauses describe a basic regular expression class
9
  template and its traits that can handle char-like [[strings.general]]
10
  template arguments, two specializations of this class template that
11
  handle sequences of `char` and `wchar_t`, a class template that holds
@@ -28,11 +28,14 @@ summarized in [[re.summary]].
28
  | [[re.alg]] | Algorithms | |
29
  | [[re.iter]] | Iterators | |
30
  | [[re.grammar]] | Grammar | |
31
 
32
 
33
- ## Requirements <a id="re.req">[[re.req]]</a>
 
 
 
34
 
35
  This subclause defines requirements on classes representing regular
36
  expression traits.
37
 
38
  [*Note 1*: The class template `regex_traits`, defined in [[re.traits]],
@@ -216,11 +219,11 @@ v.getloc()
216
  [*Note 2*: Class template `regex_traits` meets the requirements for a
217
  regular expression traits class when it is specialized for `char` or
218
  `wchar_t`. This class template is described in the header `<regex>`, and
219
  is described in [[re.traits]]. — *end note*]
220
 
221
- ## Header `<regex>` synopsis <a id="re.syn">[[re.syn]]</a>
222
 
223
  ``` cpp
224
  #include <compare> // see [compare.syn]
225
  #include <initializer_list> // see [initializer.list.syn]
226
 
@@ -457,20 +460,20 @@ namespace std {
457
  using wsmatch = match_results<wstring::const_iterator>;
458
  }
459
  }
460
  ```
461
 
462
- ## Namespace `std::regex_constants` <a id="re.const">[[re.const]]</a>
463
 
464
- ### General <a id="re.const.general">[[re.const.general]]</a>
465
 
466
  The namespace `std::regex_constants` holds symbolic constants used by
467
  the regular expression library. This namespace provides three types,
468
  `syntax_option_type`, `match_flag_type`, and `error_type`, along with
469
  several constants of these types.
470
 
471
- ### Bitmask type `syntax_option_type` <a id="re.synopt">[[re.synopt]]</a>
472
 
473
  ``` cpp
474
  namespace std::regex_constants {
475
  using syntax_option_type = T1;
476
  inline constexpr syntax_option_type icase = unspecified;
@@ -500,20 +503,20 @@ grammar is `ECMAScript`.
500
  | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
501
  | % `icase` | Specifies that matching of regular expressions against a character container sequence shall be performed without regard to case. \indexlibrarymember{syntax_option_type}{icase}% |
502
  | % `nosubs` | Specifies that no sub-expressions shall be considered to be marked, so that when a regular expression is matched against a character container sequence, no sub-expression matches shall be stored in the supplied `match_results` object. \indexlibrarymember{syntax_option_type}{nosubs}% |
503
  | % `optimize` | Specifies that the regular expression engine should pay more attention to the speed with which regular expressions are matched, and less to the speed with which regular expression objects are constructed. Otherwise it has no detectable effect on the program output. \indexlibrarymember{syntax_option_type}{optimize}% |
504
  | % `collate` | Specifies that character ranges of the form `"[a-b]"` shall be locale sensitive.% \indexlibrarymember{syntax_option_type}{collate}% \indextext{locale}% |
505
- | % `ECMAScript` | Specifies that the grammar recognized by the regular expression engine shall be that used by ECMAScript in ECMA-262, as modified in~ [[re.grammar]]. \xref ECMA-262 15.10 \indextext{ECMAScript}% \indexlibrarymember{syntax_option_type}{ECMAScript}% |
506
- | % `basic` | Specifies that the grammar recognized by the regular expression engine shall be that used by basic regular expressions in POSIX. \xref POSIX, Base Definitions and Headers, Section 9.3 \indextext{POSIX!regular expressions}% \indexlibrarymember{syntax_option_type}{basic}% |
507
- | % `extended` | Specifies that the grammar recognized by the regular expression engine shall be that used by extended regular expressions in POSIX. \xref POSIX, Base Definitions and Headers, Section 9.4 \indextext{POSIX!extended regular expressions}% \indexlibrarymember{syntax_option_type}{extended}% |
508
  | % `awk` | Specifies that the grammar recognized by the regular expression engine shall be that used by the utility awk in POSIX. \indexlibrarymember{syntax_option_type}{awk}% |
509
  | % `grep` | Specifies that the grammar recognized by the regular expression engine shall be that used by the utility grep in POSIX. \indexlibrarymember{syntax_option_type}{grep}% |
510
  | % `egrep` | Specifies that the grammar recognized by the regular expression engine shall be that used by the utility grep when given the -E option in POSIX. \indexlibrarymember{syntax_option_type}{egrep}% |
511
  | % `multiline` | Specifies that `^` shall match the beginning of a line and `$` shall match the end of a line, if the `ECMAScript` engine is selected. \indexlibrarymember{syntax_option_type}{multiline}% |
512
 
513
 
514
- ### Bitmask type `match_flag_type` <a id="re.matchflag">[[re.matchflag]]</a>
515
 
516
  ``` cpp
517
  namespace std::regex_constants {
518
  using match_flag_type = T2;
519
  inline constexpr match_flag_type match_default = {};
@@ -539,12 +542,11 @@ The type `match_flag_type` is an *implementation-defined* bitmask type
539
  Matching a regular expression against a sequence of characters
540
  \[`first`, `last`) proceeds according to the rules of the grammar
541
  specified for the regular expression object, modified according to the
542
  effects listed in [[re.matchflag]] for any bitmask elements set.
543
 
544
- **Table: `regex_constants::match_flag_type` effects when obtaining a match against a
545
- character container sequence {[}`first`, `last`{)}.** <a id="re.matchflag">[re.matchflag]</a>
546
 
547
  | Element | Effect(s) if set |
548
  | ------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
549
  | % \indexlibraryglobal{match_not_bol}% `match_not_bol` | The first character in the sequence {[}`first`, `last`{)} shall be treated as though it is not at the beginning of a line, so the character \verb|^| in the regular expression shall not match {[}`first`, `first`{)}. |
550
  | % \indexlibraryglobal{match_not_eol}% `match_not_eol` | The last character in the sequence {[}`first`, `last`{)} shall be treated as though it is not at the end of a line, so the character \verb|"$"| in the regular expression shall not match {[}`last`, `last`{)}. |
@@ -558,11 +560,11 @@ effects listed in [[re.matchflag]] for any bitmask elements set.
558
  | % \indexlibraryglobal{format_sed}% `format_sed` | When a regular expression match is to be replaced by a new string, the new string shall be constructed using the rules used by the sed utility in POSIX. |
559
  | % \indexlibraryglobal{format_no_copy}% `format_no_copy` | During a search and replace operation, sections of the character container sequence being searched that do not match the regular expression shall not be copied to the output string. |
560
  | % \indexlibraryglobal{format_first_only}% `format_first_only` | When specified during a search and replace operation, only the first occurrence of the regular expression shall be replaced. |
561
 
562
 
563
- ### Implementation-defined `error_type` <a id="re.err">[[re.err]]</a>
564
 
565
  ``` cpp
566
  namespace std::regex_constants {
567
  using error_type = T3;
568
  inline constexpr error_type error_collate = unspecified;
@@ -593,20 +595,20 @@ conditions described in [[re.err]]:
593
  | % `error_ctype` | The expression contains an invalid character class name. |
594
  | % `error_escape` | The expression contains an invalid escaped character, or a trailing escape. |
595
  | % `error_backref` | The expression contains an invalid back reference. |
596
  | % `error_brack` | The expression contains mismatched \verb|[| and \verb|]|. |
597
  | % `error_paren` | The expression contains mismatched \verb|(| and \verb|)|. |
598
- | % `error_brace` | The expression contains mismatched \verb|{| and \verb|}| |
599
  | % `error_badbrace` | The expression contains an invalid range in a \verb|{}| expression. |
600
  | % `error_range` | The expression contains an invalid character range, such as \verb|[b-a]| in most encodings. |
601
  | % `error_space` | There is insufficient memory to convert the expression into a finite state machine. |
602
  | % `error_badrepeat` | One of \verb|*?+{| is not preceded by a valid regular expression. |
603
  | % `error_complexity` | The complexity of an attempted match against a regular expression exceeds a pre-set level. |
604
  | % `error_stack` | There is insufficient memory to determine whether the regular expression matches the specified character sequence. |
605
 
606
 
607
- ## Class `regex_error` <a id="re.badexp">[[re.badexp]]</a>
608
 
609
  ``` cpp
610
  namespace std {
611
  class regex_error : public runtime_error {
612
  public:
@@ -629,11 +631,11 @@ regex_error(regex_constants::error_type ecode);
629
  regex_constants::error_type code() const;
630
  ```
631
 
632
  *Returns:* The error code that was passed to the constructor.
633
 
634
- ## Class template `regex_traits` <a id="re.traits">[[re.traits]]</a>
635
 
636
  ``` cpp
637
  namespace std {
638
  template<class charT>
639
  struct regex_traits {
@@ -713,11 +715,11 @@ template<class ForwardIterator>
713
  ```
714
 
715
  *Effects:* If
716
 
717
  ``` cpp
718
- typeid(use_facet<collate<charT>>) == typeid(collate_byname<charT>)
719
  ```
720
 
721
  and the form of the sort key returned by
722
  `collate_byname<charT>::transform(first, last)` is known and can be
723
  converted into a primary sort key then returns that key, otherwise
@@ -742,11 +744,11 @@ template<class ForwardIterator>
742
  *Returns:* An unspecified value that represents the character
743
  classification named by the character sequence designated by the
744
  iterator range \[`first`, `last`). If the parameter `icase` is `true`
745
  then the returned mask identifies the character classification without
746
  regard to the case of the characters being matched, otherwise it does
747
- honor the case of the characters being matched.[^1]
748
 
749
  The value returned shall be independent of the case of the characters in
750
  the character sequence. If the name is not recognized then returns
751
  `char_class_type()`.
752
 
@@ -828,11 +830,11 @@ the character `ch` is a valid digit in base `radix`; otherwise returns
828
 
829
  ``` cpp
830
  locale_type imbue(locale_type loc);
831
  ```
832
 
833
- *Effects:* Imbues `this` with a copy of the locale `loc`.
834
 
835
  [*Note 1*: Calling `imbue` with a different locale than the one
836
  currently in use invalidates all cached data held by
837
  `*this`. — *end note*]
838
 
@@ -869,13 +871,13 @@ the last argument passed to `imbue`.
869
  | `"upper"` | `L"upper"` | `ctype_base::upper` |
870
  | `"w"` | `L"w"` | `ctype_base::alnum` |
871
  | `"xdigit"` | `L"xdigit"` | `ctype_base::xdigit` |
872
 
873
 
874
- ## Class template `basic_regex` <a id="re.regex">[[re.regex]]</a>
875
 
876
- ### General <a id="re.regex.general">[[re.regex.general]]</a>
877
 
878
  For a char-like type `charT`, specializations of class template
879
  `basic_regex` represent regular expressions constructed from character
880
  sequences of `charT` characters. In the rest of  [[re.regex]], `charT`
881
  denotes a given char-like type. Storage for a regular expression is
@@ -900,13 +902,13 @@ namespace std {
900
  class basic_regex {
901
  public:
902
  // types
903
  using value_type = charT;
904
  using traits_type = traits;
905
- using string_type = typename traits::string_type;
906
  using flag_type = regex_constants::syntax_option_type;
907
- using locale_type = typename traits::locale_type;
908
 
909
  // [re.synopt], constants
910
  static constexpr flag_type icase = regex_constants::icase;
911
  static constexpr flag_type nosubs = regex_constants::nosubs;
912
  static constexpr flag_type optimize = regex_constants::optimize;
@@ -973,11 +975,11 @@ namespace std {
973
  regex_constants::syntax_option_type = regex_constants::ECMAScript)
974
  -> basic_regex<typename iterator_traits<ForwardIterator>::value_type>;
975
  }
976
  ```
977
 
978
- ### Constructors <a id="re.regex.construct">[[re.regex.construct]]</a>
979
 
980
  ``` cpp
981
  basic_regex();
982
  ```
983
 
@@ -1067,11 +1069,11 @@ valid regular expression.
1067
  basic_regex(initializer_list<charT> il, flag_type f = regex_constants::ECMAScript);
1068
  ```
1069
 
1070
  *Effects:* Same as `basic_regex(il.begin(), il.end(), f)`.
1071
 
1072
- ### Assignment <a id="re.regex.assign">[[re.regex.assign]]</a>
1073
 
1074
  ``` cpp
1075
  basic_regex& operator=(const basic_regex& e);
1076
  ```
1077
 
@@ -1160,11 +1162,11 @@ basic_regex& assign(initializer_list<charT> il,
1160
  flag_type f = regex_constants::ECMAScript);
1161
  ```
1162
 
1163
  *Effects:* Equivalent to: `return assign(il.begin(), il.end(), f);`
1164
 
1165
- ### Constant operations <a id="re.regex.operations">[[re.regex.operations]]</a>
1166
 
1167
  ``` cpp
1168
  unsigned mark_count() const;
1169
  ```
1170
 
@@ -1176,11 +1178,11 @@ flag_type flags() const;
1176
  ```
1177
 
1178
  *Effects:* Returns a copy of the regular expression syntax flags that
1179
  were passed to the object’s constructor or to the last call to `assign`.
1180
 
1181
- ### Locale <a id="re.regex.locale">[[re.regex.locale]]</a>
1182
 
1183
  ``` cpp
1184
  locale_type imbue(locale_type loc);
1185
  ```
1186
 
@@ -1195,11 +1197,11 @@ locale_type getloc() const;
1195
 
1196
  *Effects:* Returns the result of `traits_inst.getloc()` where
1197
  `traits_inst` is a (default-initialized) instance of the template
1198
  parameter `traits` stored within the object.
1199
 
1200
- ### Swap <a id="re.regex.swap">[[re.regex.swap]]</a>
1201
 
1202
  ``` cpp
1203
  void swap(basic_regex& e);
1204
  ```
1205
 
@@ -1208,35 +1210,33 @@ void swap(basic_regex& e);
1208
  *Ensures:* `*this` contains the regular expression that was in `e`, `e`
1209
  contains the regular expression that was in `*this`.
1210
 
1211
  *Complexity:* Constant time.
1212
 
1213
- ### Non-member functions <a id="re.regex.nonmemb">[[re.regex.nonmemb]]</a>
1214
 
1215
  ``` cpp
1216
  template<class charT, class traits>
1217
  void swap(basic_regex<charT, traits>& lhs, basic_regex<charT, traits>& rhs);
1218
  ```
1219
 
1220
  *Effects:* Calls `lhs.swap(rhs)`.
1221
 
1222
- ## Class template `sub_match` <a id="re.submatch">[[re.submatch]]</a>
1223
 
1224
- ### General <a id="re.submatch.general">[[re.submatch.general]]</a>
1225
 
1226
  Class template `sub_match` denotes the sequence of characters matched by
1227
  a particular marked sub-expression.
1228
 
1229
  ``` cpp
1230
  namespace std {
1231
  template<class BidirectionalIterator>
1232
  class sub_match : public pair<BidirectionalIterator, BidirectionalIterator> {
1233
  public:
1234
- using value_type =
1235
- typename iterator_traits<BidirectionalIterator>::value_type;
1236
- using difference_type =
1237
- typename iterator_traits<BidirectionalIterator>::difference_type;
1238
  using iterator = BidirectionalIterator;
1239
  using string_type = basic_string<value_type>;
1240
 
1241
  bool matched;
1242
 
@@ -1253,11 +1253,11 @@ namespace std {
1253
  void swap(sub_match& s) noexcept(see below);
1254
  };
1255
  }
1256
  ```
1257
 
1258
- ### Members <a id="re.submatch.members">[[re.submatch.members]]</a>
1259
 
1260
  ``` cpp
1261
  constexpr sub_match();
1262
  ```
1263
 
@@ -1315,11 +1315,11 @@ std::swap(matched, s.matched);
1315
  ```
1316
 
1317
  *Remarks:* The exception specification is equivalent to
1318
  `is_nothrow_swappable_v<BidirectionalIterator>`.
1319
 
1320
- ### Non-member operators <a id="re.submatch.op">[[re.submatch.op]]</a>
1321
 
1322
  Let `SM-CAT(I)` be
1323
 
1324
  ``` cpp
1325
  compare_three_way_result_t<basic_string<typename iterator_traits<I>::value_type>>
@@ -1414,23 +1414,23 @@ template<class charT, class ST, class BiIter>
1414
  operator<<(basic_ostream<charT, ST>& os, const sub_match<BiIter>& m);
1415
  ```
1416
 
1417
  *Returns:* `os << m.str()`.
1418
 
1419
- ## Class template `match_results` <a id="re.results">[[re.results]]</a>
1420
 
1421
- ### General <a id="re.results.general">[[re.results.general]]</a>
1422
 
1423
  Class template `match_results` denotes a collection of character
1424
  sequences representing the result of a regular expression match. Storage
1425
  for the collection is allocated and freed as necessary by the member
1426
  functions of class template `match_results`.
1427
 
1428
  The class template `match_results` meets the requirements of an
1429
- allocator-aware container and of a sequence container
1430
- [[container.requirements.general]], [[sequence.reqmts]] except that only
1431
- copy assignment, move assignment, and operations defined for
1432
  const-qualified sequence containers are supported and that the semantics
1433
  of the comparison operator functions are different from those required
1434
  for a container.
1435
 
1436
  A default-constructed `match_results` object has no fully established
@@ -1462,16 +1462,14 @@ namespace std {
1462
  using value_type = sub_match<BidirectionalIterator>;
1463
  using const_reference = const value_type&;
1464
  using reference = value_type&;
1465
  using const_iterator = implementation-defined // type of match_results::const_iterator;
1466
  using iterator = const_iterator;
1467
- using difference_type =
1468
- typename iterator_traits<BidirectionalIterator>::difference_type;
1469
- using size_type = typename allocator_traits<Allocator>::size_type;
1470
  using allocator_type = Allocator;
1471
- using char_type =
1472
- typename iterator_traits<BidirectionalIterator>::value_type;
1473
  using string_type = basic_string<char_type>;
1474
 
1475
  // [re.results.const], construct/copy/destroy
1476
  match_results() : match_results(Allocator()) {}
1477
  explicit match_results(const Allocator& a);
@@ -1487,11 +1485,11 @@ namespace std {
1487
  bool ready() const;
1488
 
1489
  // [re.results.size], size
1490
  size_type size() const;
1491
  size_type max_size() const;
1492
- [[nodiscard]] bool empty() const;
1493
 
1494
  // [re.results.acc], element access
1495
  difference_type length(size_type sub = 0) const;
1496
  difference_type position(size_type sub = 0) const;
1497
  string_type str(size_type sub = 0) const;
@@ -1530,11 +1528,11 @@ namespace std {
1530
  void swap(match_results& that);
1531
  };
1532
  }
1533
  ```
1534
 
1535
- ### Constructors <a id="re.results.const">[[re.results.const]]</a>
1536
 
1537
  [[re.results.const]] lists the postconditions of `match_results`
1538
  copy/move constructors and copy/move assignment operators. For move
1539
  operations, the results of the expressions depending on the parameter
1540
  `m` denote the values they had before the respective function calls.
@@ -1596,20 +1594,20 @@ match_results& operator=(match_results&& m);
1596
  | `(*this)[n]` | `m[n]` for all non-negative integers `n < m.size()` |
1597
  | `length(n)` | `m.length(n)` for all non-negative integers `n < m.size()` |
1598
  | `position(n)` | `m.position(n)` for all non-negative integers `n < m.size()` |
1599
 
1600
 
1601
- ### State <a id="re.results.state">[[re.results.state]]</a>
1602
 
1603
  ``` cpp
1604
  bool ready() const;
1605
  ```
1606
 
1607
  *Returns:* `true` if `*this` has a fully established result state,
1608
  otherwise `false`.
1609
 
1610
- ### Size <a id="re.results.size">[[re.results.size]]</a>
1611
 
1612
  ``` cpp
1613
  size_type size() const;
1614
  ```
1615
 
@@ -1628,16 +1626,16 @@ size_type max_size() const;
1628
 
1629
  *Returns:* The maximum number of `sub_match` elements that can be stored
1630
  in `*this`.
1631
 
1632
  ``` cpp
1633
- [[nodiscard]] bool empty() const;
1634
  ```
1635
 
1636
  *Returns:* `size() == 0`.
1637
 
1638
- ### Element access <a id="re.results.acc">[[re.results.acc]]</a>
1639
 
1640
  ``` cpp
1641
  difference_type length(size_type sub = 0) const;
1642
  ```
1643
 
@@ -1709,11 +1707,11 @@ const_iterator cend() const;
1709
  ```
1710
 
1711
  *Returns:* A terminating iterator that enumerates over all the
1712
  sub-expressions stored in `*this`.
1713
 
1714
- ### Formatting <a id="re.results.form">[[re.results.form]]</a>
1715
 
1716
  ``` cpp
1717
  template<class OutputIter>
1718
  OutputIter format(
1719
  OutputIter out,
@@ -1780,21 +1778,21 @@ calls:
1780
  format(back_inserter(result), fmt, fmt + char_traits<char_type>::length(fmt), flags);
1781
  ```
1782
 
1783
  *Returns:* `result`.
1784
 
1785
- ### Allocator <a id="re.results.all">[[re.results.all]]</a>
1786
 
1787
  ``` cpp
1788
  allocator_type get_allocator() const;
1789
  ```
1790
 
1791
  *Returns:* A copy of the Allocator that was passed to the object’s
1792
  constructor or, if that allocator has been replaced, a copy of the most
1793
  recent replacement.
1794
 
1795
- ### Swap <a id="re.results.swap">[[re.results.swap]]</a>
1796
 
1797
  ``` cpp
1798
  void swap(match_results& that);
1799
  ```
1800
 
@@ -1812,21 +1810,21 @@ template<class BidirectionalIterator, class Allocator>
1812
  match_results<BidirectionalIterator, Allocator>& m2);
1813
  ```
1814
 
1815
  *Effects:* As if by `m1.swap(m2)`.
1816
 
1817
- ### Non-member functions <a id="re.results.nonmember">[[re.results.nonmember]]</a>
1818
 
1819
  ``` cpp
1820
  template<class BidirectionalIterator, class Allocator>
1821
  bool operator==(const match_results<BidirectionalIterator, Allocator>& m1,
1822
  const match_results<BidirectionalIterator, Allocator>& m2);
1823
  ```
1824
 
1825
  *Returns:* `true` if neither match result is ready, `false` if one match
1826
  result is ready and the other is not. If both match results are ready,
1827
- returns `true` only if:
1828
 
1829
  - `m1.empty() && m2.empty()`, or
1830
  - `!m1.empty() && !m2.empty()`, and the following conditions are
1831
  satisfied:
1832
  - `m1.prefix() == m2.prefix()`,
@@ -1835,20 +1833,20 @@ returns `true` only if:
1835
  - `m1.suffix() == m2.suffix()`.
1836
 
1837
  [*Note 1*: The algorithm `equal` is defined in
1838
  [[algorithms]]. — *end note*]
1839
 
1840
- ## Regular expression algorithms <a id="re.alg">[[re.alg]]</a>
1841
 
1842
- ### Exceptions <a id="re.except">[[re.except]]</a>
1843
 
1844
  The algorithms described in subclause  [[re.alg]] may throw an exception
1845
  of type `regex_error`. If such an exception `e` is thrown, `e.code()`
1846
  shall return either `regex_constants::error_complexity` or
1847
  `regex_constants::error_stack`.
1848
 
1849
- ### `regex_match` <a id="re.alg.match">[[re.alg.match]]</a>
1850
 
1851
  ``` cpp
1852
  template<class BidirectionalIterator, class Allocator, class charT, class traits>
1853
  bool regex_match(BidirectionalIterator first, BidirectionalIterator last,
1854
  match_results<BidirectionalIterator, Allocator>& m,
@@ -1942,22 +1940,22 @@ template<class charT, class traits>
1942
  const basic_regex<charT, traits>& e,
1943
  regex_constants::match_flag_type flags = regex_constants::match_default);
1944
  ```
1945
 
1946
  *Returns:*
1947
- `regex_match(str, str + char_traits<charT>::length(str), e, flags)`
1948
 
1949
  ``` cpp
1950
  template<class ST, class SA, class charT, class traits>
1951
  bool regex_match(const basic_string<charT, ST, SA>& s,
1952
  const basic_regex<charT, traits>& e,
1953
  regex_constants::match_flag_type flags = regex_constants::match_default);
1954
  ```
1955
 
1956
  *Returns:* `regex_match(s.begin(), s.end(), e, flags)`.
1957
 
1958
- ### `regex_search` <a id="re.alg.search">[[re.alg.search]]</a>
1959
 
1960
  ``` cpp
1961
  template<class BidirectionalIterator, class Allocator, class charT, class traits>
1962
  bool regex_search(BidirectionalIterator first, BidirectionalIterator last,
1963
  match_results<BidirectionalIterator, Allocator>& m,
@@ -2047,11 +2045,11 @@ template<class ST, class SA, class charT, class traits>
2047
  regex_constants::match_flag_type flags = regex_constants::match_default);
2048
  ```
2049
 
2050
  *Returns:* `regex_search(s.begin(), s.end(), e, flags)`.
2051
 
2052
- ### `regex_replace` <a id="re.alg.replace">[[re.alg.replace]]</a>
2053
 
2054
  ``` cpp
2055
  template<class OutputIterator, class BidirectionalIterator,
2056
  class traits, class charT, class ST, class SA>
2057
  OutputIterator
@@ -2161,15 +2159,15 @@ template<class traits, class charT>
2161
  regex_replace(back_inserter(result), s, s + char_traits<charT>::length(s), e, fmt, flags);
2162
  ```
2163
 
2164
  *Returns:* `result`.
2165
 
2166
- ## Regular expression iterators <a id="re.iter">[[re.iter]]</a>
2167
 
2168
- ### Class template `regex_iterator` <a id="re.regiter">[[re.regiter]]</a>
2169
 
2170
- #### General <a id="re.regiter.general">[[re.regiter.general]]</a>
2171
 
2172
  The class template `regex_iterator` is an iterator adaptor. It
2173
  represents a new view of an existing iterator sequence, by enumerating
2174
  all the occurrences of a regular expression within that sequence. A
2175
  `regex_iterator` uses `regex_search` to find successive regular
@@ -2180,11 +2178,11 @@ the iterator finds and stores a value of
2180
  reached (`regex_search` returns `false`), the iterator becomes equal to
2181
  the end-of-sequence iterator value. The default constructor constructs
2182
  an end-of-sequence iterator object, which is the only legitimate
2183
  iterator to be used for the end condition. The result of `operator*` on
2184
  an end-of-sequence iterator is not defined. For any other iterator value
2185
- a const `match_results<BidirectionalIterator>&` is returned. The result
2186
  of `operator->` on an end-of-sequence iterator is not defined. For any
2187
  other iterator value a `const match_results<BidirectionalIterator>*` is
2188
  returned. It is impossible to store things into `regex_iterator`s. Two
2189
  end-of-sequence iterators are always equal. An end-of-sequence iterator
2190
  is not equal to a non-end-of-sequence iterator. Two non-end-of-sequence
@@ -2237,11 +2235,11 @@ iterator holds a *zero-length match* if `match[0].matched == true` and
2237
 
2238
  [*Note 1*: For example, this can occur when the part of the regular
2239
  expression that matched consists only of an assertion (such as `'^'`,
2240
  `'$'`, `'\b'`, `'\B'`). — *end note*]
2241
 
2242
- #### Constructors <a id="re.regiter.cnstr">[[re.regiter.cnstr]]</a>
2243
 
2244
  ``` cpp
2245
  regex_iterator();
2246
  ```
2247
 
@@ -2256,11 +2254,11 @@ regex_iterator(BidirectionalIterator a, BidirectionalIterator b,
2256
  *Effects:* Initializes `begin` and `end` to `a` and `b`, respectively,
2257
  sets `pregex` to `addressof(re)`, sets `flags` to `m`, then calls
2258
  `regex_search(begin, end, match, *pregex, flags)`. If this call returns
2259
  `false` the constructor sets `*this` to the end-of-sequence iterator.
2260
 
2261
- #### Comparisons <a id="re.regiter.comp">[[re.regiter.comp]]</a>
2262
 
2263
  ``` cpp
2264
  bool operator==(const regex_iterator& right) const;
2265
  ```
2266
 
@@ -2273,11 +2271,11 @@ iterators or if the following conditions all hold:
2273
  - `flags == right.flags`, and
2274
  - `match[0] == right.match[0]`;
2275
 
2276
  otherwise `false`.
2277
 
2278
- #### Indirection <a id="re.regiter.deref">[[re.regiter.deref]]</a>
2279
 
2280
  ``` cpp
2281
  const value_type& operator*() const;
2282
  ```
2283
 
@@ -2287,11 +2285,11 @@ const value_type& operator*() const;
2287
  const value_type* operator->() const;
2288
  ```
2289
 
2290
  *Returns:* `addressof(match)`.
2291
 
2292
- #### Increment <a id="re.regiter.incr">[[re.regiter.incr]]</a>
2293
 
2294
  ``` cpp
2295
  regex_iterator& operator++();
2296
  ```
2297
 
@@ -2321,12 +2319,12 @@ If the most recent match was not a zero-length match, the operator sets
2321
  `false` the iterator sets `*this` to the end-of-sequence iterator. The
2322
  iterator then returns `*this`.
2323
 
2324
  In all cases in which the call to `regex_search` returns `true`,
2325
  `match.prefix().first` shall be equal to the previous value of
2326
- `match[0].second`, and for each index `i` in the half-open range
2327
- `[0, match.size())` for which `match[i].matched` is `true`,
2328
  `match.position(i)` shall return `distance(begin, match[i].first)`.
2329
 
2330
  [*Note 1*: This means that `match.position(i)` gives the offset from
2331
  the beginning of the target sequence, which is often not the same as the
2332
  offset from the sequence passed in the call to
@@ -2348,13 +2346,13 @@ regex_iterator operator++(int);
2348
  regex_iterator tmp = *this;
2349
  ++(*this);
2350
  return tmp;
2351
  ```
2352
 
2353
- ### Class template `regex_token_iterator` <a id="re.tokiter">[[re.tokiter]]</a>
2354
 
2355
- #### General <a id="re.tokiter.general">[[re.tokiter.general]]</a>
2356
 
2357
  The class template `regex_token_iterator` is an iterator adaptor; that
2358
  is to say it represents a new view of an existing iterator sequence, by
2359
  enumerating all the occurrences of a regular expression within that
2360
  sequence, and presenting one or more sub-expressions for each match
@@ -2488,11 +2486,11 @@ same as the end of the last match found, and `suffix.second` is the same
2488
  as the end of the target sequence. — *end note*]
2489
 
2490
  The *current match* is `(*position).prefix()` if `subs[N] == -1`, or
2491
  `(*position)[subs[N]]` for any other value of `subs[N]`.
2492
 
2493
- #### Constructors <a id="re.tokiter.cnstr">[[re.tokiter.cnstr]]</a>
2494
 
2495
  ``` cpp
2496
  regex_token_iterator();
2497
  ```
2498
 
@@ -2536,11 +2534,11 @@ end-of-sequence iterator the constructor sets `result` to the address of
2536
  the current match. Otherwise if any of the values stored in `subs` is
2537
  equal to -1 the constructor sets `*this` to a suffix iterator that
2538
  points to the range \[`a`, `b`), otherwise the constructor sets `*this`
2539
  to an end-of-sequence iterator.
2540
 
2541
- #### Comparisons <a id="re.tokiter.comp">[[re.tokiter.comp]]</a>
2542
 
2543
  ``` cpp
2544
  bool operator==(const regex_token_iterator& right) const;
2545
  ```
2546
 
@@ -2549,11 +2547,11 @@ iterators, or if `*this` and `right` are both suffix iterators and
2549
  `suffix == right.suffix`; otherwise returns `false` if `*this` or
2550
  `right` is an end-of-sequence iterator or a suffix iterator. Otherwise
2551
  returns `true` if `position == right.position`, `N == right.N`, and
2552
  `subs == right.subs`. Otherwise returns `false`.
2553
 
2554
- #### Indirection <a id="re.tokiter.deref">[[re.tokiter.deref]]</a>
2555
 
2556
  ``` cpp
2557
  const value_type& operator*() const;
2558
  ```
2559
 
@@ -2563,11 +2561,11 @@ const value_type& operator*() const;
2563
  const value_type* operator->() const;
2564
  ```
2565
 
2566
  *Returns:* `result`.
2567
 
2568
- #### Increment <a id="re.tokiter.incr">[[re.tokiter.incr]]</a>
2569
 
2570
  ``` cpp
2571
  regex_token_iterator& operator++();
2572
  ```
2573
 
@@ -2589,21 +2587,21 @@ Otherwise, if any of the values stored in `subs` is equal to -1 and
2589
  iterator that points to the range \[`prev->suffix().first`,
2590
  `prev->suffix().second`).
2591
 
2592
  Otherwise, sets `*this` to an end-of-sequence iterator.
2593
 
2594
- *Returns:* `*this`
2595
 
2596
  ``` cpp
2597
  regex_token_iterator& operator++(int);
2598
  ```
2599
 
2600
  *Effects:* Constructs a copy `tmp` of `*this`, then calls `++(*this)`.
2601
 
2602
  *Returns:* `tmp`.
2603
 
2604
- ## Modified ECMAScript regular expression grammar <a id="re.grammar">[[re.grammar]]</a>
2605
 
2606
  The regular expression grammar recognized by `basic_regex` objects
2607
  constructed with the ECMAScript flag is that specified by ECMA-262,
2608
  except as specified below.
2609
 
@@ -2709,12 +2707,12 @@ exception object of type `regex_error`.
2709
 
2710
  If the *CV* of a *UnicodeEscapeSequence* is greater than the largest
2711
  value that can be held in an object of type `charT` the translator shall
2712
  throw an exception object of type `regex_error`.
2713
 
2714
- [*Note 1*: This means that values of the form `"uxxxx"` that do not fit
2715
- in a character are invalid. — *end note*]
2716
 
2717
  Where the regular expression grammar requires the conversion of a
2718
  sequence of characters to an integral value, this is accomplished by
2719
  calling `traits_inst.value`.
2720
 
@@ -2768,77 +2766,5 @@ as follows:
2768
  sequence of characters, a character `c` is a member of a character
2769
  class designated by an iterator range \[`first`, `last`) if
2770
  `traits_inst.isctype(c, traits_inst.lookup_classname(first, last, flags() & icase))`
2771
  is `true`.
2772
 
2773
- ECMA-262 15.10
2774
-
2775
- <!-- Link reference definitions -->
2776
- [algorithms]: algorithms.md#algorithms
2777
- [bitmask.types]: library.md#bitmask.types
2778
- [container.reqmts]: containers.md#container.reqmts
2779
- [container.requirements.general]: containers.md#container.requirements.general
2780
- [enumerated.types]: library.md#enumerated.types
2781
- [forward.iterators]: iterators.md#forward.iterators
2782
- [input.iterators]: iterators.md#input.iterators
2783
- [iterator.concept.bidir]: iterators.md#iterator.concept.bidir
2784
- [output.iterators]: iterators.md#output.iterators
2785
- [re]: #re
2786
- [re.alg]: #re.alg
2787
- [re.alg.match]: #re.alg.match
2788
- [re.alg.replace]: #re.alg.replace
2789
- [re.alg.search]: #re.alg.search
2790
- [re.badexp]: #re.badexp
2791
- [re.const]: #re.const
2792
- [re.const.general]: #re.const.general
2793
- [re.err]: #re.err
2794
- [re.except]: #re.except
2795
- [re.general]: #re.general
2796
- [re.grammar]: #re.grammar
2797
- [re.iter]: #re.iter
2798
- [re.matchflag]: #re.matchflag
2799
- [re.regex]: #re.regex
2800
- [re.regex.assign]: #re.regex.assign
2801
- [re.regex.construct]: #re.regex.construct
2802
- [re.regex.general]: #re.regex.general
2803
- [re.regex.locale]: #re.regex.locale
2804
- [re.regex.nonmemb]: #re.regex.nonmemb
2805
- [re.regex.operations]: #re.regex.operations
2806
- [re.regex.swap]: #re.regex.swap
2807
- [re.regiter]: #re.regiter
2808
- [re.regiter.cnstr]: #re.regiter.cnstr
2809
- [re.regiter.comp]: #re.regiter.comp
2810
- [re.regiter.deref]: #re.regiter.deref
2811
- [re.regiter.general]: #re.regiter.general
2812
- [re.regiter.incr]: #re.regiter.incr
2813
- [re.req]: #re.req
2814
- [re.results]: #re.results
2815
- [re.results.acc]: #re.results.acc
2816
- [re.results.all]: #re.results.all
2817
- [re.results.const]: #re.results.const
2818
- [re.results.form]: #re.results.form
2819
- [re.results.general]: #re.results.general
2820
- [re.results.nonmember]: #re.results.nonmember
2821
- [re.results.size]: #re.results.size
2822
- [re.results.state]: #re.results.state
2823
- [re.results.swap]: #re.results.swap
2824
- [re.submatch]: #re.submatch
2825
- [re.submatch.general]: #re.submatch.general
2826
- [re.submatch.members]: #re.submatch.members
2827
- [re.submatch.op]: #re.submatch.op
2828
- [re.summary]: #re.summary
2829
- [re.syn]: #re.syn
2830
- [re.synopt]: #re.synopt
2831
- [re.tokiter]: #re.tokiter
2832
- [re.tokiter.cnstr]: #re.tokiter.cnstr
2833
- [re.tokiter.comp]: #re.tokiter.comp
2834
- [re.tokiter.deref]: #re.tokiter.deref
2835
- [re.tokiter.general]: #re.tokiter.general
2836
- [re.tokiter.incr]: #re.tokiter.incr
2837
- [re.traits]: #re.traits
2838
- [re.traits.classnames]: #re.traits.classnames
2839
- [sequence.reqmts]: containers.md#sequence.reqmts
2840
- [strings.general]: strings.md#strings.general
2841
- [swappable.requirements]: library.md#swappable.requirements
2842
-
2843
- [^1]: For example, if the parameter `icase` is `true` then `[[:lower:]]`
2844
- is the same as `[[:alpha:]]`.
 
1
+ ## Regular expressions library <a id="re">[[re]]</a>
2
 
3
+ ### General <a id="re.general">[[re.general]]</a>
4
 
5
+ Subclause [[re]] describes components that C++ programs may use to
6
+ perform operations involving regular expression matching and searching.
7
 
8
  The following subclauses describe a basic regular expression class
9
  template and its traits that can handle char-like [[strings.general]]
10
  template arguments, two specializations of this class template that
11
  handle sequences of `char` and `wchar_t`, a class template that holds
 
28
  | [[re.alg]] | Algorithms | |
29
  | [[re.iter]] | Iterators | |
30
  | [[re.grammar]] | Grammar | |
31
 
32
 
33
+ The ECMAScript Language Specification described in Standard Ecma-262 is
34
+ called *ECMA-262* in this Clause.
35
+
36
+ ### Requirements <a id="re.req">[[re.req]]</a>
37
 
38
  This subclause defines requirements on classes representing regular
39
  expression traits.
40
 
41
  [*Note 1*: The class template `regex_traits`, defined in [[re.traits]],
 
219
  [*Note 2*: Class template `regex_traits` meets the requirements for a
220
  regular expression traits class when it is specialized for `char` or
221
  `wchar_t`. This class template is described in the header `<regex>`, and
222
  is described in [[re.traits]]. — *end note*]
223
 
224
+ ### Header `<regex>` synopsis <a id="re.syn">[[re.syn]]</a>
225
 
226
  ``` cpp
227
  #include <compare> // see [compare.syn]
228
  #include <initializer_list> // see [initializer.list.syn]
229
 
 
460
  using wsmatch = match_results<wstring::const_iterator>;
461
  }
462
  }
463
  ```
464
 
465
+ ### Namespace `std::regex_constants` <a id="re.const">[[re.const]]</a>
466
 
467
+ #### General <a id="re.const.general">[[re.const.general]]</a>
468
 
469
  The namespace `std::regex_constants` holds symbolic constants used by
470
  the regular expression library. This namespace provides three types,
471
  `syntax_option_type`, `match_flag_type`, and `error_type`, along with
472
  several constants of these types.
473
 
474
+ #### Bitmask type `syntax_option_type` <a id="re.synopt">[[re.synopt]]</a>
475
 
476
  ``` cpp
477
  namespace std::regex_constants {
478
  using syntax_option_type = T1;
479
  inline constexpr syntax_option_type icase = unspecified;
 
503
  | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
504
  | % `icase` | Specifies that matching of regular expressions against a character container sequence shall be performed without regard to case. \indexlibrarymember{syntax_option_type}{icase}% |
505
  | % `nosubs` | Specifies that no sub-expressions shall be considered to be marked, so that when a regular expression is matched against a character container sequence, no sub-expression matches shall be stored in the supplied `match_results` object. \indexlibrarymember{syntax_option_type}{nosubs}% |
506
  | % `optimize` | Specifies that the regular expression engine should pay more attention to the speed with which regular expressions are matched, and less to the speed with which regular expression objects are constructed. Otherwise it has no detectable effect on the program output. \indexlibrarymember{syntax_option_type}{optimize}% |
507
  | % `collate` | Specifies that character ranges of the form `"[a-b]"` shall be locale sensitive.% \indexlibrarymember{syntax_option_type}{collate}% \indextext{locale}% |
508
+ | % `ECMAScript` | Specifies that the grammar recognized by the regular expression engine shall be that used by ECMAScript in ECMA-262, as modified in~ [[re.grammar]]. \xref{ECMA-262 15.10} \indextext{ECMAScript}% \indexlibrarymember{syntax_option_type}{ECMAScript}% |
509
+ | % `basic` | Specifies that the grammar recognized by the regular expression engine shall be that used by basic regular expressions in POSIX. \xref{POSIX, Base Definitions and Headers, Section 9.3} \indextext{POSIX!regular expressions}% \indexlibrarymember{syntax_option_type}{basic}% |
510
+ | % `extended` | Specifies that the grammar recognized by the regular expression engine shall be that used by extended regular expressions in POSIX. \xref{POSIX, Base Definitions and Headers, Section 9.4} \indextext{POSIX!extended regular expressions}% \indexlibrarymember{syntax_option_type}{extended}% |
511
  | % `awk` | Specifies that the grammar recognized by the regular expression engine shall be that used by the utility awk in POSIX. \indexlibrarymember{syntax_option_type}{awk}% |
512
  | % `grep` | Specifies that the grammar recognized by the regular expression engine shall be that used by the utility grep in POSIX. \indexlibrarymember{syntax_option_type}{grep}% |
513
  | % `egrep` | Specifies that the grammar recognized by the regular expression engine shall be that used by the utility grep when given the -E option in POSIX. \indexlibrarymember{syntax_option_type}{egrep}% |
514
  | % `multiline` | Specifies that `^` shall match the beginning of a line and `$` shall match the end of a line, if the `ECMAScript` engine is selected. \indexlibrarymember{syntax_option_type}{multiline}% |
515
 
516
 
517
+ #### Bitmask type `match_flag_type` <a id="re.matchflag">[[re.matchflag]]</a>
518
 
519
  ``` cpp
520
  namespace std::regex_constants {
521
  using match_flag_type = T2;
522
  inline constexpr match_flag_type match_default = {};
 
542
  Matching a regular expression against a sequence of characters
543
  \[`first`, `last`) proceeds according to the rules of the grammar
544
  specified for the regular expression object, modified according to the
545
  effects listed in [[re.matchflag]] for any bitmask elements set.
546
 
547
+ **Table: `regex_constants::match_flag_type` effects** <a id="re.matchflag">[re.matchflag]</a>
 
548
 
549
  | Element | Effect(s) if set |
550
  | ------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
551
  | % \indexlibraryglobal{match_not_bol}% `match_not_bol` | The first character in the sequence {[}`first`, `last`{)} shall be treated as though it is not at the beginning of a line, so the character \verb|^| in the regular expression shall not match {[}`first`, `first`{)}. |
552
  | % \indexlibraryglobal{match_not_eol}% `match_not_eol` | The last character in the sequence {[}`first`, `last`{)} shall be treated as though it is not at the end of a line, so the character \verb|"$"| in the regular expression shall not match {[}`last`, `last`{)}. |
 
560
  | % \indexlibraryglobal{format_sed}% `format_sed` | When a regular expression match is to be replaced by a new string, the new string shall be constructed using the rules used by the sed utility in POSIX. |
561
  | % \indexlibraryglobal{format_no_copy}% `format_no_copy` | During a search and replace operation, sections of the character container sequence being searched that do not match the regular expression shall not be copied to the output string. |
562
  | % \indexlibraryglobal{format_first_only}% `format_first_only` | When specified during a search and replace operation, only the first occurrence of the regular expression shall be replaced. |
563
 
564
 
565
+ #### Implementation-defined `error_type` <a id="re.err">[[re.err]]</a>
566
 
567
  ``` cpp
568
  namespace std::regex_constants {
569
  using error_type = T3;
570
  inline constexpr error_type error_collate = unspecified;
 
595
  | % `error_ctype` | The expression contains an invalid character class name. |
596
  | % `error_escape` | The expression contains an invalid escaped character, or a trailing escape. |
597
  | % `error_backref` | The expression contains an invalid back reference. |
598
  | % `error_brack` | The expression contains mismatched \verb|[| and \verb|]|. |
599
  | % `error_paren` | The expression contains mismatched \verb|(| and \verb|)|. |
600
+ | % `error_brace` | The expression contains mismatched \verb|{| and \verb|}|. |
601
  | % `error_badbrace` | The expression contains an invalid range in a \verb|{}| expression. |
602
  | % `error_range` | The expression contains an invalid character range, such as \verb|[b-a]| in most encodings. |
603
  | % `error_space` | There is insufficient memory to convert the expression into a finite state machine. |
604
  | % `error_badrepeat` | One of \verb|*?+{| is not preceded by a valid regular expression. |
605
  | % `error_complexity` | The complexity of an attempted match against a regular expression exceeds a pre-set level. |
606
  | % `error_stack` | There is insufficient memory to determine whether the regular expression matches the specified character sequence. |
607
 
608
 
609
+ ### Class `regex_error` <a id="re.badexp">[[re.badexp]]</a>
610
 
611
  ``` cpp
612
  namespace std {
613
  class regex_error : public runtime_error {
614
  public:
 
631
  regex_constants::error_type code() const;
632
  ```
633
 
634
  *Returns:* The error code that was passed to the constructor.
635
 
636
+ ### Class template `regex_traits` <a id="re.traits">[[re.traits]]</a>
637
 
638
  ``` cpp
639
  namespace std {
640
  template<class charT>
641
  struct regex_traits {
 
715
  ```
716
 
717
  *Effects:* If
718
 
719
  ``` cpp
720
+ typeid(use_facet<collate<charT>>(getloc())) == typeid(collate_byname<charT>)
721
  ```
722
 
723
  and the form of the sort key returned by
724
  `collate_byname<charT>::transform(first, last)` is known and can be
725
  converted into a primary sort key then returns that key, otherwise
 
744
  *Returns:* An unspecified value that represents the character
745
  classification named by the character sequence designated by the
746
  iterator range \[`first`, `last`). If the parameter `icase` is `true`
747
  then the returned mask identifies the character classification without
748
  regard to the case of the characters being matched, otherwise it does
749
+ honor the case of the characters being matched.[^26]
750
 
751
  The value returned shall be independent of the case of the characters in
752
  the character sequence. If the name is not recognized then returns
753
  `char_class_type()`.
754
 
 
830
 
831
  ``` cpp
832
  locale_type imbue(locale_type loc);
833
  ```
834
 
835
+ *Effects:* Imbues `*this` with a copy of the locale `loc`.
836
 
837
  [*Note 1*: Calling `imbue` with a different locale than the one
838
  currently in use invalidates all cached data held by
839
  `*this`. — *end note*]
840
 
 
871
  | `"upper"` | `L"upper"` | `ctype_base::upper` |
872
  | `"w"` | `L"w"` | `ctype_base::alnum` |
873
  | `"xdigit"` | `L"xdigit"` | `ctype_base::xdigit` |
874
 
875
 
876
+ ### Class template `basic_regex` <a id="re.regex">[[re.regex]]</a>
877
 
878
+ #### General <a id="re.regex.general">[[re.regex.general]]</a>
879
 
880
  For a char-like type `charT`, specializations of class template
881
  `basic_regex` represent regular expressions constructed from character
882
  sequences of `charT` characters. In the rest of  [[re.regex]], `charT`
883
  denotes a given char-like type. Storage for a regular expression is
 
902
  class basic_regex {
903
  public:
904
  // types
905
  using value_type = charT;
906
  using traits_type = traits;
907
+ using string_type = traits::string_type;
908
  using flag_type = regex_constants::syntax_option_type;
909
+ using locale_type = traits::locale_type;
910
 
911
  // [re.synopt], constants
912
  static constexpr flag_type icase = regex_constants::icase;
913
  static constexpr flag_type nosubs = regex_constants::nosubs;
914
  static constexpr flag_type optimize = regex_constants::optimize;
 
975
  regex_constants::syntax_option_type = regex_constants::ECMAScript)
976
  -> basic_regex<typename iterator_traits<ForwardIterator>::value_type>;
977
  }
978
  ```
979
 
980
+ #### Constructors <a id="re.regex.construct">[[re.regex.construct]]</a>
981
 
982
  ``` cpp
983
  basic_regex();
984
  ```
985
 
 
1069
  basic_regex(initializer_list<charT> il, flag_type f = regex_constants::ECMAScript);
1070
  ```
1071
 
1072
  *Effects:* Same as `basic_regex(il.begin(), il.end(), f)`.
1073
 
1074
+ #### Assignment <a id="re.regex.assign">[[re.regex.assign]]</a>
1075
 
1076
  ``` cpp
1077
  basic_regex& operator=(const basic_regex& e);
1078
  ```
1079
 
 
1162
  flag_type f = regex_constants::ECMAScript);
1163
  ```
1164
 
1165
  *Effects:* Equivalent to: `return assign(il.begin(), il.end(), f);`
1166
 
1167
+ #### Constant operations <a id="re.regex.operations">[[re.regex.operations]]</a>
1168
 
1169
  ``` cpp
1170
  unsigned mark_count() const;
1171
  ```
1172
 
 
1178
  ```
1179
 
1180
  *Effects:* Returns a copy of the regular expression syntax flags that
1181
  were passed to the object’s constructor or to the last call to `assign`.
1182
 
1183
+ #### Locale <a id="re.regex.locale">[[re.regex.locale]]</a>
1184
 
1185
  ``` cpp
1186
  locale_type imbue(locale_type loc);
1187
  ```
1188
 
 
1197
 
1198
  *Effects:* Returns the result of `traits_inst.getloc()` where
1199
  `traits_inst` is a (default-initialized) instance of the template
1200
  parameter `traits` stored within the object.
1201
 
1202
+ #### Swap <a id="re.regex.swap">[[re.regex.swap]]</a>
1203
 
1204
  ``` cpp
1205
  void swap(basic_regex& e);
1206
  ```
1207
 
 
1210
  *Ensures:* `*this` contains the regular expression that was in `e`, `e`
1211
  contains the regular expression that was in `*this`.
1212
 
1213
  *Complexity:* Constant time.
1214
 
1215
+ #### Non-member functions <a id="re.regex.nonmemb">[[re.regex.nonmemb]]</a>
1216
 
1217
  ``` cpp
1218
  template<class charT, class traits>
1219
  void swap(basic_regex<charT, traits>& lhs, basic_regex<charT, traits>& rhs);
1220
  ```
1221
 
1222
  *Effects:* Calls `lhs.swap(rhs)`.
1223
 
1224
+ ### Class template `sub_match` <a id="re.submatch">[[re.submatch]]</a>
1225
 
1226
+ #### General <a id="re.submatch.general">[[re.submatch.general]]</a>
1227
 
1228
  Class template `sub_match` denotes the sequence of characters matched by
1229
  a particular marked sub-expression.
1230
 
1231
  ``` cpp
1232
  namespace std {
1233
  template<class BidirectionalIterator>
1234
  class sub_match : public pair<BidirectionalIterator, BidirectionalIterator> {
1235
  public:
1236
+ using value_type = iterator_traits<BidirectionalIterator>::value_type;
1237
+ using difference_type = iterator_traits<BidirectionalIterator>::difference_type;
 
 
1238
  using iterator = BidirectionalIterator;
1239
  using string_type = basic_string<value_type>;
1240
 
1241
  bool matched;
1242
 
 
1253
  void swap(sub_match& s) noexcept(see below);
1254
  };
1255
  }
1256
  ```
1257
 
1258
+ #### Members <a id="re.submatch.members">[[re.submatch.members]]</a>
1259
 
1260
  ``` cpp
1261
  constexpr sub_match();
1262
  ```
1263
 
 
1315
  ```
1316
 
1317
  *Remarks:* The exception specification is equivalent to
1318
  `is_nothrow_swappable_v<BidirectionalIterator>`.
1319
 
1320
+ #### Non-member operators <a id="re.submatch.op">[[re.submatch.op]]</a>
1321
 
1322
  Let `SM-CAT(I)` be
1323
 
1324
  ``` cpp
1325
  compare_three_way_result_t<basic_string<typename iterator_traits<I>::value_type>>
 
1414
  operator<<(basic_ostream<charT, ST>& os, const sub_match<BiIter>& m);
1415
  ```
1416
 
1417
  *Returns:* `os << m.str()`.
1418
 
1419
+ ### Class template `match_results` <a id="re.results">[[re.results]]</a>
1420
 
1421
+ #### General <a id="re.results.general">[[re.results.general]]</a>
1422
 
1423
  Class template `match_results` denotes a collection of character
1424
  sequences representing the result of a regular expression match. Storage
1425
  for the collection is allocated and freed as necessary by the member
1426
  functions of class template `match_results`.
1427
 
1428
  The class template `match_results` meets the requirements of an
1429
+ allocator-aware container [[container.alloc.reqmts]] and of a sequence
1430
+ container [[container.requirements.general]], [[sequence.reqmts]] except
1431
+ that only copy assignment, move assignment, and operations defined for
1432
  const-qualified sequence containers are supported and that the semantics
1433
  of the comparison operator functions are different from those required
1434
  for a container.
1435
 
1436
  A default-constructed `match_results` object has no fully established
 
1462
  using value_type = sub_match<BidirectionalIterator>;
1463
  using const_reference = const value_type&;
1464
  using reference = value_type&;
1465
  using const_iterator = implementation-defined // type of match_results::const_iterator;
1466
  using iterator = const_iterator;
1467
+ using difference_type = iterator_traits<BidirectionalIterator>::difference_type;
1468
+ using size_type = allocator_traits<Allocator>::size_type;
 
1469
  using allocator_type = Allocator;
1470
+ using char_type = iterator_traits<BidirectionalIterator>::value_type;
 
1471
  using string_type = basic_string<char_type>;
1472
 
1473
  // [re.results.const], construct/copy/destroy
1474
  match_results() : match_results(Allocator()) {}
1475
  explicit match_results(const Allocator& a);
 
1485
  bool ready() const;
1486
 
1487
  // [re.results.size], size
1488
  size_type size() const;
1489
  size_type max_size() const;
1490
+ bool empty() const;
1491
 
1492
  // [re.results.acc], element access
1493
  difference_type length(size_type sub = 0) const;
1494
  difference_type position(size_type sub = 0) const;
1495
  string_type str(size_type sub = 0) const;
 
1528
  void swap(match_results& that);
1529
  };
1530
  }
1531
  ```
1532
 
1533
+ #### Constructors <a id="re.results.const">[[re.results.const]]</a>
1534
 
1535
  [[re.results.const]] lists the postconditions of `match_results`
1536
  copy/move constructors and copy/move assignment operators. For move
1537
  operations, the results of the expressions depending on the parameter
1538
  `m` denote the values they had before the respective function calls.
 
1594
  | `(*this)[n]` | `m[n]` for all non-negative integers `n < m.size()` |
1595
  | `length(n)` | `m.length(n)` for all non-negative integers `n < m.size()` |
1596
  | `position(n)` | `m.position(n)` for all non-negative integers `n < m.size()` |
1597
 
1598
 
1599
+ #### State <a id="re.results.state">[[re.results.state]]</a>
1600
 
1601
  ``` cpp
1602
  bool ready() const;
1603
  ```
1604
 
1605
  *Returns:* `true` if `*this` has a fully established result state,
1606
  otherwise `false`.
1607
 
1608
+ #### Size <a id="re.results.size">[[re.results.size]]</a>
1609
 
1610
  ``` cpp
1611
  size_type size() const;
1612
  ```
1613
 
 
1626
 
1627
  *Returns:* The maximum number of `sub_match` elements that can be stored
1628
  in `*this`.
1629
 
1630
  ``` cpp
1631
+ bool empty() const;
1632
  ```
1633
 
1634
  *Returns:* `size() == 0`.
1635
 
1636
+ #### Element access <a id="re.results.acc">[[re.results.acc]]</a>
1637
 
1638
  ``` cpp
1639
  difference_type length(size_type sub = 0) const;
1640
  ```
1641
 
 
1707
  ```
1708
 
1709
  *Returns:* A terminating iterator that enumerates over all the
1710
  sub-expressions stored in `*this`.
1711
 
1712
+ #### Formatting <a id="re.results.form">[[re.results.form]]</a>
1713
 
1714
  ``` cpp
1715
  template<class OutputIter>
1716
  OutputIter format(
1717
  OutputIter out,
 
1778
  format(back_inserter(result), fmt, fmt + char_traits<char_type>::length(fmt), flags);
1779
  ```
1780
 
1781
  *Returns:* `result`.
1782
 
1783
+ #### Allocator <a id="re.results.all">[[re.results.all]]</a>
1784
 
1785
  ``` cpp
1786
  allocator_type get_allocator() const;
1787
  ```
1788
 
1789
  *Returns:* A copy of the Allocator that was passed to the object’s
1790
  constructor or, if that allocator has been replaced, a copy of the most
1791
  recent replacement.
1792
 
1793
+ #### Swap <a id="re.results.swap">[[re.results.swap]]</a>
1794
 
1795
  ``` cpp
1796
  void swap(match_results& that);
1797
  ```
1798
 
 
1810
  match_results<BidirectionalIterator, Allocator>& m2);
1811
  ```
1812
 
1813
  *Effects:* As if by `m1.swap(m2)`.
1814
 
1815
+ #### Non-member functions <a id="re.results.nonmember">[[re.results.nonmember]]</a>
1816
 
1817
  ``` cpp
1818
  template<class BidirectionalIterator, class Allocator>
1819
  bool operator==(const match_results<BidirectionalIterator, Allocator>& m1,
1820
  const match_results<BidirectionalIterator, Allocator>& m2);
1821
  ```
1822
 
1823
  *Returns:* `true` if neither match result is ready, `false` if one match
1824
  result is ready and the other is not. If both match results are ready,
1825
+ returns `true` only if
1826
 
1827
  - `m1.empty() && m2.empty()`, or
1828
  - `!m1.empty() && !m2.empty()`, and the following conditions are
1829
  satisfied:
1830
  - `m1.prefix() == m2.prefix()`,
 
1833
  - `m1.suffix() == m2.suffix()`.
1834
 
1835
  [*Note 1*: The algorithm `equal` is defined in
1836
  [[algorithms]]. — *end note*]
1837
 
1838
+ ### Regular expression algorithms <a id="re.alg">[[re.alg]]</a>
1839
 
1840
+ #### Exceptions <a id="re.except">[[re.except]]</a>
1841
 
1842
  The algorithms described in subclause  [[re.alg]] may throw an exception
1843
  of type `regex_error`. If such an exception `e` is thrown, `e.code()`
1844
  shall return either `regex_constants::error_complexity` or
1845
  `regex_constants::error_stack`.
1846
 
1847
+ #### `regex_match` <a id="re.alg.match">[[re.alg.match]]</a>
1848
 
1849
  ``` cpp
1850
  template<class BidirectionalIterator, class Allocator, class charT, class traits>
1851
  bool regex_match(BidirectionalIterator first, BidirectionalIterator last,
1852
  match_results<BidirectionalIterator, Allocator>& m,
 
1940
  const basic_regex<charT, traits>& e,
1941
  regex_constants::match_flag_type flags = regex_constants::match_default);
1942
  ```
1943
 
1944
  *Returns:*
1945
+ `regex_match(str, str + char_traits<charT>::length(str), e, flags)`.
1946
 
1947
  ``` cpp
1948
  template<class ST, class SA, class charT, class traits>
1949
  bool regex_match(const basic_string<charT, ST, SA>& s,
1950
  const basic_regex<charT, traits>& e,
1951
  regex_constants::match_flag_type flags = regex_constants::match_default);
1952
  ```
1953
 
1954
  *Returns:* `regex_match(s.begin(), s.end(), e, flags)`.
1955
 
1956
+ #### `regex_search` <a id="re.alg.search">[[re.alg.search]]</a>
1957
 
1958
  ``` cpp
1959
  template<class BidirectionalIterator, class Allocator, class charT, class traits>
1960
  bool regex_search(BidirectionalIterator first, BidirectionalIterator last,
1961
  match_results<BidirectionalIterator, Allocator>& m,
 
2045
  regex_constants::match_flag_type flags = regex_constants::match_default);
2046
  ```
2047
 
2048
  *Returns:* `regex_search(s.begin(), s.end(), e, flags)`.
2049
 
2050
+ #### `regex_replace` <a id="re.alg.replace">[[re.alg.replace]]</a>
2051
 
2052
  ``` cpp
2053
  template<class OutputIterator, class BidirectionalIterator,
2054
  class traits, class charT, class ST, class SA>
2055
  OutputIterator
 
2159
  regex_replace(back_inserter(result), s, s + char_traits<charT>::length(s), e, fmt, flags);
2160
  ```
2161
 
2162
  *Returns:* `result`.
2163
 
2164
+ ### Regular expression iterators <a id="re.iter">[[re.iter]]</a>
2165
 
2166
+ #### Class template `regex_iterator` <a id="re.regiter">[[re.regiter]]</a>
2167
 
2168
+ ##### General <a id="re.regiter.general">[[re.regiter.general]]</a>
2169
 
2170
  The class template `regex_iterator` is an iterator adaptor. It
2171
  represents a new view of an existing iterator sequence, by enumerating
2172
  all the occurrences of a regular expression within that sequence. A
2173
  `regex_iterator` uses `regex_search` to find successive regular
 
2178
  reached (`regex_search` returns `false`), the iterator becomes equal to
2179
  the end-of-sequence iterator value. The default constructor constructs
2180
  an end-of-sequence iterator object, which is the only legitimate
2181
  iterator to be used for the end condition. The result of `operator*` on
2182
  an end-of-sequence iterator is not defined. For any other iterator value
2183
+ a `const match_results<BidirectionalIterator>&` is returned. The result
2184
  of `operator->` on an end-of-sequence iterator is not defined. For any
2185
  other iterator value a `const match_results<BidirectionalIterator>*` is
2186
  returned. It is impossible to store things into `regex_iterator`s. Two
2187
  end-of-sequence iterators are always equal. An end-of-sequence iterator
2188
  is not equal to a non-end-of-sequence iterator. Two non-end-of-sequence
 
2235
 
2236
  [*Note 1*: For example, this can occur when the part of the regular
2237
  expression that matched consists only of an assertion (such as `'^'`,
2238
  `'$'`, `'\b'`, `'\B'`). — *end note*]
2239
 
2240
+ ##### Constructors <a id="re.regiter.cnstr">[[re.regiter.cnstr]]</a>
2241
 
2242
  ``` cpp
2243
  regex_iterator();
2244
  ```
2245
 
 
2254
  *Effects:* Initializes `begin` and `end` to `a` and `b`, respectively,
2255
  sets `pregex` to `addressof(re)`, sets `flags` to `m`, then calls
2256
  `regex_search(begin, end, match, *pregex, flags)`. If this call returns
2257
  `false` the constructor sets `*this` to the end-of-sequence iterator.
2258
 
2259
+ ##### Comparisons <a id="re.regiter.comp">[[re.regiter.comp]]</a>
2260
 
2261
  ``` cpp
2262
  bool operator==(const regex_iterator& right) const;
2263
  ```
2264
 
 
2271
  - `flags == right.flags`, and
2272
  - `match[0] == right.match[0]`;
2273
 
2274
  otherwise `false`.
2275
 
2276
+ ##### Indirection <a id="re.regiter.deref">[[re.regiter.deref]]</a>
2277
 
2278
  ``` cpp
2279
  const value_type& operator*() const;
2280
  ```
2281
 
 
2285
  const value_type* operator->() const;
2286
  ```
2287
 
2288
  *Returns:* `addressof(match)`.
2289
 
2290
+ ##### Increment <a id="re.regiter.incr">[[re.regiter.incr]]</a>
2291
 
2292
  ``` cpp
2293
  regex_iterator& operator++();
2294
  ```
2295
 
 
2319
  `false` the iterator sets `*this` to the end-of-sequence iterator. The
2320
  iterator then returns `*this`.
2321
 
2322
  In all cases in which the call to `regex_search` returns `true`,
2323
  `match.prefix().first` shall be equal to the previous value of
2324
+ `match[0].second`, and for each index `i` in the half-open range \[`0`,
2325
+ `match.size()`) for which `match[i].matched` is `true`,
2326
  `match.position(i)` shall return `distance(begin, match[i].first)`.
2327
 
2328
  [*Note 1*: This means that `match.position(i)` gives the offset from
2329
  the beginning of the target sequence, which is often not the same as the
2330
  offset from the sequence passed in the call to
 
2346
  regex_iterator tmp = *this;
2347
  ++(*this);
2348
  return tmp;
2349
  ```
2350
 
2351
+ #### Class template `regex_token_iterator` <a id="re.tokiter">[[re.tokiter]]</a>
2352
 
2353
+ ##### General <a id="re.tokiter.general">[[re.tokiter.general]]</a>
2354
 
2355
  The class template `regex_token_iterator` is an iterator adaptor; that
2356
  is to say it represents a new view of an existing iterator sequence, by
2357
  enumerating all the occurrences of a regular expression within that
2358
  sequence, and presenting one or more sub-expressions for each match
 
2486
  as the end of the target sequence. — *end note*]
2487
 
2488
  The *current match* is `(*position).prefix()` if `subs[N] == -1`, or
2489
  `(*position)[subs[N]]` for any other value of `subs[N]`.
2490
 
2491
+ ##### Constructors <a id="re.tokiter.cnstr">[[re.tokiter.cnstr]]</a>
2492
 
2493
  ``` cpp
2494
  regex_token_iterator();
2495
  ```
2496
 
 
2534
  the current match. Otherwise if any of the values stored in `subs` is
2535
  equal to -1 the constructor sets `*this` to a suffix iterator that
2536
  points to the range \[`a`, `b`), otherwise the constructor sets `*this`
2537
  to an end-of-sequence iterator.
2538
 
2539
+ ##### Comparisons <a id="re.tokiter.comp">[[re.tokiter.comp]]</a>
2540
 
2541
  ``` cpp
2542
  bool operator==(const regex_token_iterator& right) const;
2543
  ```
2544
 
 
2547
  `suffix == right.suffix`; otherwise returns `false` if `*this` or
2548
  `right` is an end-of-sequence iterator or a suffix iterator. Otherwise
2549
  returns `true` if `position == right.position`, `N == right.N`, and
2550
  `subs == right.subs`. Otherwise returns `false`.
2551
 
2552
+ ##### Indirection <a id="re.tokiter.deref">[[re.tokiter.deref]]</a>
2553
 
2554
  ``` cpp
2555
  const value_type& operator*() const;
2556
  ```
2557
 
 
2561
  const value_type* operator->() const;
2562
  ```
2563
 
2564
  *Returns:* `result`.
2565
 
2566
+ ##### Increment <a id="re.tokiter.incr">[[re.tokiter.incr]]</a>
2567
 
2568
  ``` cpp
2569
  regex_token_iterator& operator++();
2570
  ```
2571
 
 
2587
  iterator that points to the range \[`prev->suffix().first`,
2588
  `prev->suffix().second`).
2589
 
2590
  Otherwise, sets `*this` to an end-of-sequence iterator.
2591
 
2592
+ *Returns:* `*this`.
2593
 
2594
  ``` cpp
2595
  regex_token_iterator& operator++(int);
2596
  ```
2597
 
2598
  *Effects:* Constructs a copy `tmp` of `*this`, then calls `++(*this)`.
2599
 
2600
  *Returns:* `tmp`.
2601
 
2602
+ ### Modified ECMAScript regular expression grammar <a id="re.grammar">[[re.grammar]]</a>
2603
 
2604
  The regular expression grammar recognized by `basic_regex` objects
2605
  constructed with the ECMAScript flag is that specified by ECMA-262,
2606
  except as specified below.
2607
 
 
2707
 
2708
  If the *CV* of a *UnicodeEscapeSequence* is greater than the largest
2709
  value that can be held in an object of type `charT` the translator shall
2710
  throw an exception object of type `regex_error`.
2711
 
2712
+ [*Note 1*: This means that values of the form `"\uxxxx"` that do not
2713
+ fit in a character are invalid. — *end note*]
2714
 
2715
  Where the regular expression grammar requires the conversion of a
2716
  sequence of characters to an integral value, this is accomplished by
2717
  calling `traits_inst.value`.
2718
 
 
2766
  sequence of characters, a character `c` is a member of a character
2767
  class designated by an iterator range \[`first`, `last`) if
2768
  `traits_inst.isctype(c, traits_inst.lookup_classname(first, last, flags() & icase))`
2769
  is `true`.
2770