[linalg.algs.blas1] - C++23 → Trunk

Files changed (1) hide show

tmp/tmp9eiur0hi/{from.md → to.md} +617 -0

tmp/tmp9eiur0hi/{from.md → to.md} RENAMED Viewed

	@@ -0,0 +1,617 @@

+### BLAS 1 algorithms <a id="linalg.algs.blas1">[[linalg.algs.blas1]]</a>
+#### Complexity <a id="linalg.algs.blas1.complexity">[[linalg.algs.blas1.complexity]]</a>
+*Complexity:* All algorithms in [[linalg.algs.blas1]] with `mdspan`
+parameters perform a count of `mdspan` array accesses and arithmetic
+operations that is linear in the maximum product of extents of any
+`mdspan` parameter.
+#### Givens rotations <a id="linalg.algs.blas1.givens">[[linalg.algs.blas1.givens]]</a>
+##### Compute Givens rotation <a id="linalg.algs.blas1.givens.lartg">[[linalg.algs.blas1.givens.lartg]]</a>
+``` cpp
+template<class Real>
+  setup_givens_rotation_result<Real> setup_givens_rotation(Real a, Real b) noexcept;
+template<class Real>
+  setup_givens_rotation_result<complex<Real>>
+    setup_givens_rotation(complex<Real> a, complex<Real> b) noexcept;
+```
+These functions compute the Givens plane rotation represented by the two
+values c and s such that the 2 x 2 system of equations
+$$\left[ \begin{matrix}
+c             & s \\
+-\overline{s} & c \\
+\end{matrix} \right]
+\cdot
+\left[ \begin{matrix}
+a \\
+b \\
+\end{matrix} \right]
+=
+\left[ \begin{matrix}
+r \\
+0 \\
+\end{matrix} \right]$$
+holds, where c is always a real scalar, and c² + |s|^2 = 1. That is, c
+and s represent a 2 x 2 matrix, that when multiplied by the right by the
+input vector whose components are a and b, produces a result vector
+whose first component r is the Euclidean norm of the input vector, and
+whose second component is zero.
+[*Note 1*: These functions correspond to the LAPACK function
+`xLARTG`. — *end note*]
+*Returns:* `c, s, r`, where `c` and `s` form the Givens plane rotation
+corresponding to the input `a` and `b`, and `r` is the Euclidean norm of
+the two-component vector formed by `a` and `b`.
+##### Apply a computed Givens rotation to vectors <a id="linalg.algs.blas1.givens.rot">[[linalg.algs.blas1.givens.rot]]</a>
+``` cpp
+template<inout-vector InOutVec1, inout-vector InOutVec2, class Real>
+  void apply_givens_rotation(InOutVec1 x, InOutVec2 y, Real c, Real s);
+template<class ExecutionPolicy, inout-vector InOutVec1, inout-vector InOutVec2, class Real>
+  void apply_givens_rotation(ExecutionPolicy&& exec,
+                             InOutVec1 x, InOutVec2 y, Real c, Real s);
+template<inout-vector InOutVec1, inout-vector InOutVec2, class Real>
+  void apply_givens_rotation(InOutVec1 x, InOutVec2 y, Real c, complex<Real> s);
+template<class ExecutionPolicy, inout-vector InOutVec1, inout-vector InOutVec2, class Real>
+  void apply_givens_rotation(ExecutionPolicy&& exec,
+                             InOutVec1 x, InOutVec2 y, Real c, complex<Real> s);
+```
+[*Note 2*: These functions correspond to the BLAS function
+`xROT`. — *end note*]
+*Mandates:* *`compatible-static-extents`*`<InOutVec1, InOutVec2>(0, 0)`
+is `true`.
+*Preconditions:* `x.extent(0)` equals `y.extent(0)`.
+*Effects:* Applies the plane rotation specified by `c` and `s` to the
+input vectors `x` and `y`, as if the rotation were a 2 x 2 matrix and
+the input vectors were successive rows of a matrix with two rows.
+#### Swap matrix or vector elements <a id="linalg.algs.blas1.swap">[[linalg.algs.blas1.swap]]</a>
+``` cpp
+template<inout-object InOutObj1, inout-object InOutObj2>
+  void swap_elements(InOutObj1 x, InOutObj2 y);
+template<class ExecutionPolicy, inout-object InOutObj1, inout-object InOutObj2>
+  void swap_elements(ExecutionPolicy&& exec, InOutObj1 x, InOutObj2 y);
+```
+[*Note 1*: These functions correspond to the BLAS function
+`xSWAP`. — *end note*]
+*Constraints:* `x.rank()` equals `y.rank()`.
+*Mandates:* For all `r` in the range [0, `x.rank()`),
+``` cpp
+compatible-static-extents<InOutObj1, InOutObj2>(r, r)
+```
+is `true`.
+*Preconditions:* `x.extents()` equals `y.extents()`.
+*Effects:* Swaps all corresponding elements of `x` and `y`.
+#### Multiply the elements of an object in place by a scalar <a id="linalg.algs.blas1.scal">[[linalg.algs.blas1.scal]]</a>
+``` cpp
+template<class Scalar, inout-object InOutObj>
+  void scale(Scalar alpha, InOutObj x);
+template<class ExecutionPolicy, class Scalar, inout-object InOutObj>
+  void scale(ExecutionPolicy&& exec, Scalar alpha, InOutObj x);
+```
+[*Note 1*: These functions correspond to the BLAS function
+`xSCAL`. — *end note*]
+*Effects:* Overwrites x with the result of computing the elementwise
+multiplication α x, where the scalar α is `alpha`.
+#### Copy elements of one matrix or vector into another <a id="linalg.algs.blas1.copy">[[linalg.algs.blas1.copy]]</a>
+``` cpp
+template<in-object InObj, out-object OutObj>
+  void copy(InObj x, OutObj y);
+template<class ExecutionPolicy, in-object InObj, out-object OutObj>
+  void copy(ExecutionPolicy&& exec, InObj x, OutObj y);
+```
+[*Note 1*: These functions correspond to the BLAS function
+`xCOPY`. — *end note*]
+*Constraints:* `x.rank()` equals `y.rank()`.
+*Mandates:* For all `r` in the range [ 0, `x.rank()`),
+``` cpp
+compatible-static-extents<InObj, OutObj>(r, r)
+```
+is `true`.
+*Preconditions:* `x.extents()` equals `y.extents()`.
+*Effects:* Assigns each element of x to the corresponding element of y.
+#### Add vectors or matrices elementwise <a id="linalg.algs.blas1.add">[[linalg.algs.blas1.add]]</a>
+``` cpp
+template<in-object InObj1, in-object InObj2, out-object OutObj>
+  void add(InObj1 x, InObj2 y, OutObj z);
+template<class ExecutionPolicy, in-object InObj1, in-object InObj2, out-object OutObj>
+  void add(ExecutionPolicy&& exec,
+           InObj1 x, InObj2 y, OutObj z);
+```
+[*Note 1*: These functions correspond to the BLAS function
+`xAXPY`. — *end note*]
+*Constraints:* `x.rank()`, `y.rank()`, and `z.rank()` are all equal.
+*Mandates:* *`possibly-addable`*`<InObj1, InObj2, OutObj>()` is `true`.
+*Preconditions:* *`addable`*`(x,y,z)` is `true`.
+*Effects:* Computes z = x + y.
+*Remarks:* `z` may alias `x` or `y`.
+#### Dot product of two vectors <a id="linalg.algs.blas1.dot">[[linalg.algs.blas1.dot]]</a>
+[*Note 1*: The functions in this section correspond to the BLAS
+functions `xDOT`, `xDOTU`, and `xDOTC`. — *end note*]
+The following elements apply to all functions in
+[[linalg.algs.blas1.dot]].
+*Mandates:* `compatible-static-extents<InVec1, InVec2>(0, 0)` is `true`.
+*Preconditions:* `v1.extent(0)` equals `v2.extent(0)`.
+``` cpp
+template<in-vector InVec1, in-vector InVec2, class Scalar>
+  Scalar dot(InVec1 v1, InVec2 v2, Scalar init);
+template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, class Scalar>
+  Scalar dot(ExecutionPolicy&& exec,
+             InVec1 v1, InVec2 v2, Scalar init);
+```
+These functions compute a non-conjugated dot product with an explicitly
+specified result type.
+*Returns:* Let `N` be `v1.extent(0)`.
+- `init` if `N` is zero;
+- otherwise, *GENERALIZED_SUM*(plus\<\>(), init, v1\[0\]\*v2\[0\], …,
+  v1\[N-1\]\*v2\[N-1\]).
+*Remarks:* If `InVec1::value_type`, `InVec2::value_type`, and `Scalar`
+are all floating-point types or specializations of `complex`, and if
+`Scalar` has higher precision than `InVec1::value_type` or
+`InVec2::value_type`, then intermediate terms in the sum use `Scalar`’s
+precision or greater.
+``` cpp
+template<in-vector InVec1, in-vector InVec2>
+    auto dot(InVec1 v1, InVec2 v2);
+  template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2>
+    auto dot(ExecutionPolicy&& exec,
+             InVec1 v1, InVec2 v2);
+```
+These functions compute a non-conjugated dot product with a default
+result type.
+*Effects:* Let `T` be
+`decltype(declval<typename InVec1::value_type>() * declval<typename InVec2::value_type>())`.
+Then,
+- the two-parameter overload is equivalent to:
+  ``` cpp
+  return dot(v1, v2, T{});
+  ```
+  and
+- the three-parameter overload is equivalent to:
+  ``` cpp
+  return dot(std::forward<ExecutionPolicy>(exec), v1, v2, T{});
+  ```
+``` cpp
+template<in-vector InVec1, in-vector InVec2, class Scalar>
+  Scalar dotc(InVec1 v1, InVec2 v2, Scalar init);
+template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, class Scalar>
+  Scalar dotc(ExecutionPolicy&& exec,
+              InVec1 v1, InVec2 v2, Scalar init);
+```
+These functions compute a conjugated dot product with an explicitly
+specified result type.
+*Effects:*
+- The three-parameter overload is equivalent to:
+  ``` cpp
+  return dot(conjugated(v1), v2, init);
+  ```
+  and
+- the four-parameter overload is equivalent to:
+  ``` cpp
+  return dot(std::forward<ExecutionPolicy>(exec), conjugated(v1), v2, init);
+  ```
+``` cpp
+template<in-vector InVec1, in-vector InVec2>
+  auto dotc(InVec1 v1, InVec2 v2);
+template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2>
+  auto dotc(ExecutionPolicy&& exec,
+            InVec1 v1, InVec2 v2);
+```
+These functions compute a conjugated dot product with a default result
+type.
+*Effects:* Let `T` be
+`decltype(`*`conj-if-needed`*`(declval<typename InVec1::value_type>()) * declval<typename InVec2::value_type>())`.
+Then,
+- the two-parameter overload is equivalent to:
+  ``` cpp
+  return dotc(v1, v2, T{});
+  ```
+  and
+- the three-parameter overload is equivalent to
+  ``` cpp
+  return dotc(std::forward<ExecutionPolicy>(exec), v1, v2, T{});
+  ```
+#### Scaled sum of squares of a vector’s elements <a id="linalg.algs.blas1.ssq">[[linalg.algs.blas1.ssq]]</a>
+``` cpp
+template<in-vector InVec, class Scalar>
+  sum_of_squares_result<Scalar> vector_sum_of_squares(InVec v, sum_of_squares_result<Scalar> init);
+template<class ExecutionPolicy, in-vector InVec, class Scalar>
+  sum_of_squares_result<Scalar> vector_sum_of_squares(ExecutionPolicy&& exec,
+                                                      InVec v, sum_of_squares_result<Scalar> init);
+```
+[*Note 1*: These functions correspond to the LAPACK function
+`xLASSQ`. — *end note*]
+*Mandates:*
+`decltype(`*`abs-if-needed`*`(declval<typename InVec::value_type>()))`
+is convertible to `Scalar`.
+*Effects:* Returns a value `result` such that
+- `result.scaling_factor` is the maximum of `init.scaling_factor` and
+  *`abs-if-needed`*`(x[i])` for all `i` in the domain of `v`; and
+- let `s2init` be
+  ``` cpp
+  init.scaling_factor * init.scaling_factor * init.scaled_sum_of_squares
+  ```
+  then
+  `result.scaling_factor * result.scaling_factor * result.scaled_sum_of_squares`
+  equals the sum of `s2init` and the squares of
+  *`abs-if-needed`*`(x[i])` for all `i` in the domain of `v`.
+*Remarks:* If `InVec::value_type`, and `Scalar` are all floating-point
+types or specializations of `complex`, and if `Scalar` has higher
+precision than `InVec::value_type`, then intermediate terms in the sum
+use `Scalar`’s precision or greater.
+#### Euclidean norm of a vector <a id="linalg.algs.blas1.nrm2">[[linalg.algs.blas1.nrm2]]</a>
+``` cpp
+template<in-vector InVec, class Scalar>
+  Scalar vector_two_norm(InVec v, Scalar init);
+template<class ExecutionPolicy, in-vector InVec, class Scalar>
+  Scalar vector_two_norm(ExecutionPolicy&& exec, InVec v, Scalar init);
+```
+[*Note 1*: These functions correspond to the BLAS function
+`xNRM2`. — *end note*]
+*Mandates:* Let `a` be
+*`abs-if-needed`*`(declval<typename InVec::value_type>())`. Then,
+`decltype(init + a * a` is convertible to `Scalar`.
+*Returns:* The square root of the sum of the square of `init` and the
+squares of the absolute values of the elements of `v`.
+[*Note 2*: For `init` equal to zero, this is the Euclidean norm (also
+called 2-norm) of the vector `v`. — *end note*]
+*Remarks:* If `InVec::value_type`, and `Scalar` are all floating-point
+types or specializations of `complex`, and if `Scalar` has higher
+precision than `InVec::value_type`, then intermediate terms in the sum
+use `Scalar`’s precision or greater.
+[*Note 3*: An implementation of this function for floating-point types
+`T` can use the `scaled_sum_of_squares` result from
+`vector_sum_of_squares(x, {.scaling_factor=1.0, .scaled_sum_of_squares=init})`. — *end note*]
+``` cpp
+template<in-vector InVec>
+  auto vector_two_norm(InVec v);
+template<class ExecutionPolicy, in-vector InVec>
+  auto vector_two_norm(ExecutionPolicy&& exec, InVec v);
+```
+*Effects:* Let `a` be
+*`abs-if-needed`*`(declval<typename InVec::value_type>())`. Let `T` be
+`decltype(a * a)`. Then,
+- the one-parameter overload is equivalent to:
+  ``` cpp
+  return vector_two_norm(v, T{});
+  ```
+  and
+- the two-parameter overload is equivalent to:
+  ``` cpp
+  return vector_two_norm(std::forward<ExecutionPolicy>(exec), v, T{});
+  ```
+#### Sum of absolute values of vector elements <a id="linalg.algs.blas1.asum">[[linalg.algs.blas1.asum]]</a>
+``` cpp
+template<in-vector InVec, class Scalar>
+  Scalar vector_abs_sum(InVec v, Scalar init);
+template<class ExecutionPolicy, in-vector InVec, class Scalar>
+  Scalar vector_abs_sum(ExecutionPolicy&& exec, InVec v, Scalar init);
+```
+[*Note 1*: These functions correspond to the BLAS functions `SASUM`,
+`DASUM`, `SCASUM`, and `DZASUM`. — *end note*]
+*Mandates:*
+``` cpp
+decltype(init + abs-if-needed(real-if-needed(declval<typename InVec::value_type>())) +
+                abs-if-needed(imag-if-needed(declval<typename InVec::value_type>())))
+```
+is convertible to `Scalar`.
+*Returns:* Let `N` be `v.extent(0)`.
+- `init` if `N` is zero;
+- otherwise, if `InVec::value_type` is an arithmetic type,
+  ``` cpp
+  GENERALIZED_SUM(plus<>(), init, abs-if-needed(v[0]), …, abs-if-needed(v[N-1]))
+  ```
+- otherwise,
+  ``` cpp
+  GENERALIZED_SUM(plus<>(), init,
+         abs-if-needed(real-if-needed(v[0])) + abs-if-needed(imag-if-needed(v[0])),
+         …,
+         abs-if-needed(real-if-needed(v[N-1])) + abs-if-needed(imag-if-needed(v[N-1])))
+  ```
+*Remarks:* If `InVec::value_type` and `Scalar` are all floating-point
+types or specializations of `complex`, and if `Scalar` has higher
+precision than `InVec::value_type`, then intermediate terms in the sum
+use `Scalar`’s precision or greater.
+``` cpp
+template<in-vector InVec>
+  auto vector_abs_sum(InVec v);
+template<class ExecutionPolicy, in-vector InVec>
+  auto vector_abs_sum(ExecutionPolicy&& exec, InVec v);
+```
+*Effects:* Let `T` be `typename InVec::value_type`. Then,
+- the one-parameter overload is equivalent to:
+  ``` cpp
+  return vector_abs_sum(v, T{});
+  ```
+  and
+- the two-parameter overload is equivalent to:
+  ``` cpp
+  return vector_abs_sum(std::forward<ExecutionPolicy>(exec), v, T{});
+  ```
+#### Index of maximum absolute value of vector elements <a id="linalg.algs.blas1.iamax">[[linalg.algs.blas1.iamax]]</a>
+``` cpp
+template<in-vector InVec>
+  typename InVec::extents_type vector_idx_abs_max(InVec v);
+template<class ExecutionPolicy, in-vector InVec>
+  typename InVec::extents_type vector_idx_abs_max(ExecutionPolicy&& exec, InVec v);
+```
+[*Note 1*: These functions correspond to the BLAS function
+`IxAMAX`. — *end note*]
+Let `T` be
+``` cpp
+decltype(abs-if-needed(real-if-needed(declval<typename InVec::value_type>())) +
+         abs-if-needed(imag-if-needed(declval<typename InVec::value_type>())))
+```
+*Mandates:* `declval<T>() < declval<T>()` is a valid expression.
+*Returns:*
+- `numeric_limits<typename InVec::size_type>::max()` if `v` has zero
+  elements;
+- otherwise, the index of the first element of `v` having largest
+  absolute value, if `InVec::value_type` is an arithmetic type;
+- otherwise, the index of the first element `vₑ` of `v` for which
+  ``` cpp
+  abs-if-needed(real-if-needed($v_e$)) + abs-if-needed(imag-if-needed($v_e$))
+  ```
+  has the largest value.
+#### Frobenius norm of a matrix <a id="linalg.algs.blas1.matfrobnorm">[[linalg.algs.blas1.matfrobnorm]]</a>
+[*Note 1*: These functions exist in the BLAS standard but are not part
+of the reference implementation. — *end note*]
+``` cpp
+template<in-matrix InMat, class Scalar>
+  Scalar matrix_frob_norm(InMat A, Scalar init);
+template<class ExecutionPolicy, in-matrix InMat, class Scalar>
+  Scalar matrix_frob_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
+```
+*Mandates:* Let `a` be
+*`abs-if-needed`*`(declval<typename InMat::value_type>())`. Then,
+`decltype(init + a * a)` is convertible to `Scalar`.
+*Returns:* The square root of the sum of squares of `init` and the
+absolute values of the elements of `A`.
+[*Note 1*: For `init` equal to zero, this is the Frobenius norm of the
+matrix `A`. — *end note*]
+*Remarks:* If `InMat::value_type` and `Scalar` are all floating-point
+types or specializations of `complex`, and if `Scalar` has higher
+precision than `InMat::value_type`, then intermediate terms in the sum
+use `Scalar`’s precision or greater.
+``` cpp
+template<in-matrix InMat>
+  auto matrix_frob_norm(InMat A);
+template<class ExecutionPolicy, in-matrix InMat>
+  auto matrix_frob_norm(ExecutionPolicy&& exec, InMat A);
+```
+*Effects:* Let `a` be
+*`abs-if-needed`*`(declval<typename InMat::value_type>())`. Let `T` be
+`decltype(a * a)`. Then,
+- the one-parameter overload is equivalent to:
+  ``` cpp
+  return matrix_frob_norm(A, T{});
+  ```
+  and
+- the two-parameter overload is equivalent to:
+  ``` cpp
+  return matrix_frob_norm(std::forward<ExecutionPolicy>(exec), A, T{});
+  ```
+#### One norm of a matrix <a id="linalg.algs.blas1.matonenorm">[[linalg.algs.blas1.matonenorm]]</a>
+[*Note 1*: These functions exist in the BLAS standard but are not part
+of the reference implementation. — *end note*]
+``` cpp
+template<in-matrix InMat, class Scalar>
+  Scalar matrix_one_norm(InMat A, Scalar init);
+template<class ExecutionPolicy, in-matrix InMat, class Scalar>
+  Scalar matrix_one_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
+```
+*Mandates:*
+`decltype(`*`abs-if-needed`*`(declval<typename InMat::value_type>()))`
+is convertible to `Scalar`.
+*Returns:*
+- `init` if `A.extent(1)` is zero;
+- otherwise, the sum of `init` and the one norm of the matrix A.
+[*Note 1*: The one norm of the matrix `A` is the maximum over all
+columns of `A`, of the sum of the absolute values of the elements of the
+column. — *end note*]
+*Remarks:* If `InMat::value_type` and `Scalar` are all floating-point
+types or specializations of `complex`, and if `Scalar` has higher
+precision than `InMat::value_type`, then intermediate terms in the sum
+use `Scalar`’s precision or greater.
+``` cpp
+template<in-matrix InMat>
+  auto matrix_one_norm(InMat A);
+template<class ExecutionPolicy, in-matrix InMat>
+  auto matrix_one_norm(ExecutionPolicy&& exec, InMat A);
+```
+*Effects:* Let `T` be
+`decltype(`*`abs-if-needed`*`(declval<typename InMat::value_type>())`.
+Then,
+- the one-parameter overload is equivalent to:
+  ``` cpp
+  return matrix_one_norm(A, T{});
+  ```
+  and
+- the two-parameter overload is equivalent to:
+  ``` cpp
+  return matrix_one_norm(std::forward<ExecutionPolicy>(exec), A, T{});
+  ```
+#### Infinity norm of a matrix <a id="linalg.algs.blas1.matinfnorm">[[linalg.algs.blas1.matinfnorm]]</a>
+[*Note 1*: These functions exist in the BLAS standard but are not part
+of the reference implementation. — *end note*]
+``` cpp
+template<in-matrix InMat, class Scalar>
+  Scalar matrix_inf_norm(InMat A, Scalar init);
+template<class ExecutionPolicy, in-matrix InMat, class Scalar>
+  Scalar matrix_inf_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
+```
+*Mandates:*
+`decltype(`*`abs-if-needed`*`(declval<typename InMat::value_type>()))`
+is convertible to `Scalar`.
+*Returns:*
+- `init` if `A.extent(0)` is zero;
+- otherwise, the sum of `init` and the infinity norm of the matrix `A`.
+[*Note 1*: The infinity norm of the matrix `A` is the maximum over all
+rows of `A`, of the sum of the absolute values of the elements of the
+row. — *end note*]
+*Remarks:* If `InMat::value_type` and `Scalar` are all floating-point
+types or specializations of `complex`, and if `Scalar` has higher
+precision than `InMat::value_type`, then intermediate terms in the sum
+use `Scalar`’s precision or greater.
+``` cpp
+template<in-matrix InMat>
+  auto matrix_inf_norm(InMat A);
+template<class ExecutionPolicy, in-matrix InMat>
+  auto matrix_inf_norm(ExecutionPolicy&& exec, InMat A);
+```
+*Effects:* Let `T` be
+`decltype(`*`abs-if-needed`*`(declval<typename InMat::value_type>())`.
+Then,
+- the one-parameter overload is equivalent to:
+  ``` cpp
+  return matrix_inf_norm(A, T{});
+  ```
+  and
+- the two-parameter overload is equivalent to:
+  ``` cpp
+  return matrix_inf_norm(std::forward<ExecutionPolicy>(exec), A, T{});
+  ```

Diff to HTML by rtfpessoa