From Jason Turner

[linalg]

Large diff (160.6 KB) - rendering may be slow on some devices

Diff to HTML by rtfpessoa

Files changed (1) hide show
  1. tmp/tmp2jy82_1w/{from.md → to.md} +3998 -0
tmp/tmp2jy82_1w/{from.md → to.md} RENAMED
@@ -0,0 +1,3998 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Basic linear algebra algorithms <a id="linalg">[[linalg]]</a>
2
+
3
+ ### Overview <a id="linalg.overview">[[linalg.overview]]</a>
4
+
5
+ Subclause [[linalg]] defines basic linear algebra algorithms. The
6
+ algorithms that access the elements of arrays view those elements
7
+ through `mdspan` [[views.multidim]].
8
+
9
+ ### Header `<linalg>` synopsis <a id="linalg.syn">[[linalg.syn]]</a>
10
+
11
+ ``` cpp
12
+ namespace std::linalg {
13
+ // [linalg.tags.order], storage order tags
14
+ struct column_major_t;
15
+ inline constexpr column_major_t column_major;
16
+ struct row_major_t;
17
+ inline constexpr row_major_t row_major;
18
+
19
+ // [linalg.tags.triangle], triangle tags
20
+ struct upper_triangle_t;
21
+ inline constexpr upper_triangle_t upper_triangle;
22
+ struct lower_triangle_t;
23
+ inline constexpr lower_triangle_t lower_triangle;
24
+
25
+ // [linalg.tags.diagonal], diagonal tags
26
+ struct implicit_unit_diagonal_t;
27
+ inline constexpr implicit_unit_diagonal_t implicit_unit_diagonal;
28
+ struct explicit_diagonal_t;
29
+ inline constexpr explicit_diagonal_t explicit_diagonal;
30
+
31
+ // [linalg.layout.packed], class template layout_blas_packed
32
+ template<class Triangle, class StorageOrder>
33
+ class layout_blas_packed;
34
+
35
+ // [linalg.helpers], exposition-only helpers
36
+
37
+ // [linalg.helpers.concepts], linear algebra argument concepts
38
+ template<class T>
39
+ constexpr bool is-mdspan = see below; // exposition only
40
+
41
+ template<class T>
42
+ concept in-vector = see below; // exposition only
43
+
44
+ template<class T>
45
+ concept out-vector = see below; // exposition only
46
+
47
+ template<class T>
48
+ concept inout-vector = see below; // exposition only
49
+
50
+ template<class T>
51
+ concept in-matrix = see below; // exposition only
52
+
53
+ template<class T>
54
+ concept out-matrix = see below; // exposition only
55
+
56
+ template<class T>
57
+ concept inout-matrix = see below; // exposition only
58
+
59
+ template<class T>
60
+ concept possibly-packed-inout-matrix = see below; // exposition only
61
+
62
+ template<class T>
63
+ concept in-object = see below; // exposition only
64
+
65
+ template<class T>
66
+ concept out-object = see below; // exposition only
67
+
68
+ template<class T>
69
+ concept inout-object = see below; // exposition only
70
+
71
+ // [linalg.scaled], scaled in-place transformation
72
+
73
+ // [linalg.scaled.scaledaccessor], class template scaled_accessor
74
+ template<class ScalingFactor, class NestedAccessor>
75
+ class scaled_accessor;
76
+
77
+ // [linalg.scaled.scaled], function template scaled
78
+ template<class ScalingFactor,
79
+ class ElementType, class Extents, class Layout, class Accessor>
80
+ constexpr auto scaled(ScalingFactor alpha, mdspan<ElementType, Extents, Layout, Accessor> x);
81
+
82
+ // [linalg.conj], conjugated in-place transformation
83
+
84
+ // [linalg.conj.conjugatedaccessor], class template conjugated_accessor
85
+ template<class NestedAccessor>
86
+ class conjugated_accessor;
87
+
88
+ // [linalg.conj.conjugated], function template conjugated
89
+ template<class ElementType, class Extents, class Layout, class Accessor>
90
+ constexpr auto conjugated(mdspan<ElementType, Extents, Layout, Accessor> a);
91
+
92
+ // [linalg.transp], transpose in-place transformation
93
+
94
+ // [linalg.transp.layout.transpose], class template layout_transpose
95
+ template<class Layout>
96
+ class layout_transpose;
97
+
98
+ // [linalg.transp.transposed], function template transposed
99
+ template<class ElementType, class Extents, class Layout, class Accessor>
100
+ constexpr auto transposed(mdspan<ElementType, Extents, Layout, Accessor> a);
101
+
102
+ // [linalg.conjtransposed], conjugated transpose in-place transformation
103
+ template<class ElementType, class Extents, class Layout, class Accessor>
104
+ constexpr auto conjugate_transposed(mdspan<ElementType, Extents, Layout, Accessor> a);
105
+
106
+ // [linalg.algs.blas1], BLAS 1 algorithms
107
+
108
+ // [linalg.algs.blas1.givens], Givens rotations
109
+
110
+ // [linalg.algs.blas1.givens.lartg], compute Givens rotation
111
+
112
+ template<class Real>
113
+ struct setup_givens_rotation_result {
114
+ Real c;
115
+ Real s;
116
+ Real r;
117
+ };
118
+ template<class Real>
119
+ struct setup_givens_rotation_result<complex<Real>> {
120
+ Real c;
121
+ complex<Real> s;
122
+ complex<Real> r;
123
+ };
124
+
125
+ template<class Real>
126
+ setup_givens_rotation_result<Real> setup_givens_rotation(Real a, Real b) noexcept;
127
+
128
+ template<class Real>
129
+ setup_givens_rotation_result<complex<Real>>
130
+ setup_givens_rotation(complex<Real> a, complex<Real> b) noexcept;
131
+
132
+ // [linalg.algs.blas1.givens.rot], apply computed Givens rotation
133
+ template<inout-vector InOutVec1, inout-vector InOutVec2, class Real>
134
+ void apply_givens_rotation(InOutVec1 x, InOutVec2 y, Real c, Real s);
135
+ template<class ExecutionPolicy, inout-vector InOutVec1, inout-vector InOutVec2, class Real>
136
+ void apply_givens_rotation(ExecutionPolicy&& exec,
137
+ InOutVec1 x, InOutVec2 y, Real c, Real s);
138
+ template<inout-vector InOutVec1, inout-vector InOutVec2, class Real>
139
+ void apply_givens_rotation(InOutVec1 x, InOutVec2 y, Real c, complex<Real> s);
140
+ template<class ExecutionPolicy, inout-vector InOutVec1, inout-vector InOutVec2, class Real>
141
+ void apply_givens_rotation(ExecutionPolicy&& exec,
142
+ InOutVec1 x, InOutVec2 y, Real c, complex<Real> s);
143
+
144
+ // [linalg.algs.blas1.swap], swap elements
145
+ template<inout-object InOutObj1, inout-object InOutObj2>
146
+ void swap_elements(InOutObj1 x, InOutObj2 y);
147
+ template<class ExecutionPolicy, inout-object InOutObj1, inout-object InOutObj2>
148
+ void swap_elements(ExecutionPolicy&& exec, InOutObj1 x, InOutObj2 y);
149
+
150
+ // [linalg.algs.blas1.scal], multiply elements by scalar
151
+ template<class Scalar, inout-object InOutObj>
152
+ void scale(Scalar alpha, InOutObj x);
153
+ template<class ExecutionPolicy, class Scalar, inout-object InOutObj>
154
+ void scale(ExecutionPolicy&& exec, Scalar alpha, InOutObj x);
155
+
156
+ // [linalg.algs.blas1.copy], copy elements
157
+ template<in-object InObj, out-object OutObj>
158
+ void copy(InObj x, OutObj y);
159
+ template<class ExecutionPolicy, in-object InObj, out-object OutObj>
160
+ void copy(ExecutionPolicy&& exec, InObj x, OutObj y);
161
+
162
+ // [linalg.algs.blas1.add], add elementwise
163
+ template<in-object InObj1, in-object InObj2, out-object OutObj>
164
+ void add(InObj1 x, InObj2 y, OutObj z);
165
+ template<class ExecutionPolicy, in-object InObj1, in-object InObj2, out-object OutObj>
166
+ void add(ExecutionPolicy&& exec, InObj1 x, InObj2 y, OutObj z);
167
+
168
+ // [linalg.algs.blas1.dot], dot product of two vectors
169
+ template<in-vector InVec1, in-vector InVec2, class Scalar>
170
+ Scalar dot(InVec1 v1, InVec2 v2, Scalar init);
171
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, class Scalar>
172
+ Scalar dot(ExecutionPolicy&& exec, InVec1 v1, InVec2 v2, Scalar init);
173
+ template<in-vector InVec1, in-vector InVec2>
174
+ auto dot(InVec1 v1, InVec2 v2);
175
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2>
176
+ auto dot(ExecutionPolicy&& exec, InVec1 v1, InVec2 v2);
177
+
178
+ template<in-vector InVec1, in-vector InVec2, class Scalar>
179
+ Scalar dotc(InVec1 v1, InVec2 v2, Scalar init);
180
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, class Scalar>
181
+ Scalar dotc(ExecutionPolicy&& exec, InVec1 v1, InVec2 v2, Scalar init);
182
+ template<in-vector InVec1, in-vector InVec2>
183
+ auto dotc(InVec1 v1, InVec2 v2);
184
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2>
185
+ auto dotc(ExecutionPolicy&& exec, InVec1 v1, InVec2 v2);
186
+
187
+ // [linalg.algs.blas1.ssq], scaled sum of squares of a vector's elements
188
+ template<class Scalar>
189
+ struct sum_of_squares_result {
190
+ Scalar scaling_factor;
191
+ Scalar scaled_sum_of_squares;
192
+ };
193
+ template<in-vector InVec, class Scalar>
194
+ sum_of_squares_result<Scalar>
195
+ vector_sum_of_squares(InVec v, sum_of_squares_result<Scalar> init);
196
+ template<class ExecutionPolicy, in-vector InVec, class Scalar>
197
+ sum_of_squares_result<Scalar>
198
+ vector_sum_of_squares(ExecutionPolicy&& exec,
199
+ InVec v, sum_of_squares_result<Scalar> init);
200
+
201
+ // [linalg.algs.blas1.nrm2], Euclidean norm of a vector
202
+ template<in-vector InVec, class Scalar>
203
+ Scalar vector_two_norm(InVec v, Scalar init);
204
+ template<class ExecutionPolicy, in-vector InVec, class Scalar>
205
+ Scalar vector_two_norm(ExecutionPolicy&& exec, InVec v, Scalar init);
206
+ template<in-vector InVec>
207
+ auto vector_two_norm(InVec v);
208
+ template<class ExecutionPolicy, in-vector InVec>
209
+ auto vector_two_norm(ExecutionPolicy&& exec, InVec v);
210
+
211
+ // [linalg.algs.blas1.asum], sum of absolute values of vector elements
212
+ template<in-vector InVec, class Scalar>
213
+ Scalar vector_abs_sum(InVec v, Scalar init);
214
+ template<class ExecutionPolicy, in-vector InVec, class Scalar>
215
+ Scalar vector_abs_sum(ExecutionPolicy&& exec, InVec v, Scalar init);
216
+ template<in-vector InVec>
217
+ auto vector_abs_sum(InVec v);
218
+ template<class ExecutionPolicy, in-vector InVec>
219
+ auto vector_abs_sum(ExecutionPolicy&& exec, InVec v);
220
+
221
+ // [linalg.algs.blas1.iamax], index of maximum absolute value of vector elements
222
+ template<in-vector InVec>
223
+ typename InVec::extents_type vector_idx_abs_max(InVec v);
224
+ template<class ExecutionPolicy, in-vector InVec>
225
+ typename InVec::extents_type vector_idx_abs_max(ExecutionPolicy&& exec, InVec v);
226
+
227
+ // [linalg.algs.blas1.matfrobnorm], Frobenius norm of a matrix
228
+ template<in-matrix InMat, class Scalar>
229
+ Scalar matrix_frob_norm(InMat A, Scalar init);
230
+ template<class ExecutionPolicy, in-matrix InMat, class Scalar>
231
+ Scalar matrix_frob_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
232
+ template<in-matrix InMat>
233
+ auto matrix_frob_norm(InMat A);
234
+ template<class ExecutionPolicy, in-matrix InMat>
235
+ auto matrix_frob_norm(ExecutionPolicy&& exec, InMat A);
236
+
237
+ // [linalg.algs.blas1.matonenorm], one norm of a matrix
238
+ template<in-matrix InMat, class Scalar>
239
+ Scalar matrix_one_norm(InMat A, Scalar init);
240
+ template<class ExecutionPolicy, in-matrix InMat, class Scalar>
241
+ Scalar matrix_one_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
242
+ template<in-matrix InMat>
243
+ auto matrix_one_norm(InMat A);
244
+ template<class ExecutionPolicy, in-matrix InMat>
245
+ auto matrix_one_norm(ExecutionPolicy&& exec, InMat A);
246
+
247
+ // [linalg.algs.blas1.matinfnorm], infinity norm of a matrix
248
+ template<in-matrix InMat, class Scalar>
249
+ Scalar matrix_inf_norm(InMat A, Scalar init);
250
+ template<class ExecutionPolicy, in-matrix InMat, class Scalar>
251
+ Scalar matrix_inf_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
252
+ template<in-matrix InMat>
253
+ auto matrix_inf_norm(InMat A);
254
+ template<class ExecutionPolicy, in-matrix InMat>
255
+ auto matrix_inf_norm(ExecutionPolicy&& exec, InMat A);
256
+
257
+ // [linalg.algs.blas2], BLAS 2 algorithms
258
+
259
+ // [linalg.algs.blas2.gemv], general matrix-vector product
260
+ template<in-matrix InMat, in-vector InVec, out-vector OutVec>
261
+ void matrix_vector_product(InMat A, InVec x, OutVec y);
262
+ template<class ExecutionPolicy, in-matrix InMat, in-vector InVec, out-vector OutVec>
263
+ void matrix_vector_product(ExecutionPolicy&& exec, InMat A, InVec x, OutVec y);
264
+ template<in-matrix InMat, in-vector InVec1, in-vector InVec2, out-vector OutVec>
265
+ void matrix_vector_product(InMat A, InVec1 x, InVec2 y, OutVec z);
266
+ template<class ExecutionPolicy,
267
+ in-matrix InMat, in-vector InVec1, in-vector InVec2, out-vector OutVec>
268
+ void matrix_vector_product(ExecutionPolicy&& exec, InMat A, InVec1 x, InVec2 y, OutVec z);
269
+
270
+ // [linalg.algs.blas2.symv], symmetric matrix-vector product
271
+ template<in-matrix InMat, class Triangle, in-vector InVec, out-vector OutVec>
272
+ void symmetric_matrix_vector_product(InMat A, Triangle t, InVec x, OutVec y);
273
+ template<class ExecutionPolicy,
274
+ in-matrix InMat, class Triangle, in-vector InVec, out-vector OutVec>
275
+ void symmetric_matrix_vector_product(ExecutionPolicy&& exec,
276
+ InMat A, Triangle t, InVec x, OutVec y);
277
+ template<in-matrix InMat, class Triangle, in-vector InVec1, in-vector InVec2,
278
+ out-vector OutVec>
279
+ void symmetric_matrix_vector_product(InMat A, Triangle t, InVec1 x, InVec2 y, OutVec z);
280
+ template<class ExecutionPolicy,
281
+ in-matrix InMat, class Triangle, in-vector InVec1, in-vector InVec2,
282
+ out-vector OutVec>
283
+ void symmetric_matrix_vector_product(ExecutionPolicy&& exec,
284
+ InMat A, Triangle t, InVec1 x, InVec2 y, OutVec z);
285
+
286
+ // [linalg.algs.blas2.hemv], Hermitian matrix-vector product
287
+ template<in-matrix InMat, class Triangle, in-vector InVec, out-vector OutVec>
288
+ void hermitian_matrix_vector_product(InMat A, Triangle t, InVec x, OutVec y);
289
+ template<class ExecutionPolicy,
290
+ in-matrix InMat, class Triangle, in-vector InVec, out-vector OutVec>
291
+ void hermitian_matrix_vector_product(ExecutionPolicy&& exec,
292
+ InMat A, Triangle t, InVec x, OutVec y);
293
+ template<in-matrix InMat, class Triangle, in-vector InVec1, in-vector InVec2,
294
+ out-vector OutVec>
295
+ void hermitian_matrix_vector_product(InMat A, Triangle t, InVec1 x, InVec2 y, OutVec z);
296
+ template<class ExecutionPolicy,
297
+ in-matrix InMat, class Triangle, in-vector InVec1, in-vector InVec2,
298
+ out-vector OutVec>
299
+ void hermitian_matrix_vector_product(ExecutionPolicy&& exec,
300
+ InMat A, Triangle t, InVec1 x, InVec2 y, OutVec z);
301
+
302
+ // [linalg.algs.blas2.trmv], triangular matrix-vector product
303
+
304
+ // Overwriting triangular matrix-vector product
305
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, in-vector InVec,
306
+ out-vector OutVec>
307
+ void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d,
308
+ InVec x, OutVec y);
309
+ template<class ExecutionPolicy,
310
+ in-matrix InMat, class Triangle, class DiagonalStorage, in-vector InVec,
311
+ out-vector OutVec>
312
+ void triangular_matrix_vector_product(ExecutionPolicy&& exec,
313
+ InMat A, Triangle t, DiagonalStorage d,
314
+ InVec x, OutVec y);
315
+
316
+ // In-place triangular matrix-vector product
317
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-vector InOutVec>
318
+ void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d, InOutVec y);
319
+ template<class ExecutionPolicy,
320
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-vector InOutVec>
321
+ void triangular_matrix_vector_product(ExecutionPolicy&& exec,
322
+ InMat A, Triangle t, DiagonalStorage d, InOutVec y);
323
+
324
+ // Updating triangular matrix-vector product
325
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
326
+ in-vector InVec1, in-vector InVec2, out-vector OutVec>
327
+ void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d,
328
+ InVec1 x, InVec2 y, OutVec z);
329
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
330
+ in-vector InVec1, in-vector InVec2, out-vector OutVec>
331
+ void triangular_matrix_vector_product(ExecutionPolicy&& exec,
332
+ InMat A, Triangle t, DiagonalStorage d,
333
+ InVec1 x, InVec2 y, OutVec z);
334
+
335
+ // [linalg.algs.blas2.trsv], solve a triangular linear system
336
+
337
+ // Solve a triangular linear system, not in place
338
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
339
+ in-vector InVec, out-vector OutVec, class BinaryDivideOp>
340
+ void triangular_matrix_vector_solve(InMat A, Triangle t, DiagonalStorage d,
341
+ InVec b, OutVec x, BinaryDivideOp divide);
342
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
343
+ in-vector InVec, out-vector OutVec, class BinaryDivideOp>
344
+ void triangular_matrix_vector_solve(ExecutionPolicy&& exec,
345
+ InMat A, Triangle t, DiagonalStorage d,
346
+ InVec b, OutVec x, BinaryDivideOp divide);
347
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
348
+ in-vector InVec, out-vector OutVec>
349
+ void triangular_matrix_vector_solve(InMat A, Triangle t, DiagonalStorage d,
350
+ InVec b, OutVec x);
351
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
352
+ in-vector InVec, out-vector OutVec>
353
+ void triangular_matrix_vector_solve(ExecutionPolicy&& exec,
354
+ InMat A, Triangle t, DiagonalStorage d,
355
+ InVec b, OutVec x);
356
+
357
+ // Solve a triangular linear system, in place
358
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
359
+ inout-vector InOutVec, class BinaryDivideOp>
360
+ void triangular_matrix_vector_solve(InMat A, Triangle t, DiagonalStorage d,
361
+ InOutVec b, BinaryDivideOp divide);
362
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
363
+ inout-vector InOutVec, class BinaryDivideOp>
364
+ void triangular_matrix_vector_solve(ExecutionPolicy&& exec,
365
+ InMat A, Triangle t, DiagonalStorage d,
366
+ InOutVec b, BinaryDivideOp divide);
367
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-vector InOutVec>
368
+ void triangular_matrix_vector_solve(InMat A, Triangle t, DiagonalStorage d, InOutVec b);
369
+ template<class ExecutionPolicy,
370
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-vector InOutVec>
371
+ void triangular_matrix_vector_solve(ExecutionPolicy&& exec,
372
+ InMat A, Triangle t, DiagonalStorage d, InOutVec b);
373
+
374
+ // [linalg.algs.blas2.rank1], nonsymmetric rank-1 matrix update
375
+ template<in-vector InVec1, in-vector InVec2, inout-matrix InOutMat>
376
+ void matrix_rank_1_update(InVec1 x, InVec2 y, InOutMat A);
377
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, inout-matrix InOutMat>
378
+ void matrix_rank_1_update(ExecutionPolicy&& exec, InVec1 x, InVec2 y, InOutMat A);
379
+
380
+ template<in-vector InVec1, in-vector InVec2, inout-matrix InOutMat>
381
+ void matrix_rank_1_update_c(InVec1 x, InVec2 y, InOutMat A);
382
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, inout-matrix InOutMat>
383
+ void matrix_rank_1_update_c(ExecutionPolicy&& exec, InVec1 x, InVec2 y, InOutMat A);
384
+
385
+ // [linalg.algs.blas2.symherrank1], symmetric or Hermitian rank-1 matrix update
386
+ template<class Scalar, in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
387
+ void symmetric_matrix_rank_1_update(Scalar alpha, InVec x, InOutMat A, Triangle t);
388
+ template<class ExecutionPolicy,
389
+ class Scalar, in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
390
+ void symmetric_matrix_rank_1_update(ExecutionPolicy&& exec,
391
+ Scalar alpha, InVec x, InOutMat A, Triangle t);
392
+ template<in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
393
+ void symmetric_matrix_rank_1_update(InVec x, InOutMat A, Triangle t);
394
+ template<class ExecutionPolicy,
395
+ in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
396
+ void symmetric_matrix_rank_1_update(ExecutionPolicy&& exec, InVec x, InOutMat A, Triangle t);
397
+
398
+ template<class Scalar, in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
399
+ void hermitian_matrix_rank_1_update(Scalar alpha, InVec x, InOutMat A, Triangle t);
400
+ template<class ExecutionPolicy,
401
+ class Scalar, in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
402
+ void hermitian_matrix_rank_1_update(ExecutionPolicy&& exec,
403
+ Scalar alpha, InVec x, InOutMat A, Triangle t);
404
+ template<in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
405
+ void hermitian_matrix_rank_1_update(InVec x, InOutMat A, Triangle t);
406
+ template<class ExecutionPolicy,
407
+ in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
408
+ void hermitian_matrix_rank_1_update(ExecutionPolicy&& exec, InVec x, InOutMat A, Triangle t);
409
+
410
+ // [linalg.algs.blas2.rank2], symmetric and Hermitian rank-2 matrix updates
411
+
412
+ // symmetric rank-2 matrix update
413
+ template<in-vector InVec1, in-vector InVec2,
414
+ possibly-packed-inout-matrix InOutMat, class Triangle>
415
+ void symmetric_matrix_rank_2_update(InVec1 x, InVec2 y, InOutMat A, Triangle t);
416
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2,
417
+ possibly-packed-inout-matrix InOutMat, class Triangle>
418
+ void symmetric_matrix_rank_2_update(ExecutionPolicy&& exec,
419
+ InVec1 x, InVec2 y, InOutMat A, Triangle t);
420
+
421
+ // Hermitian rank-2 matrix update
422
+ template<in-vector InVec1, in-vector InVec2,
423
+ possibly-packed-inout-matrix InOutMat, class Triangle>
424
+ void hermitian_matrix_rank_2_update(InVec1 x, InVec2 y, InOutMat A, Triangle t);
425
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2,
426
+ possibly-packed-inout-matrix InOutMat, class Triangle>
427
+ void hermitian_matrix_rank_2_update(ExecutionPolicy&& exec,
428
+ InVec1 x, InVec2 y, InOutMat A, Triangle t);
429
+
430
+ // [linalg.algs.blas3], BLAS 3 algorithms
431
+
432
+ // [linalg.algs.blas3.gemm], general matrix-matrix product
433
+ template<in-matrix InMat1, in-matrix InMat2, out-matrix OutMat>
434
+ void matrix_product(InMat1 A, InMat2 B, OutMat C);
435
+ template<class ExecutionPolicy, in-matrix InMat1, in-matrix InMat2, out-matrix OutMat>
436
+ void matrix_product(ExecutionPolicy&& exec,
437
+ InMat1 A, InMat2 B, OutMat C);
438
+ template<in-matrix InMat1, in-matrix InMat2, in-matrix InMat3, out-matrix OutMat>
439
+ void matrix_product(InMat1 A, InMat2 B, InMat3 E, OutMat C);
440
+ template<class ExecutionPolicy,
441
+ in-matrix InMat1, in-matrix InMat2, in-matrix InMat3, out-matrix OutMat>
442
+ void matrix_product(ExecutionPolicy&& exec,
443
+ InMat1 A, InMat2 B, InMat3 E, OutMat C);
444
+
445
+ // [linalg.algs.blas3.xxmm], symmetric, Hermitian, and triangular matrix-matrix product
446
+
447
+ template<in-matrix InMat1, class Triangle, in-matrix InMat2, out-matrix OutMat>
448
+ void symmetric_matrix_product(InMat1 A, Triangle t, InMat2 B, OutMat C);
449
+ template<class ExecutionPolicy,
450
+ in-matrix InMat1, class Triangle, in-matrix InMat2, out-matrix OutMat>
451
+ void symmetric_matrix_product(ExecutionPolicy&& exec,
452
+ InMat1 A, Triangle t, InMat2 B, OutMat C);
453
+
454
+ template<in-matrix InMat1, class Triangle, in-matrix InMat2, out-matrix OutMat>
455
+ void hermitian_matrix_product(InMat1 A, Triangle t, InMat2 B, OutMat C);
456
+ template<class ExecutionPolicy,
457
+ in-matrix InMat1, class Triangle, in-matrix InMat2, out-matrix OutMat>
458
+ void hermitian_matrix_product(ExecutionPolicy&& exec,
459
+ InMat1 A, Triangle t, InMat2 B, OutMat C);
460
+
461
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
462
+ in-matrix InMat2, out-matrix OutMat>
463
+ void triangular_matrix_product(InMat1 A, Triangle t, DiagonalStorage d, InMat2 B, OutMat C);
464
+ template<class ExecutionPolicy,
465
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
466
+ in-matrix InMat2, out-matrix OutMat>
467
+ void triangular_matrix_product(ExecutionPolicy&& exec,
468
+ InMat1 A, Triangle t, DiagonalStorage d, InMat2 B, OutMat C);
469
+
470
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, out-matrix OutMat>
471
+ void symmetric_matrix_product(InMat1 A, InMat2 B, Triangle t, OutMat C);
472
+ template<class ExecutionPolicy,
473
+ in-matrix InMat1, in-matrix InMat2, class Triangle, out-matrix OutMat>
474
+ void symmetric_matrix_product(ExecutionPolicy&& exec,
475
+ InMat1 A, InMat2 B, Triangle t, OutMat C);
476
+
477
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, out-matrix OutMat>
478
+ void hermitian_matrix_product(InMat1 A, InMat2 B, Triangle t, OutMat C);
479
+ template<class ExecutionPolicy,
480
+ in-matrix InMat1, in-matrix InMat2, class Triangle, out-matrix OutMat>
481
+ void hermitian_matrix_product(ExecutionPolicy&& exec,
482
+ InMat1 A, InMat2 B, Triangle t, OutMat C);
483
+
484
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, class DiagonalStorage,
485
+ out-matrix OutMat>
486
+ void triangular_matrix_product(InMat1 A, InMat2 B, Triangle t, DiagonalStorage d, OutMat C);
487
+ template<class ExecutionPolicy,
488
+ in-matrix InMat1, in-matrix InMat2, class Triangle, class DiagonalStorage,
489
+ out-matrix OutMat>
490
+ void triangular_matrix_product(ExecutionPolicy&& exec,
491
+ InMat1 A, InMat2 B, Triangle t, DiagonalStorage d, OutMat C);
492
+
493
+ template<in-matrix InMat1, class Triangle, in-matrix InMat2, in-matrix InMat3,
494
+ out-matrix OutMat>
495
+ void symmetric_matrix_product(InMat1 A, Triangle t, InMat2 B, InMat3 E, OutMat C);
496
+ template<class ExecutionPolicy,
497
+ in-matrix InMat1, class Triangle, in-matrix InMat2, in-matrix InMat3,
498
+ out-matrix OutMat>
499
+ void symmetric_matrix_product(ExecutionPolicy&& exec,
500
+ InMat1 A, Triangle t, InMat2 B, InMat3 E, OutMat C);
501
+
502
+ template<in-matrix InMat1, class Triangle, in-matrix InMat2, in-matrix InMat3,
503
+ out-matrix OutMat>
504
+ void hermitian_matrix_product(InMat1 A, Triangle t, InMat2 B, InMat3 E, OutMat C);
505
+ template<class ExecutionPolicy,
506
+ in-matrix InMat1, class Triangle, in-matrix InMat2, in-matrix InMat3,
507
+ out-matrix OutMat>
508
+ void hermitian_matrix_product(ExecutionPolicy&& exec,
509
+ InMat1 A, Triangle t, InMat2 B, InMat3 E, OutMat C);
510
+
511
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
512
+ in-matrix InMat2, in-matrix InMat3, out-matrix OutMat>
513
+ void triangular_matrix_product(InMat1 A, Triangle t, DiagonalStorage d, InMat2 B, InMat3 E,
514
+ OutMat C);
515
+ template<class ExecutionPolicy,
516
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
517
+ in-matrix InMat2, in-matrix InMat3, out-matrix OutMat>
518
+ void triangular_matrix_product(ExecutionPolicy&& exec,
519
+ InMat1 A, Triangle t, DiagonalStorage d, InMat2 B, InMat3 E,
520
+ OutMat C);
521
+
522
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, in-matrix InMat3,
523
+ out-matrix OutMat>
524
+ void symmetric_matrix_product(InMat1 A, InMat2 B, Triangle t, InMat3 E, OutMat C);
525
+ template<class ExecutionPolicy,
526
+ in-matrix InMat1, in-matrix InMat2, class Triangle, in-matrix InMat3,
527
+ out-matrix OutMat>
528
+ void symmetric_matrix_product(ExecutionPolicy&& exec,
529
+ InMat1 A, InMat2 B, Triangle t, InMat3 E, OutMat C);
530
+
531
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, in-matrix InMat3,
532
+ out-matrix OutMat>
533
+ void hermitian_matrix_product(InMat1 A, InMat2 B, Triangle t, InMat3 E, OutMat C);
534
+ template<class ExecutionPolicy,
535
+ in-matrix InMat1, in-matrix InMat2, class Triangle, in-matrix InMat3,
536
+ out-matrix OutMat>
537
+ void hermitian_matrix_product(ExecutionPolicy&& exec,
538
+ InMat1 A, InMat2 B, Triangle t, InMat3 E, OutMat C);
539
+
540
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, class DiagonalStorage,
541
+ in-matrix InMat3, out-matrix OutMat>
542
+ void triangular_matrix_product(InMat1 A, InMat2 B, Triangle t, DiagonalStorage d, InMat3 E,
543
+ OutMat C);
544
+ template<class ExecutionPolicy,
545
+ in-matrix InMat1, in-matrix InMat2, class Triangle, class DiagonalStorage,
546
+ in-matrix InMat3, out-matrix OutMat>
547
+ void triangular_matrix_product(ExecutionPolicy&& exec,
548
+ InMat1 A, InMat2 B, Triangle t, DiagonalStorage d, InMat3 E,
549
+ OutMat C);
550
+
551
+ // [linalg.algs.blas3.trmm], in-place triangular matrix-matrix product
552
+
553
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
554
+ void triangular_matrix_left_product(InMat A, Triangle t, DiagonalStorage d, InOutMat C);
555
+ template<class ExecutionPolicy,
556
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
557
+ void triangular_matrix_left_product(ExecutionPolicy&& exec,
558
+ InMat A, Triangle t, DiagonalStorage d, InOutMat C);
559
+
560
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
561
+ void triangular_matrix_right_product(InMat A, Triangle t, DiagonalStorage d, InOutMat C);
562
+ template<class ExecutionPolicy,
563
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
564
+ void triangular_matrix_right_product(ExecutionPolicy&& exec,
565
+ InMat A, Triangle t, DiagonalStorage d, InOutMat C);
566
+
567
+ // [linalg.algs.blas3.rankk], rank-k update of a symmetric or Hermitian matrix
568
+
569
+ // rank-k symmetric matrix update
570
+ template<class Scalar, in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
571
+ void symmetric_matrix_rank_k_update(Scalar alpha, InMat A, InOutMat C, Triangle t);
572
+ template<class ExecutionPolicy, class Scalar,
573
+ in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
574
+ void symmetric_matrix_rank_k_update(ExecutionPolicy&& exec,
575
+ Scalar alpha, InMat A, InOutMat C, Triangle t);
576
+
577
+ template<in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
578
+ void symmetric_matrix_rank_k_update(InMat A, InOutMat C, Triangle t);
579
+ template<class ExecutionPolicy,
580
+ in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
581
+ void symmetric_matrix_rank_k_update(ExecutionPolicy&& exec,
582
+ InMat A, InOutMat C, Triangle t);
583
+
584
+ // rank-k Hermitian matrix update
585
+ template<class Scalar, in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
586
+ void hermitian_matrix_rank_k_update(Scalar alpha, InMat A, InOutMat C, Triangle t);
587
+ template<class ExecutionPolicy,
588
+ class Scalar, in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
589
+ void hermitian_matrix_rank_k_update(ExecutionPolicy&& exec,
590
+ Scalar alpha, InMat A, InOutMat C, Triangle t);
591
+
592
+ template<in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
593
+ void hermitian_matrix_rank_k_update(InMat A, InOutMat C, Triangle t);
594
+ template<class ExecutionPolicy,
595
+ in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
596
+ void hermitian_matrix_rank_k_update(ExecutionPolicy&& exec,
597
+ InMat A, InOutMat C, Triangle t);
598
+
599
+ // [linalg.algs.blas3.rank2k], rank-2k update of a symmetric or Hermitian matrix
600
+
601
+ // rank-2k symmetric matrix update
602
+ template<in-matrix InMat1, in-matrix InMat2,
603
+ possibly-packed-inout-matrix InOutMat, class Triangle>
604
+ void symmetric_matrix_rank_2k_update(InMat1 A, InMat2 B, InOutMat C, Triangle t);
605
+ template<class ExecutionPolicy,
606
+ in-matrix InMat1, in-matrix InMat2,
607
+ possibly-packed-inout-matrix InOutMat, class Triangle>
608
+ void symmetric_matrix_rank_2k_update(ExecutionPolicy&& exec,
609
+ InMat1 A, InMat2 B, InOutMat C, Triangle t);
610
+
611
+ // rank-2k Hermitian matrix update
612
+ template<in-matrix InMat1, in-matrix InMat2,
613
+ possibly-packed-inout-matrix InOutMat, class Triangle>
614
+ void hermitian_matrix_rank_2k_update(InMat1 A, InMat2 B, InOutMat C, Triangle t);
615
+ template<class ExecutionPolicy,
616
+ in-matrix InMat1, in-matrix InMat2,
617
+ possibly-packed-inout-matrix InOutMat, class Triangle>
618
+ void hermitian_matrix_rank_2k_update(ExecutionPolicy&& exec,
619
+ InMat1 A, InMat2 B, InOutMat C, Triangle t);
620
+
621
+ // [linalg.algs.blas3.trsm], solve multiple triangular linear systems
622
+
623
+ // solve multiple triangular systems on the left, not-in-place
624
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
625
+ in-matrix InMat2, out-matrix OutMat, class BinaryDivideOp>
626
+ void triangular_matrix_matrix_left_solve(InMat1 A, Triangle t, DiagonalStorage d,
627
+ InMat2 B, OutMat X, BinaryDivideOp divide);
628
+ template<class ExecutionPolicy,
629
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
630
+ in-matrix InMat2, out-matrix OutMat, class BinaryDivideOp>
631
+ void triangular_matrix_matrix_left_solve(ExecutionPolicy&& exec,
632
+ InMat1 A, Triangle t, DiagonalStorage d,
633
+ InMat2 B, OutMat X, BinaryDivideOp divide);
634
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
635
+ in-matrix InMat2, out-matrix OutMat>
636
+ void triangular_matrix_matrix_left_solve(InMat1 A, Triangle t, DiagonalStorage d,
637
+ InMat2 B, OutMat X);
638
+ template<class ExecutionPolicy,
639
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
640
+ in-matrix InMat2, out-matrix OutMat>
641
+ void triangular_matrix_matrix_left_solve(ExecutionPolicy&& exec,
642
+ InMat1 A, Triangle t, DiagonalStorage d,
643
+ InMat2 B, OutMat X);
644
+
645
+ // solve multiple triangular systems on the right, not-in-place
646
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
647
+ in-matrix InMat2, out-matrix OutMat, class BinaryDivideOp>
648
+ void triangular_matrix_matrix_right_solve(InMat1 A, Triangle t, DiagonalStorage d,
649
+ InMat2 B, OutMat X, BinaryDivideOp divide);
650
+ template<class ExecutionPolicy,
651
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
652
+ in-matrix InMat2, out-matrix OutMat, class BinaryDivideOp>
653
+ void triangular_matrix_matrix_right_solve(ExecutionPolicy&& exec,
654
+ InMat1 A, Triangle t, DiagonalStorage d,
655
+ InMat2 B, OutMat X, BinaryDivideOp divide);
656
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
657
+ in-matrix InMat2, out-matrix OutMat>
658
+ void triangular_matrix_matrix_right_solve(InMat1 A, Triangle t, DiagonalStorage d,
659
+ InMat2 B, OutMat X);
660
+ template<class ExecutionPolicy,
661
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
662
+ in-matrix InMat2, out-matrix OutMat>
663
+ void triangular_matrix_matrix_right_solve(ExecutionPolicy&& exec,
664
+ InMat1 A, Triangle t, DiagonalStorage d,
665
+ InMat2 B, OutMat X);
666
+
667
+ // solve multiple triangular systems on the left, in-place
668
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
669
+ inout-matrix InOutMat, class BinaryDivideOp>
670
+ void triangular_matrix_matrix_left_solve(InMat A, Triangle t, DiagonalStorage d,
671
+ InOutMat B, BinaryDivideOp divide);
672
+ template<class ExecutionPolicy,
673
+ in-matrix InMat, class Triangle, class DiagonalStorage,
674
+ inout-matrix InOutMat, class BinaryDivideOp>
675
+ void triangular_matrix_matrix_left_solve(ExecutionPolicy&& exec,
676
+ InMat A, Triangle t, DiagonalStorage d,
677
+ InOutMat B, BinaryDivideOp divide);
678
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
679
+ void triangular_matrix_matrix_left_solve(InMat A, Triangle t, DiagonalStorage d,
680
+ InOutMat B);
681
+ template<class ExecutionPolicy,
682
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
683
+ void triangular_matrix_matrix_left_solve(ExecutionPolicy&& exec,
684
+ InMat A, Triangle t, DiagonalStorage d,
685
+ InOutMat B);
686
+
687
+ // solve multiple triangular systems on the right, in-place
688
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
689
+ inout-matrix InOutMat, class BinaryDivideOp>
690
+ void triangular_matrix_matrix_right_solve(InMat A, Triangle t, DiagonalStorage d,
691
+ InOutMat B, BinaryDivideOp divide);
692
+ template<class ExecutionPolicy,
693
+ in-matrix InMat, class Triangle, class DiagonalStorage,
694
+ inout-matrix InOutMat, class BinaryDivideOp>
695
+ void triangular_matrix_matrix_right_solve(ExecutionPolicy&& exec,
696
+ InMat A, Triangle t, DiagonalStorage d,
697
+ InOutMat B, BinaryDivideOp divide);
698
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
699
+ void triangular_matrix_matrix_right_solve(InMat A, Triangle t, DiagonalStorage d,
700
+ InOutMat B);
701
+ template<class ExecutionPolicy,
702
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
703
+ void triangular_matrix_matrix_right_solve(ExecutionPolicy&& exec,
704
+ InMat A, Triangle t, DiagonalStorage d,
705
+ InOutMat B);
706
+ }
707
+ ```
708
+
709
+ ### General <a id="linalg.general">[[linalg.general]]</a>
710
+
711
+ For the effects of all functions in [[linalg]], when the effects are
712
+ described as “computes R = E” or “compute R = E” (for some R and
713
+ mathematical expression E), the following apply:
714
+
715
+ - E has the conventional mathematical meaning as written.
716
+ - The pattern $x^T$ should be read as “the transpose of x.”
717
+ - The pattern $x^H$ should be read as “the conjugate transpose of x.”
718
+ - When R is the same name as a function parameter whose type is a
719
+ template parameter with `Out` in its name, the intent is that the
720
+ result of the computation is written to the elements of the function
721
+ parameter `R`.
722
+
723
+ Some of the functions and types in [[linalg]] distinguish between the
724
+ “rows” and the “columns” of a matrix. For a matrix `A` and a
725
+ multidimensional index `i, j` in `A.extents()`,
726
+
727
+ - *row* `i` of `A` is the set of elements `A[i, k1]` for all `k1` such
728
+ that `i, k1` is in `A.extents()`; and
729
+ - *column* `j` of `A` is the set of elements `A[k0, j]` for all `k0`
730
+ such that `k0, j` is in `A.extents()`.
731
+
732
+ Some of the functions in [[linalg]] distinguish between the “upper
733
+ triangle,” “lower triangle,” and “diagonal” of a matrix.
734
+
735
+ - The *diagonal* is the set of all elements of `A` accessed by `A[i,i]`
736
+ for 0 ≤ `i` \< min(`A.extent(0)`, `A.extent(1)`).
737
+ - The *upper triangle* of a matrix `A` is the set of all elements of `A`
738
+ accessed by `A[i,j]` with `i` ≤ `j`. It includes the diagonal.
739
+ - The *lower triangle* of `A` is the set of all elements of `A` accessed
740
+ by `A[i,j]` with `i` ≥ `j`. It includes the diagonal.
741
+
742
+ For any function `F` that takes a parameter named `t`, `t` applies to
743
+ accesses done through the parameter preceding `t` in the parameter list
744
+ of `F`. Let `m` be such an access-modified function parameter. `F` will
745
+ only access the triangle of `m` specified by `t`. For accesses `m[i, j]`
746
+ outside the triangle specified by `t`, `F` will use the value
747
+
748
+ - `conj-if-needed(m[j, i])` if the name of `F` starts with `hermitian`,
749
+ - `m[j, i]` if the name of `F` starts with `symmetric`, or
750
+ - the additive identity if the name of `F` starts with `triangular`.
751
+
752
+ [*Example 1*:
753
+
754
+ Small vector product accessing only specified triangle. It would not be
755
+ a precondition violation for the non-accessed matrix element to be
756
+ non-zero.
757
+
758
+ ``` cpp
759
+ template<class Triangle>
760
+ void triangular_matrix_vector_2x2_product(
761
+ mdspan<const float, extents<int, 2, 2>> m,
762
+ Triangle t,
763
+ mdspan<const float, extents<int, 2>> x,
764
+ mdspan<float, extents<int, 2>> y) {
765
+
766
+ static_assert(is_same_v<Triangle, lower_triangle_t> ||
767
+ is_same_v<Triangle, upper_triangle_t>);
768
+
769
+ if constexpr (is_same_v<Triangle, lower_triangle_t>) {
770
+ y[0] = m[0,0] * x[0]; // + 0 * x[1]
771
+ y[1] = m[1,0] * x[0] + m[1,1] * x[1];
772
+ } else { // upper_triangle_t
773
+ y[0] = m[0,0] * x[0] + m[0,1] * x[1];
774
+ y[1] = /* 0 * x[0] + */ m[1,1] * x[1];
775
+ }
776
+ }
777
+ ```
778
+
779
+ — *end example*]
780
+
781
+ For any function `F` that takes a parameter named `d`, `d` applies to
782
+ accesses done through the previous-of-the-previous parameter of `d` in
783
+ the parameter list of `F`. Let `m` be such an access-modified function
784
+ parameter. If `d` specifies that an implicit unit diagonal is to be
785
+ assumed, then
786
+
787
+ - `F` will not access the diagonal of `m`; and
788
+ - the algorithm will interpret `m` as if it has a unit diagonal, that
789
+ is, a diagonal each of whose elements behaves as a two-sided
790
+ multiplicative identity (even if `m`’s value type does not have a
791
+ two-sided multiplicative identity).
792
+
793
+ Otherwise, if `d` specifies that an explicit diagonal is to be assumed,
794
+ then `F` will access the diagonal of `m`.
795
+
796
+ Within all the functions in [[linalg]], any calls to `abs`, `conj`,
797
+ `imag`, and `real` are unqualified.
798
+
799
+ Two `mdspan` objects `x` and `y` *alias* each other, if they have the
800
+ same extents `e`, and for every pack of integers `i` which is a
801
+ multidimensional index in `e`, `x[i...]` and `y[i...]` refer to the same
802
+ element.
803
+
804
+ [*Note 1*: This means that `x` and `y` view the same elements in the
805
+ same order. — *end note*]
806
+
807
+ Two `mdspan` objects `x` and `y` *overlap* each other, if for some pack
808
+ of integers `i` that is a multidimensional index in `x.extents()`, there
809
+ exists a pack of integers `j` that is a multidimensional index in
810
+ `y.extents()`, such that `x[i...]` and `y[j...]` refer to the same
811
+ element.
812
+
813
+ [*Note 2*: Aliasing is a special case of overlapping. If `x` and `y` do
814
+ not overlap, then they also do not alias each other. — *end note*]
815
+
816
+ ### Requirements <a id="linalg.reqs">[[linalg.reqs]]</a>
817
+
818
+ #### Linear algebra value types <a id="linalg.reqs.val">[[linalg.reqs.val]]</a>
819
+
820
+ Throughout [[linalg]], the following types are *linear algebra value
821
+ types*:
822
+
823
+ - the `value_type` type alias of any input or output `mdspan`
824
+ parameter(s) of any function in [[linalg]]; and
825
+ - the `Scalar` template parameter (if any) of any function or class in
826
+ [[linalg]].
827
+
828
+ Linear algebra value types shall model `semiregular`.
829
+
830
+ A value-initialized object of linear algebra value type shall act as the
831
+ additive identity.
832
+
833
+ #### Algorithm and class requirements <a id="linalg.reqs.alg">[[linalg.reqs.alg]]</a>
834
+
835
+ [[linalg.reqs.alg]] lists common requirements for all algorithms and
836
+ classes in [[linalg]].
837
+
838
+ All of the following statements presume that the algorithm’s asymptotic
839
+ complexity requirements, if any, are satisfied.
840
+
841
+ - The function may make arbitrarily many objects of any linear algebra
842
+ value type, value-initializing or direct-initializing them with any
843
+ existing object of that type.
844
+ - The *triangular solve algorithms* in [[linalg.algs.blas2.trsv]],
845
+ [[linalg.algs.blas3.trmm]], [[linalg.algs.blas3.trsm]], and
846
+ [[linalg.algs.blas3.inplacetrsm]] either have a `BinaryDivideOp`
847
+ template parameter (see [[linalg.algs.reqs]]) and a binary function
848
+ object parameter `divide` of that type, or they have effects
849
+ equivalent to invoking such an algorithm. Triangular solve algorithms
850
+ interpret `divide(a, b)` as `a` times the multiplicative inverse of
851
+ `b`. Each triangular solve algorithm uses a sequence of evaluations of
852
+ `*`, `*=`, `divide`, unary `+`, binary `+`, `+=`, unary `-`, binary
853
+ `-`, `-=`, and `=` operators that would produce the result specified
854
+ by the algorithm’s *Effects* and *Remarks* when operating on elements
855
+ of a field with noncommutative multiplication. It is a precondition of
856
+ the algorithm that any addend, any subtrahend, any partial sum of
857
+ addends in any order (treating any difference as a sum with the second
858
+ term negated), any factor, any partial product of factors respecting
859
+ their order, any numerator (first argument of `divide`), any
860
+ denominator (second argument of `divide`), and any assignment is a
861
+ well-formed expression.
862
+ - Each function in [[linalg.algs.blas1]], [[linalg.algs.blas2]], and
863
+ [[linalg.algs.blas3]] that is not a triangular solve algorithm will
864
+ use a sequence of evaluations of `*`, `*=`, `+`, `+=`, and `=`
865
+ operators that would produce the result specified by the algorithm’s
866
+ *Effects* and *Remarks* when operating on elements of a semiring with
867
+ noncommutative multiplication. It is a precondition of the algorithm
868
+ that any addend, any partial sum of addends in any order, any factor,
869
+ any partial product of factors respecting their order, and any
870
+ assignment is a well-formed expression.
871
+ - If the function has an output `mdspan`, then all addends, subtrahends
872
+ (for the triangular solve algorithms), or results of the `divide`
873
+ parameter on intermediate terms (if the function takes a `divide`
874
+ parameter) are assignable and convertible to the output `mdspan`’s
875
+ `value_type`.
876
+ - The function may reorder addends and partial sums arbitrarily.
877
+ \[*Note 1*: Factors in each product are not reordered; multiplication
878
+ is not necessarily commutative. — *end note*]
879
+
880
+ [*Note 2*: The above requirements do not prohibit implementation
881
+ approaches and optimization techniques which are not user-observable. In
882
+ particular, if for all input and output arguments the `value_type` is a
883
+ floating-point type, implementers are free to leverage approximations,
884
+ use arithmetic operations not explicitly listed above, and compute
885
+ floating-point sums in any way that improves their
886
+ accuracy. — *end note*]
887
+
888
+ [*Note 3*:
889
+
890
+ For all functions in [[linalg]], suppose that all input and output
891
+ `mdspan` have as `value_type` a floating-point type, and any `Scalar`
892
+ template argument has a floating-point type. Then, functions can do all
893
+ of the following:
894
+
895
+ - compute floating-point sums in any way that improves their accuracy
896
+ for arbitrary input;
897
+ - perform additional arithmetic operations (other than those specified
898
+ by the function’s wording and [[linalg.reqs.alg]]) in order to improve
899
+ performance or accuracy; and
900
+ - use approximations (that might not be exact even if computing with
901
+ real numbers), instead of computations that would be exact if it were
902
+ possible to compute without rounding error;
903
+
904
+ as long as
905
+
906
+ - the function satisfies the complexity requirements; and
907
+ - the function is logarithmically stable, as defined in Demmel 2007.
908
+ Strassen’s algorithm for matrix-matrix multiply is an example of a
909
+ logarithmically stable algorithm.
910
+
911
+ — *end note*]
912
+
913
+ ### Tag classes <a id="linalg.tags">[[linalg.tags]]</a>
914
+
915
+ #### Storage order tags <a id="linalg.tags.order">[[linalg.tags.order]]</a>
916
+
917
+ The storage order tags describe the order of elements in an `mdspan`
918
+ with `layout_blas_packed` [[linalg.layout.packed]] layout.
919
+
920
+ ``` cpp
921
+ struct column_major_t {
922
+ explicit column_major_t() = default;
923
+ };
924
+ inline constexpr column_major_t column_major{};
925
+
926
+ struct row_major_t {
927
+ explicit row_major_t() = default;
928
+ };
929
+ inline constexpr row_major_t row_major{};
930
+ ```
931
+
932
+ `column_major_t` indicates a column-major order, and `row_major_t`
933
+ indicates a row-major order.
934
+
935
+ #### Triangle tags <a id="linalg.tags.triangle">[[linalg.tags.triangle]]</a>
936
+
937
+ ``` cpp
938
+ struct upper_triangle_t {
939
+ explicit upper_triangle_t() = default;
940
+ };
941
+ inline constexpr upper_triangle_t upper_triangle{};
942
+
943
+ struct lower_triangle_t {
944
+ explicit lower_triangle_t() = default;
945
+ };
946
+ inline constexpr lower_triangle_t lower_triangle{};
947
+ ```
948
+
949
+ These tag classes specify whether algorithms and other users of a matrix
950
+ (represented as an `mdspan`) access the upper triangle
951
+ (`upper_triangle_t`) or lower triangle (`lower_triangle_t`) of the
952
+ matrix (see also [[linalg.general]]). This is also subject to the
953
+ restrictions of `implicit_unit_diagonal_t` if that tag is also used as a
954
+ function argument; see below.
955
+
956
+ #### Diagonal tags <a id="linalg.tags.diagonal">[[linalg.tags.diagonal]]</a>
957
+
958
+ ``` cpp
959
+ struct implicit_unit_diagonal_t {
960
+ explicit implicit_unit_diagonal_t() = default;
961
+ };
962
+ inline constexpr implicit_unit_diagonal_t implicit_unit_diagonal{};
963
+
964
+ struct explicit_diagonal_t {
965
+ explicit explicit_diagonal_t() = default;
966
+ };
967
+ inline constexpr explicit_diagonal_t explicit_diagonal{};
968
+ ```
969
+
970
+ These tag classes specify whether algorithms access the matrix’s
971
+ diagonal entries, and if not, then how algorithms interpret the matrix’s
972
+ implicitly represented diagonal values.
973
+
974
+ The `implicit_unit_diagonal_t` tag indicates that an implicit unit
975
+ diagonal is to be assumed [[linalg.general]].
976
+
977
+ The `explicit_diagonal_t` tag indicates that an explicit diagonal is
978
+ used [[linalg.general]].
979
+
980
+ ### Layouts for packed matrix types <a id="linalg.layout.packed">[[linalg.layout.packed]]</a>
981
+
982
+ #### Overview <a id="linalg.layout.packed.overview">[[linalg.layout.packed.overview]]</a>
983
+
984
+ `layout_blas_packed` is an `mdspan` layout mapping policy that
985
+ represents a square matrix that stores only the entries in one triangle,
986
+ in a packed contiguous format. Its `Triangle` template parameter
987
+ determines whether an `mdspan` with this layout stores the upper or
988
+ lower triangle of the matrix. Its `StorageOrder` template parameter
989
+ determines whether the layout packs the matrix’s elements in
990
+ column-major or row-major order.
991
+
992
+ A `StorageOrder` of `column_major_t` indicates column-major ordering.
993
+ This packs matrix elements starting with the leftmost (least column
994
+ index) column, and proceeding column by column, from the top entry
995
+ (least row index).
996
+
997
+ A `StorageOrder` of `row_major_t` indicates row-major ordering. This
998
+ packs matrix elements starting with the topmost (least row index) row,
999
+ and proceeding row by row, from the leftmost (least column index) entry.
1000
+
1001
+ [*Note 1*: `layout_blas_packed` describes the data layout used by the
1002
+ BLAS’ Symmetric Packed (SP), Hermitian Packed (HP), and Triangular
1003
+ Packed (TP) matrix types. — *end note*]
1004
+
1005
+ ``` cpp
1006
+ namespace std::linalg {
1007
+ template<class Triangle, class StorageOrder>
1008
+ class layout_blas_packed {
1009
+ public:
1010
+ using triangle_type = Triangle;
1011
+ using storage_order_type = StorageOrder;
1012
+
1013
+ template<class Extents>
1014
+ struct mapping {
1015
+ public:
1016
+ using extents_type = Extents;
1017
+ using index_type = extents_type::index_type;
1018
+ using size_type = extents_type::size_type;
1019
+ using rank_type = extents_type::rank_type;
1020
+ using layout_type = layout_blas_packed;
1021
+
1022
+ // [linalg.layout.packed.cons], constructors
1023
+ constexpr mapping() noexcept = default;
1024
+ constexpr mapping(const mapping&) noexcept = default;
1025
+ constexpr mapping(const extents_type&) noexcept;
1026
+ template<class OtherExtents>
1027
+ constexpr explicit(!is_convertible_v<OtherExtents, extents_type>)
1028
+ mapping(const mapping<OtherExtents>& other) noexcept;
1029
+
1030
+ constexpr mapping& operator=(const mapping&) noexcept = default;
1031
+
1032
+ // [linalg.layout.packed.obs], observers
1033
+ constexpr const extents_type& extents() const noexcept { return extents_; }
1034
+
1035
+ constexpr index_type required_span_size() const noexcept;
1036
+
1037
+ template<class Index0, class Index1>
1038
+ constexpr index_type operator() (Index0 ind0, Index1 ind1) const noexcept;
1039
+
1040
+ static constexpr bool is_always_unique() noexcept {
1041
+ return (extents_type::static_extent(0) != dynamic_extent &&
1042
+ extents_type::static_extent(0) < 2) ||
1043
+ (extents_type::static_extent(1) != dynamic_extent &&
1044
+ extents_type::static_extent(1) < 2);
1045
+ }
1046
+ static constexpr bool is_always_exhaustive() noexcept { return true; }
1047
+ static constexpr bool is_always_strided() noexcept
1048
+ { return is_always_unique(); }
1049
+
1050
+ constexpr bool is_unique() const noexcept {
1051
+ return extents_.extent(0) < 2;
1052
+ }
1053
+ constexpr bool is_exhaustive() const noexcept { return true; }
1054
+ constexpr bool is_strided() const noexcept {
1055
+ return extents_.extent(0) < 2;
1056
+ }
1057
+
1058
+ constexpr index_type stride(rank_type) const noexcept;
1059
+
1060
+ template<class OtherExtents>
1061
+ friend constexpr bool operator==(const mapping&, const mapping<OtherExtents>&) noexcept;
1062
+
1063
+ private:
1064
+ extents_type extents_{}; // exposition only
1065
+ };
1066
+ };
1067
+ }
1068
+ ```
1069
+
1070
+ *Mandates:*
1071
+
1072
+ - `Triangle` is either `upper_triangle_t` or `lower_triangle_t`,
1073
+ - `StorageOrder` is either `column_major_t` or `row_major_t`,
1074
+ - `Extents` is a specialization of `std::extents`,
1075
+ - `Extents::rank()` equals 2,
1076
+ - one of
1077
+ ``` cpp
1078
+ extents_type::static_extent(0) == dynamic_extent,
1079
+ extents_type::static_extent(1) == dynamic_extent, or
1080
+ extents_type::static_extent(0) == extents_type::static_extent(1)
1081
+ ```
1082
+
1083
+ is `true`, and
1084
+ - if `Extents::rank_dynamic() == 0` is `true`, let Nₛ be equal to
1085
+ `Extents::static_extent(0)`; then, Nₛ × (Nₛ + 1) is representable as a
1086
+ value of type `index_type`.
1087
+
1088
+ `layout_blas_packed<T, SO>::mapping<E>` is a trivially copyable type
1089
+ that models `regular` for each `T`, `SO`, and `E`.
1090
+
1091
+ #### Constructors <a id="linalg.layout.packed.cons">[[linalg.layout.packed.cons]]</a>
1092
+
1093
+ ``` cpp
1094
+ constexpr mapping(const extents_type& e) noexcept;
1095
+ ```
1096
+
1097
+ *Preconditions:*
1098
+
1099
+ - Let N be equal to `e.extent(0)`. Then, N × (N+1) is representable as a
1100
+ value of type `index_type` [[basic.fundamental]].
1101
+ - `e.extent(0)` equals `e.extent(1)`.
1102
+
1103
+ *Effects:* Direct-non-list-initializes *extents\_* with `e`.
1104
+
1105
+ ``` cpp
1106
+ template<class OtherExtents>
1107
+ explicit(!is_convertible_v<OtherExtents, extents_type>)
1108
+ constexpr mapping(const mapping<OtherExtents>& other) noexcept;
1109
+ ```
1110
+
1111
+ *Constraints:* `is_constructible_v<extents_type, OtherExtents>` is
1112
+ `true`.
1113
+
1114
+ *Preconditions:* Let N be `other.extents().extent(0)`. Then, N × (N+1)
1115
+ is representable as a value of type `index_type` [[basic.fundamental]].
1116
+
1117
+ *Effects:* Direct-non-list-initializes *extents\_* with
1118
+ `other.extents()`.
1119
+
1120
+ #### Observers <a id="linalg.layout.packed.obs">[[linalg.layout.packed.obs]]</a>
1121
+
1122
+ ``` cpp
1123
+ constexpr index_type required_span_size() const noexcept;
1124
+ ```
1125
+
1126
+ *Returns:* *`extents_`*`.extent(0) * (`*`extents_`*`.extent(0) + 1)/2`.
1127
+
1128
+ [*Note 1*: For example, a 5 x 5 packed matrix only stores 15 matrix
1129
+ elements. — *end note*]
1130
+
1131
+ ``` cpp
1132
+ template<class Index0, class Index1>
1133
+ constexpr index_type operator() (Index0 ind0, Index1 ind1) const noexcept;
1134
+ ```
1135
+
1136
+ *Constraints:*
1137
+
1138
+ - `is_convertible_v<Index0, index_type>` is `true`,
1139
+ - `is_convertible_v<Index1, index_type>` is `true`,
1140
+ - `is_nothrow_constructible_v<index_type, Index0>` is `true`, and
1141
+ - `is_nothrow_constructible_v<index_type, Index1>` is `true`.
1142
+
1143
+ Let `i` be `extents_type::`*`index-cast`*`(ind0)`, and let `j` be
1144
+ `extents_type::`*`index-cast`*`(ind1)`.
1145
+
1146
+ *Preconditions:* `i, j` is a multidimensional index in
1147
+ *extents\_*[[mdspan.overview]].
1148
+
1149
+ *Returns:* Let `N` be *`extents_`*`.extent(0)`. Then
1150
+
1151
+ - `(*this)(j, i)` if `i > j` is `true`; otherwise
1152
+ - `i + j * (j + 1)/2` if
1153
+ ``` cpp
1154
+ is_same_v<StorageOrder, column_major_t> && is_same_v<Triangle, upper_triangle_t>
1155
+ ```
1156
+
1157
+ is `true` or
1158
+ ``` cpp
1159
+ is_same_v<StorageOrder, row_major_t> && is_same_v<Triangle, lower_triangle_t>
1160
+ ```
1161
+
1162
+ is `true`; otherwise
1163
+ - `j + N * i - i * (i + 1)/2`.
1164
+
1165
+ ``` cpp
1166
+ constexpr index_type stride(rank_type r) const noexcept;
1167
+ ```
1168
+
1169
+ *Preconditions:*
1170
+
1171
+ - `is_strided()` is `true`, and
1172
+ - `r < extents_type::rank()` is `true`.
1173
+
1174
+ *Returns:* `1`.
1175
+
1176
+ ``` cpp
1177
+ template<class OtherExtents>
1178
+ friend constexpr bool operator==(const mapping& x, const mapping<OtherExtents>& y) noexcept;
1179
+ ```
1180
+
1181
+ *Effects:* Equivalent to: `return x.extents() == y.extents();`
1182
+
1183
+ ### Exposition-only helpers <a id="linalg.helpers">[[linalg.helpers]]</a>
1184
+
1185
+ #### *`abs-if-needed`* <a id="linalg.helpers.abs">[[linalg.helpers.abs]]</a>
1186
+
1187
+ The name *`abs-if-needed`* denotes an exposition-only function object.
1188
+ The expression `abs-if-needed(E)` for a subexpression `E` whose type is
1189
+ `T` is expression-equivalent to:
1190
+
1191
+ - `E` if `T` is an unsigned integer;
1192
+ - otherwise, `std::abs(E)` if `T` is an arithmetic type,
1193
+ - otherwise, `abs(E)`, if that expression is valid, with overload
1194
+ resolution performed in a context that includes the declaration
1195
+ ``` cpp
1196
+ template<class U> U abs(U) = delete;
1197
+ ```
1198
+
1199
+ If the function selected by overload resolution does not return the
1200
+ absolute value of its input, the program is ill-formed, no diagnostic
1201
+ required.
1202
+
1203
+ #### *`conj-if-needed`* <a id="linalg.helpers.conj">[[linalg.helpers.conj]]</a>
1204
+
1205
+ The name *`conj-if-needed`* denotes an exposition-only function object.
1206
+ The expression `conj-if-needed(E)` for a subexpression `E` whose type is
1207
+ `T` is expression-equivalent to:
1208
+
1209
+ - `conj(E)`, if `T` is not an arithmetic type and the expression
1210
+ `conj(E)` is valid, with overload resolution performed in a context
1211
+ that includes the declaration
1212
+ ``` cpp
1213
+ template<class U> U conj(const U&) = delete;
1214
+ ```
1215
+
1216
+ If the function selected by overload resolution does not return the
1217
+ complex conjugate of its input, the program is ill-formed, no
1218
+ diagnostic required;
1219
+ - otherwise, `E`.
1220
+
1221
+ #### *`real-if-needed`* <a id="linalg.helpers.real">[[linalg.helpers.real]]</a>
1222
+
1223
+ The name *`real-if-needed`* denotes an exposition-only function object.
1224
+ The expression `real-if-needed(E)` for a subexpression `E` whose type is
1225
+ `T` is expression-equivalent to:
1226
+
1227
+ - `real(E)`, if `T` is not an arithmetic type and the expression
1228
+ `real(E)` is valid, with overload resolution performed in a context
1229
+ that includes the declaration
1230
+ ``` cpp
1231
+ template<class U> U real(const U&) = delete;
1232
+ ```
1233
+
1234
+ If the function selected by overload resolution does not return the
1235
+ real part of its input, the program is ill-formed, no diagnostic
1236
+ required;
1237
+ - otherwise, `E`.
1238
+
1239
+ #### *`imag-if-needed`* <a id="linalg.helpers.imag">[[linalg.helpers.imag]]</a>
1240
+
1241
+ The name *`imag-if-needed`* denotes an exposition-only function object.
1242
+ The expression `imag-if-needed(E)` for a subexpression `E` whose type is
1243
+ `T` is expression-equivalent to:
1244
+
1245
+ - `imag(E)`, if `T` is not an arithmetic type and the expression
1246
+ `imag(E)` is valid, with overload resolution performed in a context
1247
+ that includes the declaration
1248
+ ``` cpp
1249
+ template<class U> U imag(const U&) = delete;
1250
+ ```
1251
+
1252
+ If the function selected by overload resolution does not return the
1253
+ imaginary part of its input, the program is ill-formed, no diagnostic
1254
+ required;
1255
+ - otherwise, `((void)E, T{})`.
1256
+
1257
+ #### Argument concepts <a id="linalg.helpers.concepts">[[linalg.helpers.concepts]]</a>
1258
+
1259
+ The exposition-only concepts defined in this section constrain the
1260
+ algorithms in [[linalg]].
1261
+
1262
+ ``` cpp
1263
+ template<class T>
1264
+ constexpr bool is-mdspan = false;
1265
+
1266
+ template<class ElementType, class Extents, class Layout, class Accessor>
1267
+ constexpr bool is-mdspan<mdspan<ElementType, Extents, Layout, Accessor>> = true;
1268
+
1269
+ template<class T>
1270
+ concept in-vector =
1271
+ is-mdspan<T> && T::rank() == 1;
1272
+
1273
+ template<class T>
1274
+ concept out-vector =
1275
+ is-mdspan<T> && T::rank() == 1 &&
1276
+ is_assignable_v<typename T::reference, typename T::element_type> && T::is_always_unique();
1277
+
1278
+ template<class T>
1279
+ concept inout-vector =
1280
+ is-mdspan<T> && T::rank() == 1 &&
1281
+ is_assignable_v<typename T::reference, typename T::element_type> && T::is_always_unique();
1282
+
1283
+ template<class T>
1284
+ concept in-matrix =
1285
+ is-mdspan<T> && T::rank() == 2;
1286
+
1287
+ template<class T>
1288
+ concept out-matrix =
1289
+ is-mdspan<T> && T::rank() == 2 &&
1290
+ is_assignable_v<typename T::reference, typename T::element_type> && T::is_always_unique();
1291
+
1292
+ template<class T>
1293
+ concept inout-matrix =
1294
+ is-mdspan<T> && T::rank() == 2 &&
1295
+ is_assignable_v<typename T::reference, typename T::element_type> && T::is_always_unique();
1296
+
1297
+ template<class T>
1298
+ constexpr bool is-layout-blas-packed = false; // exposition only
1299
+
1300
+ template<class Triangle, class StorageOrder>
1301
+ constexpr bool is-layout-blas-packed<layout_blas_packed<Triangle, StorageOrder>> = true;
1302
+
1303
+ template<class T>
1304
+ concept possibly-packed-inout-matrix =
1305
+ is-mdspan<T> && T::rank() == 2 &&
1306
+ is_assignable_v<typename T::reference, typename T::element_type> &&
1307
+ (T::is_always_unique() || is-layout-blas-packed<typename T::layout_type>);
1308
+
1309
+ template<class T>
1310
+ concept in-object =
1311
+ is-mdspan<T> && (T::rank() == 1 || T::rank() == 2);
1312
+
1313
+ template<class T>
1314
+ concept out-object =
1315
+ is-mdspan<T> && (T::rank() == 1 || T::rank() == 2) &&
1316
+ is_assignable_v<typename T::reference, typename T::element_type> && T::is_always_unique();
1317
+
1318
+ template<class T>
1319
+ concept inout-object =
1320
+ is-mdspan<T> && (T::rank() == 1 || T::rank() == 2) &&
1321
+ is_assignable_v<typename T::reference, typename T::element_type> && T::is_always_unique();
1322
+ ```
1323
+
1324
+ If a function in [[linalg]] accesses the elements of a parameter
1325
+ constrained by `in-vector`, `in-matrix`, or `in-object`, those accesses
1326
+ will not modify the elements.
1327
+
1328
+ Unless explicitly permitted, any `inout-vector`, `inout-matrix`,
1329
+ `inout-object`, `out-vector`, `out-matrix`, `out-object`, or
1330
+ `possibly-packed-inout-matrix` parameter of a function in [[linalg]]
1331
+ shall not overlap any other `mdspan` parameter of the function.
1332
+
1333
+ #### Mandates <a id="linalg.helpers.mandates">[[linalg.helpers.mandates]]</a>
1334
+
1335
+ [*Note 1*: These exposition-only helper functions use the less
1336
+ constraining input concepts even for the output arguments, because the
1337
+ additional constraint for assignability of elements is not necessary,
1338
+ and they are sometimes used in a context where the third argument is an
1339
+ input type too. — *end note*]
1340
+
1341
+ ``` cpp
1342
+ template<class MDS1, class MDS2>
1343
+ requires(is-mdspan<MDS1> && is-mdspan<MDS2>)
1344
+ constexpr
1345
+ bool compatible-static-extents(size_t r1, size_t r2) { // exposition only
1346
+ return MDS1::static_extent(r1) == dynamic_extent ||
1347
+ MDS2::static_extent(r2) == dynamic_extent ||
1348
+ MDS1::static_extent(r1) == MDS2::static_extent(r2);
1349
+ }
1350
+
1351
+ template<in-vector In1, in-vector In2, in-vector Out>
1352
+ constexpr bool possibly-addable() { // exposition only
1353
+ return compatible-static-extents<Out, In1>(0, 0) &&
1354
+ compatible-static-extents<Out, In2>(0, 0) &&
1355
+ compatible-static-extents<In1, In2>(0, 0);
1356
+ }
1357
+
1358
+ template<in-matrix In1, in-matrix In2, in-matrix Out>
1359
+ constexpr bool possibly-addable() { // exposition only
1360
+ return compatible-static-extents<Out, In1>(0, 0) &&
1361
+ compatible-static-extents<Out, In1>(1, 1) &&
1362
+ compatible-static-extents<Out, In2>(0, 0) &&
1363
+ compatible-static-extents<Out, In2>(1, 1) &&
1364
+ compatible-static-extents<In1, In2>(0, 0) &&
1365
+ compatible-static-extents<In1, In2>(1, 1);
1366
+ }
1367
+
1368
+ template<in-matrix InMat, in-vector InVec, in-vector OutVec>
1369
+ constexpr bool possibly-multipliable() { // exposition only
1370
+ return compatible-static-extents<OutVec, InMat>(0, 0) &&
1371
+ compatible-static-extents<InMat, InVec>(1, 0);
1372
+ }
1373
+
1374
+ template<in-vector InVec, in-matrix InMat, in-vector OutVec>
1375
+ constexpr bool possibly-multipliable() { // exposition only
1376
+ return compatible-static-extents<OutVec, InMat>(0, 1) &&
1377
+ compatible-static-extents<InMat, InVec>(0, 0);
1378
+ }
1379
+
1380
+ template<in-matrix InMat1, in-matrix InMat2, in-matrix OutMat>
1381
+ constexpr bool possibly-multipliable() { // exposition only
1382
+ return compatible-static-extents<OutMat, InMat1>(0, 0) &&
1383
+ compatible-static-extents<OutMat, InMat2>(1, 1) &&
1384
+ compatible-static-extents<InMat1, InMat2>(1, 0);
1385
+ }
1386
+ ```
1387
+
1388
+ #### Preconditions <a id="linalg.helpers.precond">[[linalg.helpers.precond]]</a>
1389
+
1390
+ [*Note 1*: These exposition-only helper functions use the less
1391
+ constraining input concepts even for the output arguments, because the
1392
+ additional constraint for assignability of elements is not necessary,
1393
+ and they are sometimes used in a context where the third argument is an
1394
+ input type too. — *end note*]
1395
+
1396
+ ``` cpp
1397
+ constexpr bool addable( // exposition only
1398
+ const in-vector auto& in1, const in-vector auto& in2, const in-vector auto& out) {
1399
+ return out.extent(0) == in1.extent(0) && out.extent(0) == in2.extent(0);
1400
+ }
1401
+
1402
+ constexpr bool addable( // exposition only
1403
+ const in-matrix auto& in1, const in-matrix auto& in2, const in-matrix auto& out) {
1404
+ return out.extent(0) == in1.extent(0) && out.extent(1) == in1.extent(1) &&
1405
+ out.extent(0) == in2.extent(0) && out.extent(1) == in2.extent(1);
1406
+ }
1407
+
1408
+ constexpr bool multipliable( // exposition only
1409
+ const in-matrix auto& in_mat, const in-vector auto& in_vec, const in-vector auto& out_vec) {
1410
+ return out_vec.extent(0) == in_mat.extent(0) && in_mat.extent(1) == in_vec.extent(0);
1411
+ }
1412
+
1413
+ constexpr bool multipliable( // exposition only
1414
+ const in-vector auto& in_vec, const in-matrix auto& in_mat, const in-vector auto& out_vec) {
1415
+ return out_vec.extent(0) == in_mat.extent(1) && in_mat.extent(0) == in_vec.extent(0);
1416
+ }
1417
+
1418
+ constexpr bool multipliable( // exposition only
1419
+ const in-matrix auto& in_mat1, const in-matrix auto& in_mat2, const in-matrix auto& out_mat) {
1420
+ return out_mat.extent(0) == in_mat1.extent(0) && out_mat.extent(1) == in_mat2.extent(1) &&
1421
+ in_mat1.extent(1) == in_mat2.extent(0);
1422
+ }
1423
+ ```
1424
+
1425
+ ### Scaled in-place transformation <a id="linalg.scaled">[[linalg.scaled]]</a>
1426
+
1427
+ #### Introduction <a id="linalg.scaled.intro">[[linalg.scaled.intro]]</a>
1428
+
1429
+ The `scaled` function takes a value `alpha` and an `mdspan` `x`, and
1430
+ returns a new read-only `mdspan` that represents the elementwise product
1431
+ of `alpha` with each element of `x`.
1432
+
1433
+ [*Example 1*:
1434
+
1435
+ ``` cpp
1436
+ using Vec = mdspan<double, dextents<size_t, 1>>;
1437
+
1438
+ // z = alpha * x + y
1439
+ void z_equals_alpha_times_x_plus_y(double alpha, Vec x, Vec y, Vec z) {
1440
+ add(scaled(alpha, x), y, z);
1441
+ }
1442
+
1443
+ // z = alpha * x + beta * y
1444
+ void z_equals_alpha_times_x_plus_beta_times_y(double alpha, Vec x, double beta, Vec y, Vec z) {
1445
+ add(scaled(alpha, x), scaled(beta, y), z);
1446
+ }
1447
+ ```
1448
+
1449
+ — *end example*]
1450
+
1451
+ #### Class template `scaled_accessor` <a id="linalg.scaled.scaledaccessor">[[linalg.scaled.scaledaccessor]]</a>
1452
+
1453
+ The class template `scaled_accessor` is an `mdspan` accessor policy
1454
+ which upon access produces scaled elements. It is part of the
1455
+ implementation of `scaled` [[linalg.scaled.scaled]].
1456
+
1457
+ ``` cpp
1458
+ namespace std::linalg {
1459
+ template<class ScalingFactor, class NestedAccessor>
1460
+ class scaled_accessor {
1461
+ public:
1462
+ using element_type =
1463
+ const decltype(declval<ScalingFactor>() * declval<NestedAccessor::element_type>());
1464
+ using reference = remove_const_t<element_type>;
1465
+ using data_handle_type = NestedAccessor::data_handle_type;
1466
+ using offset_policy = scaled_accessor<ScalingFactor, NestedAccessor::offset_policy>;
1467
+
1468
+ constexpr scaled_accessor() = default;
1469
+ template<class OtherNestedAccessor>
1470
+ explicit(!is_convertible_v<OtherNestedAccessor, NestedAccessor>)
1471
+ constexpr scaled_accessor(const scaled_accessor<ScalingFactor,
1472
+ OtherNestedAccessor>& other);
1473
+ constexpr scaled_accessor(const ScalingFactor& s, const NestedAccessor& a);
1474
+
1475
+ constexpr reference access(data_handle_type p, size_t i) const;
1476
+ constexpr offset_policy::data_handle_type offset(data_handle_type p, size_t i) const;
1477
+
1478
+ constexpr const ScalingFactor& scaling_factor() const noexcept { return scaling-factor; }
1479
+ constexpr const NestedAccessor& nested_accessor() const noexcept { return nested-accessor; }
1480
+
1481
+ private:
1482
+ ScalingFactor scaling-factor{}; // exposition only
1483
+ NestedAccessor nested-accessor{}; // exposition only
1484
+ };
1485
+ }
1486
+ ```
1487
+
1488
+ *Mandates:*
1489
+
1490
+ - `element_type` is valid and denotes a type,
1491
+ - `is_copy_constructible_v<reference>` is `true`,
1492
+ - `is_reference_v<element_type>` is `false`,
1493
+ - `ScalingFactor` models `semiregular`, and
1494
+ - `NestedAccessor` meets the accessor policy requirements
1495
+ [[mdspan.accessor.reqmts]].
1496
+
1497
+ ``` cpp
1498
+ template<class OtherNestedAccessor>
1499
+ explicit(!is_convertible_v<OtherNestedAccessor, NestedAccessor>)
1500
+ constexpr scaled_accessor(const scaled_accessor<ScalingFactor, OtherNestedAccessor>& other);
1501
+ ```
1502
+
1503
+ *Constraints:*
1504
+ `is_constructible_v<NestedAccessor, const OtherNestedAccessor&>` is
1505
+ `true`.
1506
+
1507
+ *Effects:*
1508
+
1509
+ - Direct-non-list-initializes *scaling-factor* with
1510
+ `other.scaling_factor()`, and
1511
+ - direct-non-list-initializes *nested-accessor* with
1512
+ `other.nested_accessor()`.
1513
+
1514
+ ``` cpp
1515
+ constexpr scaled_accessor(const ScalingFactor& s, const NestedAccessor& a);
1516
+ ```
1517
+
1518
+ *Effects:*
1519
+
1520
+ - Direct-non-list-initializes *scaling-factor* with `s`, and
1521
+ - direct-non-list-initializes *nested-accessor* with `a`.
1522
+
1523
+ ``` cpp
1524
+ constexpr reference access(data_handle_type p, size_t i) const;
1525
+ ```
1526
+
1527
+ *Returns:*
1528
+
1529
+ ``` cpp
1530
+ scaling_factor() * NestedAccessor::element_type(nested-accessor.access(p, i))
1531
+ ```
1532
+
1533
+ ``` cpp
1534
+ constexpr offset_policy::data_handle_type offset(data_handle_type p, size_t i) const;
1535
+ ```
1536
+
1537
+ *Returns:* *`nested-accessor`*`.offset(p, i)`
1538
+
1539
+ #### Function template `scaled` <a id="linalg.scaled.scaled">[[linalg.scaled.scaled]]</a>
1540
+
1541
+ The `scaled` function template takes a scaling factor `alpha` and an
1542
+ `mdspan` `x`, and returns a new read-only `mdspan` with the same domain
1543
+ as `x`, that represents the elementwise product of `alpha` with each
1544
+ element of `x`.
1545
+
1546
+ ``` cpp
1547
+ template<class ScalingFactor,
1548
+ class ElementType, class Extents, class Layout, class Accessor>
1549
+ constexpr auto scaled(ScalingFactor alpha, mdspan<ElementType, Extents, Layout, Accessor> x);
1550
+ ```
1551
+
1552
+ Let `SA` be `scaled_accessor<ScalingFactor, Accessor>`.
1553
+
1554
+ *Returns:*
1555
+
1556
+ ``` cpp
1557
+ mdspan<typename SA::element_type, Extents, Layout, SA>(x.data_handle(), x.mapping(),
1558
+ SA(alpha, x.accessor()))
1559
+ ```
1560
+
1561
+ [*Example 1*:
1562
+
1563
+ ``` cpp
1564
+ void test_scaled(mdspan<double, extents<int, 10>> x)
1565
+ {
1566
+ auto x_scaled = scaled(5.0, x);
1567
+ for (int i = 0; i < x.extent(0); ++i) {
1568
+ assert(x_scaled[i] == 5.0 * x[i]);
1569
+ }
1570
+ }
1571
+ ```
1572
+
1573
+ — *end example*]
1574
+
1575
+ ### Conjugated in-place transformation <a id="linalg.conj">[[linalg.conj]]</a>
1576
+
1577
+ #### Introduction <a id="linalg.conj.intro">[[linalg.conj.intro]]</a>
1578
+
1579
+ The `conjugated` function takes an `mdspan` `x`, and returns a new
1580
+ read-only `mdspan` `y` with the same domain as `x`, whose elements are
1581
+ the complex conjugates of the corresponding elements of `x`.
1582
+
1583
+ #### Class template `conjugated_accessor` <a id="linalg.conj.conjugatedaccessor">[[linalg.conj.conjugatedaccessor]]</a>
1584
+
1585
+ The class template `conjugated_accessor` is an `mdspan` accessor policy
1586
+ which upon access produces conjugate elements. It is part of the
1587
+ implementation of `conjugated` [[linalg.conj.conjugated]].
1588
+
1589
+ ``` cpp
1590
+ namespace std::linalg {
1591
+ template<class NestedAccessor>
1592
+ class conjugated_accessor {
1593
+ public:
1594
+ using element_type =
1595
+ const decltype(conj-if-needed(declval<NestedAccessor::element_type>()));
1596
+ using reference = remove_const_t<element_type>;
1597
+ using data_handle_type = NestedAccessor::data_handle_type;
1598
+ using offset_policy = conjugated_accessor<NestedAccessor::offset_policy>;
1599
+
1600
+ constexpr conjugated_accessor() = default;
1601
+ constexpr conjugated_accessor(const NestedAccessor& acc);
1602
+ template<class OtherNestedAccessor>
1603
+ explicit(!is_convertible_v<OtherNestedAccessor, NestedAccessor>)
1604
+ constexpr conjugated_accessor(const conjugated_accessor<OtherNestedAccessor>& other);
1605
+
1606
+ constexpr reference access(data_handle_type p, size_t i) const;
1607
+
1608
+ constexpr typename offset_policy::data_handle_type
1609
+ offset(data_handle_type p, size_t i) const;
1610
+
1611
+ constexpr const NestedAccessor& nested_accessor() const noexcept { return nested-accessor_; }
1612
+
1613
+ private:
1614
+ NestedAccessor nested-accessor_{}; // exposition only
1615
+ };
1616
+ }
1617
+ ```
1618
+
1619
+ *Mandates:*
1620
+
1621
+ - `element_type` is valid and denotes a type,
1622
+ - `is_copy_constructible_v<reference>` is `true`,
1623
+ - `is_reference_v<element_type>` is `false`, and
1624
+ - `NestedAccessor` meets the accessor policy requirements
1625
+ [[mdspan.accessor.reqmts]].
1626
+
1627
+ ``` cpp
1628
+ constexpr conjugated_accessor(const NestedAccessor& acc);
1629
+ ```
1630
+
1631
+ *Effects:* Direct-non-list-initializes *nested-accessor\_* with `acc`.
1632
+
1633
+ ``` cpp
1634
+ template<class OtherNestedAccessor>
1635
+ explicit(!is_convertible_v<OtherNestedAccessor, NestedAccessor>)
1636
+ constexpr conjugated_accessor(const conjugated_accessor<OtherNestedAccessor>& other);
1637
+ ```
1638
+
1639
+ *Constraints:*
1640
+ `is_constructible_v<NestedAccessor, const OtherNestedAccessor&>` is
1641
+ `true`.
1642
+
1643
+ *Effects:* Direct-non-list-initializes *nested-accessor\_* with
1644
+ `other.nested_accessor()`.
1645
+
1646
+ ``` cpp
1647
+ constexpr reference access(data_handle_type p, size_t i) const;
1648
+ ```
1649
+
1650
+ *Returns:*
1651
+ *`conj-if-needed`*`(NestedAccessor::element_type(`*`nested-accessor_`*`.access(p, i)))`
1652
+
1653
+ ``` cpp
1654
+ constexpr typename offset_policy::data_handle_type offset(data_handle_type p, size_t i) const;
1655
+ ```
1656
+
1657
+ *Returns:* *`nested-accessor_`*`.offset(p, i)`
1658
+
1659
+ #### Function template `conjugated` <a id="linalg.conj.conjugated">[[linalg.conj.conjugated]]</a>
1660
+
1661
+ ``` cpp
1662
+ template<class ElementType, class Extents, class Layout, class Accessor>
1663
+ constexpr auto conjugated(mdspan<ElementType, Extents, Layout, Accessor> a);
1664
+ ```
1665
+
1666
+ Let `A` be
1667
+
1668
+ - `remove_cvref_t<decltype(a.accessor().nested_accessor())>` if
1669
+ `Accessor` is a specialization of `conjugated_accessor`;
1670
+ - otherwise, `Accessor` if `remove_cvref_t<ElementType>` is an
1671
+ arithmetic type;
1672
+ - otherwise, `conjugated_accessor<Accessor>` if the expression `conj(E)`
1673
+ is valid for any subexpression `E` whose type is
1674
+ `remove_cvref_t<ElementType>` with overload resolution performed in a
1675
+ context that includes the declaration
1676
+ `template<class U> U conj(const U&) = delete;`;
1677
+ - otherwise, `Accessor`.
1678
+
1679
+ *Returns:* Let `MD` be
1680
+ `mdspan<typename A::element_type, Extents, Layout, A>`.
1681
+
1682
+ - `MD(a.data_handle(), a.mapping(), a.accessor().nested_accessor())` if
1683
+ `Accessor` is a specialization of `conjugated_accessor`;
1684
+ - otherwise, `a`, if `is_same_v<A, Accessor>` is `true`;
1685
+ - otherwise,
1686
+ `MD(a.data_handle(), a.mapping(), conjugated_accessor(a.accessor()))`.
1687
+
1688
+ [*Example 1*:
1689
+
1690
+ ``` cpp
1691
+ void test_conjugated_complex(mdspan<complex<double>, extents<int, 10>> a) {
1692
+ auto a_conj = conjugated(a);
1693
+ for (int i = 0; i < a.extent(0); ++i) {
1694
+ assert(a_conj[i] == conj(a[i]);
1695
+ }
1696
+ auto a_conj_conj = conjugated(a_conj);
1697
+ for (int i = 0; i < a.extent(0); ++i) {
1698
+ assert(a_conj_conj[i] == a[i]);
1699
+ }
1700
+ }
1701
+
1702
+ void test_conjugated_real(mdspan<double, extents<int, 10>> a) {
1703
+ auto a_conj = conjugated(a);
1704
+ for (int i = 0; i < a.extent(0); ++i) {
1705
+ assert(a_conj[i] == a[i]);
1706
+ }
1707
+ auto a_conj_conj = conjugated(a_conj);
1708
+ for (int i = 0; i < a.extent(0); ++i) {
1709
+ assert(a_conj_conj[i] == a[i]);
1710
+ }
1711
+ }
1712
+ ```
1713
+
1714
+ — *end example*]
1715
+
1716
+ ### Transpose in-place transformation <a id="linalg.transp">[[linalg.transp]]</a>
1717
+
1718
+ #### Introduction <a id="linalg.transp.intro">[[linalg.transp.intro]]</a>
1719
+
1720
+ `layout_transpose` is an `mdspan` layout mapping policy that swaps the
1721
+ two indices, extents, and strides of any unique `mdspan` layout mapping
1722
+ policy.
1723
+
1724
+ The `transposed` function takes an `mdspan` representing a matrix, and
1725
+ returns a new `mdspan` representing the transpose of the input matrix.
1726
+
1727
+ #### Exposition-only helpers for `layout_transpose` and `transposed` <a id="linalg.transp.helpers">[[linalg.transp.helpers]]</a>
1728
+
1729
+ The exposition-only *`transpose-extents`* function takes an `extents`
1730
+ object representing the extents of a matrix, and returns a new `extents`
1731
+ object representing the extents of the transpose of the matrix.
1732
+
1733
+ The exposition-only alias template `transpose-extents-t<InputExtents>`
1734
+ gives the type of `transpose-extents(e)` for a given `extents` object
1735
+ `e` of type `InputExtents`.
1736
+
1737
+ ``` cpp
1738
+ template<class IndexType, size_t InputExtent0, size_t InputExtent1>
1739
+ constexpr extents<IndexType, InputExtent1, InputExtent0>
1740
+ transpose-extents(const extents<IndexType, InputExtent0, InputExtent1>& in); // exposition only
1741
+ ```
1742
+
1743
+ *Returns:*
1744
+ `extents<IndexType, InputExtent1, InputExtent0>(in.extent(1), in.extent(0))`
1745
+
1746
+ ``` cpp
1747
+ template<class InputExtents>
1748
+ using transpose-extents-t =
1749
+ decltype(transpose-extents(declval<InputExtents>())); // exposition only
1750
+ ```
1751
+
1752
+ #### Class template `layout_transpose` <a id="linalg.transp.layout.transpose">[[linalg.transp.layout.transpose]]</a>
1753
+
1754
+ `layout_transpose` is an `mdspan` layout mapping policy that swaps the
1755
+ two indices, extents, and strides of any `mdspan` layout mapping policy.
1756
+
1757
+ ``` cpp
1758
+ namespace std::linalg {
1759
+ template<class Layout>
1760
+ class layout_transpose {
1761
+ public:
1762
+ using nested_layout_type = Layout;
1763
+
1764
+ template<class Extents>
1765
+ struct mapping {
1766
+ private:
1767
+ using nested-mapping-type =
1768
+ Layout::template mapping<transpose-extents-t<Extents>>; // exposition only
1769
+
1770
+ public:
1771
+ using extents_type = Extents;
1772
+ using index_type = extents_type::index_type;
1773
+ using size_type = extents_type::size_type;
1774
+ using rank_type = extents_type::rank_type;
1775
+ using layout_type = layout_transpose;
1776
+
1777
+ constexpr explicit mapping(const nested-mapping-type&);
1778
+
1779
+ constexpr const extents_type& extents() const noexcept { return extents_; }
1780
+
1781
+ constexpr index_type required_span_size() const
1782
+ { return nested-mapping_.required_span_size(); }
1783
+
1784
+ template<class Index0, class Index1>
1785
+ constexpr index_type operator()(Index0 ind0, Index1 ind1) const
1786
+ { return nested-mapping_(ind1, ind0); }
1787
+
1788
+ constexpr const nested-mapping-type& nested_mapping() const noexcept
1789
+ { return nested-mapping_; }
1790
+
1791
+ static constexpr bool is_always_unique() noexcept
1792
+ { return nested-mapping-type::is_always_unique(); }
1793
+ static constexpr bool is_always_exhaustive() noexcept
1794
+ { return nested-mapping-type::is_always_exhaustive(); }
1795
+ static constexpr bool is_always_strided() noexcept
1796
+ { return nested-mapping-type::is_always_strided(); }
1797
+
1798
+ constexpr bool is_unique() const { return nested-mapping_.is_unique(); }
1799
+ constexpr bool is_exhaustive() const { return nested-mapping_.is_exhaustive(); }
1800
+ constexpr bool is_strided() const { return nested-mapping_.is_strided(); }
1801
+
1802
+ constexpr index_type stride(size_t r) const;
1803
+
1804
+ template<class OtherExtents>
1805
+ friend constexpr bool operator==(const mapping& x, const mapping<OtherExtents>& y);
1806
+
1807
+ private:
1808
+ nested-mapping-type nested-mapping_; // exposition only
1809
+ extents_type extents_; // exposition only
1810
+ };
1811
+ };
1812
+ }
1813
+ ```
1814
+
1815
+ `Layout` shall meet the layout mapping policy requirements
1816
+ [[mdspan.layout.policy.reqmts]].
1817
+
1818
+ *Mandates:*
1819
+
1820
+ - `Extents` is a specialization of `std::extents`, and
1821
+ - `Extents::rank()` equals 2.
1822
+
1823
+ ``` cpp
1824
+ constexpr explicit mapping(const nested-mapping-type& map);
1825
+ ```
1826
+
1827
+ *Effects:*
1828
+
1829
+ - Initializes *nested-mapping\_* with `map`, and
1830
+ - initializes *extents\_* with *`transpose-extents`*`(map.extents())`.
1831
+
1832
+ ``` cpp
1833
+ constexpr index_type stride(size_t r) const;
1834
+ ```
1835
+
1836
+ *Preconditions:*
1837
+
1838
+ - `is_strided()` is `true`, and
1839
+ - `r < 2` is `true`.
1840
+
1841
+ *Returns:* *`nested-mapping_`*`.stride(r == 0 ? 1 : 0)`
1842
+
1843
+ ``` cpp
1844
+ template<class OtherExtents>
1845
+ friend constexpr bool operator==(const mapping& x, const mapping<OtherExtents>& y);
1846
+ ```
1847
+
1848
+ *Constraints:* The expression
1849
+ `x.`*`nested-mapping_`*` == y.`*`nested-mapping_`* is well-formed and
1850
+ its result is convertible to `bool`.
1851
+
1852
+ *Returns:* `x.`*`nested-mapping_`*` == y.`*`nested-mapping_`*.
1853
+
1854
+ #### Function template `transposed` <a id="linalg.transp.transposed">[[linalg.transp.transposed]]</a>
1855
+
1856
+ The `transposed` function takes a rank-2 `mdspan` representing a matrix,
1857
+ and returns a new `mdspan` representing the input matrix’s transpose.
1858
+ The input matrix’s data are not modified, and the returned `mdspan`
1859
+ accesses the input matrix’s data in place.
1860
+
1861
+ ``` cpp
1862
+ template<class ElementType, class Extents, class Layout, class Accessor>
1863
+ constexpr auto transposed(mdspan<ElementType, Extents, Layout, Accessor> a);
1864
+ ```
1865
+
1866
+ *Mandates:* `Extents::rank() == 2` is `true`.
1867
+
1868
+ Let `ReturnExtents` be *`transpose-extents-t`*`<Extents>`. Let `R` be
1869
+ `mdspan<ElementType, ReturnExtents, ReturnLayout, Accessor>`, where
1870
+ `ReturnLayout` is:
1871
+
1872
+ - `layout_right` if `Layout` is `layout_left`;
1873
+ - otherwise, `layout_left` if `Layout` is `layout_right`;
1874
+ - otherwise, `layout_right_padded<PaddingValue>` if `Layout` is
1875
+ `layout_left_padded<PaddingValue>` for some `size_t` value
1876
+ `PaddingValue`;
1877
+ - otherwise, `layout_left_padded<PaddingValue>` if `Layout` is
1878
+ `layout_right_padded<PaddingValue>` for some `size_t` value
1879
+ `PaddingValue`;
1880
+ - otherwise, `layout_stride` if `Layout` is `layout_stride`;
1881
+ - otherwise,
1882
+ `layout_blas_packed<OppositeTriangle, OppositeStorageOrder>`, if
1883
+ `Layout` is `layout_blas_packed<Triangle, StorageOrder>` for some
1884
+ `Triangle` and `StorageOrder`, where
1885
+ - `OppositeTriangle` is
1886
+ ``` cpp
1887
+ conditional_t<is_same_v<Triangle, upper_triangle_t>,
1888
+ lower_triangle_t, upper_triangle_t>
1889
+ ```
1890
+
1891
+ and
1892
+ - `OppositeStorageOrder` is
1893
+ ``` cpp
1894
+ conditional_t<is_same_v<StorageOrder, column_major_t>, row_major_t, column_major_t>
1895
+ ```
1896
+ - otherwise, `NestedLayout` if `Layout` is
1897
+ `layout_transpose<NestedLayout>` for some `NestedLayout`;
1898
+ - otherwise, `layout_transpose<Layout>`.
1899
+
1900
+ *Returns:* With `ReturnMapping` being the type
1901
+ `typename ReturnLayout::template mapping<ReturnExtents>`:
1902
+
1903
+ - if `Layout` is `layout_left`, `layout_right`, or a specialization of
1904
+ `layout_blas_packed`,
1905
+ ``` cpp
1906
+ R(a.data_handle(), ReturnMapping(transpose-extents(a.mapping().extents())),
1907
+ a.accessor())
1908
+ ```
1909
+ - otherwise,
1910
+ ``` cpp
1911
+ R(a.data_handle(), ReturnMapping(transpose-extents(a.mapping().extents()),
1912
+ a.mapping().stride(1)), a.accessor())
1913
+ ```
1914
+
1915
+ if `Layout` is `layout_left_padded<PaddingValue>` for some `size_t`
1916
+ value `PaddingValue`;
1917
+ - otherwise,
1918
+ ``` cpp
1919
+ R(a.data_handle(), ReturnMapping(transpose-extents(a.mapping().extents()),
1920
+ a.mapping().stride(0)), a.accessor())
1921
+ ```
1922
+
1923
+ if `Layout` is `layout_right_padded<PaddingValue>` for some `size_t`
1924
+ value `PaddingValue`;
1925
+ - otherwise, if `Layout` is `layout_stride`,
1926
+ ``` cpp
1927
+ R(a.data_handle(), ReturnMapping(transpose-extents(a.mapping().extents()),
1928
+ array{a.mapping().stride(1), a.mapping().stride(0)}), a.accessor())
1929
+ ```
1930
+ - otherwise, if `Layout` is a specialization of `layout_transpose`,
1931
+ ``` cpp
1932
+ R(a.data_handle(), a.mapping().nested_mapping(), a.accessor())
1933
+ ```
1934
+ - otherwise,
1935
+ ``` cpp
1936
+ R(a.data_handle(), ReturnMapping(a.mapping()), a.accessor())
1937
+ ```
1938
+
1939
+ [*Example 1*:
1940
+
1941
+ ``` cpp
1942
+ void test_transposed(mdspan<double, extents<size_t, 3, 4>> a) {
1943
+ const auto num_rows = a.extent(0);
1944
+ const auto num_cols = a.extent(1);
1945
+
1946
+ auto a_t = transposed(a);
1947
+ assert(num_rows == a_t.extent(1));
1948
+ assert(num_cols == a_t.extent(0));
1949
+ assert(a.stride(0) == a_t.stride(1));
1950
+ assert(a.stride(1) == a_t.stride(0));
1951
+
1952
+ for (size_t row = 0; row < num_rows; ++row) {
1953
+ for (size_t col = 0; col < num_rows; ++col) {
1954
+ assert(a[row, col] == a_t[col, row]);
1955
+ }
1956
+ }
1957
+
1958
+ auto a_t_t = transposed(a_t);
1959
+ assert(num_rows == a_t_t.extent(0));
1960
+ assert(num_cols == a_t_t.extent(1));
1961
+ assert(a.stride(0) == a_t_t.stride(0));
1962
+ assert(a.stride(1) == a_t_t.stride(1));
1963
+
1964
+ for (size_t row = 0; row < num_rows; ++row) {
1965
+ for (size_t col = 0; col < num_rows; ++col) {
1966
+ assert(a[row, col] == a_t_t[row, col]);
1967
+ }
1968
+ }
1969
+ }
1970
+ ```
1971
+
1972
+ — *end example*]
1973
+
1974
+ ### Conjugate transpose in-place transform <a id="linalg.conjtransposed">[[linalg.conjtransposed]]</a>
1975
+
1976
+ The `conjugate_transposed` function returns a conjugate transpose view
1977
+ of an object. This combines the effects of `transposed` and
1978
+ `conjugated`.
1979
+
1980
+ ``` cpp
1981
+ template<class ElementType, class Extents, class Layout, class Accessor>
1982
+ constexpr auto conjugate_transposed(mdspan<ElementType, Extents, Layout, Accessor> a);
1983
+ ```
1984
+
1985
+ *Effects:* Equivalent to: `return conjugated(transposed(a));`
1986
+
1987
+ [*Example 1*:
1988
+
1989
+ ``` cpp
1990
+ void test_conjugate_transposed(mdspan<complex<double>, extents<size_t, 3, 4>> a) {
1991
+ const auto num_rows = a.extent(0);
1992
+ const auto num_cols = a.extent(1);
1993
+
1994
+ auto a_ct = conjugate_transposed(a);
1995
+ assert(num_rows == a_ct.extent(1));
1996
+ assert(num_cols == a_ct.extent(0));
1997
+ assert(a.stride(0) == a_ct.stride(1));
1998
+ assert(a.stride(1) == a_ct.stride(0));
1999
+
2000
+ for (size_t row = 0; row < num_rows; ++row) {
2001
+ for (size_t col = 0; col < num_rows; ++col) {
2002
+ assert(a[row, col] == conj(a_ct[col, row]));
2003
+ }
2004
+ }
2005
+
2006
+ auto a_ct_ct = conjugate_transposed(a_ct);
2007
+ assert(num_rows == a_ct_ct.extent(0));
2008
+ assert(num_cols == a_ct_ct.extent(1));
2009
+ assert(a.stride(0) == a_ct_ct.stride(0));
2010
+ assert(a.stride(1) == a_ct_ct.stride(1));
2011
+
2012
+ for (size_t row = 0; row < num_rows; ++row) {
2013
+ for (size_t col = 0; col < num_rows; ++col) {
2014
+ assert(a[row, col] == a_ct_ct[row, col]);
2015
+ assert(conj(a_ct[col, row]) == a_ct_ct[row, col]);
2016
+ }
2017
+ }
2018
+ }
2019
+ ```
2020
+
2021
+ — *end example*]
2022
+
2023
+ ### Algorithm requirements based on template parameter name <a id="linalg.algs.reqs">[[linalg.algs.reqs]]</a>
2024
+
2025
+ Throughout [[linalg.algs.blas1]], [[linalg.algs.blas2]], and
2026
+ [[linalg.algs.blas3]], where the template parameters are not
2027
+ constrained, the names of template parameters are used to express the
2028
+ following constraints.
2029
+
2030
+ - `is_execution_policy<ExecutionPolicy>::value` is `true`
2031
+ [[execpol.type]].
2032
+ - `Real` is any type such that `complex<Real>` is specified
2033
+ [[complex.numbers.general]].
2034
+ - `Triangle` is either `upper_triangle_t` or `lower_triangle_t`.
2035
+ - `DiagonalStorage` is either `implicit_unit_diagonal_t` or
2036
+ `explicit_diagonal_t`.
2037
+
2038
+ [*Note 1*: Function templates that have a template parameter named
2039
+ `ExecutionPolicy` are parallel algorithms
2040
+ [[algorithms.parallel.defns]]. — *end note*]
2041
+
2042
+ ### BLAS 1 algorithms <a id="linalg.algs.blas1">[[linalg.algs.blas1]]</a>
2043
+
2044
+ #### Complexity <a id="linalg.algs.blas1.complexity">[[linalg.algs.blas1.complexity]]</a>
2045
+
2046
+ *Complexity:* All algorithms in [[linalg.algs.blas1]] with `mdspan`
2047
+ parameters perform a count of `mdspan` array accesses and arithmetic
2048
+ operations that is linear in the maximum product of extents of any
2049
+ `mdspan` parameter.
2050
+
2051
+ #### Givens rotations <a id="linalg.algs.blas1.givens">[[linalg.algs.blas1.givens]]</a>
2052
+
2053
+ ##### Compute Givens rotation <a id="linalg.algs.blas1.givens.lartg">[[linalg.algs.blas1.givens.lartg]]</a>
2054
+
2055
+ ``` cpp
2056
+ template<class Real>
2057
+ setup_givens_rotation_result<Real> setup_givens_rotation(Real a, Real b) noexcept;
2058
+
2059
+ template<class Real>
2060
+ setup_givens_rotation_result<complex<Real>>
2061
+ setup_givens_rotation(complex<Real> a, complex<Real> b) noexcept;
2062
+ ```
2063
+
2064
+ These functions compute the Givens plane rotation represented by the two
2065
+ values c and s such that the 2 x 2 system of equations
2066
+ $$\left[ \begin{matrix}
2067
+ c & s \\
2068
+ -\overline{s} & c \\
2069
+ \end{matrix} \right]
2070
+ \cdot
2071
+ \left[ \begin{matrix}
2072
+ a \\
2073
+ b \\
2074
+ \end{matrix} \right]
2075
+ =
2076
+ \left[ \begin{matrix}
2077
+ r \\
2078
+ 0 \\
2079
+ \end{matrix} \right]$$
2080
+
2081
+ holds, where c is always a real scalar, and c² + |s|^2 = 1. That is, c
2082
+ and s represent a 2 x 2 matrix, that when multiplied by the right by the
2083
+ input vector whose components are a and b, produces a result vector
2084
+ whose first component r is the Euclidean norm of the input vector, and
2085
+ whose second component is zero.
2086
+
2087
+ [*Note 1*: These functions correspond to the LAPACK function
2088
+ `xLARTG`. — *end note*]
2089
+
2090
+ *Returns:* `c, s, r`, where `c` and `s` form the Givens plane rotation
2091
+ corresponding to the input `a` and `b`, and `r` is the Euclidean norm of
2092
+ the two-component vector formed by `a` and `b`.
2093
+
2094
+ ##### Apply a computed Givens rotation to vectors <a id="linalg.algs.blas1.givens.rot">[[linalg.algs.blas1.givens.rot]]</a>
2095
+
2096
+ ``` cpp
2097
+ template<inout-vector InOutVec1, inout-vector InOutVec2, class Real>
2098
+ void apply_givens_rotation(InOutVec1 x, InOutVec2 y, Real c, Real s);
2099
+ template<class ExecutionPolicy, inout-vector InOutVec1, inout-vector InOutVec2, class Real>
2100
+ void apply_givens_rotation(ExecutionPolicy&& exec,
2101
+ InOutVec1 x, InOutVec2 y, Real c, Real s);
2102
+ template<inout-vector InOutVec1, inout-vector InOutVec2, class Real>
2103
+ void apply_givens_rotation(InOutVec1 x, InOutVec2 y, Real c, complex<Real> s);
2104
+ template<class ExecutionPolicy, inout-vector InOutVec1, inout-vector InOutVec2, class Real>
2105
+ void apply_givens_rotation(ExecutionPolicy&& exec,
2106
+ InOutVec1 x, InOutVec2 y, Real c, complex<Real> s);
2107
+ ```
2108
+
2109
+ [*Note 2*: These functions correspond to the BLAS function
2110
+ `xROT`. — *end note*]
2111
+
2112
+ *Mandates:* *`compatible-static-extents`*`<InOutVec1, InOutVec2>(0, 0)`
2113
+ is `true`.
2114
+
2115
+ *Preconditions:* `x.extent(0)` equals `y.extent(0)`.
2116
+
2117
+ *Effects:* Applies the plane rotation specified by `c` and `s` to the
2118
+ input vectors `x` and `y`, as if the rotation were a 2 x 2 matrix and
2119
+ the input vectors were successive rows of a matrix with two rows.
2120
+
2121
+ #### Swap matrix or vector elements <a id="linalg.algs.blas1.swap">[[linalg.algs.blas1.swap]]</a>
2122
+
2123
+ ``` cpp
2124
+ template<inout-object InOutObj1, inout-object InOutObj2>
2125
+ void swap_elements(InOutObj1 x, InOutObj2 y);
2126
+ template<class ExecutionPolicy, inout-object InOutObj1, inout-object InOutObj2>
2127
+ void swap_elements(ExecutionPolicy&& exec, InOutObj1 x, InOutObj2 y);
2128
+ ```
2129
+
2130
+ [*Note 1*: These functions correspond to the BLAS function
2131
+ `xSWAP`. — *end note*]
2132
+
2133
+ *Constraints:* `x.rank()` equals `y.rank()`.
2134
+
2135
+ *Mandates:* For all `r` in the range [0, `x.rank()`),
2136
+
2137
+ ``` cpp
2138
+ compatible-static-extents<InOutObj1, InOutObj2>(r, r)
2139
+ ```
2140
+
2141
+ is `true`.
2142
+
2143
+ *Preconditions:* `x.extents()` equals `y.extents()`.
2144
+
2145
+ *Effects:* Swaps all corresponding elements of `x` and `y`.
2146
+
2147
+ #### Multiply the elements of an object in place by a scalar <a id="linalg.algs.blas1.scal">[[linalg.algs.blas1.scal]]</a>
2148
+
2149
+ ``` cpp
2150
+ template<class Scalar, inout-object InOutObj>
2151
+ void scale(Scalar alpha, InOutObj x);
2152
+ template<class ExecutionPolicy, class Scalar, inout-object InOutObj>
2153
+ void scale(ExecutionPolicy&& exec, Scalar alpha, InOutObj x);
2154
+ ```
2155
+
2156
+ [*Note 1*: These functions correspond to the BLAS function
2157
+ `xSCAL`. — *end note*]
2158
+
2159
+ *Effects:* Overwrites x with the result of computing the elementwise
2160
+ multiplication α x, where the scalar α is `alpha`.
2161
+
2162
+ #### Copy elements of one matrix or vector into another <a id="linalg.algs.blas1.copy">[[linalg.algs.blas1.copy]]</a>
2163
+
2164
+ ``` cpp
2165
+ template<in-object InObj, out-object OutObj>
2166
+ void copy(InObj x, OutObj y);
2167
+ template<class ExecutionPolicy, in-object InObj, out-object OutObj>
2168
+ void copy(ExecutionPolicy&& exec, InObj x, OutObj y);
2169
+ ```
2170
+
2171
+ [*Note 1*: These functions correspond to the BLAS function
2172
+ `xCOPY`. — *end note*]
2173
+
2174
+ *Constraints:* `x.rank()` equals `y.rank()`.
2175
+
2176
+ *Mandates:* For all `r` in the range [ 0, `x.rank()`),
2177
+
2178
+ ``` cpp
2179
+ compatible-static-extents<InObj, OutObj>(r, r)
2180
+ ```
2181
+
2182
+ is `true`.
2183
+
2184
+ *Preconditions:* `x.extents()` equals `y.extents()`.
2185
+
2186
+ *Effects:* Assigns each element of x to the corresponding element of y.
2187
+
2188
+ #### Add vectors or matrices elementwise <a id="linalg.algs.blas1.add">[[linalg.algs.blas1.add]]</a>
2189
+
2190
+ ``` cpp
2191
+ template<in-object InObj1, in-object InObj2, out-object OutObj>
2192
+ void add(InObj1 x, InObj2 y, OutObj z);
2193
+ template<class ExecutionPolicy, in-object InObj1, in-object InObj2, out-object OutObj>
2194
+ void add(ExecutionPolicy&& exec,
2195
+ InObj1 x, InObj2 y, OutObj z);
2196
+ ```
2197
+
2198
+ [*Note 1*: These functions correspond to the BLAS function
2199
+ `xAXPY`. — *end note*]
2200
+
2201
+ *Constraints:* `x.rank()`, `y.rank()`, and `z.rank()` are all equal.
2202
+
2203
+ *Mandates:* *`possibly-addable`*`<InObj1, InObj2, OutObj>()` is `true`.
2204
+
2205
+ *Preconditions:* *`addable`*`(x,y,z)` is `true`.
2206
+
2207
+ *Effects:* Computes z = x + y.
2208
+
2209
+ *Remarks:* `z` may alias `x` or `y`.
2210
+
2211
+ #### Dot product of two vectors <a id="linalg.algs.blas1.dot">[[linalg.algs.blas1.dot]]</a>
2212
+
2213
+ [*Note 1*: The functions in this section correspond to the BLAS
2214
+ functions `xDOT`, `xDOTU`, and `xDOTC`. — *end note*]
2215
+
2216
+ The following elements apply to all functions in
2217
+ [[linalg.algs.blas1.dot]].
2218
+
2219
+ *Mandates:* `compatible-static-extents<InVec1, InVec2>(0, 0)` is `true`.
2220
+
2221
+ *Preconditions:* `v1.extent(0)` equals `v2.extent(0)`.
2222
+
2223
+ ``` cpp
2224
+ template<in-vector InVec1, in-vector InVec2, class Scalar>
2225
+ Scalar dot(InVec1 v1, InVec2 v2, Scalar init);
2226
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, class Scalar>
2227
+ Scalar dot(ExecutionPolicy&& exec,
2228
+ InVec1 v1, InVec2 v2, Scalar init);
2229
+ ```
2230
+
2231
+ These functions compute a non-conjugated dot product with an explicitly
2232
+ specified result type.
2233
+
2234
+ *Returns:* Let `N` be `v1.extent(0)`.
2235
+
2236
+ - `init` if `N` is zero;
2237
+ - otherwise, *GENERALIZED_SUM*(plus\<\>(), init, v1\[0\]\*v2\[0\], …,
2238
+ v1\[N-1\]\*v2\[N-1\]).
2239
+
2240
+ *Remarks:* If `InVec1::value_type`, `InVec2::value_type`, and `Scalar`
2241
+ are all floating-point types or specializations of `complex`, and if
2242
+ `Scalar` has higher precision than `InVec1::value_type` or
2243
+ `InVec2::value_type`, then intermediate terms in the sum use `Scalar`’s
2244
+ precision or greater.
2245
+
2246
+ ``` cpp
2247
+ template<in-vector InVec1, in-vector InVec2>
2248
+ auto dot(InVec1 v1, InVec2 v2);
2249
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2>
2250
+ auto dot(ExecutionPolicy&& exec,
2251
+ InVec1 v1, InVec2 v2);
2252
+ ```
2253
+
2254
+ These functions compute a non-conjugated dot product with a default
2255
+ result type.
2256
+
2257
+ *Effects:* Let `T` be
2258
+ `decltype(declval<typename InVec1::value_type>() * declval<typename InVec2::value_type>())`.
2259
+ Then,
2260
+
2261
+ - the two-parameter overload is equivalent to:
2262
+ ``` cpp
2263
+ return dot(v1, v2, T{});
2264
+ ```
2265
+
2266
+ and
2267
+ - the three-parameter overload is equivalent to:
2268
+ ``` cpp
2269
+ return dot(std::forward<ExecutionPolicy>(exec), v1, v2, T{});
2270
+ ```
2271
+
2272
+ ``` cpp
2273
+ template<in-vector InVec1, in-vector InVec2, class Scalar>
2274
+ Scalar dotc(InVec1 v1, InVec2 v2, Scalar init);
2275
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, class Scalar>
2276
+ Scalar dotc(ExecutionPolicy&& exec,
2277
+ InVec1 v1, InVec2 v2, Scalar init);
2278
+ ```
2279
+
2280
+ These functions compute a conjugated dot product with an explicitly
2281
+ specified result type.
2282
+
2283
+ *Effects:*
2284
+
2285
+ - The three-parameter overload is equivalent to:
2286
+ ``` cpp
2287
+ return dot(conjugated(v1), v2, init);
2288
+ ```
2289
+
2290
+ and
2291
+ - the four-parameter overload is equivalent to:
2292
+ ``` cpp
2293
+ return dot(std::forward<ExecutionPolicy>(exec), conjugated(v1), v2, init);
2294
+ ```
2295
+
2296
+ ``` cpp
2297
+ template<in-vector InVec1, in-vector InVec2>
2298
+ auto dotc(InVec1 v1, InVec2 v2);
2299
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2>
2300
+ auto dotc(ExecutionPolicy&& exec,
2301
+ InVec1 v1, InVec2 v2);
2302
+ ```
2303
+
2304
+ These functions compute a conjugated dot product with a default result
2305
+ type.
2306
+
2307
+ *Effects:* Let `T` be
2308
+ `decltype(`*`conj-if-needed`*`(declval<typename InVec1::value_type>()) * declval<typename InVec2::value_type>())`.
2309
+ Then,
2310
+
2311
+ - the two-parameter overload is equivalent to:
2312
+ ``` cpp
2313
+ return dotc(v1, v2, T{});
2314
+ ```
2315
+
2316
+ and
2317
+ - the three-parameter overload is equivalent to
2318
+ ``` cpp
2319
+ return dotc(std::forward<ExecutionPolicy>(exec), v1, v2, T{});
2320
+ ```
2321
+
2322
+ #### Scaled sum of squares of a vector’s elements <a id="linalg.algs.blas1.ssq">[[linalg.algs.blas1.ssq]]</a>
2323
+
2324
+ ``` cpp
2325
+ template<in-vector InVec, class Scalar>
2326
+ sum_of_squares_result<Scalar> vector_sum_of_squares(InVec v, sum_of_squares_result<Scalar> init);
2327
+ template<class ExecutionPolicy, in-vector InVec, class Scalar>
2328
+ sum_of_squares_result<Scalar> vector_sum_of_squares(ExecutionPolicy&& exec,
2329
+ InVec v, sum_of_squares_result<Scalar> init);
2330
+ ```
2331
+
2332
+ [*Note 1*: These functions correspond to the LAPACK function
2333
+ `xLASSQ`. — *end note*]
2334
+
2335
+ *Mandates:*
2336
+ `decltype(`*`abs-if-needed`*`(declval<typename InVec::value_type>()))`
2337
+ is convertible to `Scalar`.
2338
+
2339
+ *Effects:* Returns a value `result` such that
2340
+
2341
+ - `result.scaling_factor` is the maximum of `init.scaling_factor` and
2342
+ *`abs-if-needed`*`(x[i])` for all `i` in the domain of `v`; and
2343
+ - let `s2init` be
2344
+ ``` cpp
2345
+ init.scaling_factor * init.scaling_factor * init.scaled_sum_of_squares
2346
+ ```
2347
+
2348
+ then
2349
+ `result.scaling_factor * result.scaling_factor * result.scaled_sum_of_squares`
2350
+ equals the sum of `s2init` and the squares of
2351
+ *`abs-if-needed`*`(x[i])` for all `i` in the domain of `v`.
2352
+
2353
+ *Remarks:* If `InVec::value_type`, and `Scalar` are all floating-point
2354
+ types or specializations of `complex`, and if `Scalar` has higher
2355
+ precision than `InVec::value_type`, then intermediate terms in the sum
2356
+ use `Scalar`’s precision or greater.
2357
+
2358
+ #### Euclidean norm of a vector <a id="linalg.algs.blas1.nrm2">[[linalg.algs.blas1.nrm2]]</a>
2359
+
2360
+ ``` cpp
2361
+ template<in-vector InVec, class Scalar>
2362
+ Scalar vector_two_norm(InVec v, Scalar init);
2363
+ template<class ExecutionPolicy, in-vector InVec, class Scalar>
2364
+ Scalar vector_two_norm(ExecutionPolicy&& exec, InVec v, Scalar init);
2365
+ ```
2366
+
2367
+ [*Note 1*: These functions correspond to the BLAS function
2368
+ `xNRM2`. — *end note*]
2369
+
2370
+ *Mandates:* Let `a` be
2371
+ *`abs-if-needed`*`(declval<typename InVec::value_type>())`. Then,
2372
+ `decltype(init + a * a` is convertible to `Scalar`.
2373
+
2374
+ *Returns:* The square root of the sum of the square of `init` and the
2375
+ squares of the absolute values of the elements of `v`.
2376
+
2377
+ [*Note 2*: For `init` equal to zero, this is the Euclidean norm (also
2378
+ called 2-norm) of the vector `v`. — *end note*]
2379
+
2380
+ *Remarks:* If `InVec::value_type`, and `Scalar` are all floating-point
2381
+ types or specializations of `complex`, and if `Scalar` has higher
2382
+ precision than `InVec::value_type`, then intermediate terms in the sum
2383
+ use `Scalar`’s precision or greater.
2384
+
2385
+ [*Note 3*: An implementation of this function for floating-point types
2386
+ `T` can use the `scaled_sum_of_squares` result from
2387
+ `vector_sum_of_squares(x, {.scaling_factor=1.0, .scaled_sum_of_squares=init})`. — *end note*]
2388
+
2389
+ ``` cpp
2390
+ template<in-vector InVec>
2391
+ auto vector_two_norm(InVec v);
2392
+ template<class ExecutionPolicy, in-vector InVec>
2393
+ auto vector_two_norm(ExecutionPolicy&& exec, InVec v);
2394
+ ```
2395
+
2396
+ *Effects:* Let `a` be
2397
+ *`abs-if-needed`*`(declval<typename InVec::value_type>())`. Let `T` be
2398
+ `decltype(a * a)`. Then,
2399
+
2400
+ - the one-parameter overload is equivalent to:
2401
+ ``` cpp
2402
+ return vector_two_norm(v, T{});
2403
+ ```
2404
+
2405
+ and
2406
+ - the two-parameter overload is equivalent to:
2407
+ ``` cpp
2408
+ return vector_two_norm(std::forward<ExecutionPolicy>(exec), v, T{});
2409
+ ```
2410
+
2411
+ #### Sum of absolute values of vector elements <a id="linalg.algs.blas1.asum">[[linalg.algs.blas1.asum]]</a>
2412
+
2413
+ ``` cpp
2414
+ template<in-vector InVec, class Scalar>
2415
+ Scalar vector_abs_sum(InVec v, Scalar init);
2416
+ template<class ExecutionPolicy, in-vector InVec, class Scalar>
2417
+ Scalar vector_abs_sum(ExecutionPolicy&& exec, InVec v, Scalar init);
2418
+ ```
2419
+
2420
+ [*Note 1*: These functions correspond to the BLAS functions `SASUM`,
2421
+ `DASUM`, `SCASUM`, and `DZASUM`. — *end note*]
2422
+
2423
+ *Mandates:*
2424
+
2425
+ ``` cpp
2426
+ decltype(init + abs-if-needed(real-if-needed(declval<typename InVec::value_type>())) +
2427
+ abs-if-needed(imag-if-needed(declval<typename InVec::value_type>())))
2428
+ ```
2429
+
2430
+ is convertible to `Scalar`.
2431
+
2432
+ *Returns:* Let `N` be `v.extent(0)`.
2433
+
2434
+ - `init` if `N` is zero;
2435
+ - otherwise, if `InVec::value_type` is an arithmetic type,
2436
+ ``` cpp
2437
+ GENERALIZED_SUM(plus<>(), init, abs-if-needed(v[0]), …, abs-if-needed(v[N-1]))
2438
+ ```
2439
+ - otherwise,
2440
+ ``` cpp
2441
+ GENERALIZED_SUM(plus<>(), init,
2442
+ abs-if-needed(real-if-needed(v[0])) + abs-if-needed(imag-if-needed(v[0])),
2443
+ …,
2444
+ abs-if-needed(real-if-needed(v[N-1])) + abs-if-needed(imag-if-needed(v[N-1])))
2445
+ ```
2446
+
2447
+ *Remarks:* If `InVec::value_type` and `Scalar` are all floating-point
2448
+ types or specializations of `complex`, and if `Scalar` has higher
2449
+ precision than `InVec::value_type`, then intermediate terms in the sum
2450
+ use `Scalar`’s precision or greater.
2451
+
2452
+ ``` cpp
2453
+ template<in-vector InVec>
2454
+ auto vector_abs_sum(InVec v);
2455
+ template<class ExecutionPolicy, in-vector InVec>
2456
+ auto vector_abs_sum(ExecutionPolicy&& exec, InVec v);
2457
+ ```
2458
+
2459
+ *Effects:* Let `T` be `typename InVec::value_type`. Then,
2460
+
2461
+ - the one-parameter overload is equivalent to:
2462
+ ``` cpp
2463
+ return vector_abs_sum(v, T{});
2464
+ ```
2465
+
2466
+ and
2467
+ - the two-parameter overload is equivalent to:
2468
+ ``` cpp
2469
+ return vector_abs_sum(std::forward<ExecutionPolicy>(exec), v, T{});
2470
+ ```
2471
+
2472
+ #### Index of maximum absolute value of vector elements <a id="linalg.algs.blas1.iamax">[[linalg.algs.blas1.iamax]]</a>
2473
+
2474
+ ``` cpp
2475
+ template<in-vector InVec>
2476
+ typename InVec::extents_type vector_idx_abs_max(InVec v);
2477
+ template<class ExecutionPolicy, in-vector InVec>
2478
+ typename InVec::extents_type vector_idx_abs_max(ExecutionPolicy&& exec, InVec v);
2479
+ ```
2480
+
2481
+ [*Note 1*: These functions correspond to the BLAS function
2482
+ `IxAMAX`. — *end note*]
2483
+
2484
+ Let `T` be
2485
+
2486
+ ``` cpp
2487
+ decltype(abs-if-needed(real-if-needed(declval<typename InVec::value_type>())) +
2488
+ abs-if-needed(imag-if-needed(declval<typename InVec::value_type>())))
2489
+ ```
2490
+
2491
+ *Mandates:* `declval<T>() < declval<T>()` is a valid expression.
2492
+
2493
+ *Returns:*
2494
+
2495
+ - `numeric_limits<typename InVec::size_type>::max()` if `v` has zero
2496
+ elements;
2497
+ - otherwise, the index of the first element of `v` having largest
2498
+ absolute value, if `InVec::value_type` is an arithmetic type;
2499
+ - otherwise, the index of the first element `vₑ` of `v` for which
2500
+ ``` cpp
2501
+ abs-if-needed(real-if-needed($v_e$)) + abs-if-needed(imag-if-needed($v_e$))
2502
+ ```
2503
+
2504
+ has the largest value.
2505
+
2506
+ #### Frobenius norm of a matrix <a id="linalg.algs.blas1.matfrobnorm">[[linalg.algs.blas1.matfrobnorm]]</a>
2507
+
2508
+ [*Note 1*: These functions exist in the BLAS standard but are not part
2509
+ of the reference implementation. — *end note*]
2510
+
2511
+ ``` cpp
2512
+ template<in-matrix InMat, class Scalar>
2513
+ Scalar matrix_frob_norm(InMat A, Scalar init);
2514
+ template<class ExecutionPolicy, in-matrix InMat, class Scalar>
2515
+ Scalar matrix_frob_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
2516
+ ```
2517
+
2518
+ *Mandates:* Let `a` be
2519
+ *`abs-if-needed`*`(declval<typename InMat::value_type>())`. Then,
2520
+ `decltype(init + a * a)` is convertible to `Scalar`.
2521
+
2522
+ *Returns:* The square root of the sum of squares of `init` and the
2523
+ absolute values of the elements of `A`.
2524
+
2525
+ [*Note 1*: For `init` equal to zero, this is the Frobenius norm of the
2526
+ matrix `A`. — *end note*]
2527
+
2528
+ *Remarks:* If `InMat::value_type` and `Scalar` are all floating-point
2529
+ types or specializations of `complex`, and if `Scalar` has higher
2530
+ precision than `InMat::value_type`, then intermediate terms in the sum
2531
+ use `Scalar`’s precision or greater.
2532
+
2533
+ ``` cpp
2534
+ template<in-matrix InMat>
2535
+ auto matrix_frob_norm(InMat A);
2536
+ template<class ExecutionPolicy, in-matrix InMat>
2537
+ auto matrix_frob_norm(ExecutionPolicy&& exec, InMat A);
2538
+ ```
2539
+
2540
+ *Effects:* Let `a` be
2541
+ *`abs-if-needed`*`(declval<typename InMat::value_type>())`. Let `T` be
2542
+ `decltype(a * a)`. Then,
2543
+
2544
+ - the one-parameter overload is equivalent to:
2545
+ ``` cpp
2546
+ return matrix_frob_norm(A, T{});
2547
+ ```
2548
+
2549
+ and
2550
+ - the two-parameter overload is equivalent to:
2551
+ ``` cpp
2552
+ return matrix_frob_norm(std::forward<ExecutionPolicy>(exec), A, T{});
2553
+ ```
2554
+
2555
+ #### One norm of a matrix <a id="linalg.algs.blas1.matonenorm">[[linalg.algs.blas1.matonenorm]]</a>
2556
+
2557
+ [*Note 1*: These functions exist in the BLAS standard but are not part
2558
+ of the reference implementation. — *end note*]
2559
+
2560
+ ``` cpp
2561
+ template<in-matrix InMat, class Scalar>
2562
+ Scalar matrix_one_norm(InMat A, Scalar init);
2563
+ template<class ExecutionPolicy, in-matrix InMat, class Scalar>
2564
+ Scalar matrix_one_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
2565
+ ```
2566
+
2567
+ *Mandates:*
2568
+ `decltype(`*`abs-if-needed`*`(declval<typename InMat::value_type>()))`
2569
+ is convertible to `Scalar`.
2570
+
2571
+ *Returns:*
2572
+
2573
+ - `init` if `A.extent(1)` is zero;
2574
+ - otherwise, the sum of `init` and the one norm of the matrix A.
2575
+
2576
+ [*Note 1*: The one norm of the matrix `A` is the maximum over all
2577
+ columns of `A`, of the sum of the absolute values of the elements of the
2578
+ column. — *end note*]
2579
+
2580
+ *Remarks:* If `InMat::value_type` and `Scalar` are all floating-point
2581
+ types or specializations of `complex`, and if `Scalar` has higher
2582
+ precision than `InMat::value_type`, then intermediate terms in the sum
2583
+ use `Scalar`’s precision or greater.
2584
+
2585
+ ``` cpp
2586
+ template<in-matrix InMat>
2587
+ auto matrix_one_norm(InMat A);
2588
+ template<class ExecutionPolicy, in-matrix InMat>
2589
+ auto matrix_one_norm(ExecutionPolicy&& exec, InMat A);
2590
+ ```
2591
+
2592
+ *Effects:* Let `T` be
2593
+ `decltype(`*`abs-if-needed`*`(declval<typename InMat::value_type>())`.
2594
+ Then,
2595
+
2596
+ - the one-parameter overload is equivalent to:
2597
+ ``` cpp
2598
+ return matrix_one_norm(A, T{});
2599
+ ```
2600
+
2601
+ and
2602
+ - the two-parameter overload is equivalent to:
2603
+ ``` cpp
2604
+ return matrix_one_norm(std::forward<ExecutionPolicy>(exec), A, T{});
2605
+ ```
2606
+
2607
+ #### Infinity norm of a matrix <a id="linalg.algs.blas1.matinfnorm">[[linalg.algs.blas1.matinfnorm]]</a>
2608
+
2609
+ [*Note 1*: These functions exist in the BLAS standard but are not part
2610
+ of the reference implementation. — *end note*]
2611
+
2612
+ ``` cpp
2613
+ template<in-matrix InMat, class Scalar>
2614
+ Scalar matrix_inf_norm(InMat A, Scalar init);
2615
+ template<class ExecutionPolicy, in-matrix InMat, class Scalar>
2616
+ Scalar matrix_inf_norm(ExecutionPolicy&& exec, InMat A, Scalar init);
2617
+ ```
2618
+
2619
+ *Mandates:*
2620
+ `decltype(`*`abs-if-needed`*`(declval<typename InMat::value_type>()))`
2621
+ is convertible to `Scalar`.
2622
+
2623
+ *Returns:*
2624
+
2625
+ - `init` if `A.extent(0)` is zero;
2626
+ - otherwise, the sum of `init` and the infinity norm of the matrix `A`.
2627
+
2628
+ [*Note 1*: The infinity norm of the matrix `A` is the maximum over all
2629
+ rows of `A`, of the sum of the absolute values of the elements of the
2630
+ row. — *end note*]
2631
+
2632
+ *Remarks:* If `InMat::value_type` and `Scalar` are all floating-point
2633
+ types or specializations of `complex`, and if `Scalar` has higher
2634
+ precision than `InMat::value_type`, then intermediate terms in the sum
2635
+ use `Scalar`’s precision or greater.
2636
+
2637
+ ``` cpp
2638
+ template<in-matrix InMat>
2639
+ auto matrix_inf_norm(InMat A);
2640
+ template<class ExecutionPolicy, in-matrix InMat>
2641
+ auto matrix_inf_norm(ExecutionPolicy&& exec, InMat A);
2642
+ ```
2643
+
2644
+ *Effects:* Let `T` be
2645
+ `decltype(`*`abs-if-needed`*`(declval<typename InMat::value_type>())`.
2646
+ Then,
2647
+
2648
+ - the one-parameter overload is equivalent to:
2649
+ ``` cpp
2650
+ return matrix_inf_norm(A, T{});
2651
+ ```
2652
+
2653
+ and
2654
+ - the two-parameter overload is equivalent to:
2655
+ ``` cpp
2656
+ return matrix_inf_norm(std::forward<ExecutionPolicy>(exec), A, T{});
2657
+ ```
2658
+
2659
+ ### BLAS 2 algorithms <a id="linalg.algs.blas2">[[linalg.algs.blas2]]</a>
2660
+
2661
+ #### General matrix-vector product <a id="linalg.algs.blas2.gemv">[[linalg.algs.blas2.gemv]]</a>
2662
+
2663
+ [*Note 1*: These functions correspond to the BLAS function
2664
+ `xGEMV`. — *end note*]
2665
+
2666
+ The following elements apply to all functions in
2667
+ [[linalg.algs.blas2.gemv]].
2668
+
2669
+ *Mandates:*
2670
+
2671
+ - `possibly-multipliable<decltype(A), decltype(x), decltype(y)>()` is
2672
+ `true`, and
2673
+ - `possibly-addable<decltype(x), decltype(y), decltype(z)>()` is `true`
2674
+ for those overloads that take a `z` parameter.
2675
+
2676
+ *Preconditions:*
2677
+
2678
+ - `multipliable(A,x,y)` is `true`, and
2679
+ - `addable(x,y,z)` is `true` for those overloads that take a `z`
2680
+ parameter.
2681
+
2682
+ *Complexity:* 𝑂(`x.extent(0)` × `A.extent(1)`).
2683
+
2684
+ ``` cpp
2685
+ template<in-matrix InMat, in-vector InVec, out-vector OutVec>
2686
+ void matrix_vector_product(InMat A, InVec x, OutVec y);
2687
+ template<class ExecutionPolicy, in-matrix InMat, in-vector InVec, out-vector OutVec>
2688
+ void matrix_vector_product(ExecutionPolicy&& exec, InMat A, InVec x, OutVec y);
2689
+ ```
2690
+
2691
+ These functions perform an overwriting matrix-vector product.
2692
+
2693
+ *Effects:* Computes y = A x.
2694
+
2695
+ [*Example 1*:
2696
+
2697
+ ``` cpp
2698
+ constexpr size_t num_rows = 5;
2699
+ constexpr size_t num_cols = 6;
2700
+
2701
+ // y = 3.0 * A * x
2702
+ void scaled_matvec_1(mdspan<double, extents<size_t, num_rows, num_cols>> A,
2703
+ mdspan<double, extents<size_t, num_cols>> x, mdspan<double, extents<size_t, num_rows>> y) {
2704
+ matrix_vector_product(scaled(3.0, A), x, y);
2705
+ }
2706
+
2707
+ // z = 7.0 times the transpose of A, times y
2708
+ void scaled_transposed_matvec(mdspan<double, extents<size_t, num_rows, num_cols>> A,
2709
+ mdspan<double, extents<size_t, num_rows>> y, mdspan<double, extents<size_t, num_cols>> z) {
2710
+ matrix_vector_product(scaled(7.0, transposed(A)), y, z);
2711
+ }
2712
+ ```
2713
+
2714
+ — *end example*]
2715
+
2716
+ ``` cpp
2717
+ template<in-matrix InMat, in-vector InVec1, in-vector InVec2, out-vector OutVec>
2718
+ void matrix_vector_product(InMat A, InVec1 x, InVec2 y, OutVec z);
2719
+ template<class ExecutionPolicy,
2720
+ in-matrix InMat, in-vector InVec1, in-vector InVec2, out-vector OutVec>
2721
+ void matrix_vector_product(ExecutionPolicy&& exec,
2722
+ InMat A, InVec1 x, InVec2 y, OutVec z);
2723
+ ```
2724
+
2725
+ These functions perform an updating matrix-vector product.
2726
+
2727
+ *Effects:* Computes z = y + A x.
2728
+
2729
+ *Remarks:* `z` may alias `y`.
2730
+
2731
+ [*Example 2*:
2732
+
2733
+ ``` cpp
2734
+ // y = 3.0 * A * x + 2.0 * y
2735
+ void scaled_matvec_2(mdspan<double, extents<size_t, num_rows, num_cols>> A,
2736
+ mdspan<double, extents<size_t, num_cols>> x, mdspan<double, extents<size_t, num_rows>> y) {
2737
+ matrix_vector_product(scaled(3.0, A), x, scaled(2.0, y), y);
2738
+ }
2739
+ ```
2740
+
2741
+ — *end example*]
2742
+
2743
+ #### Symmetric matrix-vector product <a id="linalg.algs.blas2.symv">[[linalg.algs.blas2.symv]]</a>
2744
+
2745
+ [*Note 1*: These functions correspond to the BLAS functions `xSYMV` and
2746
+ `xSPMV`. — *end note*]
2747
+
2748
+ The following elements apply to all functions in
2749
+ [[linalg.algs.blas2.symv]].
2750
+
2751
+ *Mandates:*
2752
+
2753
+ - If `InMat` has `layout_blas_packed` layout, then the layout’s
2754
+ `Triangle` template argument has the same type as the function’s
2755
+ `Triangle` template argument;
2756
+ - `compatible-static-extents<decltype(A), decltype(A)>(0, 1)` is `true`;
2757
+ - `possibly-multipliable<decltype(A), decltype(x), decltype(y)>()` is
2758
+ `true`; and
2759
+ - `possibly-addable<decltype(x), decltype(y), decltype(z)>()` is `true`
2760
+ for those overloads that take a `z` parameter.
2761
+
2762
+ *Preconditions:*
2763
+
2764
+ - `A.extent(0)` equals `A.extent(1)`,
2765
+ - `multipliable(A,x,y)` is `true`, and
2766
+ - `addable(x,y,z)` is `true` for those overloads that take a `z`
2767
+ parameter.
2768
+
2769
+ *Complexity:* 𝑂(`x.extent(0)` × `A.extent(1)`).
2770
+
2771
+ ``` cpp
2772
+ template<in-matrix InMat, class Triangle, in-vector InVec, out-vector OutVec>
2773
+ void symmetric_matrix_vector_product(InMat A, Triangle t, InVec x, OutVec y);
2774
+ template<class ExecutionPolicy,
2775
+ in-matrix InMat, class Triangle, in-vector InVec, out-vector OutVec>
2776
+ void symmetric_matrix_vector_product(ExecutionPolicy&& exec,
2777
+ InMat A, Triangle t, InVec x, OutVec y);
2778
+ ```
2779
+
2780
+ These functions perform an overwriting symmetric matrix-vector product,
2781
+ taking into account the `Triangle` parameter that applies to the
2782
+ symmetric matrix `A` [[linalg.general]].
2783
+
2784
+ *Effects:* Computes y = A x.
2785
+
2786
+ ``` cpp
2787
+ template<in-matrix InMat, class Triangle, in-vector InVec1, in-vector InVec2, out-vector OutVec>
2788
+ void symmetric_matrix_vector_product(InMat A, Triangle t, InVec1 x, InVec2 y, OutVec z);
2789
+ template<class ExecutionPolicy,
2790
+ in-matrix InMat, class Triangle, in-vector InVec1, in-vector InVec2, out-vector OutVec>
2791
+ void symmetric_matrix_vector_product(ExecutionPolicy&& exec,
2792
+ InMat A, Triangle t, InVec1 x, InVec2 y, OutVec z);
2793
+ ```
2794
+
2795
+ These functions perform an updating symmetric matrix-vector product,
2796
+ taking into account the `Triangle` parameter that applies to the
2797
+ symmetric matrix `A` [[linalg.general]].
2798
+
2799
+ *Effects:* Computes z = y + A x.
2800
+
2801
+ *Remarks:* `z` may alias `y`.
2802
+
2803
+ #### Hermitian matrix-vector product <a id="linalg.algs.blas2.hemv">[[linalg.algs.blas2.hemv]]</a>
2804
+
2805
+ [*Note 1*: These functions correspond to the BLAS functions `xHEMV` and
2806
+ `xHPMV`. — *end note*]
2807
+
2808
+ The following elements apply to all functions in
2809
+ [[linalg.algs.blas2.hemv]].
2810
+
2811
+ *Mandates:*
2812
+
2813
+ - If `InMat` has `layout_blas_packed` layout, then the layout’s
2814
+ `Triangle` template argument has the same type as the function’s
2815
+ `Triangle` template argument;
2816
+ - `compatible-static-extents<decltype(A), decltype(A)>(0, 1)` is `true`;
2817
+ - `possibly-multipliable<decltype(A), decltype(x), decltype(y)>()` is
2818
+ `true`; and
2819
+ - `possibly-addable<decltype(x), decltype(y), decltype(z)>()` is `true`
2820
+ for those overloads that take a `z` parameter.
2821
+
2822
+ *Preconditions:*
2823
+
2824
+ - `A.extent(0)` equals `A.extent(1)`,
2825
+ - `multipliable(A, x, y)` is `true`, and
2826
+ - `addable(x, y, z)` is `true` for those overloads that take a `z`
2827
+ parameter.
2828
+
2829
+ *Complexity:* 𝑂(`x.extent(0)` × `A.extent(1)`).
2830
+
2831
+ ``` cpp
2832
+ template<in-matrix InMat, class Triangle, in-vector InVec, out-vector OutVec>
2833
+ void hermitian_matrix_vector_product(InMat A, Triangle t, InVec x, OutVec y);
2834
+ template<class ExecutionPolicy,
2835
+ in-matrix InMat, class Triangle, in-vector InVec, out-vector OutVec>
2836
+ void hermitian_matrix_vector_product(ExecutionPolicy&& exec,
2837
+ InMat A, Triangle t, InVec x, OutVec y);
2838
+ ```
2839
+
2840
+ These functions perform an overwriting Hermitian matrix-vector product,
2841
+ taking into account the `Triangle` parameter that applies to the
2842
+ Hermitian matrix `A` [[linalg.general]].
2843
+
2844
+ *Effects:* Computes y = A x.
2845
+
2846
+ ``` cpp
2847
+ template<in-matrix InMat, class Triangle, in-vector InVec1, in-vector InVec2, out-vector OutVec>
2848
+ void hermitian_matrix_vector_product(InMat A, Triangle t, InVec1 x, InVec2 y, OutVec z);
2849
+ template<class ExecutionPolicy,
2850
+ in-matrix InMat, class Triangle, in-vector InVec1, in-vector InVec2, out-vector OutVec>
2851
+ void hermitian_matrix_vector_product(ExecutionPolicy&& exec,
2852
+ InMat A, Triangle t, InVec1 x, InVec2 y, OutVec z);
2853
+ ```
2854
+
2855
+ These functions perform an updating Hermitian matrix-vector product,
2856
+ taking into account the `Triangle` parameter that applies to the
2857
+ Hermitian matrix `A` [[linalg.general]].
2858
+
2859
+ *Effects:* Computes z = y + A x.
2860
+
2861
+ *Remarks:* `z` may alias `y`.
2862
+
2863
+ #### Triangular matrix-vector product <a id="linalg.algs.blas2.trmv">[[linalg.algs.blas2.trmv]]</a>
2864
+
2865
+ [*Note 1*: These functions correspond to the BLAS functions `xTRMV` and
2866
+ `xTPMV`. — *end note*]
2867
+
2868
+ The following elements apply to all functions in
2869
+ [[linalg.algs.blas2.trmv]].
2870
+
2871
+ *Mandates:*
2872
+
2873
+ - If `InMat` has `layout_blas_packed` layout, then the layout’s
2874
+ `Triangle` template argument has the same type as the function’s
2875
+ `Triangle` template argument;
2876
+ - `compatible-static-extents<decltype(A), decltype(A)>(0, 1)` is `true`;
2877
+ - `compatible-static-extents<decltype(A), decltype(y)>(0, 0)` is `true`;
2878
+ - `compatible-static-extents<decltype(A), decltype(x)>(0, 0)` is `true`
2879
+ for those overloads that take an `x` parameter; and
2880
+ - `compatible-static-extents<decltype(A), decltype(z)>(0, 0)` is `true`
2881
+ for those overloads that take a `z` parameter.
2882
+
2883
+ *Preconditions:*
2884
+
2885
+ - `A.extent(0)` equals `A.extent(1)`,
2886
+ - `A.extent(0)` equals `y.extent(0)`,
2887
+ - `A.extent(0)` equals `x.extent(0)` for those overloads that take an
2888
+ `x` parameter, and
2889
+ - `A.extent(0)` equals `z.extent(0)` for those overloads that take a `z`
2890
+ parameter.
2891
+
2892
+ ``` cpp
2893
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, in-vector InVec,
2894
+ out-vector OutVec>
2895
+ void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d, InVec x, OutVec y);
2896
+ template<class ExecutionPolicy,
2897
+ in-matrix InMat, class Triangle, class DiagonalStorage, in-vector InVec,
2898
+ out-vector OutVec>
2899
+ void triangular_matrix_vector_product(ExecutionPolicy&& exec,
2900
+ InMat A, Triangle t, DiagonalStorage d, InVec x, OutVec y);
2901
+ ```
2902
+
2903
+ These functions perform an overwriting triangular matrix-vector product,
2904
+ taking into account the `Triangle` and `DiagonalStorage` parameters that
2905
+ apply to the triangular matrix `A` [[linalg.general]].
2906
+
2907
+ *Effects:* Computes y = A x.
2908
+
2909
+ *Complexity:* 𝑂(`x.extent(0)` × `A.extent(1)`).
2910
+
2911
+ ``` cpp
2912
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-vector InOutVec>
2913
+ void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d, InOutVec y);
2914
+ template<class ExecutionPolicy,
2915
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-vector InOutVec>
2916
+ void triangular_matrix_vector_product(ExecutionPolicy&& exec,
2917
+ InMat A, Triangle t, DiagonalStorage d, InOutVec y);
2918
+ ```
2919
+
2920
+ These functions perform an in-place triangular matrix-vector product,
2921
+ taking into account the `Triangle` and `DiagonalStorage` parameters that
2922
+ apply to the triangular matrix `A` [[linalg.general]].
2923
+
2924
+ [*Note 1*: Performing this operation in place hinders parallelization.
2925
+ However, other `ExecutionPolicy` specific optimizations, such as
2926
+ vectorization, are still possible. — *end note*]
2927
+
2928
+ *Effects:* Computes a vector y' such that y' = A y, and assigns each
2929
+ element of y' to the corresponding element of y.
2930
+
2931
+ *Complexity:* 𝑂(`y.extent(0)` × `A.extent(1)`).
2932
+
2933
+ ``` cpp
2934
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
2935
+ in-vector InVec1, in-vector InVec2, out-vector OutVec>
2936
+ void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d,
2937
+ InVec1 x, InVec2 y, OutVec z);
2938
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
2939
+ in-vector InVec1, in-vector InVec2, out-vector OutVec>
2940
+ void triangular_matrix_vector_product(ExecutionPolicy&& exec,
2941
+ InMat A, Triangle t, DiagonalStorage d,
2942
+ InVec1 x, InVec2 y, OutVec z);
2943
+ ```
2944
+
2945
+ These functions perform an updating triangular matrix-vector product,
2946
+ taking into account the `Triangle` and `DiagonalStorage` parameters that
2947
+ apply to the triangular matrix `A` [[linalg.general]].
2948
+
2949
+ *Effects:* Computes z = y + A x.
2950
+
2951
+ *Complexity:* 𝑂(`x.extent(0)` × `A.extent(1)`).
2952
+
2953
+ *Remarks:* `z` may alias `y`.
2954
+
2955
+ #### Solve a triangular linear system <a id="linalg.algs.blas2.trsv">[[linalg.algs.blas2.trsv]]</a>
2956
+
2957
+ [*Note 1*: These functions correspond to the BLAS functions `xTRSV` and
2958
+ `xTPSV`. — *end note*]
2959
+
2960
+ The following elements apply to all functions in
2961
+ [[linalg.algs.blas2.trsv]].
2962
+
2963
+ *Mandates:*
2964
+
2965
+ - If `InMat` has `layout_blas_packed` layout, then the layout’s
2966
+ `Triangle` template argument has the same type as the function’s
2967
+ `Triangle` template argument;
2968
+ - `compatible-static-extents<decltype(A), decltype(A)>(0, 1)` is `true`;
2969
+ - `compatible-static-extents<decltype(A), decltype(b)>(0, 0)` is `true`;
2970
+ and
2971
+ - `compatible-static-extents<decltype(A), decltype(x)>(0, 0)` is `true`
2972
+ for those overloads that take an `x` parameter.
2973
+
2974
+ *Preconditions:*
2975
+
2976
+ - `A.extent(0)` equals `A.extent(1)`,
2977
+ - `A.extent(0)` equals `b.extent(0)`, and
2978
+ - `A.extent(0)` equals `x.extent(0)` for those overloads that take an
2979
+ `x` parameter.
2980
+
2981
+ ``` cpp
2982
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
2983
+ in-vector InVec, out-vector OutVec, class BinaryDivideOp>
2984
+ void triangular_matrix_vector_solve(InMat A, Triangle t, DiagonalStorage d,
2985
+ InVec b, OutVec x, BinaryDivideOp divide);
2986
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
2987
+ in-vector InVec, out-vector OutVec, class BinaryDivideOp>
2988
+ void triangular_matrix_vector_solve(ExecutionPolicy&& exec,
2989
+ InMat A, Triangle t, DiagonalStorage d,
2990
+ InVec b, OutVec x, BinaryDivideOp divide);
2991
+ ```
2992
+
2993
+ These functions perform a triangular solve, taking into account the
2994
+ `Triangle` and `DiagonalStorage` parameters that apply to the triangular
2995
+ matrix `A` [[linalg.general]].
2996
+
2997
+ *Effects:* Computes a vector x' such that b = A x', and assigns each
2998
+ element of x' to the corresponding element of x. If no such x' exists,
2999
+ then the elements of `x` are valid but unspecified.
3000
+
3001
+ *Complexity:* 𝑂(`A.extent(1)` × `b.extent(0)`).
3002
+
3003
+ ``` cpp
3004
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
3005
+ in-vector InVec, out-vector OutVec>
3006
+ void triangular_matrix_vector_solve(InMat A, Triangle t, DiagonalStorage d, InVec b, OutVec x);
3007
+ ```
3008
+
3009
+ *Effects:* Equivalent to:
3010
+
3011
+ ``` cpp
3012
+ triangular_matrix_vector_solve(A, t, d, b, x, divides<void>{});
3013
+ ```
3014
+
3015
+ ``` cpp
3016
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
3017
+ in-vector InVec, out-vector OutVec>
3018
+ void triangular_matrix_vector_solve(ExecutionPolicy&& exec,
3019
+ InMat A, Triangle t, DiagonalStorage d, InVec b, OutVec x);
3020
+ ```
3021
+
3022
+ *Effects:* Equivalent to:
3023
+
3024
+ ``` cpp
3025
+ triangular_matrix_vector_solve(std::forward<ExecutionPolicy>(exec),
3026
+ A, t, d, b, x, divides<void>{});
3027
+ ```
3028
+
3029
+ ``` cpp
3030
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
3031
+ inout-vector InOutVec, class BinaryDivideOp>
3032
+ void triangular_matrix_vector_solve(InMat A, Triangle t, DiagonalStorage d,
3033
+ InOutVec b, BinaryDivideOp divide);
3034
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
3035
+ inout-vector InOutVec, class BinaryDivideOp>
3036
+ void triangular_matrix_vector_solve(ExecutionPolicy&& exec,
3037
+ InMat A, Triangle t, DiagonalStorage d,
3038
+ InOutVec b, BinaryDivideOp divide);
3039
+ ```
3040
+
3041
+ These functions perform an in-place triangular solve, taking into
3042
+ account the `Triangle` and `DiagonalStorage` parameters that apply to
3043
+ the triangular matrix `A` [[linalg.general]].
3044
+
3045
+ [*Note 1*: Performing triangular solve in place hinders
3046
+ parallelization. However, other `ExecutionPolicy` specific
3047
+ optimizations, such as vectorization, are still possible. — *end note*]
3048
+
3049
+ *Effects:* Computes a vector x' such that b = A x', and assigns each
3050
+ element of x' to the corresponding element of b. If no such x' exists,
3051
+ then the elements of `b` are valid but unspecified.
3052
+
3053
+ *Complexity:* 𝑂(`A.extent(1)` × `b.extent(0)`).
3054
+
3055
+ ``` cpp
3056
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-vector InOutVec>
3057
+ void triangular_matrix_vector_solve(InMat A, Triangle t, DiagonalStorage d, InOutVec b);
3058
+ ```
3059
+
3060
+ *Effects:* Equivalent to:
3061
+
3062
+ ``` cpp
3063
+ triangular_matrix_vector_solve(A, t, d, b, divides<void>{});
3064
+ ```
3065
+
3066
+ ``` cpp
3067
+ template<class ExecutionPolicy,
3068
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-vector InOutVec>
3069
+ void triangular_matrix_vector_solve(ExecutionPolicy&& exec,
3070
+ InMat A, Triangle t, DiagonalStorage d, InOutVec b);
3071
+ ```
3072
+
3073
+ *Effects:* Equivalent to:
3074
+
3075
+ ``` cpp
3076
+ triangular_matrix_vector_solve(std::forward<ExecutionPolicy>(exec),
3077
+ A, t, d, b, divides<void>{});
3078
+ ```
3079
+
3080
+ #### Rank-1 (outer product) update of a matrix <a id="linalg.algs.blas2.rank1">[[linalg.algs.blas2.rank1]]</a>
3081
+
3082
+ ``` cpp
3083
+ template<in-vector InVec1, in-vector InVec2, inout-matrix InOutMat>
3084
+ void matrix_rank_1_update(InVec1 x, InVec2 y, InOutMat A);
3085
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, inout-matrix InOutMat>
3086
+ void matrix_rank_1_update(ExecutionPolicy&& exec, InVec1 x, InVec2 y, InOutMat A);
3087
+ ```
3088
+
3089
+ These functions perform a nonsymmetric nonconjugated rank-1 update.
3090
+
3091
+ [*Note 1*: These functions correspond to the BLAS functions `xGER` (for
3092
+ real element types) and `xGERU` (for complex element
3093
+ types). — *end note*]
3094
+
3095
+ *Mandates:* *`possibly-multipliable`*`<InOutMat, InVec2, InVec1>()` is
3096
+ `true`.
3097
+
3098
+ *Preconditions:* *`multipliable`*`(A, y, x)` is `true`.
3099
+
3100
+ *Effects:* Computes a matrix A' such that $A' = A + x y^T$, and assigns
3101
+ each element of A' to the corresponding element of A.
3102
+
3103
+ *Complexity:* 𝑂(`x.extent(0)` × `y.extent(0)`).
3104
+
3105
+ ``` cpp
3106
+ template<in-vector InVec1, in-vector InVec2, inout-matrix InOutMat>
3107
+ void matrix_rank_1_update_c(InVec1 x, InVec2 y, InOutMat A);
3108
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2, inout-matrix InOutMat>
3109
+ void matrix_rank_1_update_c(ExecutionPolicy&& exec, InVec1 x, InVec2 y, InOutMat A);
3110
+ ```
3111
+
3112
+ These functions perform a nonsymmetric conjugated rank-1 update.
3113
+
3114
+ [*Note 2*: These functions correspond to the BLAS functions `xGER` (for
3115
+ real element types) and `xGERC` (for complex element
3116
+ types). — *end note*]
3117
+
3118
+ *Effects:*
3119
+
3120
+ - For the overloads without an `ExecutionPolicy` argument, equivalent
3121
+ to:
3122
+ ``` cpp
3123
+ matrix_rank_1_update(x, conjugated(y), A);
3124
+ ```
3125
+ - otherwise, equivalent to:
3126
+ ``` cpp
3127
+ matrix_rank_1_update(std::forward<ExecutionPolicy>(exec), x, conjugated(y), A);
3128
+ ```
3129
+
3130
+ #### Symmetric or Hermitian Rank-1 (outer product) update of a matrix <a id="linalg.algs.blas2.symherrank1">[[linalg.algs.blas2.symherrank1]]</a>
3131
+
3132
+ [*Note 1*: These functions correspond to the BLAS functions `xSYR`,
3133
+ `xSPR`, `xHER`, and `xHPR`. They have overloads taking a scaling factor
3134
+ `alpha`, because it would be impossible to express the update
3135
+ $A = A - x x^T$ otherwise. — *end note*]
3136
+
3137
+ The following elements apply to all functions in
3138
+ [[linalg.algs.blas2.symherrank1]].
3139
+
3140
+ *Mandates:*
3141
+
3142
+ - If `InOutMat` has `layout_blas_packed` layout, then the layout’s
3143
+ `Triangle` template argument has the same type as the function’s
3144
+ `Triangle` template argument;
3145
+ - `compatible-static-extents<decltype(A), decltype(A)>(0, 1)` is `true`;
3146
+ and
3147
+ - `compatible-static-extents<decltype(A), decltype(x)>(0, 0)` is `true`.
3148
+
3149
+ *Preconditions:*
3150
+
3151
+ - `A.extent(0)` equals `A.extent(1)`, and
3152
+ - `A.extent(0)` equals `x.extent(0)`.
3153
+
3154
+ *Complexity:* 𝑂(`x.extent(0)` × `x.extent(0)`).
3155
+
3156
+ ``` cpp
3157
+ template<class Scalar, in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
3158
+ void symmetric_matrix_rank_1_update(Scalar alpha, InVec x, InOutMat A, Triangle t);
3159
+ template<class ExecutionPolicy,
3160
+ class Scalar, in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
3161
+ void symmetric_matrix_rank_1_update(ExecutionPolicy&& exec,
3162
+ Scalar alpha, InVec x, InOutMat A, Triangle t);
3163
+ ```
3164
+
3165
+ These functions perform a symmetric rank-1 update of the symmetric
3166
+ matrix `A`, taking into account the `Triangle` parameter that applies to
3167
+ `A` [[linalg.general]].
3168
+
3169
+ *Effects:* Computes a matrix A' such that $A' = A + \alpha x x^T$, where
3170
+ the scalar α is `alpha`, and assigns each element of A' to the
3171
+ corresponding element of A.
3172
+
3173
+ ``` cpp
3174
+ template<in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
3175
+ void symmetric_matrix_rank_1_update(InVec x, InOutMat A, Triangle t);
3176
+ template<class ExecutionPolicy,
3177
+ in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
3178
+ void symmetric_matrix_rank_1_update(ExecutionPolicy&& exec, InVec x, InOutMat A, Triangle t);
3179
+ ```
3180
+
3181
+ These functions perform a symmetric rank-1 update of the symmetric
3182
+ matrix `A`, taking into account the `Triangle` parameter that applies to
3183
+ `A` [[linalg.general]].
3184
+
3185
+ *Effects:* Computes a matrix A' such that $A' = A + x x^T$ and assigns
3186
+ each element of A' to the corresponding element of A.
3187
+
3188
+ ``` cpp
3189
+ template<class Scalar, in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
3190
+ void hermitian_matrix_rank_1_update(Scalar alpha, InVec x, InOutMat A, Triangle t);
3191
+ template<class ExecutionPolicy,
3192
+ class Scalar, in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
3193
+ void hermitian_matrix_rank_1_update(ExecutionPolicy&& exec,
3194
+ Scalar alpha, InVec x, InOutMat A, Triangle t);
3195
+ ```
3196
+
3197
+ These functions perform a Hermitian rank-1 update of the Hermitian
3198
+ matrix `A`, taking into account the `Triangle` parameter that applies to
3199
+ `A` [[linalg.general]].
3200
+
3201
+ *Effects:* Computes A' such that $A' = A + \alpha x x^H$, where the
3202
+ scalar α is `alpha`, and assigns each element of A' to the corresponding
3203
+ element of A.
3204
+
3205
+ ``` cpp
3206
+ template<in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
3207
+ void hermitian_matrix_rank_1_update(InVec x, InOutMat A, Triangle t);
3208
+ template<class ExecutionPolicy,
3209
+ in-vector InVec, possibly-packed-inout-matrix InOutMat, class Triangle>
3210
+ void hermitian_matrix_rank_1_update(ExecutionPolicy&& exec, InVec x, InOutMat A, Triangle t);
3211
+ ```
3212
+
3213
+ These functions perform a Hermitian rank-1 update of the Hermitian
3214
+ matrix `A`, taking into account the `Triangle` parameter that applies to
3215
+ `A` [[linalg.general]].
3216
+
3217
+ *Effects:* Computes a matrix A' such that $A' = A + x x^H$ and assigns
3218
+ each element of A' to the corresponding element of A.
3219
+
3220
+ #### Symmetric and Hermitian rank-2 matrix updates <a id="linalg.algs.blas2.rank2">[[linalg.algs.blas2.rank2]]</a>
3221
+
3222
+ [*Note 1*: These functions correspond to the BLAS functions
3223
+ `xSYR2`,`xSPR2`, `xHER2` and `xHPR2`. — *end note*]
3224
+
3225
+ The following elements apply to all functions in
3226
+ [[linalg.algs.blas2.rank2]].
3227
+
3228
+ *Mandates:*
3229
+
3230
+ - If `InOutMat` has `layout_blas_packed` layout, then the layout’s
3231
+ `Triangle` template argument has the same type as the function’s
3232
+ `Triangle` template argument;
3233
+ - `compatible-static-extents<decltype(A), decltype(A)>(0, 1)` is `true`;
3234
+ and
3235
+ - `possibly-multipliable<decltype(A), decltype(x), decltype(y)>()` is
3236
+ `true`.
3237
+
3238
+ *Preconditions:*
3239
+
3240
+ - `A.extent(0)` equals `A.extent(1)`, and
3241
+ - `multipliable(A, x, y)` is `true`.
3242
+
3243
+ *Complexity:* 𝑂(`x.extent(0)` × `y.extent(0)`).
3244
+
3245
+ ``` cpp
3246
+ template<in-vector InVec1, in-vector InVec2,
3247
+ possibly-packed-inout-matrix InOutMat, class Triangle>
3248
+ void symmetric_matrix_rank_2_update(InVec1 x, InVec2 y, InOutMat A, Triangle t);
3249
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2,
3250
+ possibly-packed-inout-matrix InOutMat, class Triangle>
3251
+ void symmetric_matrix_rank_2_update(ExecutionPolicy&& exec,
3252
+ InVec1 x, InVec2 y, InOutMat A, Triangle t);
3253
+ ```
3254
+
3255
+ These functions perform a symmetric rank-2 update of the symmetric
3256
+ matrix `A`, taking into account the `Triangle` parameter that applies to
3257
+ `A` [[linalg.general]].
3258
+
3259
+ *Effects:* Computes A' such that $A' = A + x y^T + y x^T$ and assigns
3260
+ each element of A' to the corresponding element of A.
3261
+
3262
+ ``` cpp
3263
+ template<in-vector InVec1, in-vector InVec2,
3264
+ possibly-packed-inout-matrix InOutMat, class Triangle>
3265
+ void hermitian_matrix_rank_2_update(InVec1 x, InVec2 y, InOutMat A, Triangle t);
3266
+ template<class ExecutionPolicy, in-vector InVec1, in-vector InVec2,
3267
+ possibly-packed-inout-matrix InOutMat, class Triangle>
3268
+ void hermitian_matrix_rank_2_update(ExecutionPolicy&& exec,
3269
+ InVec1 x, InVec2 y, InOutMat A, Triangle t);
3270
+ ```
3271
+
3272
+ These functions perform a Hermitian rank-2 update of the Hermitian
3273
+ matrix `A`, taking into account the `Triangle` parameter that applies to
3274
+ `A` [[linalg.general]].
3275
+
3276
+ *Effects:* Computes A' such that $A' = A + x y^H + y x^H$ and assigns
3277
+ each element of A' to the corresponding element of A.
3278
+
3279
+ ### BLAS 3 algorithms <a id="linalg.algs.blas3">[[linalg.algs.blas3]]</a>
3280
+
3281
+ #### General matrix-matrix product <a id="linalg.algs.blas3.gemm">[[linalg.algs.blas3.gemm]]</a>
3282
+
3283
+ [*Note 1*: These functions correspond to the BLAS function
3284
+ `xGEMM`. — *end note*]
3285
+
3286
+ The following elements apply to all functions in
3287
+ [[linalg.algs.blas3.gemm]] in addition to function-specific elements.
3288
+
3289
+ *Mandates:*
3290
+ `possibly-multipliable<decltype(A), decltype(B), decltype(C)>()` is
3291
+ `true`.
3292
+
3293
+ *Preconditions:* `multipliable(A, B, C)` is `true`.
3294
+
3295
+ *Complexity:* 𝑂(`A.extent(0)` × `A.extent(1)` × `B.extent(1)`).
3296
+
3297
+ ``` cpp
3298
+ template<in-matrix InMat1, in-matrix InMat2, out-matrix OutMat>
3299
+ void matrix_product(InMat1 A, InMat2 B, OutMat C);
3300
+ template<class ExecutionPolicy, in-matrix InMat1, in-matrix InMat2, out-matrix OutMat>
3301
+ void matrix_product(ExecutionPolicy&& exec, InMat1 A, InMat2 B, OutMat C);
3302
+ ```
3303
+
3304
+ *Effects:* Computes C = A B.
3305
+
3306
+ ``` cpp
3307
+ template<in-matrix InMat1, in-matrix InMat2, in-matrix InMat3, out-matrix OutMat>
3308
+ void matrix_product(InMat1 A, InMat2 B, InMat3 E, OutMat C);
3309
+ template<class ExecutionPolicy,
3310
+ in-matrix InMat1, in-matrix InMat2, in-matrix InMat3, out-matrix OutMat>
3311
+ void matrix_product(ExecutionPolicy&& exec, InMat1 A, InMat2 B, InMat3 E, OutMat C);
3312
+ ```
3313
+
3314
+ *Mandates:* *`possibly-addable`*`<InMat3, InMat3, OutMat>()` is `true`.
3315
+
3316
+ *Preconditions:* *`addable`*`(E, E, C)` is `true`.
3317
+
3318
+ *Effects:* Computes C = E + A B.
3319
+
3320
+ *Remarks:* `C` may alias `E`.
3321
+
3322
+ #### Symmetric, Hermitian, and triangular matrix-matrix product <a id="linalg.algs.blas3.xxmm">[[linalg.algs.blas3.xxmm]]</a>
3323
+
3324
+ [*Note 1*: These functions correspond to the BLAS functions `xSYMM`,
3325
+ `xHEMM`, and `xTRMM`. — *end note*]
3326
+
3327
+ The following elements apply to all functions in
3328
+ [[linalg.algs.blas3.xxmm]] in addition to function-specific elements.
3329
+
3330
+ *Mandates:*
3331
+
3332
+ - `possibly-multipliable<decltype(A), decltype(B), decltype(C)>()` is
3333
+ `true`, and
3334
+ - `possibly-addable<decltype(E), decltype(E), decltype(C)>()` is `true`
3335
+ for those overloads that take an `E` parameter.
3336
+
3337
+ *Preconditions:*
3338
+
3339
+ - `multipliable(A, B, C)` is `true`, and
3340
+ - `addable(E, E, C)` is `true` for those overloads that take an `E`
3341
+ parameter.
3342
+
3343
+ *Complexity:* 𝑂(`A.extent(0)` × `A.extent(1)` × `B.extent(1)`).
3344
+
3345
+ ``` cpp
3346
+ template<in-matrix InMat1, class Triangle, in-matrix InMat2, out-matrix OutMat>
3347
+ void symmetric_matrix_product(InMat1 A, Triangle t, InMat2 B, OutMat C);
3348
+ template<class ExecutionPolicy,
3349
+ in-matrix InMat1, class Triangle, in-matrix InMat2, out-matrix OutMat>
3350
+ void symmetric_matrix_product(ExecutionPolicy&& exec, InMat1 A, Triangle t, InMat2 B, OutMat C);
3351
+
3352
+ template<in-matrix InMat1, class Triangle, in-matrix InMat2, out-matrix OutMat>
3353
+ void hermitian_matrix_product(InMat1 A, Triangle t, InMat2 B, OutMat C);
3354
+ template<class ExecutionPolicy,
3355
+ in-matrix InMat1, class Triangle, in-matrix InMat2, out-matrix OutMat>
3356
+ void hermitian_matrix_product(ExecutionPolicy&& exec, InMat1 A, Triangle t, InMat2 B, OutMat C);
3357
+
3358
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
3359
+ in-matrix InMat2, out-matrix OutMat>
3360
+ void triangular_matrix_product(InMat1 A, Triangle t, DiagonalStorage d, InMat2 B, OutMat C);
3361
+ template<class ExecutionPolicy, in-matrix InMat1, class Triangle, class DiagonalStorage,
3362
+ in-matrix InMat2, out-matrix OutMat>
3363
+ void triangular_matrix_product(ExecutionPolicy&& exec,
3364
+ InMat1 A, Triangle t, DiagonalStorage d, InMat2 B, OutMat C);
3365
+ ```
3366
+
3367
+ These functions perform a matrix-matrix multiply, taking into account
3368
+ the `Triangle` and `DiagonalStorage` (if applicable) parameters that
3369
+ apply to the symmetric, Hermitian, or triangular (respectively) matrix
3370
+ `A` [[linalg.general]].
3371
+
3372
+ *Mandates:*
3373
+
3374
+ - If `InMat1` has `layout_blas_packed` layout, then the layout’s
3375
+ `Triangle` template argument has the same type as the function’s
3376
+ `Triangle` template argument; and
3377
+ - *`compatible-static-extents`*`<InMat1, InMat1>(0, 1)` is `true`.
3378
+
3379
+ *Preconditions:* `A.extent(0) == A.extent(1)` is `true`.
3380
+
3381
+ *Effects:* Computes C = A B.
3382
+
3383
+ ``` cpp
3384
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, out-matrix OutMat>
3385
+ void symmetric_matrix_product(InMat1 A, InMat2 B, Triangle t, OutMat C);
3386
+ template<class ExecutionPolicy,
3387
+ in-matrix InMat1, in-matrix InMat2, class Triangle, out-matrix OutMat>
3388
+ void symmetric_matrix_product(ExecutionPolicy&& exec,
3389
+ InMat1 A, InMat2 B, Triangle t, OutMat C);
3390
+
3391
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, out-matrix OutMat>
3392
+ void hermitian_matrix_product(InMat1 A, InMat2 B, Triangle t, OutMat C);
3393
+ template<class ExecutionPolicy,
3394
+ in-matrix InMat1, in-matrix InMat2, class Triangle, out-matrix OutMat>
3395
+ void hermitian_matrix_product(ExecutionPolicy&& exec,
3396
+ InMat1 A, InMat2 B, Triangle t, OutMat C);
3397
+
3398
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, class DiagonalStorage,
3399
+ out-matrix OutMat>
3400
+ void triangular_matrix_product(InMat1 A, InMat2 B, Triangle t, DiagonalStorage d, OutMat C);
3401
+ template<class ExecutionPolicy,
3402
+ in-matrix InMat1, in-matrix InMat2, class Triangle, class DiagonalStorage,
3403
+ out-matrix OutMat>
3404
+ void triangular_matrix_product(ExecutionPolicy&& exec,
3405
+ InMat1 A, InMat2 B, Triangle t, DiagonalStorage d, OutMat C);
3406
+ ```
3407
+
3408
+ These functions perform a matrix-matrix multiply, taking into account
3409
+ the `Triangle` and `DiagonalStorage` (if applicable) parameters that
3410
+ apply to the symmetric, Hermitian, or triangular (respectively) matrix
3411
+ `B` [[linalg.general]].
3412
+
3413
+ *Mandates:*
3414
+
3415
+ - If `InMat2` has `layout_blas_packed` layout, then the layout’s
3416
+ `Triangle` template argument has the same type as the function’s
3417
+ `Triangle` template argument; and
3418
+ - *`compatible-static-extents`*`<InMat2, InMat2>(0, 1)` is `true`.
3419
+
3420
+ *Preconditions:* `B.extent(0) == B.extent(1)` is `true`.
3421
+
3422
+ *Effects:* Computes C = A B.
3423
+
3424
+ ``` cpp
3425
+ template<in-matrix InMat1, class Triangle, in-matrix InMat2, in-matrix InMat3,
3426
+ out-matrix OutMat>
3427
+ void symmetric_matrix_product(InMat1 A, Triangle t, InMat2 B, InMat3 E, OutMat C);
3428
+ template<class ExecutionPolicy,
3429
+ in-matrix InMat1, class Triangle, in-matrix InMat2, in-matrix InMat3,
3430
+ out-matrix OutMat>
3431
+ void symmetric_matrix_product(ExecutionPolicy&& exec,
3432
+ InMat1 A, Triangle t, InMat2 B, InMat3 E, OutMat C);
3433
+
3434
+ template<in-matrix InMat1, class Triangle, in-matrix InMat2, in-matrix InMat3,
3435
+ out-matrix OutMat>
3436
+ void hermitian_matrix_product(InMat1 A, Triangle t, InMat2 B, InMat3 E, OutMat C);
3437
+ template<class ExecutionPolicy,
3438
+ in-matrix InMat1, class Triangle, in-matrix InMat2, in-matrix InMat3,
3439
+ out-matrix OutMat>
3440
+ void hermitian_matrix_product(ExecutionPolicy&& exec,
3441
+ InMat1 A, Triangle t, InMat2 B, InMat3 E, OutMat C);
3442
+
3443
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
3444
+ in-matrix InMat2, in-matrix InMat3, out-matrix OutMat>
3445
+ void triangular_matrix_product(InMat1 A, Triangle t, DiagonalStorage d, InMat2 B, InMat3 E,
3446
+ OutMat C);
3447
+ template<class ExecutionPolicy,
3448
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
3449
+ in-matrix InMat2, in-matrix InMat3, out-matrix OutMat>
3450
+ void triangular_matrix_product(ExecutionPolicy&& exec,
3451
+ InMat1 A, Triangle t, DiagonalStorage d, InMat2 B, InMat3 E,
3452
+ OutMat C);
3453
+ ```
3454
+
3455
+ These functions perform a potentially overwriting matrix-matrix
3456
+ multiply-add, taking into account the `Triangle` and `DiagonalStorage`
3457
+ (if applicable) parameters that apply to the symmetric, Hermitian, or
3458
+ triangular (respectively) matrix `A` [[linalg.general]].
3459
+
3460
+ *Mandates:*
3461
+
3462
+ - If `InMat1` has `layout_blas_packed` layout, then the layout’s
3463
+ `Triangle` template argument has the same type as the function’s
3464
+ `Triangle` template argument; and
3465
+ - *`compatible-static-extents`*`<InMat1, InMat1>(0, 1)` is `true`.
3466
+
3467
+ *Preconditions:* `A.extent(0) == A.extent(1)` is `true`.
3468
+
3469
+ *Effects:* Computes C = E + A B.
3470
+
3471
+ *Remarks:* `C` may alias `E`.
3472
+
3473
+ ``` cpp
3474
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, in-matrix InMat3,
3475
+ out-matrix OutMat>
3476
+ void symmetric_matrix_product(InMat1 A, InMat2 B, Triangle t, InMat3 E, OutMat C);
3477
+ template<class ExecutionPolicy,
3478
+ in-matrix InMat1, in-matrix InMat2, class Triangle, in-matrix InMat3,
3479
+ out-matrix OutMat>
3480
+ void symmetric_matrix_product(ExecutionPolicy&& exec,
3481
+ InMat1 A, InMat2 B, Triangle t, InMat3 E, OutMat C);
3482
+
3483
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, in-matrix InMat3,
3484
+ out-matrix OutMat>
3485
+ void hermitian_matrix_product(InMat1 A, InMat2 B, Triangle t, InMat3 E, OutMat C);
3486
+ template<class ExecutionPolicy,
3487
+ in-matrix InMat1, in-matrix InMat2, class Triangle, in-matrix InMat3,
3488
+ out-matrix OutMat>
3489
+ void hermitian_matrix_product(ExecutionPolicy&& exec,
3490
+ InMat1 A, InMat2 B, Triangle t, InMat3 E, OutMat C);
3491
+
3492
+ template<in-matrix InMat1, in-matrix InMat2, class Triangle, class DiagonalStorage,
3493
+ in-matrix InMat3, out-matrix OutMat>
3494
+ void triangular_matrix_product(InMat1 A, InMat2 B, Triangle t, DiagonalStorage d, InMat3 E,
3495
+ OutMat C);
3496
+ template<class ExecutionPolicy,
3497
+ in-matrix InMat1, in-matrix InMat2, class Triangle, class DiagonalStorage,
3498
+ in-matrix InMat3, out-matrix OutMat>
3499
+ void triangular_matrix_product(ExecutionPolicy&& exec,
3500
+ InMat1 A, InMat2 B, Triangle t, DiagonalStorage d, InMat3 E,
3501
+ OutMat C);
3502
+ ```
3503
+
3504
+ These functions perform a potentially overwriting matrix-matrix
3505
+ multiply-add, taking into account the `Triangle` and `DiagonalStorage`
3506
+ (if applicable) parameters that apply to the symmetric, Hermitian, or
3507
+ triangular (respectively) matrix `B` [[linalg.general]].
3508
+
3509
+ *Mandates:*
3510
+
3511
+ - If `InMat2` has `layout_blas_packed` layout, then the layout’s
3512
+ `Triangle` template argument has the same type as the function’s
3513
+ `Triangle` template argument; and
3514
+ - *`compatible-static-extents`*`<InMat2, InMat2>(0, 1)` is `true`.
3515
+
3516
+ *Preconditions:* `B.extent(0) == B.extent(1)` is `true`.
3517
+
3518
+ *Effects:* Computes C = E + A B.
3519
+
3520
+ *Remarks:* `C` may alias `E`.
3521
+
3522
+ #### In-place triangular matrix-matrix product <a id="linalg.algs.blas3.trmm">[[linalg.algs.blas3.trmm]]</a>
3523
+
3524
+ These functions perform an in-place matrix-matrix multiply, taking into
3525
+ account the `Triangle` and `DiagonalStorage` parameters that apply to
3526
+ the triangular matrix `A` [[linalg.general]].
3527
+
3528
+ [*Note 1*: These functions correspond to the BLAS function
3529
+ `xTRMM`. — *end note*]
3530
+
3531
+ ``` cpp
3532
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
3533
+ void triangular_matrix_left_product(InMat A, Triangle t, DiagonalStorage d, InOutMat C);
3534
+ template<class ExecutionPolicy,
3535
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
3536
+ void triangular_matrix_left_product(ExecutionPolicy&& exec,
3537
+ InMat A, Triangle t, DiagonalStorage d, InOutMat C);
3538
+ ```
3539
+
3540
+ *Mandates:*
3541
+
3542
+ - If `InMat` has `layout_blas_packed` layout, then the layout’s
3543
+ `Triangle` template argument has the same type as the function’s
3544
+ `Triangle` template argument;
3545
+ - *`possibly-multipliable`*`<InMat, InOutMat, InOutMat>()` is `true`;
3546
+ and
3547
+ - *`compatible-static-extents`*`<InMat, InMat>(0, 1)` is `true`.
3548
+
3549
+ *Preconditions:*
3550
+
3551
+ - *`multipliable`*`(A, C, C)` is `true`, and
3552
+ - `A.extent(0) == A.extent(1)` is `true`.
3553
+
3554
+ *Effects:* Computes a matrix C' such that C' = A C and assigns each
3555
+ element of C' to the corresponding element of C.
3556
+
3557
+ *Complexity:* 𝑂(`A.extent(0)` × `A.extent(1)` × `C.extent(0)`).
3558
+
3559
+ ``` cpp
3560
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
3561
+ void triangular_matrix_right_product(InMat A, Triangle t, DiagonalStorage d, InOutMat C);
3562
+ template<class ExecutionPolicy,
3563
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
3564
+ void triangular_matrix_right_product(ExecutionPolicy&& exec,
3565
+ InMat A, Triangle t, DiagonalStorage d, InOutMat C);
3566
+ ```
3567
+
3568
+ *Mandates:*
3569
+
3570
+ - If `InMat` has `layout_blas_packed` layout, then the layout’s
3571
+ `Triangle` template argument has the same type as the function’s
3572
+ `Triangle` template argument;
3573
+ - *`possibly-multipliable`*`<InOutMat, InMat, InOutMat>()` is `true`;
3574
+ and
3575
+ - *`compatible-static-extents`*`<InMat, InMat>(0, 1)` is `true`.
3576
+
3577
+ *Preconditions:*
3578
+
3579
+ - *`multipliable`*`(C, A, C)` is `true`, and
3580
+ - `A.extent(0) == A.extent(1)` is `true`.
3581
+
3582
+ *Effects:* Computes a matrix C' such that C' = C A and assigns each
3583
+ element of C' to the corresponding element of C.
3584
+
3585
+ *Complexity:* 𝑂(`A.extent(0)` × `A.extent(1)` × `C.extent(0)`).
3586
+
3587
+ #### Rank-k update of a symmetric or Hermitian matrix <a id="linalg.algs.blas3.rankk">[[linalg.algs.blas3.rankk]]</a>
3588
+
3589
+ [*Note 1*: These functions correspond to the BLAS functions `xSYRK` and
3590
+ `xHERK`. — *end note*]
3591
+
3592
+ The following elements apply to all functions in
3593
+ [[linalg.algs.blas3.rankk]].
3594
+
3595
+ *Mandates:*
3596
+
3597
+ - If `InOutMat` has `layout_blas_packed` layout, then the layout’s
3598
+ `Triangle` template argument has the same type as the function’s
3599
+ `Triangle` template argument;
3600
+ - `compatible-static-extents<decltype(A), decltype(A)>(0, 1)` is `true`;
3601
+ - `compatible-static-extents<decltype(C), decltype(C)>(0, 1)` is `true`;
3602
+ and
3603
+ - `compatible-static-extents<decltype(A), decltype(C)>(0, 0)` is `true`.
3604
+
3605
+ *Preconditions:*
3606
+
3607
+ - `A.extent(0)` equals `A.extent(1)`,
3608
+ - `C.extent(0)` equals `C.extent(1)`, and
3609
+ - `A.extent(0)` equals `C.extent(0)`.
3610
+
3611
+ *Complexity:* 𝑂(`A.extent(0)` × `A.extent(1)` × `C.extent(0)`).
3612
+
3613
+ ``` cpp
3614
+ template<class Scalar, in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
3615
+ void symmetric_matrix_rank_k_update(Scalar alpha, InMat A, InOutMat C, Triangle t);
3616
+ template<class ExecutionPolicy, class Scalar,
3617
+ in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
3618
+ void symmetric_matrix_rank_k_update(ExecutionPolicy&& exec,
3619
+ Scalar alpha, InMat A, InOutMat C, Triangle t);
3620
+ ```
3621
+
3622
+ *Effects:* Computes a matrix C' such that $C' = C + \alpha A A^T$, where
3623
+ the scalar α is `alpha`, and assigns each element of C' to the
3624
+ corresponding element of C.
3625
+
3626
+ ``` cpp
3627
+ template<in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
3628
+ void symmetric_matrix_rank_k_update(InMat A, InOutMat C, Triangle t);
3629
+ template<class ExecutionPolicy,
3630
+ in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
3631
+ void symmetric_matrix_rank_k_update(ExecutionPolicy&& exec,
3632
+ InMat A, InOutMat C, Triangle t);
3633
+ ```
3634
+
3635
+ *Effects:* Computes a matrix C' such that $C' = C + A A^T$, and assigns
3636
+ each element of C' to the corresponding element of C.
3637
+
3638
+ ``` cpp
3639
+ template<class Scalar, in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
3640
+ void hermitian_matrix_rank_k_update(Scalar alpha, InMat A, InOutMat C, Triangle t);
3641
+ template<class ExecutionPolicy,
3642
+ class Scalar, in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
3643
+ void hermitian_matrix_rank_k_update(ExecutionPolicy&& exec,
3644
+ Scalar alpha, InMat A, InOutMat C, Triangle t);
3645
+ ```
3646
+
3647
+ *Effects:* Computes a matrix C' such that $C' = C + \alpha A A^H$, where
3648
+ the scalar α is `alpha`, and assigns each element of C' to the
3649
+ corresponding element of C.
3650
+
3651
+ ``` cpp
3652
+ template<in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
3653
+ void hermitian_matrix_rank_k_update(InMat A, InOutMat C, Triangle t);
3654
+ template<class ExecutionPolicy,
3655
+ in-matrix InMat, possibly-packed-inout-matrix InOutMat, class Triangle>
3656
+ void hermitian_matrix_rank_k_update(ExecutionPolicy&& exec,
3657
+ InMat A, InOutMat C, Triangle t);
3658
+ ```
3659
+
3660
+ *Effects:* Computes a matrix C' such that $C' = C + A A^H$, and assigns
3661
+ each element of C' to the corresponding element of C.
3662
+
3663
+ #### Rank-2k update of a symmetric or Hermitian matrix <a id="linalg.algs.blas3.rank2k">[[linalg.algs.blas3.rank2k]]</a>
3664
+
3665
+ [*Note 1*: These functions correspond to the BLAS functions `xSYR2K`
3666
+ and `xHER2K`. — *end note*]
3667
+
3668
+ The following elements apply to all functions in
3669
+ [[linalg.algs.blas3.rank2k]].
3670
+
3671
+ *Mandates:*
3672
+
3673
+ - If `InOutMat` has `layout_blas_packed` layout, then the layout’s
3674
+ `Triangle` template argument has the same type as the function’s
3675
+ `Triangle` template argument;
3676
+ - `possibly-addable<decltype(A), decltype(B), decltype(C)>()` is `true`;
3677
+ and
3678
+ - `compatible-static-extents<decltype(A), decltype(A)>(0, 1)` is `true`.
3679
+
3680
+ *Preconditions:*
3681
+
3682
+ - `addable(A, B, C)` is `true`, and
3683
+ - `A.extent(0)` equals `A.extent(1)`.
3684
+
3685
+ *Complexity:* 𝑂(`A.extent(0)` × `A.extent(1)` × `C.extent(0)`).
3686
+
3687
+ ``` cpp
3688
+ template<in-matrix InMat1, in-matrix InMat2,
3689
+ possibly-packed-inout-matrix InOutMat, class Triangle>
3690
+ void symmetric_matrix_rank_2k_update(InMat1 A, InMat2 B, InOutMat C, Triangle t);
3691
+ template<class ExecutionPolicy, in-matrix InMat1, in-matrix InMat2,
3692
+ possibly-packed-inout-matrix InOutMat, class Triangle>
3693
+ void symmetric_matrix_rank_2k_update(ExecutionPolicy&& exec,
3694
+ InMat1 A, InMat2 B, InOutMat C, Triangle t);
3695
+ ```
3696
+
3697
+ *Effects:* Computes a matrix C' such that $C' = C + A B^T + B A^T$, and
3698
+ assigns each element of C' to the corresponding element of C.
3699
+
3700
+ ``` cpp
3701
+ template<in-matrix InMat1, in-matrix InMat2,
3702
+ possibly-packed-inout-matrix InOutMat, class Triangle>
3703
+ void hermitian_matrix_rank_2k_update(InMat1 A, InMat2 B, InOutMat C, Triangle t);
3704
+ template<class ExecutionPolicy,
3705
+ in-matrix InMat1, in-matrix InMat2,
3706
+ possibly-packed-inout-matrix InOutMat, class Triangle>
3707
+ void hermitian_matrix_rank_2k_update(ExecutionPolicy&& exec,
3708
+ InMat1 A, InMat2 B, InOutMat C, Triangle t);
3709
+ ```
3710
+
3711
+ *Effects:* Computes a matrix C' such that $C' = C + A B^H + B A^H$, and
3712
+ assigns each element of C' to the corresponding element of C.
3713
+
3714
+ #### Solve multiple triangular linear systems <a id="linalg.algs.blas3.trsm">[[linalg.algs.blas3.trsm]]</a>
3715
+
3716
+ [*Note 1*: These functions correspond to the BLAS function
3717
+ `xTRSM`. — *end note*]
3718
+
3719
+ ``` cpp
3720
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
3721
+ in-matrix InMat2, out-matrix OutMat, class BinaryDivideOp>
3722
+ void triangular_matrix_matrix_left_solve(InMat1 A, Triangle t, DiagonalStorage d,
3723
+ InMat2 B, OutMat X, BinaryDivideOp divide);
3724
+ template<class ExecutionPolicy,
3725
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
3726
+ in-matrix InMat2, out-matrix OutMat, class BinaryDivideOp>
3727
+ void triangular_matrix_matrix_left_solve(ExecutionPolicy&& exec,
3728
+ InMat1 A, Triangle t, DiagonalStorage d,
3729
+ InMat2 B, OutMat X, BinaryDivideOp divide);
3730
+ ```
3731
+
3732
+ These functions perform multiple matrix solves, taking into account the
3733
+ `Triangle` and `DiagonalStorage` parameters that apply to the triangular
3734
+ matrix `A` [[linalg.general]].
3735
+
3736
+ *Mandates:*
3737
+
3738
+ - If `InMat1` has `layout_blas_packed` layout, then the layout’s
3739
+ `Triangle` template argument has the same type as the function’s
3740
+ `Triangle` template argument;
3741
+ - *`possibly-multipliable`*`<InMat1, OutMat, InMat2>()` is `true`; and
3742
+ - *`compatible-static-extents`*`<InMat1, InMat1>(0, 1)` is `true`.
3743
+
3744
+ *Preconditions:*
3745
+
3746
+ - *`multipliable`*`(A, X, B)` is `true`, and
3747
+ - `A.extent(0) == A.extent(1)` is `true`.
3748
+
3749
+ *Effects:* Computes X' such that AX' = B, and assigns each element of X'
3750
+ to the corresponding element of X. If no such X' exists, then the
3751
+ elements of `X` are valid but unspecified.
3752
+
3753
+ *Complexity:* 𝑂(`A.extent(0)` × `X.extent(1)` × `X.extent(1)`).
3754
+
3755
+ [*Note 2*: Since the triangular matrix is on the left, the desired
3756
+ `divide` implementation in the case of noncommutative multiplication is
3757
+ mathematically equivalent to $y^{-1} x$, where x is the first argument
3758
+ and y is the second argument, and $y^{-1}$ denotes the multiplicative
3759
+ inverse of y. — *end note*]
3760
+
3761
+ ``` cpp
3762
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
3763
+ in-matrix InMat2, out-matrix OutMat>
3764
+ void triangular_matrix_matrix_left_solve(InMat1 A, Triangle t, DiagonalStorage d,
3765
+ InMat2 B, OutMat X);
3766
+ ```
3767
+
3768
+ *Effects:* Equivalent to:
3769
+
3770
+ ``` cpp
3771
+ triangular_matrix_matrix_left_solve(A, t, d, B, X, divides<void>{});
3772
+ ```
3773
+
3774
+ ``` cpp
3775
+ template<class ExecutionPolicy, in-matrix InMat1, class Triangle, class DiagonalStorage,
3776
+ in-matrix InMat2, out-matrix OutMat>
3777
+ void triangular_matrix_matrix_left_solve(ExecutionPolicy&& exec,
3778
+ InMat1 A, Triangle t, DiagonalStorage d,
3779
+ InMat2 B, OutMat X);
3780
+ ```
3781
+
3782
+ *Effects:* Equivalent to:
3783
+
3784
+ ``` cpp
3785
+ triangular_matrix_matrix_left_solve(std::forward<ExecutionPolicy>(exec),
3786
+ A, t, d, B, X, divides<void>{});
3787
+ ```
3788
+
3789
+ ``` cpp
3790
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
3791
+ in-matrix InMat2, out-matrix OutMat, class BinaryDivideOp>
3792
+ void triangular_matrix_matrix_right_solve(InMat1 A, Triangle t, DiagonalStorage d,
3793
+ InMat2 B, OutMat X, BinaryDivideOp divide);
3794
+ template<class ExecutionPolicy,
3795
+ in-matrix InMat1, class Triangle, class DiagonalStorage,
3796
+ in-matrix InMat2, out-matrix OutMat, class BinaryDivideOp>
3797
+ void triangular_matrix_matrix_right_solve(ExecutionPolicy&& exec,
3798
+ InMat1 A, Triangle t, DiagonalStorage d,
3799
+ InMat2 B, OutMat X, BinaryDivideOp divide);
3800
+ ```
3801
+
3802
+ These functions perform multiple matrix solves, taking into account the
3803
+ `Triangle` and `DiagonalStorage` parameters that apply to the triangular
3804
+ matrix `A` [[linalg.general]].
3805
+
3806
+ *Mandates:*
3807
+
3808
+ - If `InMat1` has `layout_blas_packed` layout, then the layout’s
3809
+ `Triangle` template argument has the same type as the function’s
3810
+ `Triangle` template argument;
3811
+ - *`possibly-multipliable`*`<OutMat, InMat1, InMat2>()` is `true`; and
3812
+ - *`compatible-static-extents`*`<InMat1, InMat1>(0,1)` is `true`.
3813
+
3814
+ *Preconditions:*
3815
+
3816
+ - *`multipliable`*`(X, A, B)` is `true`, and
3817
+ - `A.extent(0) == A.extent(1)` is `true`.
3818
+
3819
+ *Effects:* Computes X' such that X'A = B, and assigns each element of X'
3820
+ to the corresponding element of X. If no such X' exists, then the
3821
+ elements of `X` are valid but unspecified.
3822
+
3823
+ *Complexity:* O( `B.extent(0)` ⋅ `B.extent(1)` ⋅ `A.extent(1)` )
3824
+
3825
+ [*Note 1*: Since the triangular matrix is on the right, the desired
3826
+ `divide` implementation in the case of noncommutative multiplication is
3827
+ mathematically equivalent to $x y^{-1}$, where x is the first argument
3828
+ and y is the second argument, and $y^{-1}$ denotes the multiplicative
3829
+ inverse of y. — *end note*]
3830
+
3831
+ ``` cpp
3832
+ template<in-matrix InMat1, class Triangle, class DiagonalStorage,
3833
+ in-matrix InMat2, out-matrix OutMat>
3834
+ void triangular_matrix_matrix_right_solve(InMat1 A, Triangle t, DiagonalStorage d,
3835
+ InMat2 B, OutMat X);
3836
+ ```
3837
+
3838
+ *Effects:* Equivalent to:
3839
+
3840
+ ``` cpp
3841
+ triangular_matrix_matrix_right_solve(A, t, d, B, X, divides<void>{});
3842
+ ```
3843
+
3844
+ ``` cpp
3845
+ template<class ExecutionPolicy, in-matrix InMat1, class Triangle, class DiagonalStorage,
3846
+ in-matrix InMat2, out-matrix OutMat>
3847
+ void triangular_matrix_matrix_right_solve(ExecutionPolicy&& exec,
3848
+ InMat1 A, Triangle t, DiagonalStorage d,
3849
+ InMat2 B, OutMat X);
3850
+ ```
3851
+
3852
+ *Effects:* Equivalent to:
3853
+
3854
+ ``` cpp
3855
+ triangular_matrix_matrix_right_solve(std::forward<ExecutionPolicy>(exec),
3856
+ A, t, d, B, X, divides<void>{});
3857
+ ```
3858
+
3859
+ #### Solve multiple triangular linear systems in-place <a id="linalg.algs.blas3.inplacetrsm">[[linalg.algs.blas3.inplacetrsm]]</a>
3860
+
3861
+ [*Note 1*: These functions correspond to the BLAS function
3862
+ `xTRSM`. — *end note*]
3863
+
3864
+ ``` cpp
3865
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
3866
+ inout-matrix InOutMat, class BinaryDivideOp>
3867
+ void triangular_matrix_matrix_left_solve(InMat A, Triangle t, DiagonalStorage d,
3868
+ InOutMat B, BinaryDivideOp divide);
3869
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
3870
+ inout-matrix InOutMat, class BinaryDivideOp>
3871
+ void triangular_matrix_matrix_left_solve(ExecutionPolicy&& exec,
3872
+ InMat A, Triangle t, DiagonalStorage d,
3873
+ InOutMat B, BinaryDivideOp divide);
3874
+ ```
3875
+
3876
+ These functions perform multiple in-place matrix solves, taking into
3877
+ account the `Triangle` and `DiagonalStorage` parameters that apply to
3878
+ the triangular matrix `A` [[linalg.general]].
3879
+
3880
+ [*Note 1*: This algorithm makes it possible to compute factorizations
3881
+ like Cholesky and LU in place. Performing triangular solve in place
3882
+ hinders parallelization. However, other `ExecutionPolicy` specific
3883
+ optimizations, such as vectorization, are still possible. — *end note*]
3884
+
3885
+ *Mandates:*
3886
+
3887
+ - If `InMat` has `layout_blas_packed` layout, then the layout’s
3888
+ `Triangle` template argument has the same type as the function’s
3889
+ `Triangle` template argument;
3890
+ - *`possibly-multipliable`*`<InMat, InOutMat, InOutMat>()` is `true`;
3891
+ and
3892
+ - *`compatible-static-extents`*`<InMat, InMat>(0, 1)` is `true`.
3893
+
3894
+ *Preconditions:*
3895
+
3896
+ - *`multipliable`*`(A, B, B)` is `true`, and
3897
+ - `A.extent(0) == A.extent(1)` is `true`.
3898
+
3899
+ *Effects:* Computes X' such that AX' = B, and assigns each element of X'
3900
+ to the corresponding element of B. If so such X' exists, then the
3901
+ elements of `B` are valid but unspecified.
3902
+
3903
+ *Complexity:* 𝑂(`A.extent(0)` × `A.extent(1)` × `B.extent(1)`).
3904
+
3905
+ ``` cpp
3906
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
3907
+ void triangular_matrix_matrix_left_solve(InMat A, Triangle t, DiagonalStorage d,
3908
+ InOutMat B);
3909
+ ```
3910
+
3911
+ *Effects:* Equivalent to:
3912
+
3913
+ ``` cpp
3914
+ triangular_matrix_matrix_left_solve(A, t, d, B, divides<void>{});
3915
+ ```
3916
+
3917
+ ``` cpp
3918
+ template<class ExecutionPolicy,
3919
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
3920
+ void triangular_matrix_matrix_left_solve(ExecutionPolicy&& exec,
3921
+ InMat A, Triangle t, DiagonalStorage d,
3922
+ InOutMat B);
3923
+ ```
3924
+
3925
+ *Effects:* Equivalent to:
3926
+
3927
+ ``` cpp
3928
+ triangular_matrix_matrix_left_solve(std::forward<ExecutionPolicy>(exec),
3929
+ A, t, d, B, divides<void>{});
3930
+ ```
3931
+
3932
+ ``` cpp
3933
+ template<in-matrix InMat, class Triangle, class DiagonalStorage,
3934
+ inout-matrix InOutMat, class BinaryDivideOp>
3935
+ void triangular_matrix_matrix_right_solve(InMat A, Triangle t, DiagonalStorage d,
3936
+ InOutMat B, BinaryDivideOp divide);
3937
+ template<class ExecutionPolicy, in-matrix InMat, class Triangle, class DiagonalStorage,
3938
+ inout-matrix InOutMat, class BinaryDivideOp>
3939
+ void triangular_matrix_matrix_right_solve(ExecutionPolicy&& exec,
3940
+ InMat A, Triangle t, DiagonalStorage d,
3941
+ InOutMat B, BinaryDivideOp divide);
3942
+ ```
3943
+
3944
+ These functions perform multiple in-place matrix solves, taking into
3945
+ account the `Triangle` and `DiagonalStorage` parameters that apply to
3946
+ the triangular matrix `A` [[linalg.general]].
3947
+
3948
+ [*Note 2*: This algorithm makes it possible to compute factorizations
3949
+ like Cholesky and LU in place. Performing triangular solve in place
3950
+ hinders parallelization. However, other `ExecutionPolicy` specific
3951
+ optimizations, such as vectorization, are still possible. — *end note*]
3952
+
3953
+ *Mandates:*
3954
+
3955
+ - If `InMat` has `layout_blas_packed` layout, then the layout’s
3956
+ `Triangle` template argument has the same type as the function’s
3957
+ `Triangle` template argument;
3958
+ - *`possibly-multipliable`*`<InOutMat, InMat, InOutMat>()` is `true`;
3959
+ and
3960
+ - *`compatible-static-extents`*`<InMat, InMat>(0, 1)` is `true`.
3961
+
3962
+ *Preconditions:*
3963
+
3964
+ - *`multipliable`*`(B, A, B)` is `true`, and
3965
+ - `A.extent(0) == A.extent(1)` is `true`.
3966
+
3967
+ *Effects:* Computes X' such that X'A = B, and assigns each element of X'
3968
+ to the corresponding element of B. If so such X' exists, then the
3969
+ elements of `B` are valid but unspecified.
3970
+
3971
+ *Complexity:* 𝑂(`A.extent(0)` × `A.extent(1)` × `B.extent(1)`).
3972
+
3973
+ ``` cpp
3974
+ template<in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
3975
+ void triangular_matrix_matrix_right_solve(InMat A, Triangle t, DiagonalStorage d, InOutMat B);
3976
+ ```
3977
+
3978
+ *Effects:* Equivalent to:
3979
+
3980
+ ``` cpp
3981
+ triangular_matrix_matrix_right_solve(A, t, d, B, divides<void>{});
3982
+ ```
3983
+
3984
+ ``` cpp
3985
+ template<class ExecutionPolicy,
3986
+ in-matrix InMat, class Triangle, class DiagonalStorage, inout-matrix InOutMat>
3987
+ void triangular_matrix_matrix_right_solve(ExecutionPolicy&& exec,
3988
+ InMat A, Triangle t, DiagonalStorage d,
3989
+ InOutMat B);
3990
+ ```
3991
+
3992
+ *Effects:* Equivalent to:
3993
+
3994
+ ``` cpp
3995
+ triangular_matrix_matrix_right_solve(std::forward<ExecutionPolicy>(exec),
3996
+ A, t, d, B, divides<void>{});
3997
+ ```
3998
+