References

[1]
I. Goodfellow, Y. Bengio and A. Courville. Deep learning (MIT press, Cambridge, MA, 2016).
[2]
E. Hairer, C. Lubich and G. Wanner. Geometric Numerical integration: structure-preserving algorithms for ordinary differential equations (Springer, Heidelberg, 2006).
[3]
J. Nocedal and S. J. Wright. Numerical optimization (Springer, New York, NY, 2006). Second Edition.
[4]
F. Mezzadri. How to generate random matrices from the classical compact groups, arXiv preprint math-ph/0609050 (2006).
[5]
M. J. Kochenderfer and T. A. Wheeler. Algorithms for optimization (Mit Press, 2019).
[6]
P.-A. Absil, R. Mahony and R. Sepulchre. Riemannian geometry of Grassmann manifolds with a view on algorithmic computation. Acta Applicandae Mathematica 80, 199–220 (2004).
[7]
P.-A. Absil, R. Mahony and R. Sepulchre. Optimization algorithms on matrix manifolds (Princeton University Press, Princeton, New Jersey, 2008).
[8]
T. Bendokat, R. Zimmermann and P.-A. Absil. A Grassmann manifold handbook: Basic geometry and computational aspects, arXiv preprint arXiv:2011.13699 (2020).
[9]
T. Bendokat and R. Zimmermann. The real symplectic Stiefel and Grassmann manifolds: metrics, geodesics and applications, arXiv preprint arXiv:2108.12447 (2021).
[10]
B. Brantner. Generalizing Adam To Manifolds For Efficiently Training Transformers, arXiv preprint arXiv:2305.16901 (2023).