Riemannian Manifolds

A Riemannian manifold is a manifold $\mathcal{M}$ that we endow with a mapping $g$ that smoothly[1] assigns a metric $g_x$ to each tangent space $T_x\mathcal{M}$. By a slight abuse of notation we will also refer to this $g$ as a metric.

After having defined a metric $g$ we can associate a length to each curve $\gamma:[0, t] \to \mathcal{M}$ through:

\[L(\gamma) = \int_0^t \sqrt{g_{\gamma(s)}(\gamma'(s), \gamma'(s))}ds.\]

This $L$ turns $\mathcal{M}$ into a metric space:

Definition

The metric on a Riemannian manifold $\mathcal{M}$ is

\[d(x, y) = \mathrm{inf}_{\text{$\gamma(0) = x$ and $\gamma(t) = y$}}L(\gamma),\]

where $t$ can be chosen arbitrarily.

If a curve is minimal with respect to the function $L$ we call it the shortest curve or a geodesic. So we say that a curve $\gamma:[0, t]\to\mathcal{M}$ is a geodesic if there is no shorter curve that can connect two points in $\gamma([0, t])$, i.e.

\[d(\gamma(t_i), \gamma(t_f)) = \int_{t_i}^{t_f}\sqrt{g_{\gamma(s)}(\gamma'(s), \gamma'(s))}ds,\]

for any $t_i, t_f\in[0, t]$.

An important result of Riemannian geometry states that there exists a vector field $X$ on $T\mathcal{M}$, called the geodesic spray, whose integral curves are derivatives of geodesics.

Geodesic Sprays and the Exponential Map

To every Riemannian manifold we can naturally associate a vector field called the geodesic spray or geodesic equation. For our purposes it is enough to state that this vector field is unique and well-defined [5].

The important property of the geodesic spray is

Theorem

Given an initial point $x$ and an initial velocity $v_x$, an integral curve for the geodesic spray is of the form $t \mapsto (\gamma_{v_x}(t), \gamma_{v_x}'(t))$ where $\gamma_{v_x}$ is a geodesic. We further have the property that the integral curve for the geodesic spray for an initial point $x$ and an initial velocity $\eta\cdot{}v_x$ (where $\eta$ is a scalar) is of the form $t \mapsto (\gamma_{\eta\cdot{}v_x}(t), \gamma_{\eta\cdot{}v_x}'(t)) = (\gamma_{v_x}(\eta\cdot{}t), \eta\cdot\gamma_{v_x}'(\eta\cdot{}t)).$

It is therefore customary to introduce the exponential map $\exp:T_x\mathcal{M}\to\mathcal{M}$ as

\[\exp(v_x) := \gamma_{v_x}(1),\]

and we see that $\gamma_{v_x}(t) = \exp(t\cdot{}v_x)$. In GeometricMachineLearning we denote the exponential map by geodesic to avoid confusion with the matrix exponential map[2]:

\[ \mathtt{geodesic}(x, v_x) \equiv \exp(v_x).\]

We give an example here:

using GeometricMachineLearning

Y = rand(StiefelManifold, 3, 1)

v = 5 * rand(3, 1)
Δ = v - Y * (v' * Y)


    # height = 75.,

# plot a sphere with radius one and origin 0
surface!(ax, Main.sphere(1., [0., 0., 0.])...; alpha = .5, transparency = true)

point_vec = ([Y[1]], [Y[2]], [Y[3]])
scatter!(ax, point_vec...; color = morange, marker = :star5)

arrow_vec = ([Δ[1]], [Δ[2]], [Δ[3]])
arrows!(ax, point_vec..., arrow_vec...; color = mred, linewidth = .02)

We now solve the geodesic spray for $\eta\cdot\Delta$ for $\eta = 0.1, 0.2, 0.3, \ldots, 2.5$ and plot the corresponding points:

Δ_increments = [Δ * η for η in 0.1 : 0.1 : 5.5]

Y_increments = [geodesic(Y, Δ_increment) for Δ_increment in Δ_increments]

fig, ax = set_up_plot(; theme = theme)
for Y_increment in Y_increments
    scatter!(ax, [Y_increment[1]], [Y_increment[2]], [Y_increment[3]];
        color = mred, markersize = 5)
end

fig

So a geodesic can be seen as the equivalent of a straight line on a manifold. Also note that we drew a random element form StiefelManifold here and not from $S^2$. This is because Stiefel manifolds are more general spaces than $S^n$ and also comprise them.

The Riemannian Gradient

The Riemannian gradient of a function $L\mathcal{M}\to\mathbb{R}$ is a vector field[3] $\mathrm{grad}^gL$ (or simply $\mathrm{grad}L$) for which we have

\[ g_x(\mathrm{grad}_x^gL, v_x) = (\nabla_{\varphi_U(x)}(L\circ\varphi_U^{-1}))^T \varphi_U'(v_x), \]

where

\[ \nabla_xf = \begin{pmatrix} \frac{\partial{}f}{\partial{}x_1} \\ \cdots \\ \frac{\partial{}f}{\partial{}x_n} \end{pmatrix},\]

is the Euclidean gradient. By the non-degeneracy of $g$ the Riemannian gradient always exists [3]. We will give specific examples of this when discussing the Stiefel manifold and the Grassmann manifold.

Gradient Flows and Riemannian Optimization

In GeometricMachineLearning we can include weights in neural networks that are part of a manifold. Training such neural networks amounts to Riemannian optimization and hence solving the gradient flow equation. The gradient flow equation is given by

\[X(x) = - \mathrm{grad}_xL.\]

Solving this gradient flow equation will then lead us to a local minimum on $\mathcal{M}$. This will be elaborated on when talking about optimizers. In practice we cannot solve the gradient flow equation directly and have to rely on approximations. The most straightforward approximation (and one that serves as a basis for all the optimization algorithms in GeometricMachineLearning) is to take the point $(x, X(x))$ as an initial condition for the geodesic spray and then solve the ODE for a small time step. Such an update rule, i.e.

\[x^{(t)} \leftarrow \gamma_{X(x^{(t-1)})}(\Delta{}t)\text{ with $\Delta{}t$ the time step},\]

we call the gradient optimization scheme.

Library Functions

GeometricMachineLearning.geodesicMethod
geodesic(Y::Manifold, Δ)

Take as input an element of a manifold Y and a tangent vector in Δ in the corresponding tangent space and compute the geodesic (exponential map).

In different notation: take as input an element $x$ of $\mathcal{M}$ and an element of $T_x\mathcal{M}$ and return $\mathtt{geodesic}(x, v_x) = \exp(v_x).$ For example:

Y = rand(StiefelManifold{Float64}, N, n)
Δ = rgrad(Y, rand(N, n))
geodesic(Y, Δ)

See the docstring for rgrad for details on this function.

source

References

[2]
S. Lang. Fundamentals of differential geometry. Vol. 191 (Springer Science & Business Media, 2012).
[5]
M. P. Do Carmo and J. Flaherty Francis. Riemannian geometry. Vol. 2 (Springer, 1992).
  • 1Smooth here refers to the fact that $g:\mathcal{M}\to\text{(Space of Metrics)}$ has to be a smooth map. But in order to discuss this in detail we would have to define a topology on the space of metrics. A more detailed discussion can be found in [2, 3, 5].
  • 2The Riemannian exponential map and the matrix exponential map coincide for many matrix Lie groups.
  • 3We also write $\mathrm{grad}^gL(x) = \mathrm{grad}^g_xL.$