Hamiltonian Model Order Reduction
Hamiltonian PDEs are partial differential equations that, like its ODE counterpart, have a Hamiltonian associated with it. The linear wave equation can be written as such a Hamiltonian PDE with
\[\mathcal{H}(q, p; \mu) := \frac{1}{2}\int_\Omega\mu^2(\partial_\xi{}q(t,\xi;\mu))^2 + p(t,\xi;\mu)^2d\xi.\]
Note that in contrast to the ODE case where the Hamiltonian is a function $H(\cdot; \mu):\mathbb{R}^{2d}\to\mathbb{R},$ we now have a functional $\mathcal{H}(\cdot, \cdot; \mu):\mathcal{C}^\infty(\mathcal{D})\times\mathcal{C}^\infty(\mathcal{D})\to\mathbb{R}.$ The PDE for this Hamiltonian can be obtained similarly as in the ODE case:
\[\partial_t{}q(t,\xi;\mu) = \frac{\delta{}\mathcal{H}}{\delta{}p} = p(t,\xi;\mu), \quad \partial_t{}p(t,\xi;\mu) = -\frac{\delta{}\mathcal{H}}{\delta{}q} = \mu^2\partial_{\xi{}\xi}q(t,\xi;\mu)\]
Neglecting the Hamiltonian structure of a system can have grave consequences on the performance of the reduced order model [68–70] which is why all algorithms in GeometricMachineLearning
designed for producing reduced order models respect the structure of the system.
The Symplectic Solution Manifold
As with regular parametric PDEs, we also associate a solution manifold with Hamiltonian PDEs. This is a finite-dimensional manifold, on which the dynamics can be described through a Hamiltonian ODE. The reduced system, with which we approximate this symplectic solution manifold, is a low dimensional symplectic vector space $\mathbb{R}^{2n}$ together with a reduction $\mathcal{P}$ and a reconstruction $\mathcal{R}.$ If we now take an initial condition on the solution manifold $\hat{u}_0\in\mathcal{M} \approx \mathcal{R}(\mathbb{R}^{2n})$ and project it to the reduced space with $\mathcal{P}$, we get $u = \mathcal{P}(\hat{u}_0).$ We can now integrate it on the reduced space via the induced differential equation, which is of canonical Hamiltonian form, and obtain an orbit $u(t)$ which can then be mapped back to an orbit on the solution manifold[1] via $\mathcal{R}.$ The resulting orbit $\mathcal{R}(u(t))$ is ideally the unique orbit on the full order model $\hat{u}(t)\in\mathcal{M}$.
For Hamiltonian model order reduction we additionally require that the reduction $\mathcal{P}$ satisfies
\[ \nabla_z\mathcal{P}\mathbb{J}_{2N}(\nabla_z\mathcal{P})^T = \mathbb{J}_{2n} \text{ for $z\in\mathbb{R}^{2N}$}\]
and the reconstruction $\mathcal{R}$ satisfies[2]
\[ (\nabla_z\mathcal{R})^T\mathbb{J}_{2N}\nabla_z\mathcal{R} = \mathbb{J}_{2n}.\]
With this we have
A Hamiltonian system on the reduced space $(\mathbb{R}^{2n}, \mathbb{J}_{2n}^T)$ is equivalent to a non-canonical symplectic system $(\mathcal{M}, \mathbb{J}_{2N}^T|_\mathcal{M})$ where
\[ \mathcal{M} = \mathcal{R}(\mathbb{R}^{2n})\]
is an approximation to the solution manifold. We further have
\[ \mathbb{J}_{2N}|_\mathcal{M}(z) = ((\nabla_z\mathcal{R})^+)^T\mathbb{J}_{2n}(\nabla_z\mathcal{R})^+,\]
so the dynamics on $\mathcal{M}$ can be described through a Hamiltonian ODE on $\mathbb{R}^{2n}.$
For the proof we use the fact that $\mathcal{M} = \mathcal{R}(\mathbb{R}^{2n})$ is a manifold whose coordinate chart is the local inverse of $\mathcal{R}$ which we will call $\psi$, i.e. around a point $y\in\mathcal{M}$ we have $\psi\circ\mathcal{R}(y) = y.$[3] We further define the symplectic inverse of a matrix $A\in\mathbb{R}^{2N\times2n}$ as
\[ A^+ = \mathbb{J}_{2n}^TA^T\mathbb{J}_{2N},\]
which gives:
\[ A^+A = \mathbb{J}_{2n}^TA^T\mathbb{J}_{2N}A = \mathbb{I}_{2n},\]
iff $A$ is symplectic, i.e. $A^T\mathbb{J}_{2N}A = \mathbb{J}_{2n}$.
Proof
Note that the tangent space at $y = \mathcal{R}(z)$ to $\mathcal{M}$ is:
\[ T_y\mathcal{M} = \{(\nabla_z\mathcal{R})v: v\in\mathbb{R}^{2n}\}.\]
The mapping
\[ \mathcal{M} \to T\mathcal{M}, y \mapsto (\nabla_z\mathcal{R})\mathbb{J}_{2n}\nabla_zH\]
is clearly a vector field. We now prove that it is symplectic and equal to $\mathbb{J}_{2N}\nabla_y(H\circ\psi).$ For this first note that we have $\mathbb{I} = (\nabla_z\mathcal{R})^+\nabla_z\mathcal{R} = (\nabla_{\mathcal{R}(z)}\psi)\nabla_z\mathcal{R}$ and that the pseudoinverse is unique. We then have:
\[ \mathbb{J}_{2N}\nabla_yH\circ\psi = \mathbb{J}_{2N}(\nabla_y\psi)^T\nabla_{\psi(y)}H = \mathbb{J}_{2N}\left((\nabla_{\psi(y)}\mathcal{R})^+\right)^T\nabla_{\psi(y)}H = (\nabla_{\psi(y)}\mathcal{R})\mathbb{J}_{2n}\nabla_{\psi(y)}H,\]
which proves that every Hamiltonian system on $\mathbb{R}^{2n}$ induces a Hamiltonian system on $\mathcal{M}$. Conversely assume we are given a Hamiltonian vector field whose flow map evolves on $\mathcal{M}$, which we denote by
\[\hat{X}(z) = \mathbb{J}_{2N}\nabla_{z}\hat{H} = (\nabla_{\psi(z)}\mathcal{R})\bar{X}(\psi(z)),\]
where $\bar{X}$ is a vector field on the reduced space. In the last equality we used that the flow map evolves on $\mathcal{M}$, so the corresponding vector field needs to map to $T\mathcal{M}.$ We further have:
\[(\nabla_{\psi(z)}\mathcal{R})^+\hat{X}(z) = \mathbb{J}_{2n}(\nabla_{\psi(z)}\mathcal{R})^T\nabla_z\hat{H} = \mathbb{J}_{2n}\nabla_{\psi(z)}(\hat{H}\circ\mathcal{R}) = \bar{X}(\psi(z)),\]
and we see that the vector field on the reduced space also has to be Hamiltonian. We can thus express a high-dimensional Hamiltonian system on $\mathcal{M}$ with a low-dimensional Hamiltonian system on $\mathbb{R}^{2n}$.
In the proof we used that the pseudoinverse is unique. This is not true in general [68], but holds for the architectures discussed here (proper symplectic decomposition and symplectic autoencoders). We will postpone the proof of this until after we introduced symplectic autoencoders in detail.
This theorem serves as the basis for Hamiltonian model order reduction via proper symplectic decomposition and symplectic autoencoders. We will now briefly introduce these two approaches[4].
Proper Symplectic Decomposition
For proper symplectic decomposition (PSD) the reduction $\mathcal{P}$ and the reconstruction $\mathcal{R}$ are constrained to be linear, orthonormal and symplectic. Note that these first two properties are shared with POD. The easiest way[5] to enforce this is through the so-called "cotangent lift" [68]:
\[\mathcal{R} \equiv \Psi_\mathrm{CL} = \begin{bmatrix} \Phi & \mathbb{O} \\ \mathbb{O} & \Phi \end{bmatrix} \text{ where $\Phi\in{}St(n,N)\subset\mathbb{R}^{N\times{}n}$},\]
i.e. both $\Phi$ and $\Psi_\mathrm{CL}$ are elements of the Stiefel manifold and we furthermore have $\Psi_\mathrm{CL}^T\mathbb{J}_{2N}\Psi_\mathrm{CL} = \mathbb{J}_{2n}$, i.e. $\Psi_\mathrm{CL}$ is symplectic. If the snapshot matrix is of the form:
\[M = \left[\begin{array}{c:c:c:c} \hat{q}_1(t_0) & \hat{q}_1(t_1) & \quad\ldots\quad & \hat{q}_1(t_f) \\ \hat{q}_2(t_0) & \hat{q}_2(t_1) & \ldots & \hat{q}_2(t_f) \\ \ldots & \ldots & \ldots & \ldots \\ \hat{q}_N(t_0) & \hat{q}_N(t_1) & \ldots & \hat{q}_N(t_f) \\ \hat{p}_1(t_0) & \hat{p}_1(t_1) & \ldots & \hat{p}_1(t_f) \\ \hat{p}_2(t_0) & \hat{p}_2(t_1) & \ldots & \hat{p}_2(t_f) \\ \ldots & \ldots & \ldots & \ldots \\ \hat{p}_{N}(t_0) & \hat{p}_{N}(t_1) & \ldots & \hat{p}_{N}(t_f) \\ \end{array}\right],\]
with $\mathtt{nts} := f - 1$ is the number of time steps. Then $\Phi_\mathrm{CL}$ can be computed in a very straight-forward manner:
The ideal cotangent lift $\Psi_\mathrm{CL}$ for the snapshot matrix of form
\[ M = \begin{bmatrix} M_q \\ M_p \end{bmatrix},\]
i.e. the cotangent lift that minimizes the projection error, can be obtained the following way:
- Rearrange the rows of the matrix $M$ such that we end up with a $N\times(2\mathtt{nts})$ matrix: $\hat{M} := [M_q, M_p]$.
- Perform SVD: $\hat{M} = V\Sigma{}U^T.$
- Set $\Phi\gets{}U\mathtt{[:,1:n]}.$
$\Psi_\mathrm{CL}$ is then built based on this $\Phi$.
For details on the cotangent lift (and other methods for linear symplectic model reduction) consult [68]. In GeometricMachineLearning
we use the function solve!
for this task.
Symplectic Autoencoders
Symplectic Autoencoders are a type of neural network suitable for treating Hamiltonian parametrized PDEs with slowly decaying Kolmogorov $n$-width. It is based on PSD and symplectic neural networks[6].
Symplectic autoencoders are motivated similarly to standard autoencoders for model order reduction: PSD suffers from similar shortcomings as regular POD. PSD is a linear map and the approximation space $\tilde{\mathcal{M}}= \{\Psi^\mathrm{dec}(z_r)\in\mathbb{R}^{2N}:z_r\in\mathbb{R}^{2n}\}$ is therefore also linear. For problems with slowly-decaying Kolmogorov $n$-width this leads to very poor approximations.
In order to overcome this difficulty we use neural networks, more specifically SympNets, together with cotangent lift-like matrices. The resulting architecture, symplectic autoencoders, are discussed in the dedicated section on neural network architectures.
Workflow for Symplectic ROM
As with any other data-driven reduced order modeling technique we first discretize the PDE. This should be done with a structure-preserving scheme, thus yielding a (high-dimensional) Hamiltonian ODE as a result. Going back to the example of the linear wave equation, we can discretize this equation with finite differences to obtain a Hamiltonian ODE:
\[\mathcal{H}_\mathrm{discr}(z(t;\mu);\mu) := \frac{1}{2}x(t;\mu)^T\begin{bmatrix} -\mu^2D_{\xi{}\xi} & \mathbb{O} \\ \mathbb{O} & \mathbb{I} \end{bmatrix} x(t;\mu).\]
In Hamiltonian reduced order modeling we try to find a symplectic submanifold in the solution space[7] that captures the dynamics of the full system as well as possible.
Similar to the regular PDE case we again build an encoder $\mathcal{P} \equiv \Psi^\mathrm{enc}$ and a decoder $\mathcal{R} \equiv \Psi^\mathrm{dec}$; but now both these mappings are required to be symplectic.
Concretely this means:
- The encoder is a mapping from a high-dimensional symplectic space to a low-dimensional symplectic space, i.e. $\Psi^\mathrm{enc}:\mathbb{R}^{2N}\to\mathbb{R}^{2n}$ such that $\nabla\Psi^\mathrm{enc}\mathbb{J}_{2N}(\nabla\Psi^\mathrm{enc})^T = \mathbb{J}_{2n}$.
- The decoder is a mapping from a low-dimensional symplectic space to a high-dimensional symplectic space, i.e. $\Psi^\mathrm{dec}:\mathbb{R}^{2n}\to\mathbb{R}^{2N}$ such that $(\nabla\Psi^\mathrm{dec})^T\mathbb{J}_{2N}\nabla\Psi^\mathrm{dec} = \mathbb{J}_{2n}$.
If these two maps are constrained to linear maps this amounts to PSD.
After we obtained $\Psi^\mathrm{enc}$ and $\Psi^\mathrm{dec}$ we can construct the reduced model. GeometricMachineLearning
has a symplectic Galerkin projection implemented. This symplectic Galerkin projection does:
\[ \begin{pmatrix} v_r(q, p) \\ f_r(q, p) \end{pmatrix} = \frac{d}{dt} \begin{pmatrix} q \\ p \end{pmatrix} = \mathbb{J}_{2n}(\nabla_z\mathcal{R})^T\mathbb{J}_{2N}^T \begin{pmatrix} v(\mathcal{R}(q, p)) \\ f(\mathcal{R}(q, p)) \end{pmatrix},\]
where $v$ are the first $n$ components of the vector field and $f$ are the second $n$ components of the vector field. The superscript $r$ indicated a reduced vector field. These reduced vector fields are built with GeometricMachineLearning.build_v_reduced
and GeometricMachineLearning.build_f_reduced
.
Library Functions
GeometricMachineLearning.SymplecticEncoder
— TypeSymplecticEncoder <: Encoder
This is the abstract SymplecticEncoder
type.
See Encoder
for the super type and NonLinearSymplecticEncoder
for a derived struct
.
GeometricMachineLearning.SymplecticDecoder
— TypeSymplecticDecoder <: Decoder
This is the abstract SymplecticDecoder
type.
See Decoder
for the super type and NonLinearSymplecticDecoder
for a derived struct
.
GeometricMachineLearning.NonLinearSymplecticEncoder
— TypeNonLinearSymplecticEncoder
This should not be called direclty. Instad use SymplecticAutoencoder
.
An instance of NonLinearSymplecticEncoder
can be generated by calling encoder
on an instance of SymplecticAutoencoder
.
GeometricMachineLearning.NonLinearSymplecticDecoder
— TypeNonLinearSymplecticDecoder
This should not be called direclty. Instad use SymplecticAutoencoder
.
An instance of NonLinearSymplecticDecoder
can be generated by calling decoder
on an instance of SymplecticAutoencoder
.
GeometricMachineLearning.HRedSys
— TypeHRedSys
HRedSys
computes the reconstructed dynamics in the full system based on the reduced one. Optionally it can be compared to the FOM solution.
It can be called using two different constructors.
Constructors
The standard constructor is:
HRedSys(N, n; encoder, decoder, v_full, f_full, v_reduced, f_reduced, parameters, tspan, tstep, ics, projection_error)
where
encoder
: a function $\mathbb{R}^{2N}\mapsto{}\mathbb{R}^{2n}$decoder
: a (differentiable) function $\mathbb{R}^{2n}\mapsto\mathbb{R}^{2N}$v_full
: a (differentiable) mapping defined the same way as in GeometricIntegrators.f_full
: a (differentiable) mapping defined the same way as in GeometricIntegrators.v_reduced
: a (differentiable) mapping defined the same way as in GeometricIntegrators.f_reduced
: a (differentiable) mapping defined the same way as in GeometricIntegrators.integrator
: is used for integrating the reduced system.parameters
: a NamedTuple that parametrizes the vector fields (the same for fullvectorfield and reducedvectorfield)tspan
: a tuple(t₀, tₗ)
that specifies start and end point of the time interval over which integration is performed.tstep
: the time stepics
: the initial condition for the big system.
The other, and more convenient, constructor is:
HRedSys(odeensemble, encoder, decoder)
where odeensemble
can be a HODEEnsemble
or a HODEProblem
. encoder
and decoder
have to be neural networks. With this constructor one can also pass an integrator via the keyword argument integrator
. The default is ImplicitMidpoint()
. Internally this calls the functions build_v_reduced
and build_f_reduced
in order to build the reduced vector fields.
GeometricMachineLearning.build_v_reduced
— Functionbuild_v_reduced(v_full, f_full, decoder)
Builds the reduced vector field ($q$ part) based on the full vector field for a Hamiltonian system.
We derive the reduced vector field via the reduced Hamiltonian: $\tilde{H} := H\circ\Psi^\mathrm{dec}$.
We then get
\[\begin{aligned} \mathbb{J}_{2n}\nabla_\xi\tilde{H} = \mathbb{J}_{2n}(\nabla\Psi^\mathrm{dec})^T\mathbb{J}_{2N}^T\mathbb{J}_{2N}\nabla_z{}H & = \mathbb{J}_{2n}(\nabla\Psi^\mathrm{dec})^T\mathbb{J}_{2N}^T \begin{pmatrix} v(z) \\ f(z) \end{pmatrix} \\ & = \begin{pmatrix} - (\nabla_p\Psi_q)^Tf(z) + (\nabla_p\Psi_p)^Tv(z) \\ (\nabla_q\Psi_q)^Tf(z) - (\nabla_q\Psi_p)^Tv(z) \end{pmatrix}. \end{aligned}\]
build_v_reduced
outputs the first half of the entries of this vector field.
GeometricMachineLearning.build_f_reduced
— Functionbuild_f_reduced(v_full, f_full, decoder)
Builds the reduced vector field ($p$ part) based on the full vector field for a Hamiltonian system.
build_f_reduced
outputs the second half of the entries of this vector field.
See build_v_reduced
for more information.
GeometricMachineLearning.PSDLayer
— TypePSDLayer(input_dim, output_dim)
Make an instance of PSDLayer
.
This is a PSD-like layer used for symplectic autoencoders. One layer has the following shape:
\[A = \begin{bmatrix} \Phi & \mathbb{O} \\ \mathbb{O} & \Phi \end{bmatrix},\]
where $\Phi$ is an element of the Stiefel manifold $St(n, N)$.
GeometricMachineLearning.PSDArch
— TypePSDArch <: SymplecticCompression
PSDArch
is the architecture that corresponds to PSDLayer
.
The architecture
Proper symplectic decomposition (PSD) can be seen as a SymplecticAutoencoder
for which the decoder and the encoder are both PSD-like matrices (see the docs for PSDLayer
.
Training
For optimizing the parameters in this architecture no neural network training is necessary (see the docs for solve!
).
The constructor
The constructor only takes two arguments as input:
full_dim::Integer
reduced_dim::Integer
GeometricMachineLearning.solve!
— Functionsolve!(nn::NeuralNetwork{<:PSDArch}, input)
Solve the cotangent lift problem for an input.
PSDArch
does not require neural network training since it is a strictly linear operation that can be solved with singular value decomposition (SVD).
References
- [70]
- P. Buchfink, S. Glas and B. Haasdonk. Symplectic model reduction of Hamiltonian systems on nonlinear manifolds and approximation with weakly symplectic autoencoder. SIAM Journal on Scientific Computing 45, A289–A311 (2023).
- [68]
- L. Peng and K. Mohseni. Symplectic model reduction of Hamiltonian systems. SIAM Journal on Scientific Computing 38, A1–A27 (2016).
- 1To be precise, an approximation of the solution manifold $\mathcal{R}(\mathbb{R}^{2n})$, as we are not able to find the solution manifold exactly in practice.
- 2We should note that satisfying this symplecticity condition is much more important for the reconstruction than for the reduction. There is a lack of research on whether the symplecticity condition for the projection is really needed; in [70] it is entirely ignored for example.
- 3A similar proof can be found in [72]. Further note that, if we enforced the condition $\mathcal{P}\circ\mathcal{R} = \mathrm{id}$ exactly, the projection $\mathcal{P}$ would be equal to the local inverse $\psi.$ For the proof here we however only require the existence of $\psi$, not its explicit construction as $\mathcal{P}.$
- 4We will discuss symplectic autoencoders later in a dedicated section.
- 5The original PSD paper [68] proposes another approach to build linear reductions and reconstructions with the so-called "complex SVD." In practice this only brings minor advantages over the cotangent lift however [69].
- 6We call these SympNets most of the time. This term was coined in [5].
- 7The submanifold, that approximates the solution manifold, is $\tilde{\mathcal{M}} = \{\Psi^\mathrm{dec}(z_r)\in\mathbb{R}^{2N}:u_r\in\mathrm{R}^{2n}\}$ where $z_r$ is the reduced state of the system. By a slight abuse of notation we also denote $\tilde{\mathcal{M}}$ by $\mathcal{M}$ as we have done previously when showing equivalence between Hamiltonian vector fields on $\mathbb{R}^{2n}$ and $\mathcal{M}$.