Symplectic Autoencoders and the Toda Lattice

In this tutorial we use a SymplecticAutoencoder to approximate the linear wave equation with a lower-dimensional Hamiltonian model and compare it with standard proper symplectic decomposition (PSD).

The system

The Toda lattice is a prototypical example of a Hamiltonian PDE. It is described by

\[ H(q, p) = \sum_{n\in\mathbb{Z}}\left( \frac{p_n^2}{2} + \alpha e^{q_n - q_{n+1}} \right).\]

We further assume a finite number of particles $N$ and impose periodic boundary conditions:

\[\begin{aligned} q_{n+N} & \equiv q_n \\ p_{n+N} & \equiv p_n. \end{aligned}\]

In this tutorial we want to reduce the dimension of the big system by a significant factor with (i) proper symplectic decomposition (PSD) and (ii) symplectic autoencoders. The first approach is strictly linear whereas the second one allows for more general mappings.

Using the Toda lattice in numerical experiments

In order to use the Toda lattice in numerical experiments we have to pick suitable initial conditions. For this, consider the third-degree spline:

\[h(s) = \begin{cases} 1 - \frac{3}{2}s^2 + \frac{3}{4}s^3 & \text{if } 0 \leq s \leq 1 \\ \frac{1}{4}(2 - s)^3 & \text{if } 1 < s \leq 2 \\ 0 & \text{else.} \end{cases}\]

Plotted on the relevant domain it looks like this:

We end up with the following choice of parametrized initial conditions:

\[u_0(\mu)(\omega) = h(s(\omega, \mu)), \quad s(\omega, \mu) = 20 \mu |\omega + \frac{\mu}{2}|.\]

For the purposes of this tutorial we will use the default value for $\mu$ provided in GeometricMachineLearning:

using GeometricProblems.TodaLattice: μ

μ

0.3

Get the data

The training data can very easily be obtained by using the packages GeometricProblems and GeometricIntegrators:

using GeometricProblems.TodaLattice: hodeproblem
using GeometricIntegrators: integrate, ImplicitMidpoint
using GeometricMachineLearning
using Plots
import Random

pr = hodeproblem(; tspan = (0.0, 100.))
sol = integrate(pr, ImplicitMidpoint())
dl = DataLoader(sol; autoencoder = true)

dl.input_dim

Here we first integrate the system with implicit midpoint and then put the training data into the right format by calling DataLoader. We can get the dimension of the system by calling dl.input_dim. Also note that the keyword autoencoder was set to true.

Train the network

We now want to compare two different approaches: PSDArch and SymplecticAutoencoder. For this we first have to set up the networks:

const reduced_dim = 2

psd_arch = PSDArch(dl.input_dim, reduced_dim)
sae_arch = SymplecticAutoencoder(dl.input_dim, reduced_dim; n_encoder_blocks = 4, n_decoder_blocks = 4, n_encoder_layers = 4, n_decoder_layers = 1)

Random.seed!(123)
psd_nn = NeuralNetwork(psd_arch)
sae_nn = NeuralNetwork(sae_arch)

Training a neural network is usually done by calling an instance of Optimizer in GeometricMachineLearning. PSDArch however can be solved directly by using singular value decomposition and this is done by calling solve!. The SymplecticAutoencoder we train with the AdamOptimizer however:

const n_epochs = 8
const batch_size = 16

o = Optimizer(sae_nn, AdamOptimizer(Float64))

psd_error = solve!(psd_nn, dl)
sae_error = o(sae_nn, dl, Batch(batch_size), n_epochs)

hline([psd_error]; color = 2, label = "PSD error")
plot!(sae_error; color = 3, label = "SAE error", xlabel = "epoch", ylabel = "training error")

The online stage with a standard integrator

After having trained our neural network we can now evaluate it in the online stage of reduced complexity modeling:

psd_rs = HRedSys(pr, encoder(psd_nn), decoder(psd_nn); integrator = ImplicitMidpoint())
sae_rs = HRedSys(pr, encoder(sae_nn), decoder(sae_nn); integrator = ImplicitMidpoint())

projection_error(psd_rs)

0.6172687774821377

projection_error(sae_rs)

0.10902221691883568

Next we plot a comparison between the PSD prediction and the symplectic autoencoder prediction:

sol_full = integrate_full_system(psd_rs)
sol_psd_reduced = integrate_reduced_system(psd_rs)
sol_sae_reduced = integrate_reduced_system(sae_rs)

const t_steps = 100
plot(sol_full.s.q[t_steps], label = "Implicit Midpoint")
plot!(psd_rs.decoder((q = sol_psd_reduced.s.q[t_steps], p = sol_psd_reduced.s.p[t_steps])).q, label = "PSD")
plot!(sae_rs.decoder((q = sol_sae_reduced.s.q[t_steps], p = sol_sae_reduced.s.p[t_steps])).q, label = "SAE")

We can see that the autoencoder approach has much more approximation capabilities than the psd approach. The jiggly lines are due to the fact that training was done for only 8 epochs.

The online stage with a neural network

Instead of using a standard integrator we can also use a neural network that is trained on the reduced data. For this:

data_unprocessed = encoder(sae_nn)(dl.input)
data_processed = (  q = reshape(data_unprocessed.q, reduced_dim ÷ 2, length(data_unprocessed.q)),
                    p = reshape(data_unprocessed.p, reduced_dim ÷ 2, length(data_unprocessed.p))
                    )

dl_reduced = DataLoader(data_processed; autoencoder = false)
integrator_batch_size = 128
integrator_train_epochs = 4

integrator_nn = NeuralNetwork(GSympNet(reduced_dim))
o_integrator = Optimizer(AdamOptimizer(Float64), integrator_nn)
struct ReducedLoss{ET, DT} <: GeometricMachineLearning.NetworkLoss
    encoder::ET
    decoder::DT
end
function (loss::ReducedLoss)(model::Chain, params::Tuple, input::CT, output::CT) where {AT <:Array, CT <: NamedTuple{(:q, :p), Tuple{AT, AT}}}
    GeometricMachineLearning._compute_loss(loss.decoder(model(loss.encoder(input), params)), output)
end

loss = ReducedLoss(encoder(sae_nn), decoder(sae_nn))
dl_integration = DataLoader((q = reshape(dl.input.q, size(dl.input.q, 1), size(dl.input.q, 3)),
                             p = reshape(dl.input.p, size(dl.input.p, 1), size(dl.input.p, 3)));
                            autoencoder = false
                            )

o_integrator(integrator_nn, dl_integration, Batch(integrator_batch_size), integrator_train_epochs, loss)

[ Info: You have provided a NamedTuple with keys q and p; the data are matrices. This is interpreted as *symplectic data*.
[ Info: You have provided a NamedTuple with keys q and p; the data are matrices. This is interpreted as *symplectic data*.

Progress:  50%|████████████████████▌                    |  ETA: 0:01:22
  TrainingLoss:  4.998238114969655


Progress:  75%|██████████████████████████████▊          |  ETA: 0:00:40
  TrainingLoss:  4.934409786714543


Progress: 100%|█████████████████████████████████████████| Time: 0:02:38
  TrainingLoss:  4.871005151377493

We can now evaluate the solution:

ics = (q = dl_reduced.input.q[:, 1], p = dl_reduced.input.p[:, 1])
time_series = iterate(integrator_nn, ics; n_points = t_steps)
prediction = (q = time_series.q[:, end], p = time_series.p[:, end])
sol = decoder(sae_nn)(prediction)

plot!(sol.q; label = "Neural Network Integrator")

References

[36]: P. Buchfink, S. Glas and B. Haasdonk. Symplectic model reduction of Hamiltonian systems on nonlinear manifolds and approximation with weakly symplectic autoencoder. SIAM Journal on Scientific Computing 45, A289–A311 (2023).
[37]: L. Peng and K. Mohseni. Symplectic model reduction of Hamiltonian systems. SIAM Journal on Scientific Computing 38, A1–A27 (2016).
[38]: C. Greif and K. Urban. Decay of the Kolmogorov N-width for wave problems. Applied Mathematics Letters 96, 216–222 (2019).