Neural Network Integrators

In GeometricMachineLearning we can divide most neural network architectures (that are used for applications to physical systems) into two categories: autoencoders and integrators. Integrator in its most general form refers to an approximation of the flow of an ODE (see the section on the existence and uniqueness theorem) by a numerical scheme. Traditionally these numerical schemes were constructed by defining certain relationships between a known time step $z^{(t)}$ and a future unknown one $z^{(t+1)}$ [7, 41]:

\[ f(z^{(t)}, z^{(t+1)}) = 0.\]

One usually refers to such a relationship as an "integration scheme". If this relationship can be reformulated as

\[ z^{(t+1)} = g(z^{(t)}),\]

then we refer to the scheme as explicit, if it cannot be reformulated in such a way then we refer to it as implicit. Implicit schemes are typically more expensive to solve than explicit ones. The Julia library GeometricIntegrators [42] offers a wide variety of integration schemes both implicit and explicit.

The neural network integrators in GeometricMachineLearning (the corresponding type is NeuralNetworkIntegrator) are all explicit integration schemes where the function $g$ above is modeled with a neural network.

Neural networks, as an alternative to traditional methods, are employed because of (i) potentially superior performance and (ii) an ability to learn unknown dynamics from data.

Multi-step methods

Multi-step method [30, 31] refers to schemes that are of the form[1]:

\[ f(z^{(t - \mathtt{sl} + 1)}, z^{(t - \mathtt{sl} + 2)}, \ldots, z^{(t)}, z^{(t + 1)}, \ldots, z^{(\mathtt{pw} + 1)}) = 0,\]

where sl is short for sequence length and pw is short for prediction window. In contrast to traditional single-step methods, sl and pw can be greater than 1. An explicit multi-step method has the following form:

\[[z^{(t+1)}, \ldots, z^{(t+\mathtt{pw})}] = g(z^{(t - \mathtt{sl} + 1)}, \ldots, z^{(t)}).\]

There are essentially two ways to construct multi-step methods with neural networks: the older one is using recurrent neural networks such as long short-term memory cells (LSTMs, [43]) and the newer one is using transformer neural networks [26]. Both of these approaches have been successfully employed to learn multi-step methods (see [34, 39] for the former and [32, 44, 45] for the latter), but because the transformer architecture exhibits superior performance on modern hardware and can be imbued with geometric properties it is recommended to always use a transformer-derived architecture when dealing with time series[2].

Explicit multi-step methods derived from he transformer are always subtypes of the type TransformerIntegrator in GeometricMachineLearning. In GeometricMachineLearning the standard transformer, the volume-preserving transformer and the linear symplectic transformer are implemented.

Library Functions

References

[7]
E. Hairer, C. Lubich and G. Wanner. Geometric Numerical integration: structure-preserving algorithms for ordinary differential equations (Springer, 2006).
[41]
B. Leimkuhler and S. Reich. Simulating hamiltonian dynamics. No. 14 (Cambridge university press, 2004).
[42]
[28]
K. Feng. The step-transition operators for multi-step methods of ODE's. Journal of Computational Mathematics, 193–202 (1998).
[43]
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation 9, 1735–1780 (1997).
[26]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser and I. Polosukhin. Attention is all you need. Advances in neural information processing systems 30 (2017).
[34]
S. Fresca, L. Dede’ and A. Manzoni. A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs. Journal of Scientific Computing 87, 1–36 (2021).
[39]
K. Lee and K. T. Carlberg. Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders. Journal of Computational Physics 404, 108973 (2020).
[44]
A. Hemmasian and A. Barati Farimani. Reduced-order modeling of fluid flows with transformers. Physics of Fluids 35 (2023).
[45]
A. Solera-Rico, C. S. Vila, M. Gómez, Y. Wang, A. Almashjary, S. Dawson and R. Vinuesa, $\beta$-Variational autoencoders and transformers for reduced-order modelling of fluid flows, arXiv preprint arXiv:2304.03571 (2023).
[32]
B. Brantner, G. de Romemont, M. Kraus and Z. Li. Volume-Preserving Transformers for Learning Time Series Data with Structure, arXiv preprint arXiv:2312:11166v2 (2024).
  • 1We again assume that all the steps up to and including $t$ are known.
  • 2GeometricMachineLearning also has an LSTM implementation, but this may be deprecated in the future.