Neural Network Integrators

In GeometricMachineLearning we can divide most neural network architectures (that are used for applications to physical systems) into two categories: autoencoders and integrators. Integrator in its most general form refers to an approximation of the flow of an ODE (see the section on the existence and uniqueness theorem) by a numerical scheme. Traditionally these numerical schemes were constructed by defining certain relationships between a known time step $z^{(t)}$ and a future unknown one $z^{(t+1)}$ [7, 41]:

\[ f(z^{(t)}, z^{(t+1)}) = 0.\]

One usually refers to such a relationship as an "integration scheme". If this relationship can be reformulated as

\[ z^{(t+1)} = g(z^{(t)}),\]

then we refer to the scheme as explicit, if it cannot be reformulated in such a way then we refer to it as implicit. Implicit schemes are typically more expensive to solve than explicit ones. The Julia library GeometricIntegrators [42] offers a wide variety of integration schemes both implicit and explicit.

The neural network integrators in GeometricMachineLearning (the corresponding type is NeuralNetworkIntegrator) are all explicit integration schemes where the function $g$ above is modeled with a neural network.

Neural networks, as an alternative to traditional methods, are employed because of (i) potentially superior performance and (ii) an ability to learn unknown dynamics from data.

Multi-step methods

Multi-step method [30, 31] refers to schemes that are of the form[1]:

\[ f(z^{(t - \mathtt{sl} + 1)}, z^{(t - \mathtt{sl} + 2)}, \ldots, z^{(t)}, z^{(t + 1)}, \ldots, z^{(\mathtt{pw} + 1)}) = 0,\]

where sl is short for sequence length and pw is short for prediction window. In contrast to traditional single-step methods, sl and pw can be greater than 1. An explicit multi-step method has the following form:

\[[z^{(t+1)}, \ldots, z^{(t+\mathtt{pw})}] = g(z^{(t - \mathtt{sl} + 1)}, \ldots, z^{(t)}).\]

There are essentially two ways to construct multi-step methods with neural networks: the older one is using recurrent neural networks such as long short-term memory cells (LSTMs, [43]) and the newer one is using transformer neural networks [26]. Both of these approaches have been successfully employed to learn multi-step methods (see [34, 39] for the former and [32, 44, 45] for the latter), but because the transformer architecture exhibits superior performance on modern hardware and can be imbued with geometric properties it is recommended to always use a transformer-derived architecture when dealing with time series[2].

Explicit multi-step methods derived from he transformer are always subtypes of the type TransformerIntegrator in GeometricMachineLearning. In GeometricMachineLearning the standard transformer, the volume-preserving transformer and the linear symplectic transformer are implemented.

Library Functions


E. Hairer, C. Lubich and G. Wanner. Geometric Numerical integration: structure-preserving algorithms for ordinary differential equations (Springer, 2006).
B. Leimkuhler and S. Reich. Simulating hamiltonian dynamics. No. 14 (Cambridge university press, 2004).
K. Feng. The step-transition operators for multi-step methods of ODE's. Journal of Computational Mathematics, 193–202 (1998).
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation 9, 1735–1780 (1997).
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser and I. Polosukhin. Attention is all you need. Advances in neural information processing systems 30 (2017).
S. Fresca, L. Dede’ and A. Manzoni. A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs. Journal of Scientific Computing 87, 1–36 (2021).
K. Lee and K. T. Carlberg. Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders. Journal of Computational Physics 404, 108973 (2020).
A. Hemmasian and A. Barati Farimani. Reduced-order modeling of fluid flows with transformers. Physics of Fluids 35 (2023).
A. Solera-Rico, C. S. Vila, M. Gómez, Y. Wang, A. Almashjary, S. Dawson and R. Vinuesa, $\beta$-Variational autoencoders and transformers for reduced-order modelling of fluid flows, arXiv preprint arXiv:2304.03571 (2023).
B. Brantner, G. de Romemont, M. Kraus and Z. Li. Volume-Preserving Transformers for Learning Time Series Data with Structure, arXiv preprint arXiv:2312:11166v2 (2024).
  • 1We again assume that all the steps up to and including $t$ are known.
  • 2GeometricMachineLearning also has an LSTM implementation, but this may be deprecated in the future.