Adjusting the Loss Function
GeometricMachineLearning
provides a few standard loss functions that are used as defaults for specific neural networks.
If these standard losses do not satisfy the user's needs, it is very easy to implement custom loss functions. Adding terms to the loss function is standard practice in machine learning to either increase stability [43] or to inform the network about physical properties[1] [71].
We again consider training a SympNet on the data coming from a harmonic oscillator:
using GeometricProblems.HarmonicOscillator: hodeproblem
sol = integrate(hodeproblem(; tspan = 100), ImplicitMidpoint())
data = DataLoader(sol; suppress_info = true)
nn = NeuralNetwork(GSympNet(2))
# train the network
o = Optimizer(AdamOptimizer(), nn)
batch = Batch(32)
n_epochs = 30
loss = FeedForwardLoss()
loss_array = o(nn, data, batch, n_epochs, loss; show_progress = false)
print(loss_array[end])
0.002621408481174374
And we see that the loss goes down to a very low value. But the user might want to constrain the norm of the network parameters:
# norm of parameters for single layer
network_parameter_norm(params::NamedTuple) = sum([norm(params[i]) for i in 1:length(params)])
# norm of parameters for entire network
function network_parameter_norm(params::NeuralNetworkParameters)
sum([network_parameter_norm(params[key]) for key in keys(params)])
end
network_parameter_norm(params(nn))
4.638560763436519
We now implement a custom loss such that:
\[ \mathrm{loss}_\mathcal{NN}^\mathrm{custom}(\mathrm{input}, \mathrm{output}) = \mathrm{loss}_\mathcal{NN}^\mathrm{feedforward} + \lambda \mathrm{norm}(\mathcal{NN}\mathtt{.params}).\]
struct CustomLoss <: GeometricMachineLearning.NetworkLoss end
const λ = .1
function (loss::CustomLoss)(model::Chain, params::NeuralNetworkParameters, input::CT, output::CT) where {
T,
AT<:AbstractArray{T, 3},
CT<:@NamedTuple{q::AT, p::AT}
}
FeedForwardLoss()(model, params, input, output) + λ * network_parameter_norm(params)
end
And we train the same network with this new loss:
loss = CustomLoss()
nn_custom = NeuralNetwork(GSympNet(2))
loss_array = o(nn_custom, data, batch, n_epochs, loss; show_progress = false)
print(loss_array[end])
0.19204312633659473
We see that the norm of the parameters is lower:
network_parameter_norm(params(nn_custom))
1.8597433611578906
We can also compare the solutions of the two networks:
Wit the second loss function, for which the norm of the resulting network parameters has lower value, the network still performs well, albeit slightly worse than the network trained with the first loss.
References
- [71]
- M. Raissi, P. Perdikaris and G. E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378, 686–707 (2019).
- 1Note however that we discourage using so-called physics-informed neural networks as they do not preserve any physical properties but only give a potential improvement on stability in the region where we have training data.