Adjusting the Loss Function
GeometricMachineLearning
provides a few standard loss function that are used as defaults for specific neural networks:
If these standard losses do not satisfy the user's needs, it is very easy to implement custom loss functions. We again consider training a SympNet on the data coming from a pendulum:
using GeometricMachineLearning
using GeometricIntegrators: integrate, ImplicitMidpoint
using GeometricProblems.HarmonicOscillator: hodeproblem
import Random
Random.seed!(123)
data = integrate(hodeproblem(; tspan = 100), ImplicitMidpoint()) |> DataLoader
nn = NeuralNetwork(GSympNet(2))
o = Optimizer(AdamOptimizer(), nn)
batch = Batch(32)
n_epochs = 30
loss = FeedForwardLoss()
loss_array = o(nn, data, batch, n_epochs, loss)
print(loss_array[end])
[ Info: You have provided a NamedTuple with keys q and p; the data are matrices. This is interpreted as *symplectic data*.
Progress: 7%|██▊ | ETA: 0:03:23
TrainingLoss: 0.12488902759671226
Progress: 100%|█████████████████████████████████████████| Time: 0:00:16
TrainingLoss: 0.0025667289897467697
0.0025667289897467697
And we see that the loss goes down to a very low value. But the user might want to constrain the norm of the network parameters:
using LinearAlgebra: norm
# norm of parameters for single layer
network_parameter_norm(params::NamedTuple) = sum([norm(params[i]) for i in 1:length(params)])
# norm of parameters for entire network
network_parameter_norm(params) = sum([network_parameter_norm(param) for param in params])
network_parameter_norm(nn.params)
4.637679510234861
We now implement a custom loss such that:
\[ \mathrm{loss}_\mathcal{NN}^\mathrm{custom}(\mathrm{input}, \mathrm{output}) = \mathrm{loss}_\mathcal{NN}^\mathrm{feedforward} + \lambda \mathrm{norm}(\mathcal{NN}\mathtt{.params}).\]
struct CustomLoss <: GeometricMachineLearning.NetworkLoss end
function (loss::CustomLoss)(model::Chain, params::Tuple, input::CT, output::CT) where {
T,
AT<:AbstractArray{T, 3},
CT<:@NamedTuple{q::AT, p::AT}
}
FeedForwardLoss()(model, params, input, output) + .1 * network_parameter_norm(params)
end
loss = CustomLoss()
nn_custom = NeuralNetwork(GSympNet(2))
loss_array = o(nn_custom, data, batch, n_epochs, loss)
print(loss_array[end])
Progress: 7%|██▊ | ETA: 0:00:29
TrainingLoss: 1.6860468728994746
Progress: 20%|████████▎ | ETA: 0:00:09
TrainingLoss: 0.9460796213071907
Progress: 33%|█████████████▋ | ETA: 0:00:05
TrainingLoss: 0.6397170824326305
Progress: 47%|███████████████████▏ | ETA: 0:00:03
TrainingLoss: 0.4993670427746995
Progress: 60%|████████████████████████▋ | ETA: 0:00:02
TrainingLoss: 0.47484101254591493
Progress: 73%|██████████████████████████████▏ | ETA: 0:00:01
TrainingLoss: 0.443575121167243
Progress: 87%|███████████████████████████████████▌ | ETA: 0:00:00
TrainingLoss: 0.40705628962541285
Progress: 100%|█████████████████████████████████████████| Time: 0:00:02
TrainingLoss: 0.36431485858976587
0.36431485858976587
And we see that the norm of the parameters is a lot lower:
network_parameter_norm(nn_custom.params)
3.5511322877258724
We can also compare the solutions of the two networks:
using CairoMakie
fig = Figure(; backgroundcolor = :transparent)
ax = Axis(fig[1, 1]; backgroundcolor = :transparent,
bottomspinecolor = textcolor,
topspinecolor = textcolor,
leftspinecolor = textcolor,
rightspinecolor = textcolor,
xtickcolor = textcolor,
ytickcolor = textcolor)
init_con = [0.5 0.]
n_time_steps = 100
prediction1 = zeros(2, n_time_steps + 1)
prediction2 = zeros(2, n_time_steps + 1)
prediction1[:, 1] = init_con
prediction2[:, 1] = init_con
for i in 2:(n_time_steps + 1)
prediction1[:, i] = nn(prediction1[:, i - 1])
prediction2[:, i] = nn_custom(prediction2[:, i - 1])
end
lines!(ax, data.input.q[:], data.input.p[:], label = rich("Training Data"; color = textcolor))
lines!(ax, prediction1[1, :], prediction1[2, :], label = rich("FeedForwardLoss"; color = textcolor))
lines!(ax, prediction2[1, :], prediction2[2, :], label = rich("CustomLoss"; color = textcolor))
fig
Library Functions
GeometricMachineLearning.NetworkLoss
— TypeAn abstract type for all the neural network losses. If you want to implement CustomLoss <: NetworkLoss
you need to define a functor:
(loss::CustomLoss)(model, ps, input, output)
where model
is an instance of an AbstractExplicitLayer
or a Chain
and ps
the parameters.
GeometricMachineLearning.FeedForwardLoss
— TypeFeedForwardLoss()
Make an instance of a loss for feedforward neural networks.
This doesn't have any parameters.
GeometricMachineLearning.AutoEncoderLoss
— TypeThis loss should always be used together with a neural network of type AutoEncoder
(and it is also the default for training such a network).
It simply computes:
\[\mathtt{AutoEncoderLoss}(nn\mathtt{::Loss}, input) = ||nn(input) - input||.\]
GeometricMachineLearning.TransformerLoss
— TypeTransformerLoss(seq_length, prediction_window)
Make an instance of the transformer loss.
This is the loss for a transformer network (especially a transformer integrator).
Parameters
The prediction_window
specifies how many time steps are predicted into the future. It defaults to the value specified for seq_length
.