Loss Functions (Part 4)

Let’s talk about loss functions. They are a rather abstract, yet extremely common, concept in machine learning. No neural network could work without a loss function. The whole discipline of Bayesian statistics wouldn’t really exist without them.

This is the fourth article of my Learning from Algorithms series. You can find the other parts here. I particular recommend reading Part 1 on input and output as it relates heavily to aspects of this article.

The idea behind loss functions is rather straightforward. If the input is changed a little, what happens to the output? Or in real life: if I change my behaviour in a certain manner, how does the situation change overall? Loss functions provide an abstracted answer to those questions that can be used in daily life.

A particularly interesting aspect of loss functions are their shapes. We can visualise them, which helps us categorise real-life situations and gives guidance on how we need to approach each of them. We’ll now look at three different loss function shapes: symmetric, asymmetric, and nonlinear, each with their plot and examples.

First up is the symmetric loss function. Here, if the input changes a little, the output changes only a little as well. Crucially, this is true for every input! Let’s take the time at which I start cooking dinner as the input (on the horizontal x-axis). The output is the time I am eating dinner (shown on the vertical y-axis). If the recipe and everything else stays the same I will always eat 40 minutes after I start cooking. A few minutes delay in the start time only translates to a few minutes delay in the eating time, no matter if I originally intended to cook at 5pm or 9pm. We call this symmetric as each change on the input has the same effect on the output and is therefore, in absolute terms, symmetric.

The second loss function we look at is the asymmetric one. Here, there exists a specific point in time where a small change in input can have a very different effect on the output. The example here is arrival at the airport as the input, and the arrival at the destination is the output. The plot of this asymmetric loss function has two interesting aspects: Firstly, if I arrive at 11am instead of noon at the airport I will not arrive an hour early at my destination, as I will still have to take the very same flight. I just have to kill an additional hour by buying a snack at Pret and smelling perfumes, but the output does not change with the input. However, there exists a cut-off point. If I arrive at the airport after 1pm, then I miss my flight and I won’t arrive at my destination at all! With an asymmetric loss function, a small change in the input can lead to either to no changes or to huge shifts in the output.

The third and last loss function in this article is the nonlinear one. A striking example is the relationship between income and happiness. According to Nobel-prize winner Daniel Kahnemann the relationship is not as clear cut as you might think. While it is true for lower incomes that more money does indeed rapidly increase happiness, once we reach a higher income the rate of increased happiness slows down. Basic needs are taken care of and more money therefore leads to a smaller increase in happiness. This is a distinct feature of the nonlinear loss function: the magnitude of change in the output depends on the specific location of the input. In our example, a change of +10k at 20k causes a big increase in happiness, whereas the same +10k does not have such a big influence if the starting salary is 150k.

As we have seen, symmetric, asymmetric, and nonlinear loss functions all warrant very different behaviours. If you are driving somewhere by car it doesn’t matter if you leave a few minutes late, as you will just arrive a few minutes late (symmetric loss). However, if you are going by train, being a few minutes late might cause you to miss your train and arrive hours late (asymmetric loss). Depending on your current salary you might put less energy into chasing that promotion as you are already a high earner, or start pushing even harder if you are on a lower salary (nonlinear).

While loss functions themselves do not improve your every day, they are a useful tool for reflection to analyse the situations you find yourself in. What are the kind of situations that challenge you and what are their associated loss functions? What are their inputs, outputs, and shape of the loss function? How do you adapt/would like to adapt to each scenario? Use the knowledge you just acquired on loss functions to your advantage!

Loss Functions (Part 4)

Comments

Leave a comment Cancel reply