Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
How uncertainty is used to weigh different losses in task where multiple losses are used.
Last updated
How uncertainty is used to weigh different losses in task where multiple losses are used.
Last updated
They learn the task uncertainty (homoscedastic uncertainty), they use this uncertainty to wight the losses of different tasks. So when we have a training which have mutiple losses such as , then is is not a hyperparameter here like traditional methods but is learnable uncertainty. They show this in the network where they jointy learn semantic segmentation and depth regression, hence classification and regression task.
When we learn multiple output of network such as in this case. The outputs have different units and dimensions, this raises a problem to asign different weights to different losses of outputs.
Homoscedastic Uncertainty: It is an aleatoric uncertainty which is not dependent on the input data. It is not a model output, rather a quantity which stays constant for all input data and varies between different tasks.
Let be the output of a neural network with weights on input . For regrassion tasks, we define our liklihood as a Gaussian with mean as model output:
where this is the observation noise (also task uncertainty in this case). When we use maximize the liklihood for this, we get:
For classification we often squash the model output through a softmax function, and sample from the resulting probability vector:
Here the classification liklihood they squash the scaled version of the model output through a softmax function:
This can be interpreted as a Boltz- mann distribution (also called Gibbs distribution) where the input is scaled by σ 2 (often referred to as temperature). The log liklihood of this is given as
with the th element of the vector
Now let be the model outputs corresponding to regression and classification, then:
The loss for this will the negative log liklihood of above which is given by:
where for the euclidean loss of and .
Hence, here we mathematically derived the multi task loss using task uncertainty for each task. Using this loss formulation of classification and regression, you extend it to any of the multitask loss.
Using uncertainty for weighing the losses allows us to incorporate the problem for different dimension and units for different tasks in a multi task learning problem.
The unceratiainty are learnable parameters and does not depend on input but the task itself.
Looking logically, if the output of task is more spreaded i.e has larger units, then will be high. Hence normalizing the loss. Using basically normalize the loss and helps getting rid of units in the equation.