Uncertainty based learnable weighting

This provide a way for self-paced multitask learning. Or, for a model that have muitiple task, how to weight the loss terms. If picked right, a model with multi-task target can perform better than focusing on a single task.

This method uses “task uncertainty”, which is

Task-dependent or Homoscedastic uncertainty is aleatoric uncertainty which is not dependent on the in- put data. It is not a model output, rather it is a quantity which stays constant for all input data and varies between different tasks. It can therefore be described as task-dependent uncertainty.

So basically even if the input is the same, the output would vary, and that’s this uncertainty.

The formula

Very very simple. Recall regression loss, MSE (mean squared error) can be intepreted as MLE of Gaussian. Let $f^{W} (x)$ be the output of a neural network with weights $W$ on input $x$ , then

l o g p (y ∣ f^{W} (x)) \propto - \frac{1}{2 σ ^{2}} ∣∣ y - f^{W} (x) ∣ ∣^{2} - l o g σ

So if we have two regression target, it’s

\frac{1}{2 σ _{1}^{2}} L_{1} (W) + \frac{1}{2 σ _{2}^{2}} L_{2} (W) + l o g σ_{1} σ_{2}

We learn the $σ_{1}$ and $σ_{2}$ together with our usual $W$ . Note that they cannot be too big because the last term is regulating them.

If there’s classification, we can represent it with temperatured softmax, which is a Boltzman distribution:

p (y ∣ f^{W} (x), σ) = Softmax (\frac{1}{σ ^{2}} f^{W} (x))

And you’ll get another formula. See the OG paper for details.

Yanda's Random Notes

Explorer

Uncertainty based learnable weighting

The formula

Graph View

Backlinks

Yanda's Random Notes

Explorer

Uncertainty based learnable weighting

The formula §

Graph View

Backlinks

The formula