Label Smoothing
-
Label Smoothing
-
Label Smoothing method prevents the model from being overconfident
-
Regulairzation techinques for the Generaliztion and the Calibration
-
Formula for Label Smoothing is:
\[
y_{smooth} = y \, (1 - \alpha) + \frac{\alpha}{K} \\
\]
where \(y\) is a given label, \(K\) is possible classes, and the \(\alpha\) is the smoothing parameter, usually a small value like \(0.1\)
-
When the possible classes \(K = 5\) and \(\alpha = 0.1\),
the one-hot encoded label \(y = [0, 1, 0, 0, 0]\) becomes to the \([0.02, 0.92, 0.02, 0.02, 0.02]\)
-
The \(\frac{\alpha}{K} \) term makes all classes except the label a Uniform Distribution.
-
Cross entropy loss applied with Label Smoothing is as:
\[
\mathcal{L}(y, \, \hat{y}) = -\sum_{k=1}^K y_{smooth_k} \cdot \, \log(\hat{y}_k) \\
\]
where $y_{smooth,k}$ is the smoothed label for class $k$, and $\hat{y}_k$ is the predicted probability for class $k$.
def smooth_labels(self, labels: torch.Tensor, k: int, smoothing: float = 0.1) -> torch.Tensor:
"""Applies label smoothing to reduce model overconfidence.
"""
return labels * (1 - smoothing) + smoothing / k