Inductive Bias

10/03/2024

In machine learning, inductive bias refers to the set of assumptions. These assumptions guide the algorithm in selecting the most likely hypothesis or model, based on the available training data.

Inductive Bias of Convolutional Neural Networks
- Locality: CNNs assume that nearby pixels in an image are more likely to be related or contain important patterns. This is why convolutional layers use small filters (kernels) that focus on local regions of the input image. The assumption is that the most meaningful features (like edges or textures) can be extracted from these local regions.
- Translation Invariance: CNNs also assume that import features can appear anywhere in the image and still be relevant. By using shared weights (the same filters across the entire image), the CNN is invariant to translations. This means that CNN can recognize an object even if it's shifted to a different part of the image.

Inductive Bias of Transformers
- Attention: In Transformer, every token can attend to every other token, regardless of their distance in the sequence. (거리와 상관없이 모든 토큰 간에 관계를 가질 수 있다는 가정을 기반으로, 학습을 통해 어떤 토큰 간의 관계가 중요한지를 결정)
- Positional Encoding: In attention layers, they don't assume that token positions matter, so positional encoding is introduced to provide the model with information about the order of tokens in a sequence. (단어의 순서에 따라 문장의 의미가 달라지는 것을 인식)

Minimal Bias: MLPs don't make any assumptions about the structure of the data, meaning they can be applied to a wide variety of problems. However, this also means they lack specialized mechanisms to handle certain types of data, like images or sequences.