Encoder

Encoder is basically:

  • Multi-head attention

  • Add and Norm

  • Feed forward network - 2 linear layers with relu in between

  • Add and Norm

This encoder is repeated 6 times in the original paper.

Last updated