CtrlK

Encoder

Encoder is basically:

Multi-head attention
Add and Norm
Feed forward network - 2 linear layers with relu in between
Add and Norm

This encoder is repeated 6 times in the original paper.

PreviousPositional Encoding NextDecoder

Last updated 1 year ago