What is positional encoding, and why is it essential in Transformers?
Positional encoding is added to the input embeddings in Transformers to provide information about the order of elements in a sequence. This helps the model distinguish between the positions of different elements, as self-attention alone doesn't inherently capture sequence order.