OratioAI

Sequecne to Sequence anguage translation, implimenting the methodes outlined in 'attention is all you need'

  1. Input Tokenization: The source and target sentences are tokenized using custom WordPiece tokenizers. Tokens are mapped to embeddings via the InputEmbeddings module, scaled by the model dimension.
  2. Positional Encoding: Positional information is added to token embeddings using a fixed sinusoidal encoding strategy.
  3. Encoding Phase: The encoder processes the source sequence, transforming token embeddings into contextual representations using stacked EncoderBlock modules.
  4. Decoding Phase: The decoder autoregressively generates target tokens by attending to both previous tokens and encoder outputs. Cross-attention layers align source and target sequences effectively.
  5. Projection: Final decoder outputs are projected into the target vocabulary space for token prediction.
  6. Output Generation: Decoding is performed using a beam search or greedy approach to produce the final translated sentence.
Resource Description
Training Space Hugging Face Space for training and testing the model.
GitHub Source Code Source code repository for the translation project.
Attention Is All You Need Original paper on the transformer architecture published from google
Dataset Description
Dataset Dataset Used for main model training.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .

Space using torinriley/OratioAI 1