metadata
language: de
license: mit
tags:
- pytorch
- causal-lm
datasets:
- c4
Cedille AI
Cedille is a project to bring large language models to non-English languages.
fr-boris
Anna is a 6B parameter autoregressive language model based on the GPT-J architecture and trained using the mesh-transformer-jax codebase.
Anna was trained on German text with a similar methodology to Boris, our French model. We started training from GPT-J, which has been trained on The Pile. As a consequence the model still has good performance in English language. Anna makes use of the unmodified GPT-2 tokenizer.
How to run
TO DO
Contact us
For any custom development please contact us at hello@cedille.ai.