de-anna / README.md
Cedille's picture
Update README.md
3670fa0
|
raw
history blame
No virus
1.01 kB
metadata
language: de
license: mit
tags:
  - pytorch
  - causal-lm
datasets:
  - c4

Cedille AI

Cedille is a project to bring large language models to non-English languages.

fr-boris

Anna is a 6B parameter autoregressive language model based on the GPT-J architecture and trained using the mesh-transformer-jax codebase.

Anna was trained on German text with a similar methodology to Boris, our French model. We started training from GPT-J, which has been trained on The Pile. As a consequence the model still has good performance in English language. Anna makes use of the unmodified GPT-2 tokenizer.

How to run

TO DO

Contact us

For any custom development please contact us at hello@cedille.ai.

Links