de-anna / README.md
Cedille's picture
Update README.md
3670fa0
|
raw
history blame
1.01 kB
---
language: de
license: mit
tags:
- pytorch
- causal-lm
datasets:
- c4
---
# Cedille AI
Cedille is a project to bring large language models to non-English languages.
## fr-boris
Anna is a 6B parameter autoregressive language model based on the GPT-J architecture and trained using the [mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax) codebase.
Anna was trained on German text with a similar methodology to [Boris](https://huggingface.co/Cedille/fr-boris), our French model. We started training from GPT-J, which has been trained on [The Pile](https://pile.eleuther.ai/). As a consequence the model still has good performance in English language. Anna makes use of the unmodified GPT-2 tokenizer.
# How to run
TO DO
## Contact us
For any custom development please contact us at hello@cedille.ai.
## Links
* [Official website](https://en.cedille.ai/)
* [Blog](https://en.cedille.ai/blog)
* [GitHub](https://github.com/coteries/cedille-ai)
* [Twitter](https://twitter.com/CedilleAI)