daT5-base / README.md
emillykkejensen's picture
Update README.md
4bef8aa
metadata
language:
  - da
license: apache-2.0

daT5-base

A smaller version of Google's mt5-base model, where the original model is reduced to only include Danish embeddings.

How to use

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("emillykkejensen/daT5-base")
model = AutoModel.from_pretrained("emillykkejensen/daT5-base")

Further reading

Gist showing (in Danish) how the embeddings are extracted

Article explaining how to do it by David Dale

Also check out

daT5-large