File size: 858 Bytes
1915e40
 
 
 
 
 
4bef8aa
1915e40
 
 
 
 
 
 
4bef8aa
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
language: 
- da
license: apache-2.0
---
## daT5-base
A smaller version of [Google's mt5-base](https://huggingface.co/google/mt5-base) model, where the original model is reduced to only include Danish embeddings.

## How to use
```python
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("emillykkejensen/daT5-base")
model = AutoModel.from_pretrained("emillykkejensen/daT5-base")
```

## Further reading

[Gist](https://gist.github.com/emillykkejensen/8bf1b323495efc7252dee966e6bc1b5c) showing (in Danish) how the embeddings are extracted

[Article](https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90) explaining how to do it by [David Dale](https://huggingface.co/cointegrated)

## Also check out
[daT5-large](https://huggingface.co/emillykkejensen/daT5-large)