File size: 782 Bytes
bfa3427
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---
license: mit
language:
- xh
- zu
- nr
- ss
---

Usage:

1. For mask prediction

```
tokenizer = AutoTokenizer.from_pretrained("francois-meyer/nguni-xlmr-large")
model = XLMRobertaForMaskedLM.from_pretrained("francois-meyer/nguni-xlmr-large")
text = "A test <mask> for the nguni model." ## Replace with any sentence from the Nguni Languages with mask tokens.
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
  logits = model(**inputs).logits
mask_token_index = (inputs.input_ids == tokenizer.mask_token_id)[0].nonzero(as_tuple=True)[0]
predicted_token_id = logits[0, mask_token_index].argmax(axis=-1)
print(tokenizer.decode(predicted_token_id))
```

2. For any other task, you might want to fine-tune the model in the same way you fine-tune a BERT/XLMR model.