File size: 1,482 Bytes
d1eefe9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
license: apache-2.0
datasets:
- artemsnegirev/ru-word-games
language:
- ru
metrics:
- exact_match
pipeline_tag: text2text-generation
---

Model was trained on companion [dataset](artemsnegirev/ru-word-games). Minibob guess word from a description modeling well known Alias word game.

```python
from transformers import T5ForConditionalGeneration, T5Tokenizer

prefix = "guess word:"

def predict_word(prompt, model, tokenizer):
    prompt = prompt.replace("...", "<extra_id_0>")
    prompt = f"{prefix} {prompt}"

    input_ids = tokenizer([prompt], return_tensors="pt").input_ids

    outputs = model.generate(
        input_ids.to(model.device), 
        num_beams=5, 
        max_new_tokens=8,
        do_sample=False,
        num_return_sequences=5
    )

    candidates = set()
        
    for tokens in outputs:
        candidate = tokenizer.decode(tokens, skip_special_tokens=True)
        candidate = candidate.strip().lower()

        candidates.add(candidate)

    return candidates

model_name = "artemsnegirev/minibob"

tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

prompt = "это животное с копытами на нем ездят"

print(predict_word(prompt, model, tokenizer))
# {'верблюд', 'конь', 'коня', 'лошадь', 'пони'}
```

Detailed github-based [tutorial](https://github.com/artemsnegirev/minibob) with pipeline and source code for building Minibob