minibob / README.md
artemsnegirev's picture
Create README.md
d1eefe9
metadata
license: apache-2.0
datasets:
  - artemsnegirev/ru-word-games
language:
  - ru
metrics:
  - exact_match
pipeline_tag: text2text-generation

Model was trained on companion dataset. Minibob guess word from a description modeling well known Alias word game.

from transformers import T5ForConditionalGeneration, T5Tokenizer

prefix = "guess word:"

def predict_word(prompt, model, tokenizer):
    prompt = prompt.replace("...", "<extra_id_0>")
    prompt = f"{prefix} {prompt}"

    input_ids = tokenizer([prompt], return_tensors="pt").input_ids

    outputs = model.generate(
        input_ids.to(model.device), 
        num_beams=5, 
        max_new_tokens=8,
        do_sample=False,
        num_return_sequences=5
    )

    candidates = set()
        
    for tokens in outputs:
        candidate = tokenizer.decode(tokens, skip_special_tokens=True)
        candidate = candidate.strip().lower()

        candidates.add(candidate)

    return candidates

model_name = "artemsnegirev/minibob"

tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

prompt = "это животное с копытами на нем ездят"

print(predict_word(prompt, model, tokenizer))
# {'верблюд', 'конь', 'коня', 'лошадь', 'пони'}

Detailed github-based tutorial with pipeline and source code for building Minibob