Den4ikAI
/

FRED-T5-Large-interpreter

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Den4ikAI commited on May 21, 2023

Commit

38dac29

•

1 Parent(s): 02f19a9

Update README.md

Files changed (1) hide show

README.md +38 -0

README.md CHANGED Viewed

@@ -1,3 +1,41 @@
 ---
 license: mit
 ---

 ---
 license: mit
+datasets:
+- inkoziev/incomplete_utterance_restoration
+language:
+- ru
+widget:
+- text: '<SC1>- Как тебя зовут?\n- Джульетта Мао\nРазвернутый ответ: <extra_id_0>'
+- text: '<SC1>- А живешь где?\n- В поясе астероидов\nРазвернутый ответ: <extra_id_0>'
+pipeline_tag: text2text-generation
 ---
+# Den4ikAI/FRED-T5-Large-interpreter
+Модель для восстановления фразы с помощью контекста диалога (анафора, эллипсисы, гэппинг), проверки орфографии и нормализации текста диалоговых реплик.
+Больше о задаче [тут](https://huggingface.co/inkoziev/rugpt_interpreter).
+# Пример использования
+```python
+import torch
+from transformers import T5ForConditionalGeneration, GPT2Tokenizer
+model_name = 'Den4ikAI/FRED-T5-Large-interpreter'
+tokenizer = GPT2Tokenizer.from_pretrained(model_name)
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+model = T5ForConditionalGeneration.from_pretrained(model_name)
+model.eval()
+t5_input = '''<SC1>- Ты собак любишь?
+- Не люблю я их
+Развернутый ответ: <extra_id_0>'''
+input_ids = tokenizer(t5_input, return_tensors='pt').input_ids
+out_ids = model.generate(input_ids=input_ids, max_length=100, eos_token_id=tokenizer.eos_token_id, early_stopping=True)
+t5_output = tokenizer.decode(out_ids[0][1:])
+print(t5_output)
+```
+# Citation
+```
+@MISC{FRED-T5-Large-interpreter,
+    author  = {Denis Petrov, Ilya Koziev},
+    title   = {Russian conversations interpreter and normalizer},
+    url     = {https://huggingface.co/Den4ikAI/FRED-T5-Large-interpreter},
+    year    = 2023
+}
+```