IlyaGusev's picture
Create README.md
2127fb1
|
raw
history blame
1.11 kB
---
language:
- ru
tags:
- summarization
license: apache-2.0
---
# RuT5TelegramHeadlines
## Model description
Based on [rut5-base](https://huggingface.co/cointegrated/rut5-base) model
## Intended uses & limitations
#### How to use
```python
from transformers import AutoTokenizer, T5ForConditionalGeneration
model_name = "IlyaGusev/rut5_telegram_headlines"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)
article_text = "..."
input_ids = tokenizer(
[article_text],
max_length=600,
add_special_tokens=True,
padding="max_length",
truncation=True,
return_tensors="pt"
)["input_ids"]
output_ids = model.generate(
input_ids=input_ids,
no_repeat_ngram_size=4
)[0]
headline = tokenizer.decode(output_ids, skip_special_tokens=True)
print(headline)
```
## Training data
- Dataset: [ru_all_split.tar.gz](https://www.dropbox.com/s/ykqk49a8avlmnaf/ru_all_split.tar.gz)
## Training procedure
- Training script: [train.py](https://github.com/IlyaGusev/summarus/blob/master/external/hf_scripts/train.py)