--- language: - ru tags: - summarization license: apache-2.0 --- # RuT5TelegramHeadlines ## Model description Based on [rut5-base](https://huggingface.co/cointegrated/rut5-base) model ## Intended uses & limitations #### How to use ```python from transformers import AutoTokenizer, T5ForConditionalGeneration model_name = "IlyaGusev/rut5_telegram_headlines" tokenizer = AutoTokenizer.from_pretrained(model_name) model = T5ForConditionalGeneration.from_pretrained(model_name) article_text = "..." input_ids = tokenizer( [article_text], max_length=600, add_special_tokens=True, padding="max_length", truncation=True, return_tensors="pt" )["input_ids"] output_ids = model.generate( input_ids=input_ids, no_repeat_ngram_size=4 )[0] headline = tokenizer.decode(output_ids, skip_special_tokens=True) print(headline) ``` ## Training data - Dataset: [ru_all_split.tar.gz](https://www.dropbox.com/s/ykqk49a8avlmnaf/ru_all_split.tar.gz) ## Training procedure - Training script: [train.py](https://github.com/IlyaGusev/summarus/blob/master/external/hf_scripts/train.py)