d0p3
/

Summarization
Transformers
Safetensors
Ukrainian
English
t5
text2text-generation
text-generation-inference
Inference Endpoints
Edit model card

O3ap-sm: Ukrainian News Summarizer

This repository contains the 03ap-sm model, a Ukrainian news summarization model fine-tuned on the T5-small architecture. The model has been trained on the Ukrainian Corpus CCMatrix for text summarization tasks.

Model Overview

Usage

Installation

pip install transformers

Loading the Model

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("d0p3/O3ap-sm")
model = AutoModelForSeq2SeqLM.from_pretrained("d0p3/O3ap-sm")

Generating Summaries

news_article = "**YOUR NEWS ARTICLE TEXT IN UKRAINIAN**"

input_ids = tokenizer(news_article, return_tensors="pt").input_ids
output_ids = model.generate(input_ids)

summary = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(summary)

Limitations

  • The model may not perform optimally on informal or highly colloquial Ukrainian text.
  • As with any language model, there's a possibility of generating factually incorrect summaries or summaries that reflect biases present in the training data.

Ethical Considerations

  • Transparency: Clearly state the model's intended use for summarizing news articles, and its limitations.
  • Bias: Be aware of biases that may have been introduced during training data selection or the fine-tuning process. Employ mitigation strategies where possible.
  • Misuse: Acknowledge the potential for misuse of the model, such as generating misleading summaries. Advise caution and critical evaluation of its outputs.

Contributing

We welcome contributions and feedback!

License

This model is released under the [CC-BY-NC-4.0].

Downloads last month
6
Safetensors
Model size
74.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train d0p3/O3ap-sm