Edit model card

Hungarian Abstractive Summarization with mT5-base model

For further details, see or our demo site.

  • Finetuned on mT5-base model
  • Prefix: "summarize: "
  • Finetuned on HI corpus (hvg.hu + index.hu)
    • Segments: 559162

Limitations

  • tokenized input text (tokenizer: HuSpaCy)
  • max_source_length = 1024
  • max_target_length = 256

Results

Model HI
mBART 35.17/16.46/25.61
mT5 33.30/15.97/24.65
PEGASUS 30.36/13.11/21.57

Usage with pipeline

from transformers import pipeline

summarization = pipeline(task="summarization", model="NYTK/summarization-hi-mt5-base-hungarian")

print(summarization(f"summarize: {input_text}")[0]["summary_text"])

Citation

If you use this model, please cite the following paper:

@inproceedings {yang-multi-sum,
    title = {{Többnyelvű modellek és PEGASUS finomhangolása magyar nyelvű absztraktív összefoglalás feladatára}},
    booktitle = {XIX. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2023)},
    year = {2023},
    publisher = {Szegedi Tudományegyetem, Informatikai Intézet},
    address = {Szeged, Magyarország},
    author = {Yang, Zijian Győző},
    pages = {381--393}
}
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.