Edit model card

Hungarian Abstractive Summarization BART model

For further models, scripts and details, see our repository or our demo site.

  • BART base model (see Results Table - bold):
    • Pretrained on Webcorpus 2.0
    • Finetuned NOL corpus (nol.hu)
      • Segments: 397,343

Limitations

  • tokenized input text (tokenizer: HuSpaCy)
  • max_source_length = 512
  • max_target_length = 256

Results

Model HI NOL
BART-base-512 30.18/13.86/22.92 46.48/32.40/39.45
BART-base-1024 31.86/14.59/23.79 47.01/32.91/39.97

Citation

If you use this model, please cite the following paper:

@inproceedings {yang-bart,
    title = {{BARTerezzünk! - Messze, messze, messze a világtól, - BART kísérleti modellek magyar nyelvre}},
    booktitle = {XVIII. Magyar Számítógépes Nyelvészeti Konferencia},
    year = {2022},
    publisher = {Szegedi Tudományegyetem, Informatikai Intézet},
    address = {Szeged, Magyarország},
    author = {Yang, Zijian Győző},
    pages = {15--29}
}
Downloads last month
17