Model Overview
This is the model presented in the paper "MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages". It is facebook/mbart-large-50 fine-tuned on the parallel detoxification dataset of Russian, English, Ukrainian, and Spanish.
How to use
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model_name = 'textdetox/mbart_detox_en_ru_uk_es'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
Citation
@article{dementieva2024multiparadetox,
title={MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages},
author={Dementieva, Daryna and Babakov, Nikolay and Panchenko, Alexander},
journal={arXiv preprint arXiv:2404.02037},
year={2024}
}
- Downloads last month
- 12