-
textdetox/multilingual_paradetox
Viewer ā¢ Updated ā¢ 3.6k ā¢ 865 ā¢ 8 -
textdetox/multilingual_paradetox_test
Viewer ā¢ Updated ā¢ 6k ā¢ 332 -
textdetox/xlmr-large-toxicity-classifier-v2
Text Classification ā¢ Updated ā¢ 209 ā¢ 1 -
textdetox/multilingual_toxicity_dataset
Viewer ā¢ Updated ā¢ 71.4k ā¢ 923 ā¢ 24

Multilingual Text Detoxification
AI & ML interests
Text Style Transfer, Text Detoxification, Toxic Speech Detection and Mitigation, Multilingualism
Recent Activity
Multilingual Text Detoxification with Parallel Data
Text Detoxification, toxicity detection and explanation for diverse languages: English, Spanish, German, French, Italian, Chinese, Japanese, Arabic, Hebrew, Hindi, Ukrainian, Russian, Tatar, Amharic. By many researchers from all over the world š
Support for better, safe, and multicultural online spaces.
š° Read about the project in press š¹ PyData&CPyConf Berlin 2023 talk
[2025] !!!NOW OPEN!!! TextDetox CLEF2025 shared task website š¤Starter Kit
[2025] COLNG2025: Daryna Dementieva, Nikolay Babakov, Amit Ronen, Abinew Ali Ayele, Naquee Rizwan, Florian Schneider, Xintong Wang, Seid Muhie Yimam, Daniil Alekhseevich Moskovskiy, Elisei Stakovskii, Eran Kaufman, Ashraf Elnagar, Animesh Mukherjee, and Alexander Panchenko. 2025. Multilingual and Explainable Text Detoxification with Parallel Corpora. In Proceedings of the 31st International Conference on Computational Linguistics, pages 7998ā8025, Abu Dhabi, UAE. Association for Computational Linguistics. pdf
[2024] TextDetox2024 Report: Daryna Dementieva, Daniil Moskovskiy, Nikolay Babakov, Abinew Ali Ayele, Naquee Rizwan, Florian Schneider, Xintong Wang, Seid Muhie Yimam, Dmitry Ustalov, Elisei Stakovskii, Alisa Smirnova, Ashraf Elnagar, Animesh Mukherjee, and Alexander Panchenko "Overview of the multilingual text detoxification task at pan 2024" Working Notes of CLEF (2024). pdf
[2024] MultiParaDetox @ NAACL2024: Daryna Dementieva, Nikolay Babakov, and Alexander Panchenko. "MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages." Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers). 2024. pdf
[2024] TextDetox CLEF2024 shared task website
[2022] The first Parall Text Detoxification datasets: English ParaDetox and Russian ParaDetox
Contact
We are happy to extend our research to more languages, cultures, and dimensions š
Please, contact: Daryna Dementieva
Collections
5
-
Multilingual and Explainable Text Detoxification with Parallel Corpora
Paper ā¢ 2412.11691 ā¢ Published ā¢ 1 -
textdetox/multilingual_toxicity_dataset
Viewer ā¢ Updated ā¢ 71.4k ā¢ 923 ā¢ 24 -
textdetox/multilingual_toxic_lexicon
Viewer ā¢ Updated ā¢ 176k ā¢ 423 ā¢ 3 -
textdetox/multilingual_paradetox
Viewer ā¢ Updated ā¢ 3.6k ā¢ 865 ā¢ 8
models
9

textdetox/xlmr-large-toxicity-classifier-v2

textdetox/glot500-toxicity-classifier

textdetox/bert-multilingual-toxicity-classifier

textdetox/twitter-xlmr-toxicity-classifier

textdetox/xlmr-large-toxicity-classifier

textdetox/mbart-detox-baseline

textdetox/mbart_detox_en_ru_uk_es

textdetox/mt5-xl-detox-baseline
