M-Prometheus
Collection
Open multilingual LLM judges for automatic evaluation.
•
6 items
•
Updated
•
6
M-Prometheus is a suite of open LLM judges that can natively evaluate multilingual outputs. They were trained on 480k instances of multilingual direct assessment and pairwise comparison data wiht long-form feedback. They can be prompted in the same way as Prometheus-2. Check out our paper for more details.
@misc{pombal2025mprometheussuiteopenmultilingual,
title={M-Prometheus: A Suite of Open Multilingual LLM Judges},
author={José Pombal and Dongkeun Yoon and Patrick Fernandes and Ian Wu and Seungone Kim and Ricardo Rei and Graham Neubig and André F. T. Martins},
year={2025},
eprint={2504.04953},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2504.04953},
}