cstr/wmt21-dense-24-wide-en-x-st

facebook wmt21 model facebook/wmt21-dense-24-wide-en-x as safetensors for testing.

so far, looks quite good, cf. comet scores:

+-----------------------------------------+-----------------+
| File                                    |   Overall Score |
+=========================================+=================+
| Capybara_de_wmt21_scored.jsonl          |        0.848375 |
+-----------------------------------------+-----------------+
| Capybara_de_GPT4_scored.jsonl           |        0.846241 |
+-----------------------------------------+-----------------+
| Capybara_de_Claude-3-Opus_scored.jsonl  |        0.84568  |
+-----------------------------------------+-----------------+
| Capybara_de_deepl_scored.jsonl          |        0.843937 |
+-----------------------------------------+-----------------+
| Capybara_de_GPT3.5_scored.jsonl         |        0.843922 |
+-----------------------------------------+-----------------+
| Capybara_de_occiglot_scored.jsonl       |        0.83135  |
+-----------------------------------------+-----------------+
| Capybara_de_discolm_scored.jsonl        |        0.830676 |
+-----------------------------------------+-----------------+
| Capybara_de_nbbl_scored.jsonl           |        0.829132 |
+-----------------------------------------+-----------------+
| Capybara_de_wmt19_scored.jsonl          |        0.824847 |
+-----------------------------------------+-----------------+
| Capybara_de_t5madlad_scored.jsonl       |        0.818146 |
+-----------------------------------------+-----------------+
| Capybara_de_mixtral_scored.jsonl        |        0.807397 |
+-----------------------------------------+-----------------+
| Capybara_de_TowerInstruct2_scored.jsonl |        0.788971 |
+-----------------------------------------+-----------------+

also, cf. comparison on a few snippets: https://huggingface.co/spaces/cstr/compare_translations

regarding quantization: on linux, or windows wsl (with accelerate, triton), you can use quantized versions q8 q4

on apple mac mps, you can use CTranslate2 like this: first convert the model:

ct2-transformers-converter --model cstr/wmt21-dense-24-wide-en-x-st --quantization int8_float32 --output_dir wmt21ct2_int8

then run e.g. in python:

import ctranslate2
import transformers

translator = ctranslate2.Translator("wmt21ct2_int8")
tokenizer = transformers.AutoTokenizer.from_pretrained("facebook/wmt21-dense-24-wide-en-x")
tokenizer.src_lang = "en"

source = tokenizer.convert_ids_to_tokens(tokenizer.encode("Choose the correct verb form to complete the sentence: The birds ____________ (to fly) to the south for the winter."))
target_prefix = [tokenizer.lang_code_to_token["de"]]
results = translator.translate_batch([source], target_prefix=[target_prefix])
target = results[0].hypotheses[0][1:]

print(tokenizer.decode(tokenizer.convert_tokens_to_ids(target)))