facebook wmt21 model facebook/wmt21-dense-24-wide-en-x as safetensors for testing.

so far, looks quite good, cf. comet scores:

+-----------------------------------------+-----------------+
| File                                    |   Overall Score |
+=========================================+=================+
| Capybara_de_wmt21_scored.jsonl          |        0.848375 |
+-----------------------------------------+-----------------+
| Capybara_de_GPT4_scored.jsonl           |        0.846241 |
+-----------------------------------------+-----------------+
| Capybara_de_Claude-3-Opus_scored.jsonl  |        0.84568  |
+-----------------------------------------+-----------------+
| Capybara_de_deepl_scored.jsonl          |        0.843937 |
+-----------------------------------------+-----------------+
| Capybara_de_GPT3.5_scored.jsonl         |        0.843922 |
+-----------------------------------------+-----------------+
| Capybara_de_occiglot_scored.jsonl       |        0.83135  |
+-----------------------------------------+-----------------+
| Capybara_de_discolm_scored.jsonl        |        0.830676 |
+-----------------------------------------+-----------------+
| Capybara_de_nbbl_scored.jsonl           |        0.829132 |
+-----------------------------------------+-----------------+
| Capybara_de_wmt19_scored.jsonl          |        0.824847 |
+-----------------------------------------+-----------------+
| Capybara_de_t5madlad_scored.jsonl       |        0.818146 |
+-----------------------------------------+-----------------+
| Capybara_de_mixtral_scored.jsonl        |        0.807397 |
+-----------------------------------------+-----------------+
| Capybara_de_TowerInstruct2_scored.jsonl |        0.788971 |
+-----------------------------------------+-----------------+

also, cf. comparison on a few snippets: https://huggingface.co/spaces/cstr/compare_translations

regarding quantization: on linux, or windows wsl (with accelerate, triton), you can use quantized versions q8 q4

on apple mac mps, you can use CTranslate2 like this: first convert the model:

ct2-transformers-converter --model cstr/wmt21-dense-24-wide-en-x-st --quantization int8_float32 --output_dir wmt21ct2_int8

then run e.g. in python:

import ctranslate2
import transformers

translator = ctranslate2.Translator("wmt21ct2_int8")
tokenizer = transformers.AutoTokenizer.from_pretrained("facebook/wmt21-dense-24-wide-en-x")
tokenizer.src_lang = "en"

source = tokenizer.convert_ids_to_tokens(tokenizer.encode("Choose the correct verb form to complete the sentence: The birds ____________ (to fly) to the south for the winter."))
target_prefix = [tokenizer.lang_code_to_token["de"]]
results = translator.translate_batch([source], target_prefix=[target_prefix])
target = results[0].hypotheses[0][1:]

print(tokenizer.decode(tokenizer.convert_tokens_to_ids(target)))
Downloads last month
3
Safetensors
Model size
4.69B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.