Spaces:

jesseplusplus
/

easy-translate

Running

App Files Files Community

Iker commited on May 4, 2022

Commit

797bf18

1 Parent(s): 5c7e660

Fix typos

Browse files

Files changed (3) hide show

README.md +5 -17
sample_text/en2es.m2m100_1.2B.json +123 -1
sample_text/en2es.m2m100_418M.json +123 -1

README.md CHANGED Viewed

@@ -13,20 +13,12 @@
     <br>
 </p>
-Easy-translate is a script for translating large text files in your machine using
-the [M2M100 models](https://arxiv.org/pdf/2010.11125.pdf) from Facebook/Meta AI.
-We also privide a [script](#evaluate-translations) for Easy-Evaluation of your translations 🥳
-M2M100 is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation.
-It was introduced in this [paper](https://arxiv.org/abs/2010.11125) and first released in [this](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) repository.
-The model that can directly translate between the 9,900 directions of 100 languages.
-Easy-Translate is built on top of 🤗HuggingFace's
-[Transformers](https://huggingface.co/docs/transformers/index) and
-🤗HuggingFace's [Accelerate](https://huggingface.co/docs/accelerate/index) library.
-We support:
  * CPU / multi-CPU / GPU / multi-GPU / TPU acceleration
  * BF16 / FP16 / FP32 precision.
  * Automatic batch size finder: Forget CUDA OOM errors. Set an initial batch size, if it doesn't fit, we will automatically adjust it.
@@ -80,9 +72,7 @@ accelerate launch translate.py \
 #### Multi-GPU:
 See Accelerate documentation for more information (multi-node, TPU, Sharded model...): https://huggingface.co/docs/accelerate/index
-You can use the Accelerate CLI to configure the Accelerate environment (Run
-`accelerate config` in your terminal) instead of using the
-`--multi_gpu and --num_processes` flags.
 ```bash
 accelerate launch --multi_gpu --num_processes 2 --num_machines 1 translate.py \
@@ -94,9 +84,7 @@ accelerate launch --multi_gpu --num_processes 2 --num_machines 1 translate.py \
 ```
 #### Automatic batch size finder:
-We will automatically find a batch size that fits in your GPU memory.
-The default initial batch size is 128 (You can set it with the `--starting_batch_size 128` flag).
-If we find an Out Of Memory error, we will automatically decrease the batch size until we find a working one.

     <br>
 </p>
+Easy-translate is a script for translating large text files in your machine using the [M2M100 models](https://arxiv.org/pdf/2010.11125.pdf) from Facebook/Meta AI.  We also privide a [script](#evaluate-translations) for Easy-Evaluation of your translations 🥳
+M2M100 is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation. It was introduced in this [paper](https://arxiv.org/abs/2010.11125) and first released in [this](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) repository. The model that can directly translate between the 9,900 directions of 100 languages.
+Easy-Translate is built on top of 🤗HuggingFace's [Transformers](https://huggingface.co/docs/transformers/index) and 🤗HuggingFace's [Accelerate](https://huggingface.co/docs/accelerate/index) library. We support:
  * CPU / multi-CPU / GPU / multi-GPU / TPU acceleration
  * BF16 / FP16 / FP32 precision.
  * Automatic batch size finder: Forget CUDA OOM errors. Set an initial batch size, if it doesn't fit, we will automatically adjust it.
 #### Multi-GPU:
 See Accelerate documentation for more information (multi-node, TPU, Sharded model...): https://huggingface.co/docs/accelerate/index
+You can use the Accelerate CLI to configure the Accelerate environment (Run `accelerate config` in your terminal) instead of using the `--multi_gpu and --num_processes` flags.
 ```bash
 accelerate launch --multi_gpu --num_processes 2 --num_machines 1 translate.py \
 ```
 #### Automatic batch size finder:
+We will automatically find a batch size that fits in your GPU memory. The default initial batch size is 128 (You can set it with the `--starting_batch_size 128` flag). If we find an Out Of Memory error, we will automatically decrease the batch size until we find a working one.

sample_text/en2es.m2m100_1.2B.json CHANGED Viewed

	@@ -1 +1,123 @@
1	- {"sacrebleu": {"score": 32.101150640281695, "counts": [19160, 11392, 7558, 5186], "totals": [31477, 30479, 29481, 28485], "precisions": [60.86984147155066, 37.37655434889596, 25.636850853091822, 18.20607337195015], "bp": 1.0, "sys_len": 31477, "ref_len": 30102}, "rouge": {"rouge1": [[0.5852396804366098, 0.6089057437338691, 0.5919486437026797], [0.5964621218261164, 0.6200342221830797, 0.6029705008756368], [0.6068321807422377, 0.6311106822798185, 0.61324805661008]], "rouge2": [[0.3710985389559613, 0.38708055355385995, 0.3761201217327784], [0.3844850790869714, 0.40017782122170353, 0.38920434271970195], [0.3968990790506025, 0.41382310483690327, 0.4022299418726329]], "rougeL": [[0.5351505034410595, 0.5564838960633809, 0.5410602618870524], [0.5457898501195475, 0.5677049056091881, 0.5519189480892548], [0.5575497491149766, 0.5787856637940312, 0.5630101422167583]], "rougeLsum": [[0.5352116089085267, 0.5570236521823667, 0.5415939934790461], [0.5463246235983789, 0.5676427704754348, 0.5522237812823654], [0.5581141358005033, 0.5796683147249665, 0.5630221371759908]]}, "bleu": {"bleu": 0.2842153038526809, "precisions": [0.5535070989616444, 0.33646946844340314, 0.22383069265549602, 0.15653135365661033], "brevity_penalty": 1.0, "length_ratio": 1.0469217970049918, "translation_length": 28314, "reference_length": 27045}, "meteor": {"meteor": 0.4880039569987408}, "ter": {"score": 59.500831946755405, "num_edits": 16092, "ref_length": 27045.0}, "bert_score": {"precision": 0.8192511852383614, "recall": 0.8262866012752056, "f1": 0.8223477345705033, "hashcode": "microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.11(hug_trans=4.18.0)_fast-tokenizer"}}

+{
+    "sacrebleu": {
+        "score": 32.101150640281695,
+        "counts": [
+            19160,
+            11392,
+            7558,
+            5186
+        ],
+        "totals": [
+            31477,
+            30479,
+            29481,
+            28485
+        ],
+        "precisions": [
+            60.86984147155066,
+            37.37655434889596,
+            25.636850853091822,
+            18.20607337195015
+        ],
+        "bp": 1.0,
+        "sys_len": 31477,
+        "ref_len": 30102
+    },
+    "rouge": {
+        "rouge1": [
+            [
+                0.5852396804366098,
+                0.6089057437338691,
+                0.5919486437026797
+            ],
+            [
+                0.5964621218261164,
+                0.6200342221830797,
+                0.6029705008756368
+            ],
+            [
+                0.6068321807422377,
+                0.6311106822798185,
+                0.61324805661008
+            ]
+        ],
+        "rouge2": [
+            [
+                0.3710985389559613,
+                0.38708055355385995,
+                0.3761201217327784
+            ],
+            [
+                0.3844850790869714,
+                0.40017782122170353,
+                0.38920434271970195
+            ],
+            [
+                0.3968990790506025,
+                0.41382310483690327,
+                0.4022299418726329
+            ]
+        ],
+        "rougeL": [
+            [
+                0.5351505034410595,
+                0.5564838960633809,
+                0.5410602618870524
+            ],
+            [
+                0.5457898501195475,
+                0.5677049056091881,
+                0.5519189480892548
+            ],
+            [
+                0.5575497491149766,
+                0.5787856637940312,
+                0.5630101422167583
+            ]
+        ],
+        "rougeLsum": [
+            [
+                0.5352116089085267,
+                0.5570236521823667,
+                0.5415939934790461
+            ],
+            [
+                0.5463246235983789,
+                0.5676427704754348,
+                0.5522237812823654
+            ],
+            [
+                0.5581141358005033,
+                0.5796683147249665,
+                0.5630221371759908
+            ]
+        ]
+    },
+    "bleu": {
+        "bleu": 0.2842153038526809,
+        "precisions": [
+            0.5535070989616444,
+            0.33646946844340314,
+            0.22383069265549602,
+            0.15653135365661033
+        ],
+        "brevity_penalty": 1.0,
+        "length_ratio": 1.0469217970049918,
+        "translation_length": 28314,
+        "reference_length": 27045
+    },
+    "meteor": {
+        "meteor": 0.4880039569987408
+    },
+    "ter": {
+        "score": 59.500831946755405,
+        "num_edits": 16092,
+        "ref_length": 27045.0
+    },
+    "bert_score": {
+        "precision": 0.8192488248944283,
+        "recall": 0.8262857750356197,
+        "f1": 0.8223461411595344,
+        "hashcode": "microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.11(hug_trans=4.18.0)_fast-tokenizer"
+    }
+}

sample_text/en2es.m2m100_418M.json CHANGED Viewed

	@@ -1 +1,123 @@
1	- {"sacrebleu": {"score": 29.035496917461597, "counts": [18582, 10514, 6681, 4387], "totals": [31477, 30479, 29481, 28485], "precisions": [59.033580074339994, 34.49588241084025, 22.662053525999795, 15.401088292083553], "bp": 1.0, "sys_len": 31477, "ref_len": 30388}, "rouge": {"rouge1": [[0.5661701202298134, 0.5806961045770566, 0.5693885562082325], [0.5768745925790656, 0.5926959547911554, 0.5803693779677083], [0.5871085218904836, 0.6035331460243276, 0.5900979805085623]], "rouge2": [[0.34243414046469267, 0.35226400857606666, 0.34469210847048837], [0.3545484183384055, 0.36470783370743065, 0.3569058648048812], [0.36612813327517263, 0.37717476449671, 0.3689653665404565]], "rougeL": [[0.5129704896656746, 0.526995889564155, 0.5162056185006965], [0.523632841460358, 0.5375452284094455, 0.5267080806612512], [0.5350158816319085, 0.5480980981777757, 0.5372302857012781]], "rougeLsum": [[0.5126805856827783, 0.5265189554049317, 0.5155154093959223], [0.5239559133309495, 0.5380410013947112, 0.5271022617246641], [0.5351934954578494, 0.5491115103854219, 0.5381174565735956]]}, "bleu": {"bleu": 0.2546886610724999, "precisions": [0.5339761248852158, 0.30784155806120955, 0.19560013678331242, 0.1308640025272469], "brevity_penalty": 1.0, "length_ratio": 1.0353982300884956, "translation_length": 28314, "reference_length": 27346}, "meteor": {"meteor": 0.4630996837124251}, "ter": {"score": 61.848167922182405, "num_edits": 16913, "ref_length": 27346.0}, "bert_score": {"precision": 0.8128398380875588, "recall": 0.8185442119538784, "f1": 0.8153291321396827, "hashcode": "microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.11(hug_trans=4.18.0)_fast-tokenizer"}}

+{
+    "sacrebleu": {
+        "score": 29.035496917461597,
+        "counts": [
+            18582,
+            10514,
+            6681,
+            4387
+        ],
+        "totals": [
+            31477,
+            30479,
+            29481,
+            28485
+        ],
+        "precisions": [
+            59.033580074339994,
+            34.49588241084025,
+            22.662053525999795,
+            15.401088292083553
+        ],
+        "bp": 1.0,
+        "sys_len": 31477,
+        "ref_len": 30388
+    },
+    "rouge": {
+        "rouge1": [
+            [
+                0.5661701202298134,
+                0.5806961045770566,
+                0.5693885562082325
+            ],
+            [
+                0.5768745925790656,
+                0.5926959547911554,
+                0.5803693779677083
+            ],
+            [
+                0.5871085218904836,
+                0.6035331460243276,
+                0.5900979805085623
+            ]
+        ],
+        "rouge2": [
+            [
+                0.34243414046469267,
+                0.35226400857606666,
+                0.34469210847048837
+            ],
+            [
+                0.3545484183384055,
+                0.36470783370743065,
+                0.3569058648048812
+            ],
+            [
+                0.36612813327517263,
+                0.37717476449671,
+                0.3689653665404565
+            ]
+        ],
+        "rougeL": [
+            [
+                0.5129704896656746,
+                0.526995889564155,
+                0.5162056185006965
+            ],
+            [
+                0.523632841460358,
+                0.5375452284094455,
+                0.5267080806612512
+            ],
+            [
+                0.5350158816319085,
+                0.5480980981777757,
+                0.5372302857012781
+            ]
+        ],
+        "rougeLsum": [
+            [
+                0.5126805856827783,
+                0.5265189554049317,
+                0.5155154093959223
+            ],
+            [
+                0.5239559133309495,
+                0.5380410013947112,
+                0.5271022617246641
+            ],
+            [
+                0.5351934954578494,
+                0.5491115103854219,
+                0.5381174565735956
+            ]
+        ]
+    },
+    "bleu": {
+        "bleu": 0.2546886610724999,
+        "precisions": [
+            0.5339761248852158,
+            0.30784155806120955,
+            0.19560013678331242,
+            0.1308640025272469
+        ],
+        "brevity_penalty": 1.0,
+        "length_ratio": 1.0353982300884956,
+        "translation_length": 28314,
+        "reference_length": 27346
+    },
+    "meteor": {
+        "meteor": 0.4630996837124251
+    },
+    "ter": {
+        "score": 61.848167922182405,
+        "num_edits": 16913,
+        "ref_length": 27346.0
+    },
+    "bert_score": {
+        "precision": 0.8128397642374039,
+        "recall": 0.8185485603511333,
+        "f1": 0.8153312988877296,
+        "hashcode": "microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.11(hug_trans=4.18.0)_fast-tokenizer"
+    }
+}