Iker commited on
Commit
64b847c
1 Parent(s): 9dcafee

SeamlessM4T examples

Browse files
README.md CHANGED
@@ -14,7 +14,7 @@
14
 
15
  Easy-Translate is a script for translating large text files with a 💥SINGLE COMMAND💥. Easy-Translate is designed to be as easy as possible for **beginners** and as **seamless** and **customizable** as possible for advanced users.
16
  We support almost any model, including [M2M100](https://arxiv.org/pdf/2010.11125.pdf),
17
- [NLLB200](https://research.facebook.com/publications/no-language-left-behind/),
18
  [LLaMA](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/),
19
  [Bloom](https://bigscience.notion.site/BLOOM-BigScience-176B-Model-ad073ca07cdf479398d5f95d88e218c4) and more 🥳.
20
  We also provide a [script](#evaluate-translations) for Easy-Evaluation of your translations 📋
@@ -39,7 +39,7 @@ We currently support:
39
 
40
  ## Supported Models
41
 
42
- 💥 EasyTranslate now supports any Seq2SeqLM (m2m100, nllb200, small100, mbart, MarianMT, T5, FlanT5, etc.) and any CausalLM (GPT2, LLaMA, Vicuna, Falcon) model from 🤗 Hugging Face's Hub!!
43
  We still recommend you to use M2M100, NLLB200 or SeamlessM4T for the best results, but you can experiment with any other MT model, as well as prompting LLMs to generate translations (See [Prompting Section](#prompting) for more details).
44
  You can also see [the examples folder](examples) for examples of how to use EasyTranslate with different models.
45
 
 
14
 
15
  Easy-Translate is a script for translating large text files with a 💥SINGLE COMMAND💥. Easy-Translate is designed to be as easy as possible for **beginners** and as **seamless** and **customizable** as possible for advanced users.
16
  We support almost any model, including [M2M100](https://arxiv.org/pdf/2010.11125.pdf),
17
+ [NLLB200](https://research.facebook.com/publications/no-language-left-behind/), [SeamlessM4T](https://dl.fbaipublicfiles.com/seamless/seamless_m4t_paper.pdf),
18
  [LLaMA](https://ai.facebook.com/blog/large-language-model-llama-meta-ai/),
19
  [Bloom](https://bigscience.notion.site/BLOOM-BigScience-176B-Model-ad073ca07cdf479398d5f95d88e218c4) and more 🥳.
20
  We also provide a [script](#evaluate-translations) for Easy-Evaluation of your translations 📋
 
39
 
40
  ## Supported Models
41
 
42
+ 💥 EasyTranslate now supports any Seq2SeqLM (m2m100, nllb200, SeamlessM4T, small100, mbart, MarianMT, T5, FlanT5, etc.) and any CausalLM (GPT2, LLaMA, Vicuna, Falcon, etc.) model from 🤗 Hugging Face's Hub!!
43
  We still recommend you to use M2M100, NLLB200 or SeamlessM4T for the best results, but you can experiment with any other MT model, as well as prompting LLMs to generate translations (See [Prompting Section](#prompting) for more details).
44
  You can also see [the examples folder](examples) for examples of how to use EasyTranslate with different models.
45
 
examples/SeamlessM4T-large_4bit.sh ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Runseamless-m4t-large model on sample text. This model requires a GPU with a lot of VRAM, so we use
2
+ # 8-bit quantization to reduce the required VRAM so we can fit in customer grade GPUs. If you have a GPU
3
+ # with a lot of RAM, running the model in FP16 should be faster and produce sighly better results,
4
+ # see examples/SeamlessM4T-large_bf16.sh
5
+
6
+ python3 translate.py \
7
+ --sentences_path sample_text/en.txt \
8
+ --output_path sample_text/en2es.translation.seamless-m4t-large.txt \
9
+ --source_lang eng \
10
+ --target_lang spa \
11
+ --model_name facebook/hf-seamless-m4t-large \
12
+ --precision 4 \
13
+ --starting_batch_size 8
examples/SeamlessM4T-large_bf16.sh ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Run seamless-m4t-large model on sample text. We use FP16 precision, which requires a GPU with a lot of VRAM (i.e NVIDIA A100)
2
+ # For running this model in customer grade GPUs, use 4-bit quantization, see examples/SeamlessM4T-large_4bit.sh
3
+
4
+ python3 translate.py \
5
+ --sentences_path sample_text/en.txt \
6
+ --output_path sample_text/en2es.translation.seamless-m4t-large.txt \
7
+ --source_lang eng \
8
+ --target_lang spa \
9
+ --model_name facebook/hf-seamless-m4t-large \
10
+ --precision bf16 \
11
+ --starting_batch_size 8
examples/SeamlessM4T-medium.sh ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ # Run seamless-m4t-medium model on sample text. One GPU, default precision.
2
+
3
+ python3 translate.py \
4
+ --sentences_path sample_text/en.txt \
5
+ --output_path sample_text/en2es.translation.seamless-m4t-medium.txt \
6
+ --source_lang eng \
7
+ --target_lang spa \
8
+ --model_name facebook/hf-seamless-m4t-medium
examples/SeamlessM4T-medium_2GPUS.sh ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ # Run seamless-m4t-medium on sample text. Multi GPU, default precision.
2
+
3
+ accelerate launch --multi_gpu --num_processes 2 --num_machines 1 translate.py \
4
+ --sentences_path sample_text/en.txt \
5
+ --output_path sample_text/en2es.translation.seamless-m4t-medium.txt \
6
+ --source_lang eng \
7
+ --target_lang spa \
8
+ --model_name facebook/hf-seamless-m4t-medium