reach-vb HF staff commited on
Commit
9067aa2
1 Parent(s): 00be8f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -19
README.md CHANGED
@@ -37,9 +37,9 @@ We provide extensive evaluation results of SeamlessM4T-Medium and SeamlessM4T-La
37
  First, load the processor and a checkpoint of the model:
38
 
39
  ```python
40
- from transformers import AutoProcessor, SeamlessM4TModel
41
- processor = AutoProcessor.from_pretrained("facebook/hf-seamless-m4t-large")
42
- model = SeamlessM4TModel.from_pretrained("facebook/hf-seamless-m4t-large")
43
  ```
44
 
45
  You can seamlessly use this model on text or on audio, to generated either translated text or translated audio.
@@ -47,14 +47,14 @@ You can seamlessly use this model on text or on audio, to generated either trans
47
  Here is how to use the processor to process text and audio:
48
 
49
  ```python
50
- # let's load an audio sample from an Arabic speech corpus
51
- from datasets import load_dataset
52
- dataset = load_dataset("arabic_speech_corpus", split="test", streaming=True)
53
- audio_sample = next(iter(dataset))["audio"]
54
- # now, process it
55
- audio_inputs = processor(audios=audio_sample["array"], return_tensors="pt")
56
- # now, process some English test as well
57
- text_inputs = processor(text = "Hello, my dog is cute", src_lang="eng", return_tensors="pt")
58
  ```
59
 
60
 
@@ -63,8 +63,8 @@ text_inputs = processor(text = "Hello, my dog is cute", src_lang="eng", return_t
63
  [`SeamlessM4TModel`](https://huggingface.co/docs/transformers/main/en/model_doc/seamless_m4t#transformers.SeamlessM4TModel) can *seamlessly* generate text or speech with few or no changes. Let's target Russian voice translation:
64
 
65
  ```python
66
- audio_array_from_text = model.generate(**text_inputs, tgt_lang="rus")[0].cpu().numpy().squeeze()
67
- audio_array_from_audio = model.generate(**audio_inputs, tgt_lang="rus")[0].cpu().numpy().squeeze()
68
  ```
69
 
70
  With basically the same code, I've translated English text and Arabic speech to Russian speech samples.
@@ -75,12 +75,12 @@ Similarly, you can generate translated text from audio files or from text with t
75
  This time, let's translate to French.
76
 
77
  ```python
78
- # from audio
79
- output_tokens = model.generate(**audio_inputs, tgt_lang="fra", generate_speech=False)
80
- translated_text_from_audio = processor.decode(output_tokens[0].tolist()[0], skip_special_tokens=True)
81
- # from text
82
- output_tokens = model.generate(**text_inputs, tgt_lang="fra", generate_speech=False)
83
- translated_text_from_text = processor.decode(output_tokens[0].tolist()[0], skip_special_tokens=True)
84
  ```
85
 
86
 
 
37
  First, load the processor and a checkpoint of the model:
38
 
39
  ```python
40
+ >>> from transformers import AutoProcessor, SeamlessM4TModel
41
+ >>> processor = AutoProcessor.from_pretrained("facebook/hf-seamless-m4t-large")
42
+ >>> model = SeamlessM4TModel.from_pretrained("facebook/hf-seamless-m4t-large")
43
  ```
44
 
45
  You can seamlessly use this model on text or on audio, to generated either translated text or translated audio.
 
47
  Here is how to use the processor to process text and audio:
48
 
49
  ```python
50
+ >>> # let's load an audio sample from an Arabic speech corpus
51
+ >>> from datasets import load_dataset
52
+ >>> dataset = load_dataset("arabic_speech_corpus", split="test", streaming=True)
53
+ >>> audio_sample = next(iter(dataset))["audio"]
54
+ >>> # now, process it
55
+ >>> audio_inputs = processor(audios=audio_sample["array"], return_tensors="pt")
56
+ >>> # now, process some English test as well
57
+ >>> text_inputs = processor(text = "Hello, my dog is cute", src_lang="eng", return_tensors="pt")
58
  ```
59
 
60
 
 
63
  [`SeamlessM4TModel`](https://huggingface.co/docs/transformers/main/en/model_doc/seamless_m4t#transformers.SeamlessM4TModel) can *seamlessly* generate text or speech with few or no changes. Let's target Russian voice translation:
64
 
65
  ```python
66
+ >>> audio_array_from_text = model.generate(**text_inputs, tgt_lang="rus")[0].cpu().numpy().squeeze()
67
+ >>> audio_array_from_audio = model.generate(**audio_inputs, tgt_lang="rus")[0].cpu().numpy().squeeze()
68
  ```
69
 
70
  With basically the same code, I've translated English text and Arabic speech to Russian speech samples.
 
75
  This time, let's translate to French.
76
 
77
  ```python
78
+ >>> # from audio
79
+ >>> output_tokens = model.generate(**audio_inputs, tgt_lang="fra", generate_speech=False)
80
+ >>> translated_text_from_audio = processor.decode(output_tokens[0].tolist()[0], skip_special_tokens=True)
81
+ >>> # from text
82
+ >>> output_tokens = model.generate(**text_inputs, tgt_lang="fra", generate_speech=False)
83
+ >>> translated_text_from_text = processor.decode(output_tokens[0].tolist()[0], skip_special_tokens=True)
84
  ```
85
 
86