Update README.md
Browse files
README.md
CHANGED
@@ -229,8 +229,6 @@ language_details: >-
|
|
229 |
license: mit
|
230 |
metrics:
|
231 |
- bleu
|
232 |
-
datasets:
|
233 |
-
- mozilla-foundation/common_voice_8_0
|
234 |
pipeline_tag: automatic-speech-recognition
|
235 |
tags:
|
236 |
- zeroswot
|
@@ -265,7 +263,7 @@ The compression module is a light-weight transformer that takes as input the hid
|
|
265 |
|
266 |
## Version
|
267 |
|
268 |
-
This version of ZeroSwot is trained with ASR data from
|
269 |
|
270 |
We have more versions available:
|
271 |
|
@@ -305,9 +303,9 @@ processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-large-960h-lv60
|
|
305 |
tokenizer = NllbTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
|
306 |
|
307 |
# Load ZeroSwot Encoder
|
308 |
-
commit_hash = "
|
309 |
zeroswot_encoder = AutoModel.from_pretrained(
|
310 |
-
"johntsi/ZeroSwot-Medium_asr-
|
311 |
)
|
312 |
zeroswot_encoder.eval()
|
313 |
zeroswot_encoder.to("cuda")
|
@@ -335,14 +333,24 @@ print(translation)
|
|
335 |
|
336 |
## Results
|
337 |
|
338 |
-
BLEU scores on CoVoST-2 test compared to
|
339 |
|
340 |
-
|
|
341 |
-
|
342 |
-
|
|
343 |
-
|
|
344 |
-
|
|
345 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
346 |
|
347 |
## Citation
|
348 |
|
|
|
229 |
license: mit
|
230 |
metrics:
|
231 |
- bleu
|
|
|
|
|
232 |
pipeline_tag: automatic-speech-recognition
|
233 |
tags:
|
234 |
- zeroswot
|
|
|
263 |
|
264 |
## Version
|
265 |
|
266 |
+
This version of ZeroSwot is trained with ASR data from MuST-C v1.0, and adapted [wav2vec2.0-large](https://huggingface.co/facebook/wav2vec2-large-960h-lv60-self) to the [nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) model.
|
267 |
|
268 |
We have more versions available:
|
269 |
|
|
|
303 |
tokenizer = NllbTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
|
304 |
|
305 |
# Load ZeroSwot Encoder
|
306 |
+
commit_hash = "30d17145fd8e040430bbfcf74a011070fa83debd"
|
307 |
zeroswot_encoder = AutoModel.from_pretrained(
|
308 |
+
"johntsi/ZeroSwot-Medium_asr-mustc_en-to-200", trust_remote_code=True, revision=commit_hash,
|
309 |
)
|
310 |
zeroswot_encoder.eval()
|
311 |
zeroswot_encoder.to("cuda")
|
|
|
333 |
|
334 |
## Results
|
335 |
|
336 |
+
BLEU scores on CoVoST-2 test compared to _supervised_ SOTA models from the literature. You can refer to Table 5 of the Results section in the paper for more details.
|
337 |
|
338 |
+
| Models | ZS | Size (B) | De | Es | Fr | It | Nl | Pt | Ro | Ru | Average |
|
339 |
+
|:-----------------------:|:----:|:----------:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:-------:|
|
340 |
+
| Chimera (Han et al., 2021) | ✗ | 0.15 | 27.1 | 30.6 | 35.6 | 25.0 | 29.2 | 30.2 | 24.0 | 17.4 | 27.4 |
|
341 |
+
| STEMM (Fang et al., 2022) | ✗ | 0.15 | 28.7 | 31.0 | 37.4 | 25.8 | 30.5 | 31.7 | 24.5 | 17.8 | 28.4 |
|
342 |
+
| SpeechUT (Zhang et al., 2022) | ✗ | 0.15 | 30.1 | 33.6 | 41.4 | - | - | - | - | - | - |
|
343 |
+
| Siamese-PT (Le et al., 2023) | ✗ | 0.25 | 27.9 | 31.8 | 39.2 | 27.7 | 31.7 | 34.2 | 27.0 | 18.5 | 29.8 |
|
344 |
+
| CRESS (Fang and Feng, 2023) | ✗ | 0.15 | 29.4 | 33.2 | 40.1 | 27.6 | 32.2 | 33.6 | 26.4 | 19.7 | 30.3 |
|
345 |
+
| SimRegCR (Gao et al., 2023b) | ✗ | 0.15 | 29.2 | 33.0 | 40.0 | 28.2 | 32.7 | 34.2 | 26.7 | 20.1 | 30.5 |
|
346 |
+
| LST (LLaMA2-13B) (Zhang et al., 2023)| ✗ | 13 | 30.4 | 35.3 | **41.6** | - | - | - | - | - | - |
|
347 |
+
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
348 |
+
| [ZeroSwot-Medium_asr-cv](https://huggingface.co/johntsi/ZeroSwot-Medium_asr-cv_en-to-200) | ✓ | 0.35/0.95 | 24.8 | 30.0 | 32.6 | 24.1 | 28.6 | 28.8 | 22.9 | 16.4 | 26.0 |
|
349 |
+
| [ZeroSwot-Medium_asr-mustc](https://huggingface.co/johntsi/ZeroSwot-Medium_asr-mustc_en-to-200) | ✓ | 0.35/0.95 | 28.5 | 33.1 | 37.5 | 28.2 | 32.3 | 32.9 | 26.0 | 18.7 | 29.6 |
|
350 |
+
| [ZeroSwot-Medium_asr-mustc_mt-mustc](https://huggingface.co/johntsi/ZeroSwot-Medium_asr-mustc_mt-mustc_en-to-8) | ✓ | 0.35/0.95†| 30.5 | 34.9 | 39.4 | 30.6 | 35.0 | 37.1 | 27.8 | 20.3 | 31.9 |
|
351 |
+
| [ZeroSwot-Large_asr-cv](https://huggingface.co/johntsi/ZeroSwot-Large_asr-cv_en-to-200) | ✓ | 0.35/1.65 | 26.5 | 31.1 | 33.5 | 25.4 | 29.9 | 30.6 | 24.3 | 18.0 | 27.4 |
|
352 |
+
| [ZeroSwot-Large_asr-mustc](https://huggingface.co/johntsi/ZeroSwot-Large_asr-mustc_en-to-200)| ✓ | 0.35/1.65 | 30.1 | 34.8 | 38.9 | 29.8 | 34.4 | 35.3 | 27.6 | 20.4 | 31.4 |
|
353 |
+
| [ZeroSwot-Large_asr-mustc_mt-mustc](https://huggingface.co/johntsi/ZeroSwot-Large_asr-mustc_mt-mustc_en-to-8)| ✓ | 0.35/1.65†| **31.2** | **35.8** | 40.5 | **31.4** | **36.3** | **38.3** | **28.0** | **21.5** | **32.9** |
|
354 |
|
355 |
## Citation
|
356 |
|