fhieni commited on
Commit
6b04967
1 Parent(s): f243e77

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -70
README.md CHANGED
@@ -1,70 +1,8 @@
1
- ---
2
- library_name: fairseq
3
- task: text-to-speech
4
- tags:
5
- - fairseq
6
- - audio
7
- - text-to-speech
8
- language: vi
9
- datasets:
10
- - common_voice
11
- widget:
12
- - text: "Xin chào, đây là một cuộc chạy thử nghiệm."
13
- example_title: "Hello, this is a test run."
14
- ---
15
- # tts_transformer-vi-cv7
16
-
17
- [Transformer](https://arxiv.org/abs/1809.08895) text-to-speech model from fairseq S^2 ([paper](https://arxiv.org/abs/2109.06912)/[code](https://github.com/pytorch/fairseq/tree/main/examples/speech_synthesis)):
18
- - Vietnamese
19
- - Single-speaker male voice
20
- - Trained on [Common Voice v7](https://commonvoice.mozilla.org/en/datasets)
21
-
22
- ## Usage
23
-
24
- ```python
25
- from fairseq.checkpoint_utils import load_model_ensemble_and_task_from_hf_hub
26
- from fairseq.models.text_to_speech.hub_interface import TTSHubInterface
27
- import IPython.display as ipd
28
-
29
-
30
- models, cfg, task = load_model_ensemble_and_task_from_hf_hub(
31
- "facebook/tts_transformer-vi-cv7",
32
- arg_overrides={"vocoder": "hifigan", "fp16": False}
33
- )
34
- model = models[0]
35
- TTSHubInterface.update_cfg_with_data_cfg(cfg, task.data_cfg)
36
- generator = task.build_generator(model, cfg)
37
-
38
- text = "Xin chào, đây là một cuộc chạy thử nghiệm."
39
-
40
- sample = TTSHubInterface.get_model_input(task, text)
41
- wav, rate = TTSHubInterface.get_prediction(task, model, generator, sample)
42
-
43
- ipd.Audio(wav, rate=rate)
44
- ```
45
-
46
- See also [fairseq S^2 example](https://github.com/pytorch/fairseq/blob/main/examples/speech_synthesis/docs/common_voice_example.md).
47
-
48
- ## Citation
49
-
50
- ```bibtex
51
- @inproceedings{wang-etal-2021-fairseq,
52
- title = "fairseq S{\^{}}2: A Scalable and Integrable Speech Synthesis Toolkit",
53
- author = "Wang, Changhan and
54
- Hsu, Wei-Ning and
55
- Adi, Yossi and
56
- Polyak, Adam and
57
- Lee, Ann and
58
- Chen, Peng-Jen and
59
- Gu, Jiatao and
60
- Pino, Juan",
61
- booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
62
- month = nov,
63
- year = "2021",
64
- address = "Online and Punta Cana, Dominican Republic",
65
- publisher = "Association for Computational Linguistics",
66
- url = "https://aclanthology.org/2021.emnlp-demo.17",
67
- doi = "10.18653/v1/2021.emnlp-demo.17",
68
- pages = "143--152",
69
- }
70
- ```
 
1
+ title: Vietnam VITS Male Voice TTS
2
+ colorFrom: red
3
+ colorTo: gray
4
+ sdk: gradio
5
+ sdk_version: 3.40.1
6
+ app_file: app.py
7
+ pinned: false
8
+ license: cc-by-sa-4.0