Dionyssos commited on
Commit
c8dd13e
Β·
1 Parent(s): d72b2c3

TTS & sound scene runs

Browse files
This view is limited to 50 files because it contains too many changes. Β  See raw diff
Files changed (50) hide show
  1. README.md +97 -18
  2. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_0184.wav +0 -0
  3. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_1919.wav +0 -0
  4. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_2418.wav +0 -0
  5. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_6590.wav +0 -0
  6. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_7130.wav +0 -0
  7. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_7214.wav +0 -0
  8. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_8148.wav +0 -0
  9. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_8924.wav +0 -0
  10. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_8963.wav +0 -0
  11. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_00737.wav +0 -0
  12. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_00779.wav +0 -0
  13. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_01232.wav +0 -0
  14. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_01701.wav +0 -0
  15. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_02194.wav +0 -0
  16. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_03042.wav +0 -0
  17. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_0834.wav +0 -0
  18. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_1010.wav +0 -0
  19. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_3108.wav +0 -0
  20. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_3713.wav +0 -0
  21. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_3958.wav +0 -0
  22. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_4046.wav +0 -0
  23. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_4811.wav +0 -0
  24. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_5958.wav +0 -0
  25. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_9169.wav +0 -0
  26. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_rm.wav +0 -0
  27. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_angela_merkel.wav +0 -0
  28. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_eva_k.wav +0 -0
  29. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_karlsson.wav +0 -0
  30. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_ramona_deininger.wav +0 -0
  31. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_rebecca_braunert_plunkett.wav +0 -0
  32. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_amused.wav +0 -0
  33. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_angry.wav +0 -0
  34. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_disgusted.wav +0 -0
  35. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_drunk.wav +0 -0
  36. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_neutral.wav +0 -0
  37. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_sleepy.wav +0 -0
  38. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_surprised.wav +0 -0
  39. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_whisper.wav +0 -0
  40. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten.wav +0 -0
  41. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/el_GR_rapunzelina.wav +0 -0
  42. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/es_ES_carlfm.wav +0 -0
  43. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/es_ES_m-ailabs_karen_savage.wav +0 -0
  44. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/es_ES_m-ailabs_tux.wav +0 -0
  45. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/es_ES_m-ailabs_victor_villarraza.wav +0 -0
  46. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fa_haaniye.wav +0 -0
  47. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fi_FI_harri-tapani-ylilammi.wav +0 -0
  48. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fr_FR_m-ailabs_bernard.wav +0 -0
  49. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fr_FR_m-ailabs_ezwa.wav +0 -0
  50. assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fr_FR_m-ailabs_gilles_g_le_blanc.wav +0 -0
README.md CHANGED
@@ -1,29 +1,108 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- language:
4
- - en
5
- pipeline_tag: text-to-speech
6
- tags:
7
- - audio
8
- - speech
9
- - styletts2
10
- - mimic3
11
- - speech-emotion-recognition
12
- - dkounadis
13
- ---
14
 
15
- # Artificial StyleTTS2
16
 
17
- Using Mimic-3 Synthetic Speech to Drive StyleTTS2
18
 
 
 
19
 
 
20
 
21
- **[Listen to Samples](https://huggingface.co/dkounadis/artificial-styletts2/discussions/2)**
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ```
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ```
26
 
27
 
28
-
29
- See demo at [SHIFT TTS tool](https://github.com/audeering/shift/tree/main)
 
 
 
 
 
 
 
 
 
1
+ ----
2
+ -license:
3
+ -language:
4
+ -- en
5
+ -pipeline_tag: text-to-speech
6
+ -tags:
7
+ -- audio
8
+ -- speech
9
+ -- styletts2
10
+ -- mimic3
11
+ -- dkounadis
12
+ ----
 
13
 
 
14
 
15
+ # Text & Video to Affective Speech
16
 
17
+ Affective TTS System for [SHIFT Horizon](https://shift-europe.eu/). Synthesizes affective speech from plain text or subtitles (.srt) & overlays it to videos.
18
+ - Has 134 build-in voices available, tuned for [StyleTTS2](https://github.com/yl4579/StyleTTS2). Has optional support for [foreign langauges](https://github.com/MycroftAI/mimic3-voices) via [mimic3](https://huggingface.co/mukowaty/mimic3-voices/tree/main/voices).
19
 
20
+ ### Available Voices
21
 
22
+ <a href="https://audeering.github.io/shift/">Listen to available voices!</a>
23
 
24
+ ## Flask API
25
+
26
+ Install
27
+
28
+ ```
29
+ virtualenv --python=python3 ~/.envs/.my_env
30
+ source ~/.envs/.my_env/bin/activate
31
+ cd shift/
32
+ pip install -r requirements.txt
33
+ ```
34
+
35
+ Start Flask
36
+
37
+ ```
38
+ CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=./hf_home CUDA_VISIBLE_DEVICES=2 python api.py
39
+ ```
40
+
41
+ ## Inference
42
+
43
+ The following need `api.py` to be running, e.g. `.. on computeXX`.
44
+
45
+ **Text 2 Speech**
46
+
47
+ ```python
48
+ # Basic TTS - See Available Voices
49
+ python tts.py --text sample.txt --voice "en_US/m-ailabs_low#mary_ann" --affective
50
+
51
+ # voice cloning
52
+ python tts.py --text sample.txt --native assets/native_voice.wav
53
+ ```
54
+
55
+ **Image 2 Video**
56
+
57
+ ```python
58
+ # Make video narrating an image - All above TTS args apply also here!
59
+ python tts.py --text sample.txt --image assets/image_from_T31.jpg
60
+ ```
61
+
62
+ **Video 2 Video**
63
+
64
+ ```python
65
+ # Video Dubbing - from time-stamped subtitles (.srt)
66
+ python tts.py --text assets/head_of_fortuna_en.srt --video assets/head_of_fortuna.mp4
67
+
68
+ # Video narration - from text description (.txt)
69
+ python tts.py --text assets/head_of_fortuna_GPT.txt --video assets/head_of_fortuna.mp4
70
  ```
71
 
72
+ ## Examples
73
+
74
+ Native voice video
75
+
76
+ [![Native voice ANBPR video](assets/native_video_thumb.png)](https://www.youtube.com/watch?v=tmo2UbKYAqc)
77
+
78
+ ##
79
+
80
+ Same video where Native voice is replaced with English TTS voice with similar emotion
81
+
82
+
83
+ [![Same video w. Native voice replaced with English TTS](assets/tts_video_thumb.png)](https://www.youtube.com/watch?v=geI1Vqn4QpY)
84
+
85
+
86
+ ## Video Dubbing
87
+
88
+ [![Review demo SHIFT](assets/review_demo_thumb.png)](https://www.youtube.com/watch?v=bpt7rOBENcQ)
89
+
90
+ Generate dubbed video:
91
+
92
+
93
+ ```python
94
+ python tts.py --text assets/head_of_fortuna_en.srt --video assets/head_of_fortuna.mp4
95
+
96
  ```
97
 
98
 
99
+ ## Joint Application of D3.1 & D3.2
100
+
101
+ [![Captions To Video](assets/caption_to_video_thumb.png)](https://youtu.be/wWC8DpOKVvQ)
102
+
103
+ From an image with caption(s) create a video:
104
+
105
+ ```python
106
+
107
+ python tts.py --text sample.txt --image assets/image_from_T31.jpg
108
+ ```
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_0184.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_1919.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_2418.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_6590.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_7130.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_7214.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_8148.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_8924.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/af_ZA_google-nwu_8963.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_00737.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_00779.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_01232.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_01701.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_02194.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_03042.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_0834.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_1010.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_3108.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_3713.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_3958.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_4046.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_4811.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_5958.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_9169.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/bn_multi_rm.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_angela_merkel.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_eva_k.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_karlsson.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_ramona_deininger.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_m-ailabs_rebecca_braunert_plunkett.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_amused.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_angry.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_disgusted.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_drunk.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_neutral.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_sleepy.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_surprised.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten-emotion_whisper.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/de_DE_thorsten.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/el_GR_rapunzelina.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/es_ES_carlfm.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/es_ES_m-ailabs_karen_savage.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/es_ES_m-ailabs_tux.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/es_ES_m-ailabs_victor_villarraza.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fa_haaniye.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fi_FI_harri-tapani-ylilammi.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fr_FR_m-ailabs_bernard.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fr_FR_m-ailabs_ezwa.wav RENAMED
File without changes
assets/{mimic3_foreign β†’ wavs/mimic3_foreign}/fr_FR_m-ailabs_gilles_g_le_blanc.wav RENAMED
File without changes