Spaces:

sekhan
/

demo-lux-tts

Sleeping

App Files Files Community

sekhan commited on Jun 16

Commit

f76c7fa

•

1 Parent(s): 7212b60

add app file

Browse files

Files changed (3) hide show

README.md +38 -5
app.py +54 -0
requirements.txt +4 -0

README.md CHANGED Viewed

@@ -1,8 +1,8 @@
 ---
-title: Demo Lux Tts
-emoji: 🌍
-colorFrom: blue
-colorTo: yellow
 sdk: gradio
 sdk_version: 4.36.1
 app_file: app.py
@@ -10,4 +10,37 @@ pinned: false
 license: cc-by-nc-sa-4.0
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Demo Lux Piper Tts
+emoji: 🌖
+colorFrom: pink
+colorTo: blue
 sdk: gradio
 sdk_version: 4.36.1
 app_file: app.py
 license: cc-by-nc-sa-4.0
 ---
+# Luxembourgish Text-to-Speech Synthesizer
+This is a text-to-speech synthesizer that uses the Piper TTS model to synthesize Luxembourgish text into speech. The project includes a Gradio interface that allows users to enter Luxembourgish text and hear the synthesized speech.
+## Demo
+You can try out the demo of this project on Hugging Face Spaces [here](https://huggingface.co/spaces/sekhan/demo-lux-piper-tts).
+## Usage
+To use the project, simply enter some Luxembourgish text into the input field and click the "Synthesize" button to hear the synthesized speech.
+## Files
+The project includes the following files:
+- `app.py`: the main Python script that contains the code for the Gradio interface and the text-to-speech synthesizer.
+- `requirements.txt`: a list of the necessary libraries and their versions.
+- `.gitignore`: a file that excludes certain files and directories from being tracked by Git.
+- `lu_rtl_high3000.onnx` and `lu_rtl_high3000.onnx.json`: the Piper TTS model and configuration files for Luxembourgish.
+## License
+This project is licensed under the Creative Commons Attribution Non Commercial Share Alike 4.0 License.
+## Acknowledgements
+The Luxembourgish voice is trained using the subset: `rtl.lu : 1257 luxembourgish male samples (© RTL-CLT-UFA)` subset of the dataset by `mbarnig/lb-de-fr-en-pt-12800-TTS-CORPUS`.
+## Disclaimer
+This project is a demo and is not intended for production use. The text data entered into the interface is not saved and is only used for the speech synthesis.

app.py ADDED Viewed

	@@ -0,0 +1,54 @@

+import os
+from huggingface_hub import hf_hub_download
+import gradio as gr
+from piper import PiperVoice
+from io import BytesIO
+import wave
+import numpy as np
+def text_to_speech(text):
+    # Load voice data
+    model_path = hf_hub_download(repo_id="sekhan/luxembourgish-voice",
+                                    repo_type='dataset',
+                                    filename="high/lu_rtl_high3239.onnx",
+                                    token=os.environ['HF_TOKEN'])
+    config_path = hf_hub_download(repo_id="sekhan/luxembourgish-voice",
+                                    repo_type='dataset',
+                                    filename="high/lu_rtl_high3239.onnx.json",
+                                    token=os.environ['HF_TOKEN'])
+    # Load Lux. voice
+    voice = PiperVoice.load(model_path, config_path)
+    buffer = BytesIO()
+    with wave.open(buffer, 'wb') as wav_file:
+        wav_file.setframerate(voice.config.sample_rate)
+        wav_file.setsampwidth(2)
+        wav_file.setnchannels(1)
+        voice.synthesize(text, wav_file, sentence_silence=0.5, length_scale=1.1, noise_scale=0.75)
+    buffer.seek(0)
+    audio_data = np.frombuffer(buffer.read(), dtype=np.int16)
+    return audio_data.tobytes(), None
+# Gradio Interface
+with gr.Blocks(theme=gr.themes.Base(), css="footer {visibility: hidden}") as blocks:
+    gr.Markdown("# Luxembourgish Text-to-Speech Synthesizer")
+    gr.Markdown("Enter Luxembourgish text to synthesize it into speech. This is a very early demo. Your spontaneous text data are not saved and only used for the speech synthesis.")
+    input_text = gr.Textbox(label="Input Text", max_lines=3, placeholder="Enter text here...")
+    submit_button = gr.Button("Synthesize")
+    output_audio = gr.Audio(label="Synthesized Speech", type="numpy", show_download_button=False)
+    output_text = gr.Textbox(label="Output Text", visible=False)
+    def process_and_output(text):
+        audio, message = text_to_speech(text)
+        if message:
+            return audio, message
+        else:
+            return audio, None
+    submit_button.click(process_and_output, inputs=input_text, outputs=[output_audio, output_text])
+blocks.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+gradio==4.36.1
+piper-tts==1.2.0
+PyWavelets==1.5.0
+numpy==1.24.4