ylacombe
/

mms-tam-finetuned-monospeaker

Transformers.js

Inference Endpoints

Model card Files Files and versions Community

ylacombe HF staff commited on Jan 2

Commit

05782e5

•

1 Parent(s): 1c8775f

Create README.md

Files changed (1) hide show

README.md +73 -0

README.md ADDED Viewed

	@@ -0,0 +1,73 @@

+---
+library_name: transformers
+pipeline_tag: text-to-speech
+tags:
+- transformers.js
+- mms
+- vits
+license: cc-by-nc-4.0
+datasets:
+- ylacombe/google-tamil
+language:
+- es
+---
+## Model
+This is a finetuned version of the [Tamil version](https://huggingface.co/facebook/mms-tts-guj) of Massively Multilingual Speech (MMS) models, which are light-weight, low-latency TTS models based on the [VITS architecture](https://huggingface.co/docs/transformers/model_doc/vits).
+It was trained in around **20 minutes** with as little as **80 to 150 samples**, on this [Tamil dataset](https://huggingface.co/datasets/ylacombe/google-tamil).
+Training recipe available in this [github repository: **ylacombe/finetune-hf-vits**](https://github.com/ylacombe/finetune-hf-vits).
+## Usage
+### Transformers
+```python
+from transformers import pipeline
+import scipy
+model_id = "ylacombe/mms-guj-finetuned-monospeaker"
+synthesiser = pipeline("text-to-speech", model_id) # add device=0 if you want to use a GPU
+speech = synthesiser("Hola, ¿cómo estás hoy?")
+scipy.io.wavfile.write("finetuned_output.wav", rate=speech["sampling_rate"], data=speech["audio"])
+```
+### Transformers.js
+If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
+```bash
+npm i @xenova/transformers
+```
+**Example:** Generate Tamil speech with `ylacombe/mms-guj-finetuned-monospeaker`.
+```js
+import { pipeline } from '@xenova/transformers';
+// Create a text-to-speech pipeline
+const synthesizer = await pipeline('text-to-speech', 'ylacombe/mms-guj-finetuned-monospeaker', {
+    quantized: false, // Remove this line to use the quantized version (default)
+});
+// Generate speech
+const output = await synthesizer('Hola, ¿cómo estás hoy?');
+console.log(output);
+// {
+//   audio: Float32Array(69888) [ ... ],
+//   sampling_rate: 16000
+// }
+```
+Optionally, save the audio to a wav file (Node.js):
+```js
+import wavefile from 'wavefile';
+import fs from 'fs';
+const wav = new wavefile.WaveFile();
+wav.fromScratch(1, output.sampling_rate, '32f', output.audio);
+fs.writeFileSync('out.wav', wav.toBuffer());
+```