Update README.md
Browse files
README.md
CHANGED
@@ -52,10 +52,10 @@ It models audio as tokens and can generate high-quality audio with consistent st
|
|
52 |
### Key features
|
53 |
|
54 |
1. Extremely small, based on GPT-2 small architecture. The methodology can be extended to any autoregressive transformer-based architecture.
|
55 |
-
2. Ultra-fast. Using our [self hosted service option](#self-hosted-service), the model can achieve speeds up to 400 toks/s (4s of audio generation per s) and under 20ms time to first token
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
|
60 |
### Details
|
61 |
|
@@ -94,11 +94,30 @@ pipe = pipeline(
|
|
94 |
trust_remote_code=True
|
95 |
)
|
96 |
|
97 |
-
output = pipe(['Hi, my name is Indri and I like to talk.'])
|
98 |
|
99 |
torchaudio.save('output.wav', output[0]['audio'][0], sample_rate=24000)
|
100 |
```
|
101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
### Self hosted service
|
103 |
|
104 |
```bash
|
|
|
52 |
### Key features
|
53 |
|
54 |
1. Extremely small, based on GPT-2 small architecture. The methodology can be extended to any autoregressive transformer-based architecture.
|
55 |
+
2. Ultra-fast. Using our [self hosted service option](#self-hosted-service), on RTX6000Ada NVIDIA GPU the model can achieve speeds up to 400 toks/s (4s of audio generation per s) and under 20ms time to first token.
|
56 |
+
3. On RTX6000Ada, it can support a batch size of 1k with full context length of 1024 tokens
|
57 |
+
4. Supports voice cloning with small prompts (<5s).
|
58 |
+
5. Code mixing text input in 2 languages - English and Hindi.
|
59 |
|
60 |
### Details
|
61 |
|
|
|
94 |
trust_remote_code=True
|
95 |
)
|
96 |
|
97 |
+
output = pipe(['Hi, my name is Indri and I like to talk.'], speaker = '[spkr_63]')
|
98 |
|
99 |
torchaudio.save('output.wav', output[0]['audio'][0], sample_rate=24000)
|
100 |
```
|
101 |
|
102 |
+
**Available speakers**
|
103 |
+
|
104 |
+
|Speaker ID|Speaker name|
|
105 |
+
|---|---|
|
106 |
+
|`[spkr_63]`|๐ฌ๐ง ๐จ book reader|
|
107 |
+
|`[spkr_67]`|๐บ๐ธ ๐จ influencer|
|
108 |
+
|`[spkr_68]`|๐ฎ๐ณ ๐จ book reader|
|
109 |
+
|`[spkr_69]`|๐ฎ๐ณ ๐จ book reader|
|
110 |
+
|`[spkr_70]`|๐ฎ๐ณ ๐จ motivational speaker|
|
111 |
+
|`[spkr_62]`|๐ฎ๐ณ ๐จ book reader heavy|
|
112 |
+
|`[spkr_53]`|๐ฎ๐ณ ๐ฉ recipe reciter|
|
113 |
+
|`[spkr_60]`|๐ฎ๐ณ ๐ฉ book reader|
|
114 |
+
|`[spkr_74]`|๐บ๐ธ ๐จ book reader|
|
115 |
+
|`[spkr_75]`|๐ฎ๐ณ ๐จ entrepreneur|
|
116 |
+
|`[spkr_76]`|๐ฌ๐ง ๐จ nature lover|
|
117 |
+
|`[spkr_77]`|๐ฎ๐ณ ๐จ influencer|
|
118 |
+
|`[spkr_66]`|๐ฎ๐ณ ๐จ politician|
|
119 |
+
|
120 |
+
|
121 |
### Self hosted service
|
122 |
|
123 |
```bash
|