sanchit-gandhi HF staff commited on
Commit
33d12bd
1 Parent(s): 80ca0fc
Files changed (1) hide show
  1. app.py +4 -5
app.py CHANGED
@@ -325,8 +325,8 @@ with gr.Blocks(css=css) as block:
325
 
326
  <p>Tips for ensuring good generation:
327
  <ul>
328
- <li>Include the term "very clear audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise</li>
329
- <li>When using the fine-tuned model, include the term "Jenny" to pick out her voice</li>
330
  <li>Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech</li>
331
  <li>The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt</li>
332
  </ul>
@@ -368,9 +368,8 @@ with gr.Blocks(css=css) as block:
368
  <p>To improve the prosody and naturalness of the speech further, we're scaling up the amount of training data to 50k hours of speech.
369
  The v1 release of the model will be trained on this data, as well as inference optimisations, such as flash attention
370
  and torch compile, that will improve the latency by 2-4x. If you want to find out more about how this model was trained and even fine-tune it yourself, check-out the
371
- <a href="https://github.com/huggingface/parler-tts"> Parler-TTS</a> repository on GitHub.</p>
372
-
373
- <p>The Parler-TTS codebase and its associated checkpoints are licensed under <a href='https://github.com/huggingface/parler-tts?tab=Apache-2.0-1-ov-file#readme'> Apache 2.0</a>.</p>
374
  """
375
  )
376
 
 
325
 
326
  <p>Tips for ensuring good generation:
327
  <ul>
328
+ <li>Include the term <b>"very clear audio"</b> to generate the highest quality audio, and "very noisy audio" for high levels of background noise</li>
329
+ <li>When using the fine-tuned model, include the term <b>"Jenny"</b> to pick out her voice</li>
330
  <li>Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech</li>
331
  <li>The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt</li>
332
  </ul>
 
368
  <p>To improve the prosody and naturalness of the speech further, we're scaling up the amount of training data to 50k hours of speech.
369
  The v1 release of the model will be trained on this data, as well as inference optimisations, such as flash attention
370
  and torch compile, that will improve the latency by 2-4x. If you want to find out more about how this model was trained and even fine-tune it yourself, check-out the
371
+ <a href="https://github.com/huggingface/parler-tts"> Parler-TTS</a> repository on GitHub. The Parler-TTS codebase and its
372
+ associated checkpoints are licensed under <a href='https://github.com/huggingface/parler-tts?tab=Apache-2.0-1-ov-file#readme'> Apache 2.0</a>.</p>
 
373
  """
374
  )
375