sanchit-gandhi HF staff commited on
Commit
4447566
1 Parent(s): 58bf923
Files changed (1) hide show
  1. app.py +4 -4
app.py CHANGED
@@ -177,11 +177,11 @@ if __name__ == "__main__":
177
  )
178
  gr.Markdown(
179
  """
180
- One of the major claims of the <a href="https://arxiv.org/abs/2311.00430"> Distil-Whisper paper </a> is that
181
  that Distil-Whisper hallucinates less than Whisper on long-form audio. To demonstrate this, we'll analyse the
182
- transcriptions generated by <a href="https://huggingface.co/openai/whisper-large-v2"> Whisper </a>
183
- and <a href="https://huggingface.co/distil-whisper/distil-large-v2"> Distil-Whisper </a> on the
184
- <a href="https://huggingface.co/datasets/distil-whisper/tedlium-long-form"> TED-LIUM </a> validation set.
185
 
186
  To quantify the amount of repetition and hallucination in the predicted transcriptions, we measure the number
187
  of repeated 5-gram word duplicates (5-Dup.) and the insertion error rate (IER). Analysis is performed on the
 
177
  )
178
  gr.Markdown(
179
  """
180
+ One of the major claims of the <a href="https://arxiv.org/abs/2311.00430"> Distil-Whisper paper</a> is that
181
  that Distil-Whisper hallucinates less than Whisper on long-form audio. To demonstrate this, we'll analyse the
182
+ transcriptions generated by <a href="https://huggingface.co/openai/whisper-large-v2"> Whisper</a>
183
+ and <a href="https://huggingface.co/distil-whisper/distil-large-v2"> Distil-Whisper</a> on the
184
+ <a href="https://huggingface.co/datasets/distil-whisper/tedlium-long-form"> TED-LIUM</a> validation set.
185
 
186
  To quantify the amount of repetition and hallucination in the predicted transcriptions, we measure the number
187
  of repeated 5-gram word duplicates (5-Dup.) and the insertion error rate (IER). Analysis is performed on the