vumichien commited on
Commit
1ae7cce
1 Parent(s): 23fc185

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -27,5 +27,13 @@ movements and the produced sound.
27
  Audio-Visual Hidden Unit BERT (AV-HuBERT), a self-supervised representation learning framework for audio-visual speech, which masks multi-stream video input and predicts automatically discovered and iteratively refined multimodal hidden units. AV-HuBERT
28
  learns powerful audio-visual speech representation benefiting both lip-reading and automatic speech recognition.
29
 
 
 
 
 
 
 
 
 
30
  ## Datasets
31
  The authors trained the model on lip-reading benchmark LRS3 datasets (433 hours).
 
27
  Audio-Visual Hidden Unit BERT (AV-HuBERT), a self-supervised representation learning framework for audio-visual speech, which masks multi-stream video input and predicts automatically discovered and iteratively refined multimodal hidden units. AV-HuBERT
28
  learns powerful audio-visual speech representation benefiting both lip-reading and automatic speech recognition.
29
 
30
+ ## Example
31
+
32
+ <figure>
33
+ <img src="https://huggingface.co/vumichien/AV-HuBERT/resolve/main/lipreading.gif" alt="Audio-Visual Speech Recognition">
34
+ <figcaption> Speech Recognition from Lip video
35
+ </figcaption>
36
+ </figure>
37
+
38
  ## Datasets
39
  The authors trained the model on lip-reading benchmark LRS3 datasets (433 hours).