FireRedTeam commited on
Commit
74561b1
·
verified ·
1 Parent(s): 682d69e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -3
README.md CHANGED
@@ -1,3 +1,50 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - zh
5
+ - en
6
+ base_model:
7
+ - speechbrain/spkrec-ecapa-voxceleb
8
+ tags:
9
+ - agent
10
+ - voice-activity-detection
11
+ ---
12
+
13
+ <div align="center">
14
+ <h1>FireRedChat-pVAD</h1>
15
+ </div>
16
+ <div align="center">
17
+ <a href="https://fireredteam.github.io/demos/firered_chat/">Demo</a> •
18
+ <a href="https://arxiv.org/pdf/2509.06502">Paper</a> •
19
+ <a href="https://huggingface.co/FireRedTeam">Huggingface</a>
20
+ </div>
21
+
22
+ ## Descriptions
23
+ FireRedChat's personalized Voice Activity Detection (pVAD) model, an open-weight model for detecting voice activity with speaker embedding updates.. [LiveKit plugin available here](https://github.com/fireredchat-submodules/livekit-plugins-fireredchat-pvad)
24
+
25
+ - Supports speaker embedding updates for improved voice activity detection.
26
+ - The plugin requires a compatible LiveKit Agents fork or modification to include `update_speaker` call for the first user utterance.
27
+
28
+ ## Roadmap
29
+ - [x] 2025/09
30
+ - [x] Release the pVAD model weights and LiveKit plugin.
31
+
32
+ ## Usage
33
+ For inference, please use the [LiveKit plugin](https://github.com/fireredchat-submodules/livekit-plugins-fireredchat-pvad). Install and configure as follows:
34
+
35
+ ```python
36
+ from livekit.plugins import fireredchat_pvad as pvad
37
+
38
+ def prewarm(proc: JobProcess):
39
+ proc.userdata["vad"] = pvad.VAD.load(activation_threshold=0.5)
40
+
41
+ # After the first utterance (or when primary speaker switches based on RMS), call VADStream's update_speaker() to update speaker embedding.
42
+ ```
43
+
44
+ ## License
45
+ The model weights and plugin code are licensed under the Apache-2.0 license.
46
+
47
+ ### Acknowledgment
48
+ - Speaker embedding model: speechbrain/spkrec-ecapa-voxceleb
49
+
50
+ ---