zinc75 commited on
Commit
b4935ec
1 Parent(s): 46de9b7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -3
README.md CHANGED
@@ -1,3 +1,59 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - Cnam-LMSSC/vibravox
5
+ language:
6
+ - fr
7
+ ---
8
+ # Master Model Card: Vibravox Audio Bandwidth extension Models
9
+
10
+ <p align="center">
11
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6390fc80e6d656eb421bab69/KkZoQQmrn53U6BTLmr0XK.png" />
12
+ </p>
13
+
14
+ ## Overview
15
+
16
+ This master model card serves as an entry point for exploring multiple **audio bandwidth extension** (BWE) models trained on different sensor data from the [Vibravox dataset](https://huggingface.co/datasets/Cnam-LMSSC/vibravox).
17
+
18
+ These models are designed to to enhance the audio quality of body-conducted captured speech, by denoising and regenerating mid and high frequencies from low frequency content only.
19
+
20
+ The models are trained on specific sensors to address various audio capture scenarios using **body conducted** sound and vibration sensors.
21
+
22
+ ## Disclaimer
23
+ Each of these models has been trained for **specific non-conventional speech sensors** and is intended to be used with **in-domain data**.
24
+
25
+ Please be advised that using these models outside their intended sensor data may result in suboptimal performance.
26
+
27
+ ## Usage
28
+ All models are trained using [Configurable EBEN](https://github.com/jhauret/vibravox/blob/main/vibravox/torch_modules/dnn/eben_generator.py) (see [publication](https://ieeexplore.ieee.org/document/10244161)) and adapted to different sensor inputs. They are intended to be used at a sample rate of 16kHz.
29
+
30
+ ## Training Procedure
31
+ Detailed instructions for reproducing the experiments are available on the [jhauret/vibravox](https://github.com/jhauret/vibravox) Github repository.
32
+
33
+ ## Available Models
34
+
35
+ The following models are available, **each trained on a different sensor** from the (https://huggingface.co/datasets/Cnam-LMSSC/vibravox):
36
+
37
+ | **Transducer** | **Huggingface model link** | **Training dataset** |
38
+ |:---------------------------|:---------------------|:-------------|
39
+ | In-ear comply foam-embedded microphone |[EBEN BWE Model for in-ear comply foam-embedded microphone](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_soft_in_ear_mic) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
40
+ | In-ear rigid earpiece-embedded microphone | [EBEN BWE Model for in-ear rigid earpiece-embedded microphone](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_rigid_in_ear_mic) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
41
+ | Forehead miniature vibration sensor | [EBEN BWE Model for forehead vibration sensor](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_forehead_accelerometer) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
42
+ | Temple vibration pickup | [EBEN BWE Model for temple vibration pickup](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_temple_vibration_pickup) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
43
+ | Laryngophone | [EBEN BWE Model for laryngophone](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_phonemizer_laryngophone) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
44
+
45
+
46
+ ## License
47
+ All these models are released under the MIT License.
48
+
49
+
50
+ ## Sensors positioning and documentation
51
+
52
+
53
+ | **Sensor** | **Image** | **Transducer** | **Online documentation** |
54
+ |:---------------------------|:---------------------|:-------------|:----------------------------------------------------------------------------------------------------------------------|
55
+ | In-ear comply foam-embedded microphone |![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/OJ4TGk-aH6M0jRUJVdA2B.png)| Knowles FG-23329-P07 | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/soft_inear/index.html) |
56
+ | In-ear rigid earpiece-embedded microphone | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/DIuuYaTk7Ba67CsJlplmZ.png) | Knowles SPH1642HT5H | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/rigid_inear/index.html) |
57
+ | Forehead miniature vibration sensor | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/ttdc4Bakbf6O3KxNoNERW.png) | Knowles BU23173-000 | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/forehead/index.html) |
58
+ | Temple vibration pickup | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/uHZ2mhHAlCmd0e_jdESK6.png) | AKG C411 | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/temple/index.html) |
59
+ | Laryngophone | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/KaS7sxV4g4JOSB3ERKnha.png) | iXRadio XVTM822D-D35 | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/throat/index.html) |