Cnam-LMSSC
/

vibravox_EBEN_models

Audio-to-Audio

French

audio

speech

Model card Files Files and versions Community

zinc75 commited on May 31

Commit

b4935ec

•

1 Parent(s): 46de9b7

Update README.md

Browse files

Files changed (1) hide show

README.md +59 -3

README.md CHANGED Viewed

@@ -1,3 +1,59 @@
----
-license: mit
----

+---
+license: mit
+datasets:
+- Cnam-LMSSC/vibravox
+language:
+- fr
+---
+# Master Model Card: Vibravox Audio Bandwidth extension Models
+<p align="center">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/6390fc80e6d656eb421bab69/KkZoQQmrn53U6BTLmr0XK.png" />
+</p>
+## Overview
+This master model card serves as an entry point for exploring multiple **audio bandwidth extension** (BWE) models trained on different sensor data from the [Vibravox dataset](https://huggingface.co/datasets/Cnam-LMSSC/vibravox).
+These models are designed to to enhance the audio quality of body-conducted captured speech, by denoising and regenerating mid and high frequencies from low frequency content only.
+The models are trained on specific sensors to address various audio capture scenarios using **body conducted** sound and vibration sensors.
+## Disclaimer
+Each of these models has been trained for **specific non-conventional speech sensors** and is intended to be used with **in-domain data**.
+Please be advised that using these models outside their intended sensor data may result in suboptimal performance.
+## Usage
+All models are trained using [Configurable EBEN](https://github.com/jhauret/vibravox/blob/main/vibravox/torch_modules/dnn/eben_generator.py) (see [publication](https://ieeexplore.ieee.org/document/10244161)) and adapted to different sensor inputs. They are intended to be used at a sample rate of 16kHz.
+## Training Procedure
+Detailed instructions for reproducing the experiments are available on the [jhauret/vibravox](https://github.com/jhauret/vibravox) Github repository.
+## Available Models
+The following models are available, **each trained on a different sensor** from the (https://huggingface.co/datasets/Cnam-LMSSC/vibravox):
+| **Transducer**                 | **Huggingface model link**  |  **Training dataset**    |
+|:---------------------------|:---------------------|:-------------|
+| In-ear comply foam-embedded microphone |[EBEN BWE Model for in-ear comply foam-embedded microphone](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_soft_in_ear_mic) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
+| In-ear rigid earpiece-embedded microphone | [EBEN BWE Model for in-ear rigid earpiece-embedded microphone](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_rigid_in_ear_mic) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
+| Forehead miniature vibration sensor | [EBEN BWE Model for forehead vibration sensor](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_forehead_accelerometer) |   `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
+| Temple vibration pickup | [EBEN BWE Model for temple vibration pickup](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_temple_vibration_pickup) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
+| Laryngophone | [EBEN BWE Model for laryngophone](https://huggingface.co/Cnam-LMSSC/EBEN_bwe_phonemizer_laryngophone) | `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) |
+## License
+All these models are released under the MIT License.
+## Sensors positioning and documentation
+| **Sensor**                 | **Image** | **Transducer** |  **Online documentation**    |
+|:---------------------------|:---------------------|:-------------|:----------------------------------------------------------------------------------------------------------------------|
+| In-ear comply foam-embedded microphone |![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/OJ4TGk-aH6M0jRUJVdA2B.png)|  Knowles FG-23329-P07  | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/soft_inear/index.html) |
+| In-ear rigid earpiece-embedded microphone | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/DIuuYaTk7Ba67CsJlplmZ.png) | Knowles SPH1642HT5H  | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/rigid_inear/index.html)  |
+| Forehead miniature vibration sensor | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/ttdc4Bakbf6O3KxNoNERW.png) | Knowles BU23173-000   | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/forehead/index.html) |
+| Temple vibration pickup | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/uHZ2mhHAlCmd0e_jdESK6.png) | AKG C411   | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/temple/index.html) |
+| Laryngophone | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65302a613ecbe51d6a6ddcec/KaS7sxV4g4JOSB3ERKnha.png) | iXRadio XVTM822D-D35  | [See documentation on vibravox.cnam.fr](https://vibravox.cnam.fr/documentation/hardware/microphones/throat/index.html)  |