|
---
|
|
language: "fr"
|
|
thumbnail:
|
|
tags:
|
|
- wav2vec2
|
|
license: "apache-2.0"
|
|
---
|
|
|
|
# LeBenchmark: wav2vec2 base model trained on 1K hours of French *female-only* speech
|
|
|
|
|
|
LeBenchmark provides an ensemble of pretrained wav2vec2 models on different French datasets containing spontaneous, read, and broadcasted speech.
|
|
|
|
For more information about our gender study for SSL moddels, please refer to our paper at: [A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems]()
|
|
|
|
|
|
## Model and data descriptions
|
|
|
|
We release four gender-specific models trained on 1K hours of speech.
|
|
|
|
- [wav2vec2-FR-1K-Male-large](https://huggingface.co/LeBenchmark/wav2vec-FR-1K-Male-large/)
|
|
- [wav2vec2-FR-1k-Male-base](https://huggingface.co/LeBenchmark/wav2vec-FR-1K-Male-base/)
|
|
- [wav2vec2-FR-1K-Female-large](https://huggingface.co/LeBenchmark/wav2vec-FR-1K-Female-large/)
|
|
- [wav2vec2-FR-1K-Female-base](https://huggingface.co/LeBenchmark/wav2vec-FR-1K-Female-base/)
|
|
|
|
## Intended uses & limitations
|
|
|
|
Pretrained wav2vec2 models are distributed under the Apache-2.0 license. Hence, they can be reused extensively without strict limitations. However, benchmarks and data may be linked to corpora that are not completely open-sourced.
|
|
|
|
## Referencing our gender-specific models
|
|
```
|
|
@article{boito2022study,
|
|
title={A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems},
|
|
author={Marcely Zanon Boito and Laurent Besacier and Natalia Tomashenko and Yannick Est{\`e}ve},
|
|
journal={arXiv preprint arXiv:2204.01397},
|
|
year={2022}
|
|
}
|
|
```
|
|
## Referencing LeBenchmark
|
|
|
|
```
|
|
@inproceedings{evain2021task,
|
|
title={Task agnostic and task specific self-supervised learning from speech with \textit{LeBenchmark}},
|
|
author={Evain, Sol{\`e}ne and Nguyen, Ha and Le, Hang and Boito, Marcely Zanon and Mdhaffar, Salima and Alisamir, Sina and Tong, Ziyi and Tomashenko, Natalia and Dinarelli, Marco and Parcollet, Titouan and others},
|
|
booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)},
|
|
year={2021}
|
|
}
|
|
``` |