|
--- |
|
license: mit |
|
--- |
|
|
|
# hubert-base-jtube |
|
This repo provides model weights for the [hubert-base model](https://arxiv.org/abs/2106.07447) trained on the [JTubeSpeech](https://github.com/sarulab-speech/jtubespeech) corpus. |
|
|
|
## Dataset |
|
We extracted approximately 2720 hours of Japanese speech from the single-speaker subset of the JTubeSpeech corpus. |
|
The training data includes approximately 6,000,000 utterances from a total of about 55,000 speakers. |
|
|
|
# Contributors |
|
* [中田 亘](https://wataru-nakata.github.io) |
|
* [関 健太郎](https://trgkpc.github.io/) |
|
* [谷中 瞳](https://hitomiyanaka.mystrikingly.com/) |
|
* [佐伯 高明](https://takaaki-saeki.github.io/) |
|
* [齋藤 佑樹](https://sython.org/) |
|
* [高道 慎之介](https://sites.google.com/site/shinnosuketakamichi/home) |
|
|
|
# 謝辞/acknowledgements |
|
本研究は、国立研究開発法人産業技術総合研究所事業の令和5年度覚醒プロジェクトの助成を受けたものです。 |
|
/This work was supported by AIST KAKUSEI project (FY2023). |