Wav2Vec2-XLS-R-300M-Japanese-Hiragana

Fine-tuned facebook/wav2vec2-xls-r-300m on Japanese Hiragana characters using JSUT, JVS, Common Voice, and in-house dataset. The sentence outputs do not contain word boundaries. Audio inputs should be sampled at 16kHz.

Test Results

CER: 9.34%

Training

Trained on JSUT, a subset of JVS, train+valid set of Common Voice Japanese, and in-house Japanese dataset. Tested on test set of Common Voice Japanese.

Downloads last month
83
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train snu-nia-12/wav2vec2-xls-r-300m_nia12_phone-hiragana_japanese

Evaluation results