File size: 923 Bytes
dd84e33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
language: ja
datasets:
- common_voice
metrics:
- cer
model-index:
- name: wav2vec2-xls-r-300m finetuned on Japanese Hiragana with no word boundaries
  results:
  - task:
      name: Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Common Voice Japanese
      type: common_voice
      args: ja
    metrics:
       - name: Test CER
         type: cer
         value: 9.34
---
# Wav2Vec2-XLS-R-300M-Japanese-Hiragana
Fine-tuned [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on Japanese Hiragana characters using JSUT, JVS, Common Voice, and in-house dataset.
The sentence outputs do not contain word boundaries. Audio inputs should be sampled at 16kHz.

## Test Results
**CER:** 9.34%
## Training
Trained on JSUT, a subset of JVS, train+valid set of Common Voice Japanese, and in-house Japanese dataset. Tested on test set of Common Voice Japanese.