lsb commited on
Commit
d7a89cb
1 Parent(s): f8ff755

add tokenizer

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ ---
3
+ language:
4
+ - la
5
+ license: agpl-3.0
6
+ tags:
7
+ - robust-speech-event
8
+ datasets:
9
+ - lsb/poetaexmachina-mp3-recitations
10
+ metrics:
11
+ - wer
12
+ model-index:
13
+ - name: wav2vec2-base-it-latin
14
+ results:
15
+ - task:
16
+ type: automatic-speech-recognition
17
+ name: Speech Recognition
18
+ dataset:
19
+ type: lsb/poetaexmachina-mp3-recitations
20
+ name: Poeta Ex Machina mp3 recitations
21
+ metrics:
22
+ - type: wer
23
+ value: 0.398
24
+ name: Test WER
25
+
26
+ ---
27
+ ---
28
+
29
+ # wav2vec2-base-it-latin
30
+
31
+ This model is a fine-tuned version of [wav2vec2-base-it-voxpopuli](https://huggingface.co/facebook/wav2vec2-base-it-voxpopuli)
32
+
33
+ The dataset used is the [poetaexmachina-mp3-recitations](https://github.com/lsb/poetaexmachina-mp3-recitations),
34
+ all of the 2-series texts (vergil) and every tenth 1-series text (words from Poeta Ex Machina's [database](https://github.com/lsb/poetaexmachina/blob/master/merged-scansions.db) of words with scansions).
35
+
36
+ It achieves the following [results](https://github.com/lsb/tironiculum/blame/trunk/wav2vec2%20base%20it%20latin.ipynb#L1234) on the evaluation set:
37
+
38
+ - Loss: 0.1943
39
+ - WER: 0.398