lighteternal commited on
Commit
e5de2d3
1 Parent(s): 2c4a267

Added new model trained on 60 epochs

Browse files
README.md CHANGED
@@ -1,12 +1,13 @@
1
 
2
  ---
3
- language:
4
- - el
 
5
  tags:
6
- - pytorch
7
- - ASR
8
-
9
-
10
  ---
11
 
12
  # Greek (el) version of the XLSR-Wav2Vec2 automatic speech recognition (ASR) model
@@ -28,6 +29,8 @@ This model was trained on Greek CommonVoice speech data (364MB) for 30 epochs on
28
 
29
  ### How to use for inference:
30
 
 
 
31
  Instructions to test on CommonVoice extracts are provided in the ASR_Inference.ipynb. Snippet also available below:
32
 
33
  ```
1
 
2
  ---
3
+ language: el
4
+ datasets:
5
+ - common_voice
6
  tags:
7
+ - speech
8
+ - audio
9
+ - automatic-speech-recognition
10
+ license: apache-2.0
11
  ---
12
 
13
  # Greek (el) version of the XLSR-Wav2Vec2 automatic speech recognition (ASR) model
29
 
30
  ### How to use for inference:
31
 
32
+ For live demo, make sure that speech files are sampled at 16kHz.
33
+
34
  Instructions to test on CommonVoice extracts are provided in the ASR_Inference.ipynb. Snippet also available below:
35
 
36
  ```
wav2vec2-large-xlsr-greek/checkpoint-18400/config.json → config.json RENAMED
File without changes
wav2vec2-large-xlsr-greek/checkpoint-18400/preprocessor_config.json → preprocessor_config.json RENAMED
File without changes
wav2vec2-large-xlsr-greek/checkpoint-18400/pytorch_model.bin → pytorch_model.bin RENAMED
File without changes
wav2vec2-large-xlsr-greek/special_tokens_map.json → special_tokens_map.json RENAMED
File without changes
wav2vec2-large-xlsr-greek/tokenizer_config.json → tokenizer_config.json RENAMED
File without changes
wav2vec2-large-xlsr-greek/checkpoint-18400/trainer_state.json → trainer_state.json RENAMED
File without changes
wav2vec2-large-xlsr-greek/checkpoint-18400/training_args.bin → training_args.bin RENAMED
File without changes
vocab.json CHANGED
@@ -1 +1 @@
1
- {"\u03c8": 0, "\u03bc": 1, "\u03c3": 2, "\u00b4": 3, "\u0301": 4, "v": 6, "\u03ad": 7, "\u03ac": 8, "\u03ba": 9, "\u03c5": 10, "\u03b3": 11, "\u03b7": 12, "\u03c7": 13, "\u03be": 14, "m": 15, "\u00bb": 16, "\u03b2": 17, "'": 18, "\u03b5": 19, "\u03c6": 20, "\u03b6": 21, "\u03b1": 22, "\u03ae": 23, "\u03c2": 24, "\u2019": 25, "\u03bf": 26, "\u03cd": 27, "\u03b9": 28, "\u03c9": 29, "g": 30, "h": 31, "\u00ab": 32, "\u03cb": 33, "\u03bb": 34, "r": 35, "\u03af": 36, "\u03ca": 37, "\u03b4": 38, "\u0390": 39, "a": 40, "\u03c0": 41, "\u03c4": 42, "e": 43, "o": 44, "n": 45, "\u03b8": 46, "\u03ce": 47, "\u03c1": 48, "\u03cc": 49, "\u03bd": 50, "|": 5, "[UNK]": 51, "[PAD]": 52}
1
+ {"ώ": 0, "γ": 1, "n": 2, "ϋ": 3, "κ": 4, "e": 5, "ξ": 6, "'": 7, "θ": 8, "": 9, "σ": 10, "η": 11, "ι": 12, "α": 13, "ε": 14, "υ": 15, "v": 16, "μ": 17, "ο": 18, "«": 19, "»": 20, "έ": 21, "ν": 22, "ά": 24, "o": 25, "ζ": 26, "β": 27, "τ": 28, "π": 29, "ή": 30, "ψ": 31, "ΐ": 32, "ό": 33, "h": 34, "ύ": 35, "ω": 36, "´": 37, "χ": 38, "ϊ": 39, "ρ": 40, "a": 41, "ς": 42, "r": 43, "g": 44, "m": 45, "λ": 46, "́": 47, "ί": 48, "φ": 49, "δ": 50, "|": 23, "[UNK]": 51, "[PAD]": 52}
wav2vec2-large-xlsr-greek/checkpoint-18400/optimizer.pt DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:272a2cd55e47d89c16aadfc21a5186b3c1ee4f0dd61f67d1dfbe6e325392f208
3
- size 2490511751
 
 
 
wav2vec2-large-xlsr-greek/checkpoint-18400/scheduler.pt DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:191d69c70a4b492d24ee1d59dad04edf9839c09a08b9cc6bd41530cfdaa0092c
3
- size 623
 
 
 
wav2vec2-large-xlsr-greek/preprocessor_config.json DELETED
@@ -1,8 +0,0 @@
1
- {
2
- "do_normalize": true,
3
- "feature_size": 1,
4
- "padding_side": "right",
5
- "padding_value": 0.0,
6
- "return_attention_mask": true,
7
- "sampling_rate": 16000
8
- }
 
 
 
 
 
 
 
 
wav2vec2-large-xlsr-greek/vocab.json DELETED
@@ -1 +0,0 @@
1
- {"ώ": 0, "γ": 1, "n": 2, "ϋ": 3, "κ": 4, "e": 5, "ξ": 6, "'": 7, "θ": 8, "’": 9, "σ": 10, "η": 11, "ι": 12, "α": 13, "ε": 14, "υ": 15, "v": 16, "μ": 17, "ο": 18, "«": 19, "»": 20, "έ": 21, "ν": 22, "ά": 24, "o": 25, "ζ": 26, "β": 27, "τ": 28, "π": 29, "ή": 30, "ψ": 31, "ΐ": 32, "ό": 33, "h": 34, "ύ": 35, "ω": 36, "´": 37, "χ": 38, "ϊ": 39, "ρ": 40, "a": 41, "ς": 42, "r": 43, "g": 44, "m": 45, "λ": 46, "́": 47, "ί": 48, "φ": 49, "δ": 50, "|": 23, "[UNK]": 51, "[PAD]": 52}