File size: 5,473 Bytes
95c7bad
6895c84
 
95c7bad
6895c84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95c7bad
6895c84
 
 
74cd34a
3f00cb1
6895c84
 
 
 
3f00cb1
6895c84
3f00cb1
6895c84
74cd34a
 
6895c84
74cd34a
6895c84
74cd34a
6895c84
74cd34a
 
 
6895c84
74cd34a
6895c84
74cd34a
6895c84
 
 
 
 
74cd34a
6895c84
74cd34a
6895c84
 
 
 
 
 
 
 
 
 
 
 
74cd34a
6895c84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---
language:
- uz
license: apache-2.0
tags:
- automatic-speech-recognition
- generated_from_trainer
- hf-asr-leaderboard
- mozilla-foundation/common_voice_8_0
- robust-speech-event
datasets:
- mozilla-foundation/common_voice_8_0
base_model: facebook/wav2vec2-xls-r-300m
model-index:
- name: XLS-R-300M Uzbek CV8
  results:
  - task:
      type: automatic-speech-recognition
      name: Automatic Speech Recognition
    dataset:
      name: Common Voice 8
      type: mozilla-foundation/common_voice_8_0
      args: uz
    metrics:
    - type: wer
      value: 15.065
      name: Test WER (with LM)
    - type: cer
      value: 3.077
      name: Test CER (with LM)
    - type: wer
      value: 32.88
      name: Test WER (no LM)
    - type: cer
      value: 6.53
      name: Test CER (no LM)
---

# XLS-R-300M Uzbek CV8

Ushbu model [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) asosida MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - UZ datasetidan foydalangan holda Transfer Learning usuli orqali ngramm modeli asosida o'zbek tili uchun fine-tuning qilingan.
Model quydagi natijalarga erishgan:
- Loss: 0.3063
- Wer: 0.3852
- Cer: 0.0777

## Model haqida

Model arxitekturasi haqida ko'prom ma'lumot olish uchun ushbu [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) havola orqali o'ting

Ushbu modelning lugʻati oʻzbek tili  zamonaviy lotin alifbosidan iborat boʻlib, tinish belgilari olib tashlangan(https://en.wikipedia.org/wiki/Uzbek_alphabet). 
Shuni ta'kidlash kerakki, <‘> va <’> belgilar tinish belgisi sifatida hisoblanmaydi, qachonki mana shunday belgilar \<o\> va \<g\> dan so'ng kelganda ularni  <‘> bilan o‘zgartirilgan.

Dekoder common_voice matniga asoslangan kenlm tili modelidan foydalanadi.

## Foydalanish yo'nalishilari va cheklovlar

Ushbu model quyidagi foydalanish holatlari uchun foydali bo'lishi kutilmoqda:
- Video subtitr uchun
- yozib olingan eshittirishlarni indekslash

Model jonli efirdagi uchrashuvlar yoki ko'rsatuvlarni subtitrini aniqlash uchun kerakli ravishda mos emas va undan Common Voice maʼlumotlar toʻplamiga yoki boshqa hissa qoʻshuvchilarning shaxsiy hayotini xafvga qo'yadigan holatlar uchun ishlatilmasligi kerak.

## Training va baholash ma'lumotlari

The 50% of the `train` common voice official split was used as training data. The 50% of the official `dev` split was used as validation data, and the full `test` set was used for final evaluation of the model without LM, while the model with LM was evaluated only on 500 examples from the `test` set.

The kenlm language model was compiled from the target sentences of the train + other dataset splits.

### Training giperparametrlari

Training jarayonida quyidagi giperparametrlardan foydalanildi:
- learning_rate: 3e-05
- train_batch_size: 32
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 100.0
- mixed_precision_training: Native AMP

### Training natijalari

| Training Loss | Epoch | Step  | Validation Loss | Wer    | Cer    |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|
| 3.1401        | 3.25  | 500   | 3.1146          | 1.0    | 1.0    |
| 2.7484        | 6.49  | 1000  | 2.2842          | 1.0065 | 0.7069 |
| 1.0899        | 9.74  | 1500  | 0.5414          | 0.6125 | 0.1351 |
| 0.9465        | 12.99 | 2000  | 0.4566          | 0.5635 | 0.1223 |
| 0.8771        | 16.23 | 2500  | 0.4212          | 0.5366 | 0.1161 |
| 0.8346        | 19.48 | 3000  | 0.3994          | 0.5144 | 0.1102 |
| 0.8127        | 22.73 | 3500  | 0.3819          | 0.4944 | 0.1051 |
| 0.7833        | 25.97 | 4000  | 0.3705          | 0.4798 | 0.1011 |
| 0.7603        | 29.22 | 4500  | 0.3661          | 0.4704 | 0.0992 |
| 0.7424        | 32.47 | 5000  | 0.3529          | 0.4577 | 0.0957 |
| 0.7251        | 35.71 | 5500  | 0.3410          | 0.4473 | 0.0928 |
| 0.7106        | 38.96 | 6000  | 0.3401          | 0.4428 | 0.0919 |
| 0.7027        | 42.21 | 6500  | 0.3355          | 0.4353 | 0.0905 |
| 0.6927        | 45.45 | 7000  | 0.3308          | 0.4296 | 0.0885 |
| 0.6828        | 48.7  | 7500  | 0.3246          | 0.4204 | 0.0863 |
| 0.6706        | 51.95 | 8000  | 0.3250          | 0.4233 | 0.0868 |
| 0.6629        | 55.19 | 8500  | 0.3264          | 0.4159 | 0.0849 |
| 0.6556        | 58.44 | 9000  | 0.3213          | 0.4100 | 0.0835 |
| 0.6484        | 61.69 | 9500  | 0.3182          | 0.4124 | 0.0837 |
| 0.6407        | 64.93 | 10000 | 0.3171          | 0.4050 | 0.0825 |
| 0.6375        | 68.18 | 10500 | 0.3150          | 0.4039 | 0.0822 |
| 0.6363        | 71.43 | 11000 | 0.3129          | 0.3991 | 0.0810 |
| 0.6307        | 74.67 | 11500 | 0.3114          | 0.3986 | 0.0807 |
| 0.6232        | 77.92 | 12000 | 0.3103          | 0.3895 | 0.0790 |
| 0.6216        | 81.17 | 12500 | 0.3086          | 0.3891 | 0.0790 |
| 0.6174        | 84.41 | 13000 | 0.3082          | 0.3881 | 0.0785 |
| 0.6196        | 87.66 | 13500 | 0.3059          | 0.3875 | 0.0782 |
| 0.6174        | 90.91 | 14000 | 0.3084          | 0.3862 | 0.0780 |
| 0.6169        | 94.16 | 14500 | 0.3070          | 0.3860 | 0.0779 |
| 0.6166        | 97.4  | 15000 | 0.3066          | 0.3855 | 0.0778 |


### Framework versions

- Transformers 4.16.2
- Pytorch 1.10.2+cu102
- Datasets 1.18.3
- Tokenizers 0.11.0