End of training
Browse files
README.md
CHANGED
@@ -1,6 +1,10 @@
|
|
1 |
---
|
|
|
2 |
tags:
|
3 |
- generated_from_trainer
|
|
|
|
|
|
|
4 |
model-index:
|
5 |
- name: geez_15k_sc_mt5
|
6 |
results: []
|
@@ -11,7 +15,12 @@ should probably proofread and complete it, then remove this comment. -->
|
|
11 |
|
12 |
# geez_15k_sc_mt5
|
13 |
|
14 |
-
This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
## Model description
|
17 |
|
@@ -30,13 +39,79 @@ More information needed
|
|
30 |
### Training hyperparameters
|
31 |
|
32 |
The following hyperparameters were used during training:
|
33 |
-
- learning_rate: 0.
|
34 |
- train_batch_size: 64
|
35 |
-
- eval_batch_size:
|
36 |
- seed: 42
|
37 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
38 |
- lr_scheduler_type: linear
|
39 |
-
- num_epochs:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
### Framework versions
|
42 |
|
|
|
1 |
---
|
2 |
+
base_model: Samuael/geez_15k_sc_mt5
|
3 |
tags:
|
4 |
- generated_from_trainer
|
5 |
+
metrics:
|
6 |
+
- wer
|
7 |
+
- bleu
|
8 |
model-index:
|
9 |
- name: geez_15k_sc_mt5
|
10 |
results: []
|
|
|
15 |
|
16 |
# geez_15k_sc_mt5
|
17 |
|
18 |
+
This model is a fine-tuned version of [Samuael/geez_15k_sc_mt5](https://huggingface.co/Samuael/geez_15k_sc_mt5) on an unknown dataset.
|
19 |
+
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 0.8328
|
21 |
+
- Wer: 0.3812
|
22 |
+
- Cer: 0.1999
|
23 |
+
- Bleu: 56.7220
|
24 |
|
25 |
## Model description
|
26 |
|
|
|
39 |
### Training hyperparameters
|
40 |
|
41 |
The following hyperparameters were used during training:
|
42 |
+
- learning_rate: 0.001
|
43 |
- train_batch_size: 64
|
44 |
+
- eval_batch_size: 128
|
45 |
- seed: 42
|
46 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
47 |
- lr_scheduler_type: linear
|
48 |
+
- num_epochs: 60
|
49 |
+
|
50 |
+
### Training results
|
51 |
+
|
52 |
+
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer | Bleu |
|
53 |
+
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:-------:|
|
54 |
+
| 1.2083 | 1.0 | 777 | 1.6171 | 2.2784 | 1.9212 | 5.8020 |
|
55 |
+
| 0.688 | 2.0 | 1554 | 1.0003 | 1.6859 | 1.3688 | 15.2189 |
|
56 |
+
| 0.5511 | 3.0 | 2331 | 0.8504 | 1.2962 | 1.0018 | 22.7065 |
|
57 |
+
| 0.4668 | 4.0 | 3108 | 0.7743 | 0.9571 | 0.7079 | 30.4793 |
|
58 |
+
| 0.4484 | 5.0 | 3885 | 0.7333 | 0.9577 | 0.7284 | 31.3887 |
|
59 |
+
| 0.4025 | 6.0 | 4662 | 0.7124 | 0.8598 | 0.6273 | 34.6893 |
|
60 |
+
| 0.4051 | 7.0 | 5439 | 0.6944 | 0.9169 | 0.6990 | 33.8591 |
|
61 |
+
| 0.3447 | 8.0 | 6216 | 0.6947 | 0.8415 | 0.6091 | 35.9903 |
|
62 |
+
| 0.2795 | 9.0 | 6993 | 0.6986 | 0.7251 | 0.5160 | 39.3956 |
|
63 |
+
| 0.2969 | 10.0 | 7770 | 0.6814 | 0.6958 | 0.4835 | 40.8659 |
|
64 |
+
| 0.3033 | 11.0 | 8547 | 0.6894 | 0.6601 | 0.4389 | 42.3079 |
|
65 |
+
| 0.2893 | 12.0 | 9324 | 0.6619 | 0.6064 | 0.3952 | 44.4939 |
|
66 |
+
| 0.2943 | 13.0 | 10101 | 0.6611 | 0.6190 | 0.3981 | 44.3940 |
|
67 |
+
| 0.2632 | 14.0 | 10878 | 0.6647 | 0.5360 | 0.3386 | 47.5831 |
|
68 |
+
| 0.2145 | 15.0 | 11655 | 0.6586 | 0.6001 | 0.3966 | 45.1441 |
|
69 |
+
| 0.2136 | 16.0 | 12432 | 0.6777 | 0.5597 | 0.3603 | 46.7637 |
|
70 |
+
| 0.2275 | 17.0 | 13209 | 0.6560 | 0.5175 | 0.3315 | 49.1168 |
|
71 |
+
| 0.1672 | 18.0 | 13986 | 0.7088 | 0.5404 | 0.3398 | 47.7639 |
|
72 |
+
| 0.1593 | 19.0 | 14763 | 0.6990 | 0.5919 | 0.4052 | 45.8245 |
|
73 |
+
| 0.1587 | 20.0 | 15540 | 0.6815 | 0.5078 | 0.3086 | 49.5436 |
|
74 |
+
| 0.1546 | 21.0 | 16317 | 0.6927 | 0.5122 | 0.3109 | 49.3813 |
|
75 |
+
| 0.1719 | 22.0 | 17094 | 0.6857 | 0.5659 | 0.3721 | 47.1962 |
|
76 |
+
| 0.1462 | 23.0 | 17871 | 0.7005 | 0.5081 | 0.2973 | 49.7675 |
|
77 |
+
| 0.1297 | 24.0 | 18648 | 0.7067 | 0.4305 | 0.2452 | 53.4919 |
|
78 |
+
| 0.1311 | 25.0 | 19425 | 0.7041 | 0.5335 | 0.3327 | 48.9495 |
|
79 |
+
| 0.1078 | 26.0 | 20202 | 0.7083 | 0.4138 | 0.2244 | 54.5218 |
|
80 |
+
| 0.1208 | 27.0 | 20979 | 0.7141 | 0.4513 | 0.2672 | 52.7653 |
|
81 |
+
| 0.1291 | 28.0 | 21756 | 0.7110 | 0.4907 | 0.3055 | 50.7751 |
|
82 |
+
| 0.0892 | 29.0 | 22533 | 0.7335 | 0.4447 | 0.2580 | 52.8261 |
|
83 |
+
| 0.1139 | 30.0 | 23310 | 0.7295 | 0.4373 | 0.2513 | 53.3666 |
|
84 |
+
| 0.0962 | 31.0 | 24087 | 0.7322 | 0.4787 | 0.2929 | 51.6834 |
|
85 |
+
| 0.1119 | 32.0 | 24864 | 0.7371 | 0.4630 | 0.2734 | 52.1441 |
|
86 |
+
| 0.0929 | 33.0 | 25641 | 0.7569 | 0.4270 | 0.2374 | 53.9869 |
|
87 |
+
| 0.1068 | 34.0 | 26418 | 0.7512 | 0.4376 | 0.2539 | 53.5690 |
|
88 |
+
| 0.0799 | 35.0 | 27195 | 0.7422 | 0.4066 | 0.2234 | 55.3981 |
|
89 |
+
| 0.0883 | 36.0 | 27972 | 0.7552 | 0.4146 | 0.2367 | 54.9842 |
|
90 |
+
| 0.0885 | 37.0 | 28749 | 0.7708 | 0.4189 | 0.2351 | 54.4978 |
|
91 |
+
| 0.0634 | 38.0 | 29526 | 0.7738 | 0.4755 | 0.2826 | 51.6775 |
|
92 |
+
| 0.0738 | 39.0 | 30303 | 0.7601 | 0.4191 | 0.2315 | 54.6229 |
|
93 |
+
| 0.0889 | 40.0 | 31080 | 0.7724 | 0.4029 | 0.2191 | 55.2928 |
|
94 |
+
| 0.0741 | 41.0 | 31857 | 0.7743 | 0.4336 | 0.2464 | 53.7418 |
|
95 |
+
| 0.0798 | 42.0 | 32634 | 0.7771 | 0.4188 | 0.2273 | 54.7897 |
|
96 |
+
| 0.0686 | 43.0 | 33411 | 0.7710 | 0.4050 | 0.2247 | 55.6628 |
|
97 |
+
| 0.0798 | 44.0 | 34188 | 0.7839 | 0.4328 | 0.2495 | 54.0205 |
|
98 |
+
| 0.0564 | 45.0 | 34965 | 0.7814 | 0.4065 | 0.2143 | 55.2743 |
|
99 |
+
| 0.0534 | 46.0 | 35742 | 0.7995 | 0.4085 | 0.2306 | 55.3014 |
|
100 |
+
| 0.0665 | 47.0 | 36519 | 0.7883 | 0.3745 | 0.1945 | 57.1238 |
|
101 |
+
| 0.0622 | 48.0 | 37296 | 0.8022 | 0.4118 | 0.2388 | 55.3067 |
|
102 |
+
| 0.0672 | 49.0 | 38073 | 0.7893 | 0.3974 | 0.2265 | 55.8381 |
|
103 |
+
| 0.0707 | 50.0 | 38850 | 0.8055 | 0.4160 | 0.2350 | 55.0484 |
|
104 |
+
| 0.0705 | 51.0 | 39627 | 0.7914 | 0.3751 | 0.1977 | 57.3312 |
|
105 |
+
| 0.0622 | 52.0 | 40404 | 0.8046 | 0.3775 | 0.2001 | 57.2524 |
|
106 |
+
| 0.0485 | 53.0 | 41181 | 0.8065 | 0.4389 | 0.2553 | 53.8869 |
|
107 |
+
| 0.0412 | 54.0 | 41958 | 0.8196 | 0.3788 | 0.1980 | 56.7680 |
|
108 |
+
| 0.0736 | 55.0 | 42735 | 0.8035 | 0.4024 | 0.2145 | 55.9035 |
|
109 |
+
| 0.0463 | 56.0 | 43512 | 0.8027 | 0.3636 | 0.1828 | 57.8132 |
|
110 |
+
| 0.0545 | 57.0 | 44289 | 0.8421 | 0.3931 | 0.2144 | 56.3515 |
|
111 |
+
| 0.0549 | 58.0 | 45066 | 0.8090 | 0.4325 | 0.2522 | 54.1258 |
|
112 |
+
| 0.0563 | 59.0 | 45843 | 0.8210 | 0.4088 | 0.2316 | 55.5962 |
|
113 |
+
| 0.0343 | 60.0 | 46620 | 0.8328 | 0.3812 | 0.1999 | 56.7220 |
|
114 |
+
|
115 |
|
116 |
### Framework versions
|
117 |
|
config.json
CHANGED
@@ -1,4 +1,5 @@
|
|
1 |
{
|
|
|
2 |
"architectures": [
|
3 |
"T5ForConditionalGeneration"
|
4 |
],
|
|
|
1 |
{
|
2 |
+
"_name_or_path": "Samuael/geez_15k_sc_mt5",
|
3 |
"architectures": [
|
4 |
"T5ForConditionalGeneration"
|
5 |
],
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 291090192
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:12b85eb3b4fa7e63733923e08bf241e9f3035e8687db7e2c6e7407bb49983223
|
3 |
size 291090192
|
runs/Mar23_18-23-56_29117dfaa79d/events.out.tfevents.1711218353.29117dfaa79d.3409.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:76b36edf7ec8bb5652ef54cc0b00092e9a56b8913c835226e699f29392fb3d07
|
3 |
+
size 7926
|
runs/Mar23_18-26-27_29117dfaa79d/events.out.tfevents.1711218522.29117dfaa79d.3409.1
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:25eb2b8fb6bef4a2c7e20c105a0ab9c07e577d7df5f1195d766ff03ed0d783bb
|
3 |
+
size 9987533
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5048
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ad545b7e3a875b4d3a93bbdd7eb800a4746f505f1a7f08b19a183b45283d809d
|
3 |
size 5048
|