n3wtou commited on
Commit
ab9ce41
1 Parent(s): 40a026f

Training in progress epoch 0

Browse files
Files changed (3) hide show
  1. README.md +8 -23
  2. tf_model.h5 +1 -1
  3. tokenizer.json +2 -2
README.md CHANGED
@@ -5,12 +5,6 @@ tags:
5
  model-index:
6
  - name: n3wtou/mt5-small-finedtuned-4-swahili
7
  results: []
8
- datasets:
9
- - csebuetnlp/xlsum
10
- language:
11
- - sw
12
- metrics:
13
- - rouge
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
@@ -18,15 +12,15 @@ probably proofread and complete it, then remove this comment. -->
18
 
19
  # n3wtou/mt5-small-finedtuned-4-swahili
20
 
21
- This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the [csebuetnlp/xlsum](https://huggingface.co/datasets/csebuetnlp/xlsum/viewer/swahili/train) dataset.
22
  It achieves the following results on the evaluation set:
23
- - Train Loss: 3.0006
24
- - Validation Loss: 2.7015
25
- - Epoch: 9
26
 
27
  ## Model description
28
 
29
- This model is a fined-tuned google/mt5-small for Kiswahili abstractive text generation
30
 
31
  ## Intended uses & limitations
32
 
@@ -41,23 +35,14 @@ More information needed
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
44
- - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0003, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0003, 'decay_steps': 9900, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 100, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
45
  - training_precision: mixed_float16
46
 
47
  ### Training results
48
 
49
  | Train Loss | Validation Loss | Epoch |
50
  |:----------:|:---------------:|:-----:|
51
- | 7.0434 | 3.2396 | 0 |
52
- | 4.3604 | 3.0452 | 1 |
53
- | 3.9184 | 2.9186 | 2 |
54
- | 3.6516 | 2.8443 | 3 |
55
- | 3.4569 | 2.7955 | 4 |
56
- | 3.3146 | 2.7645 | 5 |
57
- | 3.2039 | 2.7292 | 6 |
58
- | 3.1135 | 2.7182 | 7 |
59
- | 3.0450 | 2.7040 | 8 |
60
- | 3.0006 | 2.7015 | 9 |
61
 
62
 
63
  ### Framework versions
@@ -65,4 +50,4 @@ The following hyperparameters were used during training:
65
  - Transformers 4.29.2
66
  - TensorFlow 2.12.0
67
  - Datasets 2.12.0
68
- - Tokenizers 0.13.3
 
5
  model-index:
6
  - name: n3wtou/mt5-small-finedtuned-4-swahili
7
  results: []
 
 
 
 
 
 
8
  ---
9
 
10
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
 
12
 
13
  # n3wtou/mt5-small-finedtuned-4-swahili
14
 
15
+ This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Train Loss: 7.7378
18
+ - Validation Loss: 4.4308
19
+ - Epoch: 0
20
 
21
  ## Model description
22
 
23
+ More information needed
24
 
25
  ## Intended uses & limitations
26
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0003, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0003, 'decay_steps': 99900, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 100, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
39
  - training_precision: mixed_float16
40
 
41
  ### Training results
42
 
43
  | Train Loss | Validation Loss | Epoch |
44
  |:----------:|:---------------:|:-----:|
45
+ | 7.7378 | 4.4308 | 0 |
 
 
 
 
 
 
 
 
 
46
 
47
 
48
  ### Framework versions
 
50
  - Transformers 4.29.2
51
  - TensorFlow 2.12.0
52
  - Datasets 2.12.0
53
+ - Tokenizers 0.13.3
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6900044f1b887bb31c11e27a0d5361d61b05413c19e6ae5de8e7c0d225244480
3
  size 2225556280
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e4f68b35a9162f1ba2d5b8d650df6411e2882fde475ee042e5e0517fe378dc9a
3
  size 2225556280
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:93c3578052e1605d8332eb961bc08d72e246071974e4cc54aa6991826b802aa5
3
- size 16330369
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:faaa6405f5f79c9e788c7980874a9a3b5b0aea07b53bd9243bf1abb8f5c49c81
3
+ size 16330467