mikhail-panzo commited on
Commit
3dbe8a2
1 Parent(s): f642054

End of training

Browse files
Files changed (1) hide show
  1. README.md +9 -13
README.md CHANGED
@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [mikhail-panzo/malay_full_checkpoint](https://huggingface.co/mikhail-panzo/malay_full_checkpoint) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 0.3935
19
 
20
  ## Model description
21
 
@@ -34,7 +34,7 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - learning_rate: 0.0001
38
  - train_batch_size: 16
39
  - eval_batch_size: 8
40
  - seed: 42
@@ -43,23 +43,19 @@ The following hyperparameters were used during training:
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: linear
45
  - lr_scheduler_warmup_steps: 2000
46
- - training_steps: 5000
47
  - mixed_precision_training: Native AMP
48
 
49
  ### Training results
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:-----:|:----:|:---------------:|
53
- | 0.4871 | 6.76 | 500 | 0.4428 |
54
- | 0.4518 | 13.51 | 1000 | 0.4171 |
55
- | 0.4426 | 20.27 | 1500 | 0.4133 |
56
- | 0.4355 | 27.03 | 2000 | 0.4077 |
57
- | 0.425 | 33.78 | 2500 | 0.4013 |
58
- | 0.4167 | 40.54 | 3000 | 0.4020 |
59
- | 0.4037 | 47.3 | 3500 | 0.4051 |
60
- | 0.3933 | 54.05 | 4000 | 0.3945 |
61
- | 0.3875 | 60.81 | 4500 | 0.3928 |
62
- | 0.3828 | 67.57 | 5000 | 0.3935 |
63
 
64
 
65
  ### Framework versions
 
15
 
16
  This model is a fine-tuned version of [mikhail-panzo/malay_full_checkpoint](https://huggingface.co/mikhail-panzo/malay_full_checkpoint) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 0.4735
19
 
20
  ## Model description
21
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - learning_rate: 1e-06
38
  - train_batch_size: 16
39
  - eval_batch_size: 8
40
  - seed: 42
 
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: linear
45
  - lr_scheduler_warmup_steps: 2000
46
+ - training_steps: 3000
47
  - mixed_precision_training: Native AMP
48
 
49
  ### Training results
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:-----:|:----:|:---------------:|
53
+ | 0.7579 | 6.76 | 500 | 0.7164 |
54
+ | 0.6196 | 13.51 | 1000 | 0.5662 |
55
+ | 0.5622 | 20.27 | 1500 | 0.5077 |
56
+ | 0.5341 | 27.03 | 2000 | 0.4858 |
57
+ | 0.52 | 33.78 | 2500 | 0.4772 |
58
+ | 0.5233 | 40.54 | 3000 | 0.4735 |
 
 
 
 
59
 
60
 
61
  ### Framework versions