dtvingg commited on
Commit
97c2ad2
1 Parent(s): ad8612d

Training complete

Browse files
Files changed (1) hide show
  1. README.md +15 -14
README.md CHANGED
@@ -19,8 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [vinai/bartpho-syllable-base](https://huggingface.co/vinai/bartpho-syllable-base) on the None dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.0237
23
- - Sacrebleu: 98.1436
 
24
 
25
  ## Model description
26
 
@@ -40,11 +41,11 @@ More information needed
40
 
41
  The following hyperparameters were used during training:
42
  - learning_rate: 1e-05
43
- - train_batch_size: 24
44
- - eval_batch_size: 96
45
  - seed: 42
46
  - gradient_accumulation_steps: 4
47
- - total_train_batch_size: 96
48
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
  - lr_scheduler_type: linear
50
  - num_epochs: 5
@@ -52,18 +53,18 @@ The following hyperparameters were used during training:
52
 
53
  ### Training results
54
 
55
- | Training Loss | Epoch | Step | Validation Loss | Sacrebleu |
56
- |:-------------:|:-----:|:----:|:---------------:|:---------:|
57
- | No log | 1.0 | 179 | 0.0514 | 96.4718 |
58
- | No log | 2.0 | 358 | 0.0363 | 97.6428 |
59
- | 0.0916 | 3.0 | 537 | 0.0285 | 97.6959 |
60
- | 0.0916 | 4.0 | 716 | 0.0252 | 97.8422 |
61
- | 0.0916 | 5.0 | 895 | 0.0237 | 98.1436 |
62
 
63
 
64
  ### Framework versions
65
 
66
  - Transformers 4.46.3
67
- - Pytorch 2.4.0
68
- - Datasets 3.1.0
69
  - Tokenizers 0.20.3
 
19
 
20
  This model is a fine-tuned version of [vinai/bartpho-syllable-base](https://huggingface.co/vinai/bartpho-syllable-base) on the None dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.3488
23
+ - Model Preparation Time: 0.0071
24
+ - Sacrebleu: 92.9401
25
 
26
  ## Model description
27
 
 
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 1e-05
44
+ - train_batch_size: 12
45
+ - eval_batch_size: 48
46
  - seed: 42
47
  - gradient_accumulation_steps: 4
48
+ - total_train_batch_size: 48
49
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 5
 
53
 
54
  ### Training results
55
 
56
+ | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Sacrebleu |
57
+ |:-------------:|:------:|:----:|:---------------:|:----------------------:|:---------:|
58
+ | No log | 0.9231 | 3 | 0.7367 | 0.0071 | 89.5419 |
59
+ | No log | 1.8462 | 6 | 0.6013 | 0.0071 | 89.5419 |
60
+ | No log | 2.7692 | 9 | 0.4542 | 0.0071 | 89.5419 |
61
+ | No log | 4.0 | 13 | 0.3624 | 0.0071 | 92.9401 |
62
+ | No log | 4.6154 | 15 | 0.3488 | 0.0071 | 92.9401 |
63
 
64
 
65
  ### Framework versions
66
 
67
  - Transformers 4.46.3
68
+ - Pytorch 2.5.1+cu121
69
+ - Datasets 3.2.0
70
  - Tokenizers 0.20.3