Holmeister's picture
End of training
968b3ce verified
---
license: other
base_model: boun-tabi-LMG/TURNA
tags:
- generated_from_trainer
metrics:
- rouge
- bleu
model-index:
- name: TURNA_spell_correction
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# TURNA_spell_correction
This model is a fine-tuned version of [boun-tabi-LMG/TURNA](https://huggingface.co/boun-tabi-LMG/TURNA) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0425
- Rouge1: 0.8554
- Rouge2: 0.2215
- Rougel: 0.8561
- Rougelsum: 0.8555
- Bleu: 0.9246
- Precisions: [0.8658536585365854, 0.8441558441558441, 1.0, 1.0]
- Brevity Penalty: 1.0
- Length Ratio: 1.0017
- Translation Length: 574
- Reference Length: 573
- Meteor: 0.4890
- Score: 13.4380
- Num Edits: 77
- Ref Length: 573.0
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Meteor | Score | Num Edits | Ref Length |
|:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:------:|:-----------------------------------------------------------------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:-------:|:---------:|:----------:|
| No log | 0.5013 | 196 | 0.1719 | 0.5130 | 0.1519 | 0.5138 | 0.5132 | 0.0 | [0.5173611111111112, 0.46835443037974683, 0.0, 0.0] | 1.0 | 1.0052 | 576 | 573 | 0.2860 | 49.5637 | 284 | 573.0 |
| 1.8157 | 1.0026 | 392 | 0.0889 | 0.6976 | 0.2011 | 0.6993 | 0.6991 | 0.7805 | [0.7142857142857143, 0.7792207792207793, 0.6666666666666666, 1.0] | 1.0 | 1.0017 | 574 | 573 | 0.4017 | 29.1449 | 167 | 573.0 |
| 1.8157 | 1.5038 | 588 | 0.0701 | 0.7537 | 0.2023 | 0.7545 | 0.7554 | 0.8780 | [0.7688266199649737, 0.7837837837837838, 1.0, 1.0] | 0.9965 | 0.9965 | 571 | 573 | 0.4301 | 23.3857 | 134 | 573.0 |
| 0.0846 | 2.0051 | 784 | 0.0527 | 0.7805 | 0.214 | 0.7829 | 0.7813 | 0.9036 | [0.8017543859649123, 0.8493150684931506, 1.0, 1.0] | 0.9948 | 0.9948 | 570 | 573 | 0.4482 | 20.2443 | 116 | 573.0 |
| 0.0846 | 2.5064 | 980 | 0.0502 | 0.8055 | 0.215 | 0.8075 | 0.8073 | 0.9125 | [0.8181818181818182, 0.8533333333333334, 1.0, 1.0] | 0.9983 | 0.9983 | 572 | 573 | 0.4589 | 18.3246 | 105 | 573.0 |
| 0.0401 | 3.0077 | 1176 | 0.0441 | 0.8229 | 0.223 | 0.8248 | 0.8249 | 0.9214 | [0.8374125874125874, 0.8666666666666667, 1.0, 1.0] | 0.9983 | 0.9983 | 572 | 573 | 0.4693 | 16.4049 | 94 | 573.0 |
| 0.0401 | 3.5090 | 1372 | 0.0435 | 0.8467 | 0.231 | 0.8480 | 0.8473 | 0.9271 | [0.8583916083916084, 0.8666666666666667, 1.0, 1.0] | 0.9983 | 0.9983 | 572 | 573 | 0.4818 | 14.3106 | 82 | 573.0 |
| 0.0228 | 4.0102 | 1568 | 0.0427 | 0.8569 | 0.227 | 0.8591 | 0.8581 | 0.9400 | [0.8671328671328671, 0.9066666666666666, 1.0, 1.0] | 0.9983 | 0.9983 | 572 | 573 | 0.4874 | 13.4380 | 77 | 573.0 |
| 0.0228 | 4.5115 | 1764 | 0.0420 | 0.8539 | 0.224 | 0.8556 | 0.8539 | 0.9321 | [0.8636363636363636, 0.88, 1.0, 1.0] | 0.9983 | 0.9983 | 572 | 573 | 0.4866 | 13.7871 | 79 | 573.0 |
### Framework versions
- Transformers 4.40.1
- Pytorch 2.3.0+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1