Edit model card

donut_experiment_bayesian_trial_0

This model is a fine-tuned version of naver-clova-ix/donut-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4050
  • Bleu: 0.0639
  • Precisions: [0.79957805907173, 0.7386091127098321, 0.7083333333333334, 0.6765676567656765]
  • Brevity Penalty: 0.0876
  • Length Ratio: 0.2912
  • Translation Length: 474
  • Reference Length: 1628
  • Cer: 0.7653
  • Wer: 0.8371

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.2045081648781836e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Cer Wer
0.9353 1.0 253 0.6228 0.0486 [0.7096774193548387, 0.6053921568627451, 0.5612535612535613, 0.5102040816326531] 0.0820 0.2856 465 1628 0.7751 0.8592
0.462 2.0 506 0.4846 0.0568 [0.7913978494623656, 0.7058823529411765, 0.6609686609686609, 0.6224489795918368] 0.0820 0.2856 465 1628 0.7650 0.8423
0.4071 3.0 759 0.4226 0.0626 [0.7899159663865546, 0.711217183770883, 0.6767955801104972, 0.6459016393442623] 0.0889 0.2924 476 1628 0.7685 0.8436
0.3007 4.0 1012 0.4092 0.0638 [0.7957894736842105, 0.7344497607655502, 0.7008310249307479, 0.6644736842105263] 0.0883 0.2918 475 1628 0.7640 0.8397
0.3114 5.0 1265 0.4050 0.0639 [0.79957805907173, 0.7386091127098321, 0.7083333333333334, 0.6765676567656765] 0.0876 0.2912 474 1628 0.7653 0.8371

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.1.0
  • Datasets 2.18.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
202M params
Tensor type
I64
·
F32
·
Unable to determine this model’s pipeline type. Check the docs .

Finetuned from