donut_experiment_bayesian_trial_11

This model is a fine-tuned version of naver-clova-ix/donut-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4496
  • Bleu: 0.0662
  • Precisions: [0.8162839248434238, 0.7440758293838863, 0.7013698630136986, 0.6623376623376623]
  • Brevity Penalty: 0.0908
  • Length Ratio: 0.2942
  • Translation Length: 479
  • Reference Length: 1628
  • Cer: 0.7615
  • Wer: 0.8328

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.00012678733283601488
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Cer Wer
0.5513 1.0 253 0.5209 0.0666 [0.7287128712871287, 0.6450892857142857, 0.578005115089514, 0.5269461077844312] 0.1082 0.3102 505 1628 0.7584 0.8461
0.2612 2.0 506 0.5017 0.0648 [0.7962577962577962, 0.7287735849056604, 0.6784741144414169, 0.6225806451612903] 0.0921 0.2955 481 1628 0.7542 0.8349
0.1638 3.0 759 0.4666 0.0615 [0.7995824634655533, 0.6990521327014217, 0.6356164383561644, 0.5909090909090909] 0.0908 0.2942 479 1628 0.7636 0.8392
0.059 4.0 1012 0.4496 0.0662 [0.8162839248434238, 0.7440758293838863, 0.7013698630136986, 0.6623376623376623] 0.0908 0.2942 479 1628 0.7615 0.8328

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.1.0
  • Datasets 2.18.0
  • Tokenizers 0.19.1
Downloads last month
48
Safetensors
Model size
202M params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for davelotito/donut_experiment_bayesian_trial_11

Finetuned
(376)
this model