Edit model card

donut-base-vishnu

This model is a fine-tuned version of naver-clova-ix/donut-base on the imagefolder dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3250

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.9387 1.0 127 2.0237
0.8381 2.0 254 1.3332
1.4923 3.0 381 1.1110
0.9061 4.0 508 0.9530
0.4627 5.0 635 0.9156
0.4305 6.0 762 0.7884
0.4383 7.0 889 0.6936
0.1852 8.0 1016 0.6715
0.2348 9.0 1143 0.6209
0.3975 10.0 1270 0.5614
0.1548 11.0 1397 0.5152
0.0377 12.0 1524 0.5135
0.043 13.0 1651 0.4759
0.0698 14.0 1778 0.4697
0.0292 15.0 1905 0.4243
0.0516 16.0 2032 0.4594
0.2062 17.0 2159 0.4332
0.0307 18.0 2286 0.4030
0.0775 19.0 2413 0.4069
0.0157 20.0 2540 0.4111
0.0137 21.0 2667 0.4072
0.0148 22.0 2794 0.3938
0.0454 23.0 2921 0.3789
0.0023 24.0 3048 0.3864
0.0033 25.0 3175 0.3750
0.0292 26.0 3302 0.3847
0.0087 27.0 3429 0.3592
0.0032 28.0 3556 0.3665
0.0048 29.0 3683 0.3372
0.0035 30.0 3810 0.3349
0.0197 31.0 3937 0.3591
0.0006 32.0 4064 0.3504
0.0016 33.0 4191 0.3450
0.0006 34.0 4318 0.3505
0.0046 35.0 4445 0.3332
0.0045 36.0 4572 0.3206
0.0006 37.0 4699 0.3361
0.0039 38.0 4826 0.3348
0.0059 39.0 4953 0.3328
0.0039 40.0 5080 0.3406
0.0014 41.0 5207 0.3250

Framework versions

  • Transformers 4.35.0
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
1
Safetensors
Model size
202M params
Tensor type
I64
·
F32
·
Unable to determine this model’s pipeline type. Check the docs .

Finetuned from