_base_nougat_jawi

This model is a fine-tuned version of facebook/nougat-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4371

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 6
  • total_train_batch_size: 48
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss
12.3256 1.0 84 1.9774
11.0152 2.0 168 1.7704
10.0413 3.0 252 1.6441
9.5176 4.0 336 1.5924
8.4833 5.0 420 1.5436
8.6871 6.0 504 1.4766
8.2367 7.0 588 1.4353
7.5666 8.0 672 1.3372
6.3156 9.0 756 1.0868
4.7622 10.0 840 0.8802
4.2315 11.0 924 0.7787
3.6727 12.0 1008 0.6649
2.944 13.0 1092 0.5857
2.9873 14.0 1176 0.5698
2.326 15.0 1260 0.5270
2.0722 16.0 1344 0.5679
2.1138 17.0 1428 0.4879
2.1412 18.0 1512 0.4786
1.9 19.0 1596 0.4855
1.8338 20.0 1680 0.4556
1.8543 21.0 1764 0.4475
1.5969 22.0 1848 0.4443
1.7896 23.0 1932 0.4375
1.5593 24.0 2016 0.4359
1.5125 25.0 2100 0.4303
1.672 26.0 2184 0.4348
1.6588 27.0 2268 0.4370
1.6671 28.0 2352 0.4371

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
9
Safetensors
Model size
349M params
Tensor type
I64
·
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support transformers models with pipeline type image-text-to-text

Model tree for bustamiyusoef/_base_nougat_jawi

Finetuned
(5)
this model