dhivehi-nougat-base-text-sen
This model is a fine-tuned version of facebook/nougat-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0796
Model description
Finetuned dhivehi on dhivehi-img-txtsen dataset
Usage
from PIL import Image
import torch
from transformers import NougatProcessor, VisionEncoderDecoderModel
from pathlib import Path
# Load the model and processor
processor = NougatProcessor.from_pretrained("alakxender/dhivehi-nougat-base-text-sen")
model = VisionEncoderDecoderModel.from_pretrained(
"alakxender/dhivehi-nougat-small-dv01-01",
torch_dtype=torch.bfloat16, # Optional: Load the model with BF16 data type for faster inference and lower memory usage
attn_implementation={ # Optional: Specify the attention kernel implementations for different parts of the model
"decoder": "flash_attention_2", # Use FlashAttention-2 for the decoder for improved performance
"encoder": "eager" # Use the default ("eager") attention implementation for the encoder
}
)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
context_length = 128
def predict(img_path):
# Ensure image is in RGB format
image = Image.open(img_path).convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values.to(torch.bfloat16)
# generate prediction
outputs = model.generate(
pixel_values.to(device),
min_length=1,
max_new_tokens=context_length,
repetition_penalty=1.5,
bad_words_ids=[[processor.tokenizer.unk_token_id]],
eos_token_id=processor.tokenizer.eos_token_id,
)
page_sequence = processor.batch_decode(outputs, skip_special_tokens=True)[0]
return page_sequence
print(predict("DV01-04_31.jpg"))
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 3
- eval_batch_size: 3
- seed: 42
- gradient_accumulation_steps: 6
- total_train_batch_size: 18
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
5.0226 | 0.0383 | 100 | 0.7908 |
4.0593 | 0.0766 | 200 | 0.6369 |
3.6743 | 0.1149 | 300 | 0.5734 |
3.4239 | 0.1532 | 400 | 0.5411 |
3.3072 | 0.1915 | 500 | 0.5175 |
3.1666 | 0.2298 | 600 | 0.5048 |
3.0814 | 0.2681 | 700 | 0.4925 |
3.0572 | 0.3064 | 800 | 0.4824 |
2.9389 | 0.3447 | 900 | 0.4746 |
2.9756 | 0.3830 | 1000 | 0.4683 |
2.8457 | 0.4213 | 1100 | 0.4614 |
2.8612 | 0.4597 | 1200 | 0.4561 |
2.9689 | 0.4980 | 1300 | 0.4500 |
2.8069 | 0.5363 | 1400 | 0.4457 |
2.7381 | 0.5746 | 1500 | 0.4413 |
2.7011 | 0.6129 | 1600 | 0.4388 |
2.6893 | 0.6512 | 1700 | 0.4354 |
2.7628 | 0.6895 | 1800 | 0.4320 |
2.6868 | 0.7278 | 1900 | 0.4291 |
2.7244 | 0.7661 | 2000 | 0.4261 |
2.7016 | 0.8044 | 2100 | 0.4257 |
2.6166 | 0.8427 | 2200 | 0.4206 |
2.647 | 0.8810 | 2300 | 0.4187 |
2.687 | 0.9193 | 2400 | 0.4150 |
2.6376 | 0.9576 | 2500 | 0.4144 |
2.5493 | 0.9959 | 2600 | 0.4118 |
2.5871 | 1.0341 | 2700 | 0.4103 |
2.589 | 1.0724 | 2800 | 0.4089 |
2.6471 | 1.1107 | 2900 | 0.4061 |
2.5845 | 1.1490 | 3000 | 0.4055 |
2.5417 | 1.1873 | 3100 | 0.4050 |
2.4787 | 1.2256 | 3200 | 0.4032 |
2.4835 | 1.2639 | 3300 | 0.4002 |
2.4791 | 1.3022 | 3400 | 0.3997 |
2.4897 | 1.3405 | 3500 | 0.3970 |
2.5129 | 1.3788 | 3600 | 0.3967 |
2.5013 | 1.4171 | 3700 | 0.3950 |
2.4323 | 1.4554 | 3800 | 0.3943 |
2.5074 | 1.4937 | 3900 | 0.3929 |
2.4401 | 1.5320 | 4000 | 0.3926 |
2.4195 | 1.5704 | 4100 | 0.3913 |
2.4749 | 1.6087 | 4200 | 0.3898 |
2.4423 | 1.6470 | 4300 | 0.3894 |
2.5008 | 1.6853 | 4400 | 0.3882 |
2.4293 | 1.7236 | 4500 | 0.3866 |
2.3966 | 1.7619 | 4600 | 0.3870 |
2.3954 | 1.8002 | 4700 | 0.3850 |
2.4398 | 1.8385 | 4800 | 0.3839 |
2.4465 | 1.8768 | 4900 | 0.3833 |
2.4152 | 1.9151 | 5000 | 0.3823 |
2.4633 | 1.9534 | 5100 | 0.3815 |
2.3733 | 1.9917 | 5200 | 0.3814 |
2.4842 | 2.0299 | 5300 | 0.3794 |
2.3732 | 2.0682 | 5400 | 0.3797 |
2.3409 | 2.1065 | 5500 | 0.3789 |
2.3788 | 2.1448 | 5600 | 0.3771 |
2.4165 | 2.1831 | 5700 | 0.3757 |
2.3168 | 2.2214 | 5800 | 0.3749 |
2.3661 | 2.2597 | 5900 | 0.3742 |
2.3646 | 2.2980 | 6000 | 0.3731 |
2.3661 | 2.3363 | 6100 | 0.3730 |
2.3396 | 2.3746 | 6200 | 0.3730 |
2.2718 | 2.4129 | 6300 | 0.3712 |
2.3257 | 2.4512 | 6400 | 0.3703 |
2.2976 | 2.4895 | 6500 | 0.3692 |
2.2838 | 2.5278 | 6600 | 0.3679 |
2.273 | 2.5661 | 6700 | 0.3673 |
2.3019 | 2.6044 | 6800 | 0.3663 |
2.2569 | 2.6427 | 6900 | 0.3657 |
2.2991 | 2.6811 | 7000 | 0.3647 |
2.268 | 2.7194 | 7100 | 0.3642 |
2.2132 | 2.7577 | 7200 | 0.3630 |
2.3134 | 2.7960 | 7300 | 0.3613 |
2.2995 | 2.8343 | 7400 | 0.3598 |
2.289 | 2.8726 | 7500 | 0.3598 |
2.2509 | 2.9109 | 7600 | 0.3579 |
2.2367 | 2.9492 | 7700 | 0.3567 |
2.2016 | 2.9875 | 7800 | 0.3544 |
2.2573 | 3.0257 | 7900 | 0.3527 |
2.2029 | 3.0640 | 8000 | 0.3512 |
2.2087 | 3.1023 | 8100 | 0.3500 |
2.1385 | 3.1406 | 8200 | 0.3416 |
2.1084 | 3.1789 | 8300 | 0.3346 |
2.0978 | 3.2172 | 8400 | 0.3258 |
2.0254 | 3.2555 | 8500 | 0.3159 |
1.9649 | 3.2938 | 8600 | 0.3021 |
1.8909 | 3.3321 | 8700 | 0.2877 |
1.8284 | 3.3704 | 8800 | 0.2721 |
1.7419 | 3.4087 | 8900 | 0.2612 |
1.6687 | 3.4470 | 9000 | 0.2510 |
1.6713 | 3.4853 | 9100 | 0.2406 |
1.5075 | 3.5236 | 9200 | 0.2314 |
1.558 | 3.5619 | 9300 | 0.2251 |
1.5508 | 3.6002 | 9400 | 0.2155 |
1.4222 | 3.6385 | 9500 | 0.2093 |
1.4103 | 3.6768 | 9600 | 0.2016 |
1.2759 | 3.7151 | 9700 | 0.1936 |
1.3577 | 3.7534 | 9800 | 0.1888 |
1.2245 | 3.7918 | 9900 | 0.1833 |
1.3226 | 3.8301 | 10000 | 0.1776 |
1.2007 | 3.8684 | 10100 | 0.1743 |
1.1289 | 3.9067 | 10200 | 0.1693 |
1.1646 | 3.9450 | 10300 | 0.1659 |
1.1498 | 3.9833 | 10400 | 0.1619 |
1.1152 | 4.0215 | 10500 | 0.1588 |
1.0254 | 4.0598 | 10600 | 0.1558 |
1.0719 | 4.0981 | 10700 | 0.1527 |
1.103 | 4.1364 | 10800 | 0.1502 |
1.1307 | 4.1747 | 10900 | 0.1474 |
1.0523 | 4.2130 | 11000 | 0.1445 |
0.9377 | 4.2513 | 11100 | 0.1427 |
1.0505 | 4.2896 | 11200 | 0.1399 |
0.9646 | 4.3279 | 11300 | 0.1382 |
0.9571 | 4.3662 | 11400 | 0.1366 |
0.9693 | 4.4045 | 11500 | 0.1343 |
0.9362 | 4.4428 | 11600 | 0.1325 |
0.9162 | 4.4811 | 11700 | 0.1319 |
0.9699 | 4.5194 | 11800 | 0.1299 |
0.9275 | 4.5577 | 11900 | 0.1291 |
0.8864 | 4.5960 | 12000 | 0.1271 |
0.9603 | 4.6343 | 12100 | 0.1263 |
0.9842 | 4.6726 | 12200 | 0.1244 |
0.8629 | 4.7109 | 12300 | 0.1231 |
0.9338 | 4.7492 | 12400 | 0.1234 |
0.8358 | 4.7875 | 12500 | 0.1210 |
0.7986 | 4.8258 | 12600 | 0.1196 |
0.8606 | 4.8641 | 12700 | 0.1188 |
0.801 | 4.9025 | 12800 | 0.1180 |
0.8723 | 4.9408 | 12900 | 0.1166 |
0.8224 | 4.9791 | 13000 | 0.1167 |
0.7655 | 5.0172 | 13100 | 0.1144 |
0.89 | 5.0555 | 13200 | 0.1139 |
0.7515 | 5.0938 | 13300 | 0.1131 |
0.8617 | 5.1322 | 13400 | 0.1129 |
0.8763 | 5.1705 | 13500 | 0.1119 |
0.8394 | 5.2088 | 13600 | 0.1104 |
0.8494 | 5.2471 | 13700 | 0.1097 |
0.7357 | 5.2854 | 13800 | 0.1090 |
0.78 | 5.3237 | 13900 | 0.1080 |
0.7955 | 5.3620 | 14000 | 0.1080 |
0.8194 | 5.4003 | 14100 | 0.1070 |
0.8297 | 5.4386 | 14200 | 0.1069 |
0.697 | 5.4769 | 14300 | 0.1057 |
0.8037 | 5.5152 | 14400 | 0.1051 |
0.7782 | 5.5535 | 14500 | 0.1047 |
0.7672 | 5.5918 | 14600 | 0.1037 |
0.7789 | 5.6301 | 14700 | 0.1031 |
0.7292 | 5.6684 | 14800 | 0.1035 |
0.8318 | 5.7067 | 14900 | 0.1019 |
0.6917 | 5.7450 | 15000 | 0.1016 |
0.7711 | 5.7833 | 15100 | 0.1009 |
0.718 | 5.8216 | 15200 | 0.1003 |
0.8245 | 5.8599 | 15300 | 0.1010 |
0.7005 | 5.8982 | 15400 | 0.0995 |
0.7685 | 5.9365 | 15500 | 0.0991 |
0.6955 | 5.9748 | 15600 | 0.0988 |
0.6962 | 6.0130 | 15700 | 0.0981 |
0.6917 | 6.0513 | 15800 | 0.0974 |
0.8487 | 6.0896 | 15900 | 0.0972 |
0.6653 | 6.1279 | 16000 | 0.0970 |
0.7476 | 6.1662 | 16100 | 0.0966 |
0.682 | 6.2045 | 16200 | 0.0960 |
0.6858 | 6.2428 | 16300 | 0.0958 |
0.696 | 6.2812 | 16400 | 0.0948 |
0.7115 | 6.3195 | 16500 | 0.0949 |
0.7388 | 6.3578 | 16600 | 0.0942 |
0.6637 | 6.3961 | 16700 | 0.0937 |
0.7032 | 6.4344 | 16800 | 0.0934 |
0.6581 | 6.4727 | 16900 | 0.0931 |
0.6609 | 6.5110 | 17000 | 0.0930 |
0.6724 | 6.5493 | 17100 | 0.0921 |
0.629 | 6.5876 | 17200 | 0.0915 |
0.682 | 6.6259 | 17300 | 0.0914 |
0.7201 | 6.6642 | 17400 | 0.0914 |
0.5541 | 6.7025 | 17500 | 0.0914 |
0.6999 | 6.7408 | 17600 | 0.0903 |
0.6552 | 6.7791 | 17700 | 0.0906 |
0.6613 | 6.8174 | 17800 | 0.0897 |
0.7954 | 6.8557 | 17900 | 0.0894 |
0.6358 | 6.8940 | 18000 | 0.0890 |
0.665 | 6.9323 | 18100 | 0.0890 |
0.6274 | 6.9706 | 18200 | 0.0884 |
0.6558 | 7.0088 | 18300 | 0.0880 |
0.6541 | 7.0471 | 18400 | 0.0883 |
0.6568 | 7.0854 | 18500 | 0.0877 |
0.6677 | 7.1237 | 18600 | 0.0873 |
0.7305 | 7.1620 | 18700 | 0.0871 |
0.6118 | 7.2003 | 18800 | 0.0872 |
0.5958 | 7.2386 | 18900 | 0.0865 |
0.6912 | 7.2769 | 19000 | 0.0862 |
0.5643 | 7.3152 | 19100 | 0.0859 |
0.6254 | 7.3535 | 19200 | 0.0856 |
0.6773 | 7.3919 | 19300 | 0.0854 |
0.7044 | 7.4302 | 19400 | 0.0848 |
0.5636 | 7.4685 | 19500 | 0.0847 |
0.5932 | 7.5068 | 19600 | 0.0848 |
0.566 | 7.5451 | 19700 | 0.0846 |
0.6553 | 7.5834 | 19800 | 0.0843 |
0.5729 | 7.6217 | 19900 | 0.0841 |
0.6147 | 7.6600 | 20000 | 0.0836 |
0.6125 | 7.6983 | 20100 | 0.0831 |
0.5793 | 7.7366 | 20200 | 0.0832 |
0.6042 | 7.7749 | 20300 | 0.0832 |
0.604 | 7.8132 | 20400 | 0.0827 |
0.5963 | 7.8515 | 20500 | 0.0826 |
0.5757 | 7.8898 | 20600 | 0.0826 |
0.6194 | 7.9281 | 20700 | 0.0821 |
0.5528 | 7.9664 | 20800 | 0.0817 |
0.7031 | 8.0046 | 20900 | 0.0817 |
0.5997 | 8.0429 | 21000 | 0.0816 |
0.5876 | 8.0812 | 21100 | 0.0814 |
0.5757 | 8.1195 | 21200 | 0.0811 |
0.6033 | 8.1578 | 21300 | 0.0814 |
0.5738 | 8.1961 | 21400 | 0.0807 |
0.6308 | 8.2344 | 21500 | 0.0807 |
0.5583 | 8.2727 | 21600 | 0.0809 |
0.6401 | 8.3110 | 21700 | 0.0804 |
0.5611 | 8.3493 | 21800 | 0.0803 |
0.5526 | 8.3876 | 21900 | 0.0799 |
0.5877 | 8.4259 | 22000 | 0.0796 |
0.6311 | 8.4642 | 22100 | 0.0793 |
0.556 | 8.5026 | 22200 | 0.0799 |
0.5976 | 8.5409 | 22300 | 0.0794 |
0.5851 | 8.5792 | 22400 | 0.0796 |
Framework versions
- Transformers 4.47.0
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for alakxender/dhivehi-nougat-base-text-sen
Base model
facebook/nougat-base