Edit model card

your-project

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3281

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 5
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
0.814 0.0063 50 0.4102
0.3875 0.0127 100 0.3834
0.3645 0.0190 150 0.3681
0.369 0.0253 200 0.3606
0.3434 0.0317 250 0.3550
0.3567 0.0380 300 0.3509
0.3506 0.0443 350 0.3484
0.344 0.0507 400 0.3442
0.339 0.0570 450 0.3426
0.3437 0.0633 500 0.3398
0.3498 0.0697 550 0.3379
0.3319 0.0760 600 0.3358
0.3338 0.0823 650 0.3343
0.3301 0.0887 700 0.3333
0.3323 0.0950 750 0.3317
0.3289 0.1013 800 0.3306
0.3245 0.1077 850 0.3296
0.3189 0.1140 900 0.3290
0.32 0.1203 950 0.3283
0.3254 0.1267 1000 0.3281

Framework versions

  • PEFT 0.13.3.dev0
  • Transformers 4.46.3
  • Pytorch 2.4.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
8
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Debk/mistral-Bengali_NER

Adapter
(1172)
this model