Edit model card

mambarim-110m-chat

This model is a fine-tuned version of dominguesm/mambarim-110m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5904

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
2.8055 0.0545 1000 2.7821
2.8298 0.1089 2000 2.7619
2.9104 0.1634 3000 2.7539
2.6692 0.2178 4000 2.7379
2.5876 0.2723 5000 2.7325
2.7439 0.3267 6000 2.7203
2.7787 0.3812 7000 2.7178
2.8461 0.4356 8000 2.7117
2.6929 0.4901 9000 2.7060
2.7229 0.5445 10000 2.7005
2.5014 0.5990 11000 2.6948
2.5046 0.6535 12000 2.6923
2.6258 0.7079 13000 2.6898
2.5822 0.7624 14000 2.6847
2.6399 0.8168 15000 2.6847
2.5342 0.8713 16000 2.6768
2.6878 0.9257 17000 2.6726
2.8872 0.9802 18000 2.6729
2.6565 1.0346 19000 2.6693
2.4293 1.0891 20000 2.6672
2.8411 1.1435 21000 2.6620
2.7126 1.1980 22000 2.6618
2.5516 1.2525 23000 2.6609
2.6093 1.3069 24000 2.6557
2.6489 1.3614 25000 2.6554
2.6014 1.4158 26000 2.6522
2.6185 1.4703 27000 2.6477
2.6896 1.5247 28000 2.6468
2.6222 1.5792 29000 2.6433
2.6227 1.6336 30000 2.6415
2.5772 1.6881 31000 2.6377
2.4859 1.7425 32000 2.6356
2.3725 1.7970 33000 2.6327
2.5452 1.8514 34000 2.6308
2.6545 1.9059 35000 2.6281
2.6109 1.9604 36000 2.6265
2.5004 2.0148 37000 2.6237
2.4471 2.0693 38000 2.6236
2.5242 2.1237 39000 2.6211
2.6242 2.1782 40000 2.6175
2.561 2.2326 41000 2.6168
2.5065 2.2871 42000 2.6149
2.6165 2.3415 43000 2.6122
2.4452 2.3960 44000 2.6098
2.6277 2.4504 45000 2.6075
2.5547 2.5049 46000 2.6062
2.5153 2.5594 47000 2.6028
2.6322 2.6138 48000 2.6020
2.5263 2.6683 49000 2.5995
2.7165 2.7227 50000 2.5974
2.6576 2.7772 51000 2.5956
2.5471 2.8316 52000 2.5940
2.7174 2.8861 53000 2.5923
2.5018 2.9405 54000 2.5910
2.6201 2.9950 55000 2.5904

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
18
Unable to determine this model’s pipeline type. Check the docs .

Adapter for