synthAIze_telugu_colloquial_trans-20250220-171557

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 9.4826

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 7
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.7932 0.1524 50 9.5016
4.2355 0.3049 100 9.4934
3.9892 0.4573 150 9.4652
3.9026 0.6098 200 9.4494
3.8959 0.7622 250 9.4487
3.8923 0.9146 300 9.4453
3.8547 1.0671 350 9.4375
3.8484 1.2195 400 9.4447
3.8292 1.3720 450 9.4463
3.8149 1.5244 500 9.4531
3.813 1.6768 550 9.4573
3.8323 1.8293 600 9.4606
3.7694 1.9817 650 9.4535
3.7559 2.1341 700 9.4536
3.7347 2.2866 750 9.4551
3.7644 2.4390 800 9.4583
3.7183 2.5915 850 9.4598
3.7547 2.7439 900 9.4625
3.7757 2.8963 950 9.4610
3.6877 3.0488 1000 9.4642
3.682 3.2012 1050 9.4665
3.6884 3.3537 1100 9.4719
3.7081 3.5061 1150 9.4755
3.7037 3.6585 1200 9.4676
3.711 3.8110 1250 9.4710
3.6867 3.9634 1300 9.4724
3.608 4.1159 1350 9.4704
3.6853 4.2683 1400 9.4716
3.6458 4.4207 1450 9.4745
3.6421 4.5732 1500 9.4811
3.6456 4.7256 1550 9.4736
3.6407 4.8780 1600 9.4789
3.667 5.0305 1650 9.4769
3.5912 5.1829 1700 9.4801
3.5854 5.3354 1750 9.4837
3.6187 5.4878 1800 9.4850
3.6494 5.6402 1850 9.4836
3.5671 5.7927 1900 9.4795
3.6055 5.9451 1950 9.4780
3.6121 6.0976 2000 9.4821
3.5694 6.25 2050 9.4817
3.5671 6.4024 2100 9.4827
3.6026 6.5549 2150 9.4809
3.5941 6.7073 2200 9.4819
3.549 6.8598 2250 9.4826

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
3
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for sril32996/synthAIze_telugu_colloquial_trans-20250220-171557

Adapter
(110)
this model