synthAIze_telugu_colloquial_trans-20250220-171557
This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:
- Loss: 9.4826
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 7
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
6.7932 | 0.1524 | 50 | 9.5016 |
4.2355 | 0.3049 | 100 | 9.4934 |
3.9892 | 0.4573 | 150 | 9.4652 |
3.9026 | 0.6098 | 200 | 9.4494 |
3.8959 | 0.7622 | 250 | 9.4487 |
3.8923 | 0.9146 | 300 | 9.4453 |
3.8547 | 1.0671 | 350 | 9.4375 |
3.8484 | 1.2195 | 400 | 9.4447 |
3.8292 | 1.3720 | 450 | 9.4463 |
3.8149 | 1.5244 | 500 | 9.4531 |
3.813 | 1.6768 | 550 | 9.4573 |
3.8323 | 1.8293 | 600 | 9.4606 |
3.7694 | 1.9817 | 650 | 9.4535 |
3.7559 | 2.1341 | 700 | 9.4536 |
3.7347 | 2.2866 | 750 | 9.4551 |
3.7644 | 2.4390 | 800 | 9.4583 |
3.7183 | 2.5915 | 850 | 9.4598 |
3.7547 | 2.7439 | 900 | 9.4625 |
3.7757 | 2.8963 | 950 | 9.4610 |
3.6877 | 3.0488 | 1000 | 9.4642 |
3.682 | 3.2012 | 1050 | 9.4665 |
3.6884 | 3.3537 | 1100 | 9.4719 |
3.7081 | 3.5061 | 1150 | 9.4755 |
3.7037 | 3.6585 | 1200 | 9.4676 |
3.711 | 3.8110 | 1250 | 9.4710 |
3.6867 | 3.9634 | 1300 | 9.4724 |
3.608 | 4.1159 | 1350 | 9.4704 |
3.6853 | 4.2683 | 1400 | 9.4716 |
3.6458 | 4.4207 | 1450 | 9.4745 |
3.6421 | 4.5732 | 1500 | 9.4811 |
3.6456 | 4.7256 | 1550 | 9.4736 |
3.6407 | 4.8780 | 1600 | 9.4789 |
3.667 | 5.0305 | 1650 | 9.4769 |
3.5912 | 5.1829 | 1700 | 9.4801 |
3.5854 | 5.3354 | 1750 | 9.4837 |
3.6187 | 5.4878 | 1800 | 9.4850 |
3.6494 | 5.6402 | 1850 | 9.4836 |
3.5671 | 5.7927 | 1900 | 9.4795 |
3.6055 | 5.9451 | 1950 | 9.4780 |
3.6121 | 6.0976 | 2000 | 9.4821 |
3.5694 | 6.25 | 2050 | 9.4817 |
3.5671 | 6.4024 | 2100 | 9.4827 |
3.6026 | 6.5549 | 2150 | 9.4809 |
3.5941 | 6.7073 | 2200 | 9.4819 |
3.549 | 6.8598 | 2250 | 9.4826 |
Framework versions
- PEFT 0.14.0
- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.3.1
- Tokenizers 0.21.0
- Downloads last month
- 3
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.
Model tree for sril32996/synthAIze_telugu_colloquial_trans-20250220-171557
Base model
unsloth/tinyllama-chat-bnb-4bit