--- language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - llama - trl base_model: unsloth/llama-3-8b-Instruct-bnb-4bit --- # Uploaded model - **Developed by:** Angelectronic - **License:** apache-2.0 - **Finetuned from model :** unsloth/llama-3-8b-Instruct-bnb-4bit This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth) ### Evaluation - **ViMMRC test set:** 0.8385 accuracy ### Training results | Training Loss | Accuracy | Step | Validation Loss | |:-------------:|:---------:|:-----:|:---------------:| | 1.836600 | 0.822141 | 240 | 2.302049 | | 1.648300 | 0.827586 | 480 | 2.330861 | | 1.511700 | 0.833031 | 720 | 2.388702 | | 1.376200 | 0.833031 | 960 | 2.528673 | | 1.240900 | 0.833031 | 1200 | 2.592396 | | 1.069600 | 0.831216 | 1440 | 2.697354 | | 0.860700 | 0.827586 | 1680 | 2.827819 | | 0.767000 | 0.838475 | 1920 | 2.826283 | | 0.677900 | 0.822142 | 2160 | 2.965557 | | 0.594500 | 0.822142 | 2400 | 2.979151 | | 0.514500 | 0.820327 | 2640 | 3.109596 | | 0.406800 | 0.818512 | 2880 | 3.196722 | | 0.320700 | 0.818512 | 3120 | 3.232843 | | 0.296100 | 0.822142 | 3360 | 3.294877 | | 0.273400 | 0.818512 | 3600 | 3.346133 | | 0.262800 | 0.816697 | 3840 | 3.344488 | | 0.255100 | 0.818511 | 4080 | 3.349281 | ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 16 - eval_batch_size: 8 - seed: 3407 - gradient_accumulation_steps: 4 - eval_accumulation_steps: 4 - total_train_batch_size: 64 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 5 - num_epochs: 3 ### Framework versions - PEFT 0.10.0 - Transformers 4.40.2 - Pytorch 2.3.0 - Datasets 2.19.1 - Tokenizers 0.19.1