--- language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - llama - trl base_model: unsloth/llama-3-8b-Instruct-bnb-4bit --- # Uploaded model - **Developed by:** Angelectronic - **License:** apache-2.0 - **Finetuned from model :** unsloth/llama-3-8b-Instruct-bnb-4bit This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth) ### Evaluation - **ViMMRC test set:** 0.8221 accuracy ### Training results | Training Loss | Accuracy | Step | Validation Loss | |:-------------:|:---------:|:-----:|:---------------:| | 1.836600 | 0.822141 | 240 | 2.302049 | | 1.648300 | 0.827586 | 480 | 2.330861 | | 1.511700 | 0.833031 | 720 | 2.388702 | | 0.669600 | 0.767695 | 960 | 1.521454 | | 0.592300 | 0.771324 | 1200 | 1.590301 | | 0.496500 | 0.780399 | 1440 | 1.608687 | | 0.381800 | 0.785843 | 1680 | 1.641979 | | 0.334100 | 0.769510 | 1920 | 1.629696 | | 0.285500 | 0.769510 | 2160 | 1.715881 | | 0.242200 | 0.765880 | 2400 | 1.747410 | | 0.200000 | 0.773140 | 2640 | 1.813693 | | 0.146800 | 0.765880 | 2880 | 1.937426 | | 0.112200 | 0.776769 | 3120 | 1.937926 | | 0.101500 | 0.765880 | 3360 | 1.997301 | | 0.094200 | 0.764065 | 3600 | 1.968903 | | 0.087000 | 0.758621 | 3840 | 2.004644 | | 0.084600 | 0.762250 | 4080 | 2.010856 | ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 16 - eval_batch_size: 8 - seed: 3407 - gradient_accumulation_steps: 4 - eval_accumulation_steps: 4 - total_train_batch_size: 64 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 5 - num_epochs: 3 ### Framework versions - PEFT 0.10.0 - Transformers 4.40.2 - Pytorch 2.3.0 - Datasets 2.19.1 - Tokenizers 0.19.1