--- license: other base_model: Qwen/Qwen1.5-4B tags: - generated_from_trainer datasets: - tyzhu/lmind_hotpot_train8000_eval7405_v1_qa metrics: - accuracy model-index: - name: lmind_hotpot_train8000_eval7405_v1_qa_Qwen_Qwen1.5-4B_3e-4_lora2 results: - task: name: Causal Language Modeling type: text-generation dataset: name: tyzhu/lmind_hotpot_train8000_eval7405_v1_qa type: tyzhu/lmind_hotpot_train8000_eval7405_v1_qa metrics: - name: Accuracy type: accuracy value: 0.49165079365079367 library_name: peft --- # lmind_hotpot_train8000_eval7405_v1_qa_Qwen_Qwen1.5-4B_3e-4_lora2 This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on the tyzhu/lmind_hotpot_train8000_eval7405_v1_qa dataset. It achieves the following results on the evaluation set: - Loss: 3.6823 - Accuracy: 0.4917 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0003 - train_batch_size: 1 - eval_batch_size: 2 - seed: 42 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 8 - total_train_batch_size: 32 - total_eval_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 20.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:--------:| | 2.248 | 1.0 | 250 | 2.3110 | 0.5173 | | 1.9103 | 2.0 | 500 | 2.3740 | 0.5157 | | 1.4896 | 3.0 | 750 | 2.5266 | 0.5112 | | 1.109 | 4.0 | 1000 | 2.7830 | 0.5037 | | 0.7757 | 5.0 | 1250 | 3.0311 | 0.4987 | | 0.5994 | 6.0 | 1500 | 3.2256 | 0.4979 | | 0.4921 | 7.0 | 1750 | 3.3517 | 0.4958 | | 0.4575 | 8.0 | 2000 | 3.4321 | 0.4946 | | 0.4233 | 9.0 | 2250 | 3.5151 | 0.4961 | | 0.4178 | 10.0 | 2500 | 3.5280 | 0.4950 | | 0.3987 | 11.0 | 2750 | 3.5547 | 0.4951 | | 0.4033 | 12.0 | 3000 | 3.5601 | 0.4954 | | 0.3932 | 13.0 | 3250 | 3.5859 | 0.4932 | | 0.4012 | 14.0 | 3500 | 3.5944 | 0.4927 | | 0.3895 | 15.0 | 3750 | 3.6038 | 0.4939 | | 0.396 | 16.0 | 4000 | 3.6504 | 0.4932 | | 0.3847 | 17.0 | 4250 | 3.6602 | 0.4912 | | 0.3942 | 18.0 | 4500 | 3.6515 | 0.4914 | | 0.3809 | 19.0 | 4750 | 3.7304 | 0.4923 | | 0.3805 | 20.0 | 5000 | 3.6823 | 0.4917 | ### Framework versions - PEFT 0.5.0 - Transformers 4.40.2 - Pytorch 2.3.0 - Datasets 2.19.1 - Tokenizers 0.19.1