--- license: other base_model: Qwen/Qwen1.5-4B tags: - generated_from_trainer datasets: - tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3 metrics: - accuracy model-index: - name: lmind_nq_train6000_eval6489_v1_docidx_v3_Qwen_Qwen1.5-4B_3e-5_lora2 results: - task: name: Causal Language Modeling type: text-generation dataset: name: tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3 type: tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3 metrics: - name: Accuracy type: accuracy value: 0.4408205128205128 library_name: peft --- # lmind_nq_train6000_eval6489_v1_docidx_v3_Qwen_Qwen1.5-4B_3e-5_lora2 This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on the tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3 dataset. It achieves the following results on the evaluation set: - Loss: 4.3717 - Accuracy: 0.4408 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 1 - eval_batch_size: 2 - seed: 42 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 8 - total_train_batch_size: 32 - total_eval_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 20.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | |:-------------:|:-------:|:----:|:---------------:|:--------:| | 1.9685 | 0.9985 | 341 | 3.0537 | 0.4626 | | 1.9337 | 2.0 | 683 | 2.9928 | 0.4709 | | 1.9029 | 2.9985 | 1024 | 3.0243 | 0.4705 | | 1.856 | 4.0 | 1366 | 3.0766 | 0.4697 | | 1.8019 | 4.9985 | 1707 | 3.1923 | 0.4696 | | 1.7406 | 6.0 | 2049 | 3.2573 | 0.4684 | | 1.6974 | 6.9985 | 2390 | 3.3286 | 0.4672 | | 1.6249 | 8.0 | 2732 | 3.4775 | 0.4647 | | 1.5993 | 8.9985 | 3073 | 3.5378 | 0.4636 | | 1.5449 | 10.0 | 3415 | 3.6347 | 0.4597 | | 1.4855 | 10.9985 | 3756 | 3.6955 | 0.4553 | | 1.4205 | 12.0 | 4098 | 3.8478 | 0.4479 | | 1.3757 | 12.9985 | 4439 | 3.9185 | 0.4487 | | 1.3098 | 14.0 | 4781 | 3.9575 | 0.4455 | | 1.2574 | 14.9985 | 5122 | 4.1279 | 0.4457 | | 1.2049 | 16.0 | 5464 | 4.1540 | 0.4448 | | 1.1617 | 16.9985 | 5805 | 4.2049 | 0.4454 | | 1.1046 | 18.0 | 6147 | 4.2909 | 0.4432 | | 1.043 | 18.9985 | 6488 | 4.3535 | 0.4385 | | 1.0044 | 19.9707 | 6820 | 4.3717 | 0.4408 | ### Framework versions - PEFT 0.5.0 - Transformers 4.40.2 - Pytorch 2.3.0 - Datasets 2.19.1 - Tokenizers 0.19.1