tyzhu's picture
End of training
bb7096f verified
metadata
license: other
base_model: Qwen/Qwen1.5-4B
tags:
  - generated_from_trainer
datasets:
  - tyzhu/lmind_hotpot_train8000_eval7405_v1_recite_qa
metrics:
  - accuracy
model-index:
  - name: lmind_hotpot_train8000_eval7405_v1_recite_qa_Qwen_Qwen1.5-4B_lora2
    results:
      - task:
          name: Causal Language Modeling
          type: text-generation
        dataset:
          name: tyzhu/lmind_hotpot_train8000_eval7405_v1_recite_qa
          type: tyzhu/lmind_hotpot_train8000_eval7405_v1_recite_qa
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.7780232896652111
library_name: peft

lmind_hotpot_train8000_eval7405_v1_recite_qa_Qwen_Qwen1.5-4B_lora2

This model is a fine-tuned version of Qwen/Qwen1.5-4B on the tyzhu/lmind_hotpot_train8000_eval7405_v1_recite_qa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4804
  • Accuracy: 0.7780

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
1.5635 0.9998 1089 0.6796 1.4615
1.4521 1.9995 2178 0.6874 1.3626
1.2848 2.9993 3267 0.6958 1.2575
1.1197 4.0 4357 0.7054 1.1527
0.9756 4.9998 5446 0.7143 1.0532
0.8393 5.9995 6535 0.7241 0.9538
0.7125 6.9993 7624 0.7324 0.8674
0.6144 8.0 8714 0.7404 0.7907
0.5355 8.9998 9803 0.7469 0.7288
0.4584 9.9977 10890 0.7531 0.6794
0.413 10.9998 11979 0.7577 0.6292
0.3731 11.9995 13068 0.7616 0.5926
0.3423 12.9993 14157 0.7656 0.5620
0.3185 14.0 15247 0.7682 0.5426
0.2924 14.9998 16336 0.7708 0.5232
0.2824 15.9995 17425 0.7727 0.5129
0.2669 16.9993 18514 0.7748 0.4988
0.2517 18.0 19604 0.7762 0.4892
0.2376 18.9998 20693 0.7773 0.4808
0.2316 19.9977 21780 0.7780 0.4804

Framework versions

  • PEFT 0.5.0
  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1