tyzhu commited on
Commit
85393c4
1 Parent(s): b44510f

Model save

Browse files
Files changed (1) hide show
  1. README.md +64 -44
README.md CHANGED
@@ -1,25 +1,14 @@
1
  ---
2
- license: llama2
3
- base_model: meta-llama/Llama-2-7b-hf
4
  tags:
5
  - generated_from_trainer
6
- datasets:
7
- - tyzhu/lmind_hotpot_train8000_eval7405_v1_qa
8
  metrics:
9
  - accuracy
10
  model-index:
11
  - name: lmind_hotpot_train8000_eval7405_v1_qa_5e-4_lora2
12
- results:
13
- - task:
14
- name: Causal Language Modeling
15
- type: text-generation
16
- dataset:
17
- name: tyzhu/lmind_hotpot_train8000_eval7405_v1_qa
18
- type: tyzhu/lmind_hotpot_train8000_eval7405_v1_qa
19
- metrics:
20
- - name: Accuracy
21
- type: accuracy
22
- value: 0.5813164556962025
23
  ---
24
 
25
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -27,10 +16,10 @@ should probably proofread and complete it, then remove this comment. -->
27
 
28
  # lmind_hotpot_train8000_eval7405_v1_qa_5e-4_lora2
29
 
30
- This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the tyzhu/lmind_hotpot_train8000_eval7405_v1_qa dataset.
31
  It achieves the following results on the evaluation set:
32
- - Loss: 2.9420
33
- - Accuracy: 0.5813
34
 
35
  ## Model description
36
 
@@ -61,37 +50,68 @@ The following hyperparameters were used during training:
61
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
62
  - lr_scheduler_type: constant
63
  - lr_scheduler_warmup_ratio: 0.05
64
- - num_epochs: 20.0
65
 
66
  ### Training results
67
 
68
- | Training Loss | Epoch | Step | Validation Loss | Accuracy |
69
- |:-------------:|:-----:|:----:|:---------------:|:--------:|
70
- | 1.8732 | 1.0 | 250 | 2.0111 | 0.5939 |
71
- | 1.6142 | 2.0 | 500 | 1.8443 | 0.6051 |
72
- | 1.206 | 3.0 | 750 | 1.9818 | 0.6007 |
73
- | 0.8693 | 4.0 | 1000 | 2.2100 | 0.5941 |
74
- | 0.6023 | 5.0 | 1250 | 2.3756 | 0.5910 |
75
- | 0.4717 | 6.0 | 1500 | 2.5421 | 0.5896 |
76
- | 0.3938 | 7.0 | 1750 | 2.6587 | 0.5891 |
77
- | 0.3697 | 8.0 | 2000 | 2.7532 | 0.5873 |
78
- | 0.3617 | 9.0 | 2250 | 2.7664 | 0.5870 |
79
- | 0.3607 | 10.0 | 2500 | 2.8514 | 0.5867 |
80
- | 0.3414 | 11.0 | 2750 | 2.8932 | 0.5861 |
81
- | 0.3439 | 12.0 | 3000 | 2.9545 | 0.5855 |
82
- | 0.335 | 13.0 | 3250 | 2.8991 | 0.5843 |
83
- | 0.3391 | 14.0 | 3500 | 2.8793 | 0.5840 |
84
- | 0.328 | 15.0 | 3750 | 2.8954 | 0.5851 |
85
- | 0.3351 | 16.0 | 4000 | 2.9140 | 0.5838 |
86
- | 0.3252 | 17.0 | 4250 | 2.9297 | 0.5825 |
87
- | 0.332 | 18.0 | 4500 | 2.9812 | 0.5834 |
88
- | 0.324 | 19.0 | 4750 | 2.9823 | 0.5808 |
89
- | 0.3329 | 20.0 | 5000 | 2.9420 | 0.5813 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
 
92
  ### Framework versions
93
 
94
- - Transformers 4.34.0
 
95
  - Pytorch 2.1.0+cu121
96
- - Datasets 2.18.0
97
- - Tokenizers 0.14.1
 
1
  ---
2
+ license: other
3
+ base_model: Qwen/Qwen1.5-4B
4
  tags:
5
  - generated_from_trainer
 
 
6
  metrics:
7
  - accuracy
8
  model-index:
9
  - name: lmind_hotpot_train8000_eval7405_v1_qa_5e-4_lora2
10
+ results: []
11
+ library_name: peft
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
16
 
17
  # lmind_hotpot_train8000_eval7405_v1_qa_5e-4_lora2
18
 
19
+ This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 4.0366
22
+ - Accuracy: 0.4784
23
 
24
  ## Model description
25
 
 
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
  - lr_scheduler_type: constant
52
  - lr_scheduler_warmup_ratio: 0.05
53
+ - num_epochs: 50.0
54
 
55
  ### Training results
56
 
57
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
58
+ |:-------------:|:-----:|:-----:|:---------------:|:--------:|
59
+ | 2.2398 | 1.0 | 250 | 2.3236 | 0.5163 |
60
+ | 1.8301 | 2.0 | 500 | 2.4220 | 0.5124 |
61
+ | 1.3626 | 3.0 | 750 | 2.6153 | 0.5062 |
62
+ | 1.0112 | 4.0 | 1000 | 2.8349 | 0.4997 |
63
+ | 0.7198 | 5.0 | 1250 | 3.0756 | 0.4963 |
64
+ | 0.589 | 6.0 | 1500 | 3.2339 | 0.4943 |
65
+ | 0.4969 | 7.0 | 1750 | 3.3425 | 0.4935 |
66
+ | 0.4786 | 8.0 | 2000 | 3.4198 | 0.4924 |
67
+ | 0.4399 | 9.0 | 2250 | 3.4695 | 0.4911 |
68
+ | 0.4481 | 10.0 | 2500 | 3.5353 | 0.4913 |
69
+ | 0.4166 | 11.0 | 2750 | 3.4938 | 0.4894 |
70
+ | 0.429 | 12.0 | 3000 | 3.5450 | 0.4906 |
71
+ | 0.4193 | 13.0 | 3250 | 3.5636 | 0.4882 |
72
+ | 0.4276 | 14.0 | 3500 | 3.5626 | 0.4890 |
73
+ | 0.4071 | 15.0 | 3750 | 3.6309 | 0.4883 |
74
+ | 0.421 | 16.0 | 4000 | 3.5818 | 0.4890 |
75
+ | 0.4065 | 17.0 | 4250 | 3.6167 | 0.4869 |
76
+ | 0.4188 | 18.0 | 4500 | 3.6926 | 0.4857 |
77
+ | 0.3994 | 19.0 | 4750 | 3.6533 | 0.4863 |
78
+ | 0.4103 | 20.0 | 5000 | 3.6891 | 0.4864 |
79
+ | 0.397 | 21.0 | 5250 | 3.6973 | 0.4851 |
80
+ | 0.4118 | 22.0 | 5500 | 3.7214 | 0.4859 |
81
+ | 0.3944 | 23.0 | 5750 | 3.7193 | 0.4851 |
82
+ | 0.4036 | 24.0 | 6000 | 3.7567 | 0.4845 |
83
+ | 0.3939 | 25.0 | 6250 | 3.7891 | 0.4841 |
84
+ | 0.401 | 26.0 | 6500 | 3.7671 | 0.4828 |
85
+ | 0.3871 | 27.0 | 6750 | 3.7838 | 0.4835 |
86
+ | 0.4005 | 28.0 | 7000 | 3.8041 | 0.4831 |
87
+ | 0.3854 | 29.0 | 7250 | 3.8603 | 0.4830 |
88
+ | 0.3942 | 30.0 | 7500 | 3.8247 | 0.4812 |
89
+ | 0.3837 | 31.0 | 7750 | 3.8497 | 0.4815 |
90
+ | 0.3896 | 32.0 | 8000 | 3.8705 | 0.4836 |
91
+ | 0.3817 | 33.0 | 8250 | 3.8643 | 0.4818 |
92
+ | 0.3928 | 34.0 | 8500 | 3.9378 | 0.4807 |
93
+ | 0.3839 | 35.0 | 8750 | 3.9542 | 0.4810 |
94
+ | 0.3942 | 36.0 | 9000 | 3.9250 | 0.4806 |
95
+ | 0.381 | 37.0 | 9250 | 3.9220 | 0.4792 |
96
+ | 0.3918 | 38.0 | 9500 | 3.9584 | 0.4781 |
97
+ | 0.3787 | 39.0 | 9750 | 3.9241 | 0.4776 |
98
+ | 0.3897 | 40.0 | 10000 | 3.9434 | 0.4773 |
99
+ | 0.3786 | 41.0 | 10250 | 3.9411 | 0.4793 |
100
+ | 0.3864 | 42.0 | 10500 | 3.9933 | 0.4766 |
101
+ | 0.377 | 43.0 | 10750 | 4.0015 | 0.4787 |
102
+ | 0.3887 | 44.0 | 11000 | 3.9979 | 0.4788 |
103
+ | 0.3805 | 45.0 | 11250 | 3.9764 | 0.4796 |
104
+ | 0.3827 | 46.0 | 11500 | 3.9990 | 0.4786 |
105
+ | 0.3737 | 47.0 | 11750 | 4.0059 | 0.4792 |
106
+ | 0.3807 | 48.0 | 12000 | 4.0746 | 0.4798 |
107
+ | 0.3772 | 49.0 | 12250 | 4.0123 | 0.4776 |
108
+ | 0.3808 | 50.0 | 12500 | 4.0366 | 0.4784 |
109
 
110
 
111
  ### Framework versions
112
 
113
+ - PEFT 0.5.0
114
+ - Transformers 4.41.1
115
  - Pytorch 2.1.0+cu121
116
+ - Datasets 2.19.1
117
+ - Tokenizers 0.19.1