Model save
Browse files- README.md +148 -0
- adapter_model.safetensors +1 -1
- all_results.json +8 -0
- train_results.json +8 -0
- trainer_state.json +0 -0
README.md
ADDED
@@ -0,0 +1,148 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
library_name: peft
|
4 |
+
tags:
|
5 |
+
- trl
|
6 |
+
- dpo
|
7 |
+
- generated_from_trainer
|
8 |
+
base_model: mistralai/Mistral-7B-v0.1
|
9 |
+
model-index:
|
10 |
+
- name: zephyr-7b-dpo-qlora
|
11 |
+
results: []
|
12 |
+
---
|
13 |
+
|
14 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
+
should probably proofread and complete it, then remove this comment. -->
|
16 |
+
|
17 |
+
# zephyr-7b-dpo-qlora
|
18 |
+
|
19 |
+
This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
|
20 |
+
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.4886
|
22 |
+
- Rewards/chosen: -3.3028
|
23 |
+
- Rewards/rejected: -4.6175
|
24 |
+
- Rewards/accuracies: 0.7520
|
25 |
+
- Rewards/margins: 1.3147
|
26 |
+
- Logps/rejected: -706.3290
|
27 |
+
- Logps/chosen: -594.9012
|
28 |
+
- Logits/rejected: 1.7559
|
29 |
+
- Logits/chosen: 1.0126
|
30 |
+
|
31 |
+
## Model description
|
32 |
+
|
33 |
+
More information needed
|
34 |
+
|
35 |
+
## Intended uses & limitations
|
36 |
+
|
37 |
+
More information needed
|
38 |
+
|
39 |
+
## Training and evaluation data
|
40 |
+
|
41 |
+
More information needed
|
42 |
+
|
43 |
+
## Training procedure
|
44 |
+
|
45 |
+
### Training hyperparameters
|
46 |
+
|
47 |
+
The following hyperparameters were used during training:
|
48 |
+
- learning_rate: 5e-06
|
49 |
+
- train_batch_size: 2
|
50 |
+
- eval_batch_size: 4
|
51 |
+
- seed: 42
|
52 |
+
- distributed_type: multi-GPU
|
53 |
+
- gradient_accumulation_steps: 4
|
54 |
+
- total_train_batch_size: 8
|
55 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
56 |
+
- lr_scheduler_type: cosine
|
57 |
+
- lr_scheduler_warmup_ratio: 0.1
|
58 |
+
- num_epochs: 1
|
59 |
+
|
60 |
+
### Training results
|
61 |
+
|
62 |
+
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
63 |
+
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
64 |
+
| 0.6885 | 0.01 | 100 | 0.6887 | 0.0401 | 0.0310 | 0.6155 | 0.0091 | -241.4763 | -260.6096 | -2.3013 | -2.3864 |
|
65 |
+
| 0.6826 | 0.03 | 200 | 0.6777 | 0.0538 | 0.0208 | 0.6555 | 0.0329 | -242.4942 | -259.2415 | -2.2939 | -2.3792 |
|
66 |
+
| 0.6623 | 0.04 | 300 | 0.6578 | -0.0931 | -0.1758 | 0.6735 | 0.0827 | -262.1588 | -273.9337 | -2.2310 | -2.3202 |
|
67 |
+
| 0.6619 | 0.05 | 400 | 0.6455 | -0.2994 | -0.4240 | 0.6610 | 0.1245 | -286.9754 | -294.5644 | -2.0309 | -2.1441 |
|
68 |
+
| 0.6257 | 0.07 | 500 | 0.6194 | -0.3522 | -0.5612 | 0.6850 | 0.2089 | -300.6967 | -299.8442 | -2.0400 | -2.1485 |
|
69 |
+
| 0.6114 | 0.08 | 600 | 0.6004 | -0.6308 | -0.9602 | 0.6755 | 0.3295 | -340.6012 | -327.6964 | -1.5503 | -1.7200 |
|
70 |
+
| 0.5394 | 0.09 | 700 | 0.6103 | -1.5690 | -1.9843 | 0.6635 | 0.4153 | -443.0096 | -421.5208 | -0.6532 | -0.9309 |
|
71 |
+
| 0.6171 | 0.1 | 800 | 0.6372 | -1.7546 | -2.0641 | 0.6405 | 0.3095 | -450.9858 | -440.0762 | 0.0235 | -0.3349 |
|
72 |
+
| 0.5553 | 0.12 | 900 | 0.5687 | -1.3500 | -1.8540 | 0.6930 | 0.5041 | -429.9809 | -399.6168 | 2.6187 | 1.9978 |
|
73 |
+
| 0.6299 | 0.13 | 1000 | 0.5620 | -1.1629 | -1.7464 | 0.6975 | 0.5835 | -419.2182 | -380.9113 | 3.4192 | 2.7155 |
|
74 |
+
| 0.5898 | 0.14 | 1100 | 0.5619 | -2.4368 | -3.0963 | 0.7090 | 0.6594 | -554.2042 | -508.3033 | 5.3078 | 4.4134 |
|
75 |
+
| 0.4782 | 0.16 | 1200 | 0.5594 | -1.5060 | -2.2383 | 0.7090 | 0.7323 | -468.4132 | -415.2229 | 4.0187 | 3.1485 |
|
76 |
+
| 0.5709 | 0.17 | 1300 | 0.5481 | -1.7316 | -2.3668 | 0.7245 | 0.6352 | -481.2582 | -437.7783 | 4.1315 | 3.2570 |
|
77 |
+
| 0.5181 | 0.18 | 1400 | 0.5454 | -2.4857 | -3.3898 | 0.7140 | 0.9042 | -583.5640 | -513.1900 | 4.6977 | 3.6944 |
|
78 |
+
| 0.5495 | 0.2 | 1500 | 0.5428 | -2.5602 | -3.3574 | 0.7205 | 0.7972 | -580.3215 | -520.6432 | 4.1847 | 3.2888 |
|
79 |
+
| 0.574 | 0.21 | 1600 | 0.5638 | -2.7101 | -3.5446 | 0.7190 | 0.8346 | -599.0428 | -535.6277 | 4.9219 | 3.9304 |
|
80 |
+
| 0.4901 | 0.22 | 1700 | 0.5284 | -2.4900 | -3.3577 | 0.7335 | 0.8677 | -580.3493 | -513.6201 | 3.8220 | 2.9305 |
|
81 |
+
| 0.5149 | 0.24 | 1800 | 0.5408 | -1.7507 | -2.4663 | 0.7215 | 0.7156 | -491.2047 | -439.6899 | 2.0262 | 1.2751 |
|
82 |
+
| 0.6382 | 0.25 | 1900 | 0.5325 | -2.1268 | -2.9548 | 0.7255 | 0.8279 | -540.0542 | -477.3052 | 2.4039 | 1.4990 |
|
83 |
+
| 0.5178 | 0.26 | 2000 | 0.5276 | -1.4221 | -2.1526 | 0.7305 | 0.7305 | -459.8390 | -406.8324 | 1.5288 | 0.8157 |
|
84 |
+
| 0.524 | 0.27 | 2100 | 0.5663 | -2.7101 | -3.7077 | 0.7110 | 0.9976 | -615.3445 | -535.6266 | 2.5955 | 1.6625 |
|
85 |
+
| 0.523 | 0.29 | 2200 | 0.5422 | -2.2871 | -3.3438 | 0.7230 | 1.0567 | -578.9616 | -493.3343 | 3.5955 | 2.5436 |
|
86 |
+
| 0.5431 | 0.3 | 2300 | 0.5253 | -2.1932 | -3.2183 | 0.7340 | 1.0252 | -566.4124 | -483.9387 | 4.2433 | 3.2004 |
|
87 |
+
| 0.5147 | 0.31 | 2400 | 0.5132 | -2.8441 | -3.8795 | 0.7315 | 1.0354 | -632.5286 | -549.0342 | 4.6772 | 3.6861 |
|
88 |
+
| 0.4198 | 0.33 | 2500 | 0.5214 | -2.1756 | -3.1443 | 0.7290 | 0.9687 | -559.0054 | -482.1783 | 2.7950 | 1.8511 |
|
89 |
+
| 0.5994 | 0.34 | 2600 | 0.5188 | -3.1314 | -4.1849 | 0.7290 | 1.0535 | -663.0683 | -577.7604 | 3.4511 | 2.4450 |
|
90 |
+
| 0.4812 | 0.35 | 2700 | 0.5139 | -3.0136 | -4.1060 | 0.7455 | 1.0924 | -655.1821 | -565.9851 | 3.7760 | 2.7916 |
|
91 |
+
| 0.4696 | 0.37 | 2800 | 0.5137 | -2.2305 | -3.2368 | 0.7355 | 1.0063 | -568.2574 | -487.6709 | 2.6757 | 1.8289 |
|
92 |
+
| 0.5418 | 0.38 | 2900 | 0.5177 | -2.0641 | -3.1462 | 0.7345 | 1.0822 | -559.2020 | -471.0270 | 2.0189 | 1.1899 |
|
93 |
+
| 0.5068 | 0.39 | 3000 | 0.5096 | -2.4564 | -3.5648 | 0.7400 | 1.1084 | -601.0543 | -510.2569 | 2.8679 | 2.0023 |
|
94 |
+
| 0.4429 | 0.41 | 3100 | 0.5324 | -2.7544 | -3.8869 | 0.7180 | 1.1325 | -633.2682 | -540.0566 | 1.3309 | 0.6491 |
|
95 |
+
| 0.5977 | 0.42 | 3200 | 0.4963 | -2.8842 | -3.9825 | 0.7425 | 1.0983 | -642.8285 | -553.0416 | 2.0170 | 1.2328 |
|
96 |
+
| 0.5281 | 0.43 | 3300 | 0.5074 | -2.4254 | -3.5511 | 0.7325 | 1.1257 | -599.6907 | -507.1647 | 1.1826 | 0.4294 |
|
97 |
+
| 0.5114 | 0.44 | 3400 | 0.5197 | -2.8424 | -4.0833 | 0.7255 | 1.2409 | -652.9095 | -548.8630 | 2.1493 | 1.2128 |
|
98 |
+
| 0.4984 | 0.46 | 3500 | 0.5002 | -3.1997 | -4.4222 | 0.7450 | 1.2225 | -686.7951 | -584.5864 | 3.3502 | 2.4203 |
|
99 |
+
| 0.5723 | 0.47 | 3600 | 0.5010 | -3.0065 | -4.2439 | 0.7410 | 1.2374 | -668.9721 | -565.2749 | 3.1534 | 2.2598 |
|
100 |
+
| 0.5496 | 0.48 | 3700 | 0.5015 | -3.0581 | -4.3336 | 0.7395 | 1.2755 | -677.9391 | -570.4304 | 3.3120 | 2.4472 |
|
101 |
+
| 0.5106 | 0.5 | 3800 | 0.5013 | -3.5077 | -4.8209 | 0.7395 | 1.3132 | -726.6729 | -615.3915 | 2.7134 | 1.8547 |
|
102 |
+
| 0.376 | 0.51 | 3900 | 0.4995 | -3.2636 | -4.5260 | 0.7415 | 1.2624 | -697.1753 | -590.9803 | 2.7739 | 1.9628 |
|
103 |
+
| 0.4935 | 0.52 | 4000 | 0.4916 | -2.8251 | -3.9628 | 0.7465 | 1.1377 | -640.8605 | -547.1311 | 2.2899 | 1.5516 |
|
104 |
+
| 0.445 | 0.54 | 4100 | 0.4959 | -3.1300 | -4.4063 | 0.7480 | 1.2763 | -685.2046 | -577.6177 | 2.5949 | 1.8263 |
|
105 |
+
| 0.443 | 0.55 | 4200 | 0.5039 | -2.6104 | -3.9167 | 0.7345 | 1.3063 | -636.2510 | -525.6652 | 2.5643 | 1.7637 |
|
106 |
+
| 0.517 | 0.56 | 4300 | 0.5042 | -3.0608 | -4.4485 | 0.7375 | 1.3877 | -689.4330 | -570.7054 | 2.6212 | 1.8545 |
|
107 |
+
| 0.3693 | 0.58 | 4400 | 0.4969 | -3.2698 | -4.5598 | 0.7470 | 1.2900 | -700.5564 | -591.6002 | 2.5178 | 1.8051 |
|
108 |
+
| 0.481 | 0.59 | 4500 | 0.4893 | -2.8076 | -3.9614 | 0.7445 | 1.1537 | -640.7148 | -545.3853 | 2.0329 | 1.3648 |
|
109 |
+
| 0.4696 | 0.6 | 4600 | 0.4945 | -3.3369 | -4.5983 | 0.7465 | 1.2614 | -704.4065 | -598.3125 | 2.6733 | 1.9401 |
|
110 |
+
| 0.4437 | 0.62 | 4700 | 0.4940 | -2.8130 | -4.0860 | 0.7445 | 1.2730 | -653.1788 | -545.9229 | 2.0547 | 1.2696 |
|
111 |
+
| 0.4492 | 0.63 | 4800 | 0.4963 | -2.7727 | -4.0657 | 0.7465 | 1.2930 | -651.1524 | -541.8960 | 2.3393 | 1.5355 |
|
112 |
+
| 0.5163 | 0.64 | 4900 | 0.5017 | -3.3498 | -4.7649 | 0.7465 | 1.4150 | -721.0643 | -599.6019 | 2.0201 | 1.2216 |
|
113 |
+
| 0.488 | 0.65 | 5000 | 0.4917 | -3.2508 | -4.5623 | 0.7480 | 1.3115 | -700.8107 | -589.7007 | 1.9166 | 1.1418 |
|
114 |
+
| 0.3606 | 0.67 | 5100 | 0.4905 | -2.9757 | -4.2308 | 0.7460 | 1.2551 | -667.6595 | -562.1877 | 1.5031 | 0.7813 |
|
115 |
+
| 0.58 | 0.68 | 5200 | 0.4897 | -2.8783 | -4.1021 | 0.75 | 1.2239 | -654.7924 | -552.4492 | 1.2839 | 0.5850 |
|
116 |
+
| 0.5788 | 0.69 | 5300 | 0.4900 | -3.0607 | -4.2816 | 0.7490 | 1.2209 | -672.7391 | -570.6943 | 1.4059 | 0.7114 |
|
117 |
+
| 0.4138 | 0.71 | 5400 | 0.4910 | -3.3493 | -4.6193 | 0.7515 | 1.2701 | -706.5120 | -599.5464 | 1.6121 | 0.8970 |
|
118 |
+
| 0.5737 | 0.72 | 5500 | 0.4898 | -3.1843 | -4.4515 | 0.7480 | 1.2672 | -689.7249 | -583.0511 | 1.4061 | 0.6955 |
|
119 |
+
| 0.4249 | 0.73 | 5600 | 0.4918 | -3.3448 | -4.6778 | 0.7490 | 1.3330 | -712.3564 | -599.0980 | 1.7110 | 0.9558 |
|
120 |
+
| 0.5457 | 0.75 | 5700 | 0.4897 | -3.2784 | -4.5741 | 0.75 | 1.2957 | -701.9877 | -592.4562 | 1.7372 | 0.9922 |
|
121 |
+
| 0.5287 | 0.76 | 5800 | 0.4920 | -3.3167 | -4.6600 | 0.7495 | 1.3433 | -710.5778 | -596.2890 | 1.9802 | 1.2037 |
|
122 |
+
| 0.5286 | 0.77 | 5900 | 0.4919 | -3.2305 | -4.5655 | 0.7465 | 1.3350 | -701.1276 | -587.6722 | 1.9038 | 1.1361 |
|
123 |
+
| 0.5147 | 0.79 | 6000 | 0.4910 | -3.3145 | -4.6435 | 0.7505 | 1.3290 | -708.9319 | -596.0760 | 1.9303 | 1.1726 |
|
124 |
+
| 0.4478 | 0.8 | 6100 | 0.4886 | -3.2069 | -4.5013 | 0.7480 | 1.2944 | -694.7131 | -585.3105 | 1.7621 | 1.0186 |
|
125 |
+
| 0.5236 | 0.81 | 6200 | 0.4901 | -3.3207 | -4.6497 | 0.7495 | 1.3290 | -709.5499 | -596.6957 | 1.8309 | 1.0794 |
|
126 |
+
| 0.5079 | 0.82 | 6300 | 0.4890 | -3.3084 | -4.6220 | 0.7495 | 1.3137 | -706.7820 | -595.4583 | 1.7747 | 1.0322 |
|
127 |
+
| 0.4942 | 0.84 | 6400 | 0.4891 | -3.2621 | -4.5672 | 0.7495 | 1.3051 | -701.3010 | -590.8314 | 1.7716 | 1.0268 |
|
128 |
+
| 0.4688 | 0.85 | 6500 | 0.4891 | -3.2863 | -4.5956 | 0.7505 | 1.3093 | -704.1410 | -593.2547 | 1.7863 | 1.0402 |
|
129 |
+
| 0.5062 | 0.86 | 6600 | 0.4889 | -3.2923 | -4.6029 | 0.7485 | 1.3106 | -704.8691 | -593.8478 | 1.7695 | 1.0261 |
|
130 |
+
| 0.574 | 0.88 | 6700 | 0.4887 | -3.2779 | -4.5886 | 0.7495 | 1.3108 | -703.4429 | -592.4089 | 1.7573 | 1.0140 |
|
131 |
+
| 0.5737 | 0.89 | 6800 | 0.4887 | -3.2917 | -4.6042 | 0.7510 | 1.3124 | -704.9940 | -593.7938 | 1.7560 | 1.0126 |
|
132 |
+
| 0.4298 | 0.9 | 6900 | 0.4889 | -3.2985 | -4.6115 | 0.7505 | 1.3131 | -705.7332 | -594.4664 | 1.7563 | 1.0130 |
|
133 |
+
| 0.55 | 0.92 | 7000 | 0.4889 | -3.2997 | -4.6137 | 0.7505 | 1.3140 | -705.9527 | -594.5901 | 1.7567 | 1.0132 |
|
134 |
+
| 0.4123 | 0.93 | 7100 | 0.4889 | -3.3026 | -4.6168 | 0.7515 | 1.3142 | -706.2578 | -594.8819 | 1.7586 | 1.0151 |
|
135 |
+
| 0.5207 | 0.94 | 7200 | 0.4887 | -3.3049 | -4.6192 | 0.75 | 1.3143 | -706.5007 | -595.1128 | 1.7557 | 1.0126 |
|
136 |
+
| 0.4618 | 0.96 | 7300 | 0.4888 | -3.3019 | -4.6165 | 0.7515 | 1.3145 | -706.2247 | -594.8143 | 1.7552 | 1.0116 |
|
137 |
+
| 0.4826 | 0.97 | 7400 | 0.4889 | -3.3035 | -4.6177 | 0.7510 | 1.3142 | -706.3512 | -594.9731 | 1.7538 | 1.0108 |
|
138 |
+
| 0.3856 | 0.98 | 7500 | 0.4887 | -3.3043 | -4.6187 | 0.7515 | 1.3144 | -706.4486 | -595.0473 | 1.7544 | 1.0114 |
|
139 |
+
| 0.5369 | 0.99 | 7600 | 0.4886 | -3.3028 | -4.6175 | 0.7520 | 1.3147 | -706.3290 | -594.9012 | 1.7559 | 1.0126 |
|
140 |
+
|
141 |
+
|
142 |
+
### Framework versions
|
143 |
+
|
144 |
+
- PEFT 0.8.2
|
145 |
+
- Transformers 4.38.1
|
146 |
+
- Pytorch 2.2.0
|
147 |
+
- Datasets 2.17.1
|
148 |
+
- Tokenizers 0.15.2
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 671150064
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bf2a6cef266f481b68b030667cb2640c6984fb6646b3cd806e5ab8c5ae4edee8
|
3 |
size 671150064
|
all_results.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 1.0,
|
3 |
+
"train_loss": 0.5201432038994555,
|
4 |
+
"train_runtime": 240641.1375,
|
5 |
+
"train_samples": 61135,
|
6 |
+
"train_samples_per_second": 0.254,
|
7 |
+
"train_steps_per_second": 0.032
|
8 |
+
}
|
train_results.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 1.0,
|
3 |
+
"train_loss": 0.5201432038994555,
|
4 |
+
"train_runtime": 240641.1375,
|
5 |
+
"train_samples": 61135,
|
6 |
+
"train_samples_per_second": 0.254,
|
7 |
+
"train_steps_per_second": 0.032
|
8 |
+
}
|
trainer_state.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|