afrideva commited on
Commit
d0266f2
1 Parent(s): 22ea0c8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +142 -0
README.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BEE-spoke-data/zephyr-220m-dpo-full
3
+ datasets:
4
+ - HuggingFaceH4/ultrafeedback_binarized
5
+ inference: false
6
+ license: apache-2.0
7
+ model-index:
8
+ - name: zephyr-220m-dpo-full
9
+ results: []
10
+ model_creator: BEE-spoke-data
11
+ model_name: zephyr-220m-dpo-full
12
+ pipeline_tag: text-generation
13
+ quantized_by: afrideva
14
+ tags:
15
+ - generated_from_trainer
16
+ - gguf
17
+ - ggml
18
+ - quantized
19
+ - q2_k
20
+ - q3_k_m
21
+ - q4_k_m
22
+ - q5_k_m
23
+ - q6_k
24
+ - q8_0
25
+ ---
26
+ # BEE-spoke-data/zephyr-220m-dpo-full-GGUF
27
+
28
+ Quantized GGUF model files for [zephyr-220m-dpo-full](https://huggingface.co/BEE-spoke-data/zephyr-220m-dpo-full) from [BEE-spoke-data](https://huggingface.co/BEE-spoke-data)
29
+
30
+
31
+ | Name | Quant method | Size |
32
+ | ---- | ---- | ---- |
33
+ | [zephyr-220m-dpo-full.fp16.gguf](https://huggingface.co/afrideva/zephyr-220m-dpo-full-GGUF/resolve/main/zephyr-220m-dpo-full.fp16.gguf) | fp16 | 436.50 MB |
34
+ | [zephyr-220m-dpo-full.q2_k.gguf](https://huggingface.co/afrideva/zephyr-220m-dpo-full-GGUF/resolve/main/zephyr-220m-dpo-full.q2_k.gguf) | q2_k | 94.43 MB |
35
+ | [zephyr-220m-dpo-full.q3_k_m.gguf](https://huggingface.co/afrideva/zephyr-220m-dpo-full-GGUF/resolve/main/zephyr-220m-dpo-full.q3_k_m.gguf) | q3_k_m | 114.65 MB |
36
+ | [zephyr-220m-dpo-full.q4_k_m.gguf](https://huggingface.co/afrideva/zephyr-220m-dpo-full-GGUF/resolve/main/zephyr-220m-dpo-full.q4_k_m.gguf) | q4_k_m | 137.58 MB |
37
+ | [zephyr-220m-dpo-full.q5_k_m.gguf](https://huggingface.co/afrideva/zephyr-220m-dpo-full-GGUF/resolve/main/zephyr-220m-dpo-full.q5_k_m.gguf) | q5_k_m | 157.91 MB |
38
+ | [zephyr-220m-dpo-full.q6_k.gguf](https://huggingface.co/afrideva/zephyr-220m-dpo-full-GGUF/resolve/main/zephyr-220m-dpo-full.q6_k.gguf) | q6_k | 179.52 MB |
39
+ | [zephyr-220m-dpo-full.q8_0.gguf](https://huggingface.co/afrideva/zephyr-220m-dpo-full-GGUF/resolve/main/zephyr-220m-dpo-full.q8_0.gguf) | q8_0 | 232.28 MB |
40
+
41
+
42
+
43
+ ## Original Model Card:
44
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
45
+ should probably proofread and complete it, then remove this comment. -->
46
+
47
+ # zephyr-220m-dpo-full
48
+
49
+ This model is a fine-tuned version of [amazingvince/zephyr-220m-sft-full](https://huggingface.co/amazingvince/zephyr-220m-sft-full) on the None dataset.
50
+ It achieves the following results on the evaluation set:
51
+ - Loss: 0.5608
52
+ - Rewards/chosen: 0.4691
53
+ - Rewards/rejected: -0.0455
54
+ - Rewards/accuracies: 0.6930
55
+ - Rewards/margins: 0.5145
56
+ - Logps/rejected: -438.4595
57
+ - Logps/chosen: -544.6858
58
+ - Logits/rejected: -4.0092
59
+ - Logits/chosen: -3.9839
60
+
61
+ ## Model description
62
+
63
+ More information needed
64
+
65
+ ## Intended uses & limitations
66
+
67
+ More information needed
68
+
69
+ ## Training and evaluation data
70
+
71
+ More information needed
72
+
73
+ ## Training procedure
74
+
75
+ ### Training hyperparameters
76
+
77
+ The following hyperparameters were used during training:
78
+ - learning_rate: 5e-07
79
+ - train_batch_size: 8
80
+ - eval_batch_size: 4
81
+ - seed: 42
82
+ - distributed_type: multi-GPU
83
+ - num_devices: 2
84
+ - total_train_batch_size: 16
85
+ - total_eval_batch_size: 8
86
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
87
+ - lr_scheduler_type: linear
88
+ - lr_scheduler_warmup_ratio: 0.1
89
+ - num_epochs: 1
90
+
91
+ ### Training results
92
+
93
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
94
+ |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
95
+ | 0.6906 | 0.03 | 100 | 0.6932 | 0.0008 | 0.0007 | 0.4860 | 0.0002 | -437.9984 | -549.3683 | -4.0893 | -4.0515 |
96
+ | 0.6844 | 0.05 | 200 | 0.6855 | 0.0323 | 0.0173 | 0.5640 | 0.0150 | -437.8319 | -549.0540 | -4.0871 | -4.0501 |
97
+ | 0.6685 | 0.08 | 300 | 0.6675 | 0.1075 | 0.0537 | 0.6160 | 0.0538 | -437.4682 | -548.3016 | -4.0788 | -4.0432 |
98
+ | 0.6579 | 0.1 | 400 | 0.6426 | 0.2153 | 0.0941 | 0.6430 | 0.1212 | -437.0637 | -547.2234 | -4.0645 | -4.0309 |
99
+ | 0.6331 | 0.13 | 500 | 0.6241 | 0.2980 | 0.1106 | 0.6430 | 0.1874 | -436.8989 | -546.3970 | -4.0525 | -4.0221 |
100
+ | 0.6229 | 0.15 | 600 | 0.6138 | 0.3428 | 0.1103 | 0.6580 | 0.2325 | -436.9023 | -545.9487 | -4.0402 | -4.0116 |
101
+ | 0.6008 | 0.18 | 700 | 0.6053 | 0.3822 | 0.0970 | 0.6560 | 0.2852 | -437.0354 | -545.5550 | -4.0301 | -4.0042 |
102
+ | 0.5751 | 0.21 | 800 | 0.5998 | 0.4077 | 0.0879 | 0.6540 | 0.3198 | -437.1260 | -545.2994 | -4.0359 | -4.0099 |
103
+ | 0.6485 | 0.23 | 900 | 0.5922 | 0.4208 | 0.0655 | 0.6600 | 0.3553 | -437.3501 | -545.1683 | -4.0167 | -3.9936 |
104
+ | 0.6164 | 0.26 | 1000 | 0.5880 | 0.4046 | 0.0287 | 0.6620 | 0.3759 | -437.7182 | -545.3309 | -4.0092 | -3.9869 |
105
+ | 0.6225 | 0.28 | 1100 | 0.5852 | 0.4058 | 0.0110 | 0.6680 | 0.3948 | -437.8951 | -545.3189 | -4.0240 | -3.9984 |
106
+ | 0.6289 | 0.31 | 1200 | 0.5824 | 0.4127 | 0.0078 | 0.6670 | 0.4048 | -437.9265 | -545.2498 | -4.0253 | -3.9994 |
107
+ | 0.5818 | 0.34 | 1300 | 0.5818 | 0.4222 | 0.0097 | 0.6680 | 0.4125 | -437.9080 | -545.1544 | -4.0212 | -3.9953 |
108
+ | 0.567 | 0.36 | 1400 | 0.5797 | 0.4098 | -0.0141 | 0.6730 | 0.4238 | -438.1456 | -545.2791 | -4.0333 | -4.0062 |
109
+ | 0.5659 | 0.39 | 1500 | 0.5790 | 0.4204 | -0.0154 | 0.6780 | 0.4358 | -438.1591 | -545.1725 | -4.0245 | -3.9963 |
110
+ | 0.5993 | 0.41 | 1600 | 0.5783 | 0.4161 | -0.0285 | 0.6720 | 0.4446 | -438.2904 | -545.2161 | -4.0185 | -3.9907 |
111
+ | 0.5999 | 0.44 | 1700 | 0.5767 | 0.4067 | -0.0468 | 0.6840 | 0.4535 | -438.4729 | -545.3095 | -4.0207 | -3.9935 |
112
+ | 0.6004 | 0.46 | 1800 | 0.5731 | 0.4233 | -0.0394 | 0.6830 | 0.4627 | -438.3991 | -545.1437 | -4.0219 | -3.9944 |
113
+ | 0.5349 | 0.49 | 1900 | 0.5720 | 0.4285 | -0.0429 | 0.6830 | 0.4714 | -438.4335 | -545.0914 | -4.0295 | -4.0012 |
114
+ | 0.5377 | 0.52 | 2000 | 0.5702 | 0.4255 | -0.0540 | 0.6850 | 0.4795 | -438.5449 | -545.1220 | -4.0290 | -4.0009 |
115
+ | 0.4988 | 0.54 | 2100 | 0.5713 | 0.4347 | -0.0548 | 0.6840 | 0.4895 | -438.5533 | -545.0299 | -4.0317 | -4.0039 |
116
+ | 0.6093 | 0.57 | 2200 | 0.5706 | 0.4464 | -0.0456 | 0.6810 | 0.4920 | -438.4607 | -544.9128 | -4.0288 | -4.0014 |
117
+ | 0.5356 | 0.59 | 2300 | 0.5689 | 0.4484 | -0.0486 | 0.6880 | 0.4971 | -438.4912 | -544.8922 | -4.0257 | -3.9986 |
118
+ | 0.5753 | 0.62 | 2400 | 0.5681 | 0.4596 | -0.0441 | 0.6850 | 0.5037 | -438.4457 | -544.7802 | -4.0100 | -3.9846 |
119
+ | 0.5709 | 0.65 | 2500 | 0.5673 | 0.4693 | -0.0387 | 0.6910 | 0.5081 | -438.3924 | -544.6835 | -4.0100 | -3.9849 |
120
+ | 0.5565 | 0.67 | 2600 | 0.5665 | 0.4692 | -0.0401 | 0.6820 | 0.5092 | -438.4054 | -544.6850 | -4.0096 | -3.9843 |
121
+ | 0.585 | 0.7 | 2700 | 0.5650 | 0.4780 | -0.0351 | 0.6940 | 0.5131 | -438.3558 | -544.5962 | -4.0074 | -3.9820 |
122
+ | 0.5883 | 0.72 | 2800 | 0.5670 | 0.4914 | -0.0151 | 0.6880 | 0.5066 | -438.1562 | -544.4624 | -3.9894 | -3.9669 |
123
+ | 0.624 | 0.75 | 2900 | 0.5663 | 0.4877 | -0.0191 | 0.6840 | 0.5068 | -438.1958 | -544.4997 | -3.9935 | -3.9705 |
124
+ | 0.5347 | 0.77 | 3000 | 0.5644 | 0.4757 | -0.0335 | 0.6850 | 0.5092 | -438.3401 | -544.6199 | -4.0019 | -3.9777 |
125
+ | 0.5837 | 0.8 | 3100 | 0.5637 | 0.4783 | -0.0302 | 0.6830 | 0.5085 | -438.3073 | -544.5936 | -3.9976 | -3.9742 |
126
+ | 0.5293 | 0.83 | 3200 | 0.5634 | 0.4715 | -0.0363 | 0.6890 | 0.5078 | -438.3679 | -544.6616 | -4.0023 | -3.9778 |
127
+ | 0.5128 | 0.85 | 3300 | 0.5620 | 0.4745 | -0.0387 | 0.6880 | 0.5131 | -438.3917 | -544.6319 | -4.0053 | -3.9804 |
128
+ | 0.6204 | 0.88 | 3400 | 0.5625 | 0.4679 | -0.0442 | 0.6860 | 0.5121 | -438.4469 | -544.6978 | -4.0067 | -3.9815 |
129
+ | 0.5469 | 0.9 | 3500 | 0.5618 | 0.4612 | -0.0491 | 0.6860 | 0.5102 | -438.4956 | -544.7651 | -4.0098 | -3.9843 |
130
+ | 0.5807 | 0.93 | 3600 | 0.5615 | 0.4675 | -0.0454 | 0.6890 | 0.5129 | -438.4584 | -544.7015 | -4.0068 | -3.9818 |
131
+ | 0.5265 | 0.96 | 3700 | 0.5620 | 0.4675 | -0.0435 | 0.6880 | 0.5110 | -438.4403 | -544.7019 | -4.0082 | -3.9833 |
132
+ | 0.5484 | 0.98 | 3800 | 0.5615 | 0.4685 | -0.0449 | 0.6930 | 0.5133 | -438.4536 | -544.6919 | -4.0103 | -3.9851 |
133
+
134
+
135
+ ### Framework versions
136
+
137
+ - Transformers 4.37.0.dev0
138
+ - Pytorch 2.1.2+cu121
139
+ - Datasets 2.15.0
140
+ - Tokenizers 0.15.0
141
+
142
+ https://wandb.ai/amazingvince/huggingface/runs/z71h0hc3?workspace=user-amazingvince