File size: 18,818 Bytes
74cc13a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
license: mit
base_model: databricks/dolly-v2-7b
tags:
- generated_from_trainer
model-index:
- name: dolly-v2-7b-dpo-full-3-epoch-hydrox-safe
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# dolly-v2-7b-dpo-full-3-epoch-hydrox-safe

This model is a fine-tuned version of [databricks/dolly-v2-7b](https://huggingface.co/databricks/dolly-v2-7b) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0060
- Rewards/chosen: 3.6820
- Rewards/rejected: -10.6709
- Rewards/accuracies: 0.9966
- Rewards/margins: 14.3529
- Logps/rejected: -666.2253
- Logps/chosen: -383.1022
- Logits/rejected: -1.3595
- Logits/chosen: -1.5884

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-07
- train_batch_size: 8
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 64
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.873         | 0.03  | 100  | 0.7547          | 0.2247         | -0.0238          | 0.5918             | 0.2485          | -559.7548      | -417.6762    | -1.1954         | -1.4929       |
| 0.6069        | 0.07  | 200  | 0.5675          | 0.7273         | -0.0433          | 0.7407             | 0.7706          | -559.9498      | -412.6499    | -1.1932         | -1.4956       |
| 0.3668        | 0.1   | 300  | 0.3913          | 1.3855         | -0.2301          | 0.8552             | 1.6156          | -561.8173      | -406.0676    | -1.1768         | -1.4862       |
| 0.2547        | 0.14  | 400  | 0.2942          | 2.0422         | -0.3359          | 0.8897             | 2.3781          | -562.875       | -399.5007    | -1.1603         | -1.4768       |
| 0.2496        | 0.17  | 500  | 0.2323          | 2.5759         | -0.5597          | 0.9184             | 3.1357          | -565.1138      | -394.1635    | -1.1394         | -1.4661       |
| 0.2099        | 0.2   | 600  | 0.1979          | 3.0353         | -0.7414          | 0.9242             | 3.7767          | -566.9301      | -389.5694    | -1.1137         | -1.4513       |
| 0.123         | 0.24  | 700  | 0.1624          | 3.4398         | -1.1264          | 0.9436             | 4.5662          | -570.7800      | -385.5248    | -1.1147         | -1.4531       |
| 0.1211        | 0.27  | 800  | 0.1404          | 3.7877         | -1.3826          | 0.9453             | 5.1703          | -573.3425      | -382.0456    | -1.1126         | -1.4555       |
| 0.1398        | 0.31  | 900  | 0.1305          | 4.1188         | -1.5720          | 0.9545             | 5.6908          | -575.2359      | -378.7344    | -1.1145         | -1.4568       |
| 0.1161        | 0.34  | 1000 | 0.1066          | 4.3055         | -1.7418          | 0.9646             | 6.0473          | -576.9345      | -376.8678    | -1.1217         | -1.4605       |
| 0.1109        | 0.37  | 1100 | 0.1006          | 4.4233         | -2.0049          | 0.9621             | 6.4282          | -579.5653      | -375.6897    | -1.1334         | -1.4683       |
| 0.0983        | 0.41  | 1200 | 0.0881          | 4.3080         | -2.5628          | 0.9638             | 6.8708          | -585.1442      | -376.8426    | -1.1544         | -1.4813       |
| 0.0965        | 0.44  | 1300 | 0.0778          | 4.3457         | -2.6685          | 0.9621             | 7.0142          | -586.2010      | -376.4656    | -1.1651         | -1.4955       |
| 0.0542        | 0.48  | 1400 | 0.0705          | 4.3768         | -3.1529          | 0.9739             | 7.5297          | -591.0455      | -376.1544    | -1.1767         | -1.4972       |
| 0.053         | 0.51  | 1500 | 0.0659          | 4.4009         | -3.2268          | 0.9781             | 7.6278          | -591.7845      | -375.9133    | -1.1797         | -1.5057       |
| 0.0653        | 0.54  | 1600 | 0.0680          | 4.3566         | -3.1994          | 0.9781             | 7.5559          | -591.5099      | -376.3570    | -1.1682         | -1.4980       |
| 0.0634        | 0.58  | 1700 | 0.0553          | 4.2444         | -3.7967          | 0.9764             | 8.0411          | -597.4832      | -377.4786    | -1.1988         | -1.5176       |
| 0.0574        | 0.61  | 1800 | 0.0490          | 4.3076         | -3.9340          | 0.9790             | 8.2416          | -598.8566      | -376.8465    | -1.2222         | -1.5299       |
| 0.0518        | 0.65  | 1900 | 0.0424          | 4.2820         | -4.0325          | 0.9857             | 8.3145          | -599.8412      | -377.1030    | -1.2285         | -1.5390       |
| 0.0376        | 0.68  | 2000 | 0.0423          | 4.2925         | -4.1097          | 0.9840             | 8.4022          | -600.6129      | -376.9975    | -1.2280         | -1.5311       |
| 0.0339        | 0.71  | 2100 | 0.0424          | 4.2423         | -4.3969          | 0.9882             | 8.6393          | -603.4858      | -377.4996    | -1.2371         | -1.5422       |
| 0.0323        | 0.75  | 2200 | 0.0418          | 4.3016         | -4.3550          | 0.9832             | 8.6566          | -603.0663      | -376.9068    | -1.2198         | -1.5286       |
| 0.0267        | 0.78  | 2300 | 0.0386          | 4.1635         | -4.6663          | 0.9882             | 8.8297          | -606.1791      | -378.2882    | -1.2158         | -1.5230       |
| 0.0296        | 0.82  | 2400 | 0.0316          | 3.9990         | -5.4019          | 0.9907             | 9.4009          | -613.5353      | -379.9330    | -1.2347         | -1.5268       |
| 0.0289        | 0.85  | 2500 | 0.0315          | 4.1064         | -5.2099          | 0.9907             | 9.3164          | -611.6158      | -378.8586    | -1.2152         | -1.5109       |
| 0.0326        | 0.88  | 2600 | 0.0280          | 4.0899         | -5.4030          | 0.9907             | 9.4929          | -613.5463      | -379.0233    | -1.2434         | -1.5354       |
| 0.025         | 0.92  | 2700 | 0.0333          | 4.0463         | -5.2395          | 0.9924             | 9.2857          | -611.9110      | -379.4600    | -1.2283         | -1.5268       |
| 0.0273        | 0.95  | 2800 | 0.0259          | 4.1046         | -5.4947          | 0.9975             | 9.5993          | -614.4639      | -378.8770    | -1.2253         | -1.5271       |
| 0.0197        | 0.99  | 2900 | 0.0360          | 4.1642         | -5.3436          | 0.9907             | 9.5078          | -612.9525      | -378.2808    | -1.2320         | -1.5321       |
| 0.0196        | 1.02  | 3000 | 0.0267          | 3.8748         | -5.9868          | 0.9949             | 9.8616          | -619.3846      | -381.1749    | -1.2358         | -1.5308       |
| 0.0188        | 1.05  | 3100 | 0.0268          | 3.8452         | -6.0908          | 0.9949             | 9.9361          | -620.4247      | -381.4705    | -1.2365         | -1.5361       |
| 0.0172        | 1.09  | 3200 | 0.0231          | 3.7735         | -6.3630          | 0.9907             | 10.1365         | -623.1463      | -382.1877    | -1.2627         | -1.5561       |
| 0.0099        | 1.12  | 3300 | 0.0218          | 3.7491         | -6.5816          | 0.9958             | 10.3307         | -625.3326      | -382.4322    | -1.2410         | -1.5316       |
| 0.0113        | 1.16  | 3400 | 0.0189          | 3.7109         | -6.6907          | 0.9958             | 10.4017         | -626.4235      | -382.8133    | -1.2519         | -1.5387       |
| 0.0146        | 1.19  | 3500 | 0.0191          | 3.6138         | -7.1128          | 0.9941             | 10.7266         | -630.6445      | -383.7852    | -1.2702         | -1.5462       |
| 0.0108        | 1.22  | 3600 | 0.0175          | 3.5940         | -7.3181          | 0.9949             | 10.9121         | -632.6978      | -383.9829    | -1.2642         | -1.5481       |
| 0.0175        | 1.26  | 3700 | 0.0183          | 3.4786         | -7.8254          | 0.9949             | 11.3039         | -637.7700      | -385.1370    | -1.2904         | -1.5503       |
| 0.0147        | 1.29  | 3800 | 0.0153          | 3.2734         | -8.1715          | 0.9966             | 11.4449         | -641.2316      | -387.1888    | -1.3082         | -1.5667       |
| 0.0113        | 1.33  | 3900 | 0.0153          | 3.3033         | -8.3504          | 0.9966             | 11.6537         | -643.0201      | -386.8899    | -1.2907         | -1.5525       |
| 0.0284        | 1.36  | 4000 | 0.0270          | 3.5241         | -8.1571          | 0.9924             | 11.6812         | -641.0871      | -384.6817    | -1.2917         | -1.5474       |
| 0.0101        | 1.39  | 4100 | 0.0138          | 3.3142         | -8.9443          | 0.9941             | 12.2585         | -648.9590      | -386.7809    | -1.3039         | -1.5402       |
| 0.0093        | 1.43  | 4200 | 0.0159          | 3.3533         | -9.0499          | 0.9966             | 12.4032         | -650.0153      | -386.3899    | -1.3067         | -1.5543       |
| 0.0083        | 1.46  | 4300 | 0.0149          | 3.4209         | -8.8296          | 0.9958             | 12.2505         | -647.8128      | -385.7142    | -1.3104         | -1.5558       |
| 0.0068        | 1.5   | 4400 | 0.0123          | 3.2700         | -9.3033          | 0.9975             | 12.5733         | -652.5496      | -387.2229    | -1.3257         | -1.5680       |
| 0.0093        | 1.53  | 4500 | 0.0122          | 3.5894         | -8.8354          | 0.9983             | 12.4248         | -647.8701      | -384.0288    | -1.3217         | -1.5701       |
| 0.0065        | 1.56  | 4600 | 0.0117          | 3.4515         | -8.7814          | 0.9975             | 12.2329         | -647.3306      | -385.4080    | -1.3381         | -1.5838       |
| 0.0132        | 1.6   | 4700 | 0.0119          | 3.4540         | -8.4518          | 0.9975             | 11.9058         | -644.0345      | -385.3825    | -1.3352         | -1.5862       |
| 0.0085        | 1.63  | 4800 | 0.0113          | 3.3970         | -8.7353          | 0.9966             | 12.1323         | -646.8692      | -385.9526    | -1.3331         | -1.5766       |
| 0.0096        | 1.67  | 4900 | 0.0121          | 3.2728         | -9.0713          | 0.9966             | 12.3442         | -650.2295      | -387.1943    | -1.3552         | -1.5969       |
| 0.0042        | 1.7   | 5000 | 0.0106          | 3.1699         | -9.4193          | 0.9975             | 12.5892         | -653.7093      | -388.2237    | -1.3307         | -1.5739       |
| 0.0116        | 1.73  | 5100 | 0.0096          | 3.2716         | -9.0292          | 0.9958             | 12.3008         | -649.8085      | -387.2067    | -1.3274         | -1.5748       |
| 0.0093        | 1.77  | 5200 | 0.0103          | 3.2228         | -9.3477          | 0.9983             | 12.5706         | -652.9938      | -387.6946    | -1.3153         | -1.5495       |
| 0.0058        | 1.8   | 5300 | 0.0103          | 3.1251         | -9.6052          | 0.9966             | 12.7303         | -655.5681      | -388.6714    | -1.3273         | -1.5594       |
| 0.0066        | 1.84  | 5400 | 0.0094          | 3.5167         | -9.0559          | 0.9983             | 12.5726         | -650.0754      | -384.7553    | -1.3330         | -1.5721       |
| 0.0038        | 1.87  | 5500 | 0.0093          | 3.5884         | -9.0262          | 0.9983             | 12.6146         | -649.7783      | -384.0386    | -1.3171         | -1.5599       |
| 0.0134        | 1.9   | 5600 | 0.0093          | 3.0874         | -9.8027          | 0.9983             | 12.8901         | -657.5432      | -389.0488    | -1.3368         | -1.5645       |
| 0.0059        | 1.94  | 5700 | 0.0098          | 3.4393         | -9.7104          | 0.9975             | 13.1497         | -656.6204      | -385.5294    | -1.3526         | -1.5716       |
| 0.0057        | 1.97  | 5800 | 0.0080          | 3.5892         | -9.4003          | 0.9983             | 12.9896         | -653.5198      | -384.0307    | -1.3593         | -1.5880       |
| 0.0015        | 2.01  | 5900 | 0.0102          | 3.4266         | -9.8551          | 0.9966             | 13.2816         | -658.0669      | -385.6569    | -1.3552         | -1.5837       |
| 0.0019        | 2.04  | 6000 | 0.0105          | 3.5092         | -9.9457          | 0.9983             | 13.4549         | -658.9734      | -384.8311    | -1.3418         | -1.5734       |
| 0.0049        | 2.07  | 6100 | 0.0083          | 3.4872         | -10.1039         | 0.9983             | 13.5911         | -660.5549      | -385.0504    | -1.3269         | -1.5633       |
| 0.0056        | 2.11  | 6200 | 0.0089          | 3.3922         | -10.3713         | 0.9975             | 13.7635         | -663.2297      | -386.0008    | -1.3437         | -1.5700       |
| 0.0041        | 2.14  | 6300 | 0.0078          | 3.5705         | -10.1344         | 0.9983             | 13.7049         | -660.8607      | -384.2182    | -1.3527         | -1.5831       |
| 0.0039        | 2.18  | 6400 | 0.0092          | 3.3798         | -10.7994         | 0.9975             | 14.1792         | -667.5103      | -386.1252    | -1.3748         | -1.5843       |
| 0.0018        | 2.21  | 6500 | 0.0076          | 3.5825         | -10.5328         | 0.9983             | 14.1153         | -664.8441      | -384.0977    | -1.3583         | -1.5744       |
| 0.0037        | 2.24  | 6600 | 0.0075          | 3.5553         | -10.3432         | 0.9983             | 13.8984         | -662.9481      | -384.3702    | -1.3604         | -1.5848       |
| 0.0021        | 2.28  | 6700 | 0.0082          | 3.7310         | -10.3324         | 0.9992             | 14.0634         | -662.8404      | -382.6127    | -1.3437         | -1.5693       |
| 0.0025        | 2.31  | 6800 | 0.0074          | 3.5582         | -10.6710         | 0.9975             | 14.2292         | -666.2263      | -384.3409    | -1.3487         | -1.5658       |
| 0.0112        | 2.35  | 6900 | 0.0076          | 3.5915         | -10.7786         | 0.9966             | 14.3700         | -667.3019      | -384.0081    | -1.3470         | -1.5688       |
| 0.0022        | 2.38  | 7000 | 0.0080          | 3.6060         | -10.6007         | 0.9975             | 14.2067         | -665.5234      | -383.8625    | -1.3536         | -1.5774       |
| 0.0012        | 2.41  | 7100 | 0.0063          | 3.5627         | -10.8773         | 0.9975             | 14.4400         | -668.2891      | -384.2953    | -1.3445         | -1.5681       |
| 0.0018        | 2.45  | 7200 | 0.0070          | 3.4237         | -11.0692         | 0.9975             | 14.4928         | -670.2083      | -385.6862    | -1.3656         | -1.5819       |
| 0.0084        | 2.48  | 7300 | 0.0079          | 3.7091         | -10.3477         | 0.9983             | 14.0569         | -662.9936      | -382.8314    | -1.3539         | -1.5873       |
| 0.0031        | 2.52  | 7400 | 0.0064          | 3.5680         | -10.4848         | 0.9983             | 14.0528         | -664.3639      | -384.2423    | -1.3510         | -1.5829       |
| 0.0027        | 2.55  | 7500 | 0.0069          | 3.5130         | -10.6612         | 0.9983             | 14.1741         | -666.1280      | -384.7932    | -1.3666         | -1.5947       |
| 0.0051        | 2.58  | 7600 | 0.0066          | 3.5461         | -10.7595         | 0.9983             | 14.3056         | -667.1109      | -384.4612    | -1.3600         | -1.5872       |
| 0.001         | 2.62  | 7700 | 0.0076          | 3.5633         | -10.7690         | 0.9983             | 14.3323         | -667.2067      | -384.2903    | -1.3486         | -1.5750       |
| 0.0021        | 2.65  | 7800 | 0.0066          | 3.6662         | -10.7670         | 0.9983             | 14.4332         | -667.1862      | -383.2607    | -1.3604         | -1.5892       |
| 0.004         | 2.69  | 7900 | 0.0067          | 3.7915         | -10.4856         | 0.9983             | 14.2771         | -664.3723      | -382.0074    | -1.3540         | -1.5830       |
| 0.0022        | 2.72  | 8000 | 0.0066          | 3.8259         | -10.5371         | 0.9983             | 14.3630         | -664.8873      | -381.6641    | -1.3510         | -1.5812       |
| 0.0018        | 2.75  | 8100 | 0.0071          | 3.7228         | -10.6783         | 0.9983             | 14.4011         | -666.2990      | -382.6946    | -1.3470         | -1.5789       |
| 0.0015        | 2.79  | 8200 | 0.0065          | 3.7032         | -10.7685         | 0.9983             | 14.4717         | -667.2010      | -382.8909    | -1.3501         | -1.5791       |
| 0.0015        | 2.82  | 8300 | 0.0072          | 3.7173         | -10.7747         | 0.9975             | 14.4920         | -667.2634      | -382.7499    | -1.3574         | -1.5864       |
| 0.0016        | 2.86  | 8400 | 0.0064          | 3.7268         | -10.7100         | 0.9983             | 14.4368         | -666.6169      | -382.6550    | -1.3626         | -1.5899       |
| 0.0022        | 2.89  | 8500 | 0.0062          | 3.6175         | -10.9238         | 0.9992             | 14.5413         | -668.7542      | -383.7477    | -1.3531         | -1.5792       |
| 0.0032        | 2.92  | 8600 | 0.0059          | 3.7174         | -10.7382         | 0.9983             | 14.4556         | -666.8983      | -382.7484    | -1.3576         | -1.5869       |
| 0.0035        | 2.96  | 8700 | 0.0062          | 3.5749         | -11.0859         | 0.9975             | 14.6608         | -670.3754      | -384.1739    | -1.3459         | -1.5728       |
| 0.0024        | 2.99  | 8800 | 0.0062          | 3.6894         | -10.8042         | 0.9983             | 14.4936         | -667.5587      | -383.0290    | -1.3618         | -1.5891       |


### Framework versions

- Transformers 4.35.0
- Pytorch 2.1.1+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1