wzhouad commited on
Commit
bc43d54
1 Parent(s): cb48cc2

Model save

Browse files
README.md CHANGED
@@ -14,16 +14,6 @@ should probably proofread and complete it, then remove this comment. -->
14
  # zephyr-7b-dpo-full
15
 
16
  This model was trained from scratch on the None dataset.
17
- It achieves the following results on the evaluation set:
18
- - Loss: 0.5286
19
- - Rewards/chosen: -1.7068
20
- - Rewards/rejected: -3.1572
21
- - Rewards/accuracies: 0.7695
22
- - Rewards/margins: 1.4504
23
- - Logps/rejected: -627.3446
24
- - Logps/chosen: -474.2680
25
- - Logits/rejected: -0.7503
26
- - Logits/chosen: -0.5802
27
 
28
  ## Model description
29
 
@@ -43,12 +33,12 @@ More information needed
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 1e-06
46
- - train_batch_size: 4
47
  - eval_batch_size: 8
48
  - seed: 5
49
  - distributed_type: multi-GPU
50
  - num_devices: 8
51
- - gradient_accumulation_steps: 4
52
  - total_train_batch_size: 128
53
  - total_eval_batch_size: 64
54
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
@@ -58,17 +48,6 @@ The following hyperparameters were used during training:
58
 
59
  ### Training results
60
 
61
- | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
62
- |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
63
- | 0.6254 | 0.21 | 100 | 0.6260 | -0.1988 | -0.5392 | 0.6992 | 0.3404 | -365.5406 | -323.4606 | 0.3870 | 0.3426 |
64
- | 0.5841 | 0.42 | 200 | 0.5597 | -0.4205 | -1.0478 | 0.7305 | 0.6273 | -416.4040 | -345.6356 | 0.1437 | 0.0907 |
65
- | 0.5389 | 0.63 | 300 | 0.5285 | -0.6859 | -1.5998 | 0.7773 | 0.9140 | -471.6094 | -372.1726 | 0.2331 | 0.2134 |
66
- | 0.5188 | 0.84 | 400 | 0.5197 | -0.7311 | -1.7606 | 0.7852 | 1.0295 | -487.6861 | -376.6970 | -0.1165 | -0.1000 |
67
- | 0.3402 | 1.05 | 500 | 0.5344 | -1.5025 | -2.9522 | 0.7773 | 1.4497 | -606.8411 | -453.8337 | -0.4375 | -0.3802 |
68
- | 0.3141 | 1.26 | 600 | 0.5426 | -1.7806 | -3.3337 | 0.7539 | 1.5531 | -644.9940 | -481.6454 | -0.6733 | -0.5363 |
69
- | 0.3324 | 1.47 | 700 | 0.5322 | -1.7213 | -3.2147 | 0.7773 | 1.4934 | -633.0936 | -475.7130 | -0.9457 | -0.7473 |
70
- | 0.3372 | 1.67 | 800 | 0.5313 | -1.7652 | -3.2295 | 0.7656 | 1.4643 | -634.5750 | -480.1067 | -0.7581 | -0.5822 |
71
- | 0.3058 | 1.88 | 900 | 0.5286 | -1.7068 | -3.1572 | 0.7695 | 1.4504 | -627.3446 | -474.2680 | -0.7503 | -0.5802 |
72
 
73
 
74
  ### Framework versions
 
14
  # zephyr-7b-dpo-full
15
 
16
  This model was trained from scratch on the None dataset.
 
 
 
 
 
 
 
 
 
 
17
 
18
  ## Model description
19
 
 
33
 
34
  The following hyperparameters were used during training:
35
  - learning_rate: 1e-06
36
+ - train_batch_size: 2
37
  - eval_batch_size: 8
38
  - seed: 5
39
  - distributed_type: multi-GPU
40
  - num_devices: 8
41
+ - gradient_accumulation_steps: 8
42
  - total_train_batch_size: 128
43
  - total_eval_batch_size: 64
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 
48
 
49
  ### Training results
50
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
 
53
  ### Framework versions
all_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 2.0,
3
- "train_loss": 0.44928585458351633,
4
- "train_runtime": 8755.9554,
5
- "train_samples": 61134,
6
- "train_samples_per_second": 13.964,
7
- "train_steps_per_second": 0.109
8
  }
 
1
  {
2
  "epoch": 2.0,
3
+ "train_loss": 0.11559809408500558,
4
+ "train_runtime": 23407.6305,
5
+ "train_samples": 106682,
6
+ "train_samples_per_second": 9.115,
7
+ "train_steps_per_second": 0.071
8
  }
model-00001-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bb6dae14a8a45dc2dba2faa238cabeb3dd475cf38582ca55e0db21e1bcba298b
3
  size 4976698672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e0e47c941b4dacd35c121d5e4bc0286cf8b5e81054c59e2d56144e860df9597c
3
  size 4976698672
model-00002-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d4a9b139ac61337a3f9804165615d4b66c3ac23c8aa785fe62d6ed1e6762d8c7
3
  size 4999802720
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dca0065a230087c3b55705c6a6f7054e64f0b81d4cd9031df0f8aad008ede197
3
  size 4999802720
model-00003-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:891c89339231c57c3d361d8e3316ff43c01cf327000837bc937ebf6b25a6bc24
3
  size 4915916176
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d1d4980829c9d3d06ef7a81e02fffa0e40201c4532305ec6a04aba5f5e1b689
3
  size 4915916176
model-00004-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c1a6e1fa0a8e051df6b214baf25d5e7ecf4e020f3d12403c1e06564d15c9d5dd
3
  size 1168138808
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdae88e9a153704c05b7398a5c9237f862194fe810bc9fe187286c41d9552058
3
  size 1168138808
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 2.0,
3
- "train_loss": 0.44928585458351633,
4
- "train_runtime": 8755.9554,
5
- "train_samples": 61134,
6
- "train_samples_per_second": 13.964,
7
- "train_steps_per_second": 0.109
8
  }
 
1
  {
2
  "epoch": 2.0,
3
+ "train_loss": 0.11559809408500558,
4
+ "train_runtime": 23407.6305,
5
+ "train_samples": 106682,
6
+ "train_samples_per_second": 9.115,
7
+ "train_steps_per_second": 0.071
8
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2a6b1b9121f5e07a38d89a34290e13b8df30cc0da45f64840437b68179a1db9a
3
  size 6648
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:53e04de78c43c6d34212d0523f5accae0cd42011de8bac7befb2084cd87faa70
3
  size 6648