lole25 commited on
Commit
1d92b30
1 Parent(s): baefeb6

Model save

Browse files
README.md CHANGED
@@ -1,15 +1,11 @@
1
  ---
2
- license: apache-2.0
3
  library_name: peft
4
  tags:
5
- - alignment-handbook
6
- - generated_from_trainer
7
  - trl
8
  - dpo
9
  - generated_from_trainer
10
  base_model: DUAL-GPO/phi-2-gpo-new-i0
11
- datasets:
12
- - HuggingFaceH4/ultrafeedback_binarized
13
  model-index:
14
  - name: phi-2-gpo-v6-i1
15
  results: []
@@ -20,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
20
 
21
  # phi-2-gpo-v6-i1
22
 
23
- This model is a fine-tuned version of [DUAL-GPO/phi-2-gpo-new-i0](https://huggingface.co/DUAL-GPO/phi-2-gpo-new-i0) on the HuggingFaceH4/ultrafeedback_binarized dataset.
24
 
25
  ## Model description
26
 
@@ -44,12 +40,14 @@ The following hyperparameters were used during training:
44
  - eval_batch_size: 4
45
  - seed: 42
46
  - distributed_type: multi-GPU
 
47
  - gradient_accumulation_steps: 4
48
- - total_train_batch_size: 16
 
49
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
50
  - lr_scheduler_type: cosine
51
  - lr_scheduler_warmup_ratio: 0.1
52
- - num_epochs: 1
53
 
54
  ### Training results
55
 
 
1
  ---
2
+ license: mit
3
  library_name: peft
4
  tags:
 
 
5
  - trl
6
  - dpo
7
  - generated_from_trainer
8
  base_model: DUAL-GPO/phi-2-gpo-new-i0
 
 
9
  model-index:
10
  - name: phi-2-gpo-v6-i1
11
  results: []
 
16
 
17
  # phi-2-gpo-v6-i1
18
 
19
+ This model is a fine-tuned version of [DUAL-GPO/phi-2-gpo-new-i0](https://huggingface.co/DUAL-GPO/phi-2-gpo-new-i0) on the None dataset.
20
 
21
  ## Model description
22
 
 
40
  - eval_batch_size: 4
41
  - seed: 42
42
  - distributed_type: multi-GPU
43
+ - num_devices: 3
44
  - gradient_accumulation_steps: 4
45
+ - total_train_batch_size: 48
46
+ - total_eval_batch_size: 12
47
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
  - lr_scheduler_type: cosine
49
  - lr_scheduler_warmup_ratio: 0.1
50
+ - num_epochs: 2
51
 
52
  ### Training results
53
 
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b7af97ea380712e402312cd49eff36da55468c74765e36c98f52e0e0ac294dff
3
  size 167807296
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b259973d90c50e4d7da05879aa5daf8af4b2cb88c49cc1ee7d4b375d3b074cd
3
  size 167807296
all_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 1.0,
3
- "train_loss": 0.24134081795175627,
4
- "train_runtime": 12108.13,
5
- "train_samples": 21000,
6
- "train_samples_per_second": 1.734,
7
- "train_steps_per_second": 0.108
8
  }
 
1
  {
2
+ "epoch": 2.0,
3
+ "train_loss": 0.27172684411589915,
4
+ "train_runtime": 11567.6763,
5
+ "train_samples": 20000,
6
+ "train_samples_per_second": 3.458,
7
+ "train_steps_per_second": 0.072
8
  }
runs/May15_15-27-49_gpu4-119-5/events.out.tfevents.1715750995.gpu4-119-5.3097660.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cfcbe207bb9600ba418e6086d5a50b60651ecde83e206b41bacdc355db7289cc
3
- size 56037
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:63307d23b36188e99c93745be59a64b5614b33715f51682ad83e6f075aeefd92
3
+ size 58293
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 1.0,
3
- "train_loss": 0.24134081795175627,
4
- "train_runtime": 12108.13,
5
- "train_samples": 21000,
6
- "train_samples_per_second": 1.734,
7
- "train_steps_per_second": 0.108
8
  }
 
1
  {
2
+ "epoch": 2.0,
3
+ "train_loss": 0.27172684411589915,
4
+ "train_runtime": 11567.6763,
5
+ "train_samples": 20000,
6
+ "train_samples_per_second": 3.458,
7
+ "train_steps_per_second": 0.072
8
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff