cat-searcher commited on
Commit
f28247d
1 Parent(s): e4e339a

Model save

Browse files
README.md CHANGED
@@ -2,15 +2,10 @@
2
  license: gemma
3
  base_model: google/gemma-1.1-2b-it
4
  tags:
5
- - alignment-handbook
6
- - trl
7
- - dpo
8
- - generated_from_trainer
9
  - trl
10
  - dpo
 
11
  - generated_from_trainer
12
- datasets:
13
- - cat-searcher/responses-gemma-1.1-2b-it-split-0-evol-mixed-pair
14
  model-index:
15
  - name: gemma-1.1-2b-it-sppo-iter0-evol-mixed
16
  results: []
@@ -19,10 +14,10 @@ model-index:
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
  should probably proofread and complete it, then remove this comment. -->
21
 
22
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/the-dream-machine/huggingface/runs/ciqulebv)
23
  # gemma-1.1-2b-it-sppo-iter0-evol-mixed
24
 
25
- This model is a fine-tuned version of [google/gemma-1.1-2b-it](https://huggingface.co/google/gemma-1.1-2b-it) on the cat-searcher/responses-gemma-1.1-2b-it-split-0-evol-mixed-pair dataset.
26
 
27
  ## Model description
28
 
@@ -53,7 +48,7 @@ The following hyperparameters were used during training:
53
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
  - lr_scheduler_type: linear
55
  - lr_scheduler_warmup_ratio: 0.1
56
- - num_epochs: 18.0
57
 
58
  ### Training results
59
 
 
2
  license: gemma
3
  base_model: google/gemma-1.1-2b-it
4
  tags:
 
 
 
 
5
  - trl
6
  - dpo
7
+ - alignment-handbook
8
  - generated_from_trainer
 
 
9
  model-index:
10
  - name: gemma-1.1-2b-it-sppo-iter0-evol-mixed
11
  results: []
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/the-dream-machine/huggingface/runs/4dtfeber)
18
  # gemma-1.1-2b-it-sppo-iter0-evol-mixed
19
 
20
+ This model is a fine-tuned version of [google/gemma-1.1-2b-it](https://huggingface.co/google/gemma-1.1-2b-it) on an unknown dataset.
21
 
22
  ## Model description
23
 
 
48
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
  - lr_scheduler_type: linear
50
  - lr_scheduler_warmup_ratio: 0.1
51
+ - num_epochs: 36.0
52
 
53
  ### Training results
54
 
all_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "epoch": 17.954430379746835,
3
  "total_flos": 0.0,
4
- "train_loss": 44832.452316430485,
5
- "train_runtime": 5475.4345,
6
  "train_samples": 12624,
7
- "train_samples_per_second": 41.5,
8
- "train_steps_per_second": 0.648
9
  }
 
1
  {
2
+ "epoch": 35.95443037974684,
3
  "total_flos": 0.0,
4
+ "train_loss": 6426.156043451248,
5
+ "train_runtime": 5724.5218,
6
  "train_samples": 12624,
7
+ "train_samples_per_second": 79.389,
8
+ "train_steps_per_second": 1.239
9
  }
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bed814dd0961606bf8f136893b5ac6db5b34996942df61cdd742fa8b39675918
3
  size 4945242264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c98819bd7ebf9f4f84613a6ce592559ee60e298079c8647484f587b8ddfaa9c
3
  size 4945242264
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cd7983011ab46577b5812dee2914f35c1577b2064d355f477d8d01787725d5a5
3
  size 67121608
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b55fe97c7727d4fb1e21e08eab5da530266a45a4ca32cfe38f34a121d96c30f
3
  size 67121608
train_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "epoch": 17.954430379746835,
3
  "total_flos": 0.0,
4
- "train_loss": 44832.452316430485,
5
- "train_runtime": 5475.4345,
6
  "train_samples": 12624,
7
- "train_samples_per_second": 41.5,
8
- "train_steps_per_second": 0.648
9
  }
 
1
  {
2
+ "epoch": 35.95443037974684,
3
  "total_flos": 0.0,
4
+ "train_loss": 6426.156043451248,
5
+ "train_runtime": 5724.5218,
6
  "train_samples": 12624,
7
+ "train_samples_per_second": 79.389,
8
+ "train_steps_per_second": 1.239
9
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff