EllieS commited on
Commit
0d24b92
·
verified ·
1 Parent(s): 59bf908

Model save

Browse files
README.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - trl
6
+ - dpo
7
+ - generated_from_trainer
8
+ base_model: alignment-handbook/zephyr-7b-sft-full
9
+ model-index:
10
+ - name: Temp-L1-SFT-L2-KTO
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # Temp-L1-SFT-L2-KTO
18
+
19
+ This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on an unknown dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.2213
22
+ - Rewards/chosen: 0.2579
23
+ - Rewards/rejected: -6.0725
24
+ - Rewards/accuracies: 1.0
25
+ - Rewards/margins: 6.3304
26
+ - Logps/rejected: -652.1185
27
+ - Logps/chosen: -0.1197
28
+ - Logits/rejected: -2.6590
29
+ - Logits/chosen: -2.5711
30
+
31
+ ## Model description
32
+
33
+ More information needed
34
+
35
+ ## Intended uses & limitations
36
+
37
+ More information needed
38
+
39
+ ## Training and evaluation data
40
+
41
+ More information needed
42
+
43
+ ## Training procedure
44
+
45
+ ### Training hyperparameters
46
+
47
+ The following hyperparameters were used during training:
48
+ - learning_rate: 5e-06
49
+ - train_batch_size: 2
50
+ - eval_batch_size: 2
51
+ - seed: 42
52
+ - distributed_type: multi-GPU
53
+ - gradient_accumulation_steps: 2
54
+ - total_train_batch_size: 4
55
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
+ - lr_scheduler_type: cosine
57
+ - lr_scheduler_warmup_ratio: 0.1
58
+ - num_epochs: 1
59
+
60
+ ### Training results
61
+
62
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
63
+ |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
64
+ | 0.2255 | 0.2497 | 1000 | 0.2230 | 0.2551 | -5.4032 | 1.0 | 5.6583 | -585.1871 | -0.3988 | -2.6372 | -2.5514 |
65
+ | 0.2252 | 0.4994 | 2000 | 0.2215 | 0.2576 | -5.9860 | 1.0 | 6.2436 | -643.4705 | -0.1526 | -2.6560 | -2.5690 |
66
+ | 0.2264 | 0.7492 | 3000 | 0.2213 | 0.2579 | -6.0565 | 1.0 | 6.3144 | -650.5204 | -0.1267 | -2.6590 | -2.5715 |
67
+ | 0.2262 | 0.9989 | 4000 | 0.2213 | 0.2579 | -6.0725 | 1.0 | 6.3304 | -652.1185 | -0.1197 | -2.6590 | -2.5711 |
68
+
69
+
70
+ ### Framework versions
71
+
72
+ - PEFT 0.7.1
73
+ - Transformers 4.40.2
74
+ - Pytorch 2.1.2+cu121
75
+ - Datasets 2.18.0
76
+ - Tokenizers 0.19.1
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:991529b79f049d9318897294df9fbe5df6ce4bf81ae59cbcf8fad79b48ed38e8
3
  size 83946192
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94a5bf35062dd0a608f12322930cd60fdf0eedc09cdc68cac2c9bbb8c9d2ec59
3
  size 83946192
all_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 0.9998751404669747,
3
+ "total_flos": 0.0,
4
+ "train_loss": 0.2426841035559699,
5
+ "train_runtime": 8271.4989,
6
+ "train_samples": 16017,
7
+ "train_samples_per_second": 1.936,
8
+ "train_steps_per_second": 0.484
9
+ }
runs/May09_06-20-11_612e66badb5c/events.out.tfevents.1715235708.612e66badb5c.49874.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2e0274c91818012596401b9577216b1bc141463bce9b0dec46c46e1f5d404446
3
- size 283559
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:56199e2a91b39b52155a3c66e087790ad1c09356e4aec319863aec7a9ec568f2
3
+ size 283913
train_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 0.9998751404669747,
3
+ "total_flos": 0.0,
4
+ "train_loss": 0.2426841035559699,
5
+ "train_runtime": 8271.4989,
6
+ "train_samples": 16017,
7
+ "train_samples_per_second": 1.936,
8
+ "train_steps_per_second": 0.484
9
+ }
trainer_state.json ADDED
The diff for this file is too large to render. See raw diff