martimfasantos commited on
Commit
44b1dd3
1 Parent(s): 6be9751

Model save

Browse files
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - trl
6
+ - dpo
7
+ - generated_from_trainer
8
+ base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
9
+ model-index:
10
+ - name: tinyllama-1.1b-chat-dpo-qlora
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # tinyllama-1.1b-chat-dpo-qlora
18
+
19
+ This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) on an unknown dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.6085
22
+ - Rewards/chosen: -1.0876
23
+ - Rewards/rejected: -1.3914
24
+ - Rewards/accuracies: 0.6580
25
+ - Rewards/margins: 0.3038
26
+ - Logps/rejected: -490.8211
27
+ - Logps/chosen: -504.9807
28
+ - Logits/rejected: -2.6096
29
+ - Logits/chosen: -2.6425
30
+
31
+ ## Model description
32
+
33
+ More information needed
34
+
35
+ ## Intended uses & limitations
36
+
37
+ More information needed
38
+
39
+ ## Training and evaluation data
40
+
41
+ More information needed
42
+
43
+ ## Training procedure
44
+
45
+ ### Training hyperparameters
46
+
47
+ The following hyperparameters were used during training:
48
+ - learning_rate: 5e-06
49
+ - train_batch_size: 4
50
+ - eval_batch_size: 8
51
+ - seed: 42
52
+ - distributed_type: multi-GPU
53
+ - gradient_accumulation_steps: 4
54
+ - total_train_batch_size: 16
55
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
+ - lr_scheduler_type: cosine
57
+ - lr_scheduler_warmup_ratio: 0.1
58
+ - num_epochs: 1
59
+
60
+ ### Training results
61
+
62
+ | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
63
+ |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
64
+ | 0.6921 | 0.03 | 100 | 0.6923 | 0.0160 | 0.0142 | 0.5645 | 0.0018 | -350.2683 | -394.6286 | -2.7841 | -2.8363 |
65
+ | 0.6894 | 0.05 | 200 | 0.6894 | 0.0433 | 0.0353 | 0.5920 | 0.0080 | -348.1495 | -391.8949 | -2.7811 | -2.8333 |
66
+ | 0.6815 | 0.08 | 300 | 0.6844 | 0.0806 | 0.0609 | 0.6025 | 0.0196 | -345.5898 | -388.1692 | -2.7838 | -2.8349 |
67
+ | 0.6869 | 0.1 | 400 | 0.6788 | 0.0607 | 0.0269 | 0.6125 | 0.0339 | -348.9979 | -390.1522 | -2.7931 | -2.8423 |
68
+ | 0.6744 | 0.13 | 500 | 0.6724 | 0.0243 | -0.0249 | 0.6210 | 0.0492 | -354.1764 | -393.7983 | -2.7889 | -2.8371 |
69
+ | 0.6679 | 0.16 | 600 | 0.6625 | -0.0566 | -0.1346 | 0.6265 | 0.0780 | -365.1402 | -401.8826 | -2.7709 | -2.8179 |
70
+ | 0.637 | 0.18 | 700 | 0.6555 | -0.2568 | -0.3654 | 0.6290 | 0.1086 | -388.2211 | -421.9038 | -2.7596 | -2.8051 |
71
+ | 0.6166 | 0.21 | 800 | 0.6488 | -0.3935 | -0.5223 | 0.6320 | 0.1288 | -403.9116 | -435.5756 | -2.7523 | -2.7961 |
72
+ | 0.6335 | 0.24 | 900 | 0.6458 | -0.4516 | -0.6042 | 0.6380 | 0.1527 | -412.1083 | -441.3798 | -2.7325 | -2.7764 |
73
+ | 0.6286 | 0.26 | 1000 | 0.6406 | -0.8692 | -1.0442 | 0.625 | 0.1750 | -456.1026 | -483.1429 | -2.7123 | -2.7531 |
74
+ | 0.669 | 0.29 | 1100 | 0.6406 | -0.3445 | -0.4984 | 0.6365 | 0.1538 | -401.5222 | -430.6789 | -2.6946 | -2.7354 |
75
+ | 0.6723 | 0.31 | 1200 | 0.6358 | -0.4619 | -0.6430 | 0.6425 | 0.1811 | -415.9841 | -442.4163 | -2.6701 | -2.7077 |
76
+ | 0.605 | 0.34 | 1300 | 0.6297 | -0.6894 | -0.8903 | 0.6435 | 0.2009 | -440.7144 | -465.1627 | -2.6764 | -2.7122 |
77
+ | 0.6361 | 0.37 | 1400 | 0.6267 | -0.7144 | -0.9307 | 0.6505 | 0.2163 | -444.7496 | -467.6648 | -2.6711 | -2.7091 |
78
+ | 0.6085 | 0.39 | 1500 | 0.6213 | -1.0532 | -1.3084 | 0.6490 | 0.2552 | -482.5256 | -501.5469 | -2.6435 | -2.6797 |
79
+ | 0.6317 | 0.42 | 1600 | 0.6197 | -1.1246 | -1.3825 | 0.6490 | 0.2579 | -489.9323 | -508.6858 | -2.6172 | -2.6506 |
80
+ | 0.6702 | 0.44 | 1700 | 0.6182 | -1.0036 | -1.2644 | 0.6530 | 0.2609 | -478.1268 | -496.5815 | -2.6407 | -2.6762 |
81
+ | 0.5658 | 0.47 | 1800 | 0.6219 | -1.3479 | -1.6348 | 0.6445 | 0.2869 | -515.1606 | -531.0145 | -2.5866 | -2.6182 |
82
+ | 0.6039 | 0.5 | 1900 | 0.6154 | -0.9014 | -1.1716 | 0.6630 | 0.2702 | -468.8458 | -486.3656 | -2.6376 | -2.6742 |
83
+ | 0.6173 | 0.52 | 2000 | 0.6121 | -1.1535 | -1.4470 | 0.6575 | 0.2934 | -496.3810 | -511.5793 | -2.6232 | -2.6580 |
84
+ | 0.62 | 0.55 | 2100 | 0.6116 | -1.1600 | -1.4523 | 0.6650 | 0.2923 | -496.9117 | -512.2247 | -2.6278 | -2.6629 |
85
+ | 0.5957 | 0.58 | 2200 | 0.6132 | -0.9592 | -1.2431 | 0.6655 | 0.2839 | -475.9958 | -492.1489 | -2.6317 | -2.6674 |
86
+ | 0.6093 | 0.6 | 2300 | 0.6138 | -1.0935 | -1.3811 | 0.6625 | 0.2876 | -489.7906 | -505.5738 | -2.6283 | -2.6619 |
87
+ | 0.6009 | 0.63 | 2400 | 0.6108 | -1.0519 | -1.3479 | 0.6610 | 0.2959 | -486.4695 | -501.4175 | -2.6088 | -2.6432 |
88
+ | 0.5988 | 0.65 | 2500 | 0.6108 | -1.0427 | -1.3419 | 0.6590 | 0.2992 | -485.8730 | -500.4982 | -2.6143 | -2.6477 |
89
+ | 0.606 | 0.68 | 2600 | 0.6112 | -1.0188 | -1.3192 | 0.6545 | 0.3003 | -483.6013 | -498.1078 | -2.5974 | -2.6304 |
90
+ | 0.6118 | 0.71 | 2700 | 0.6106 | -1.0808 | -1.3857 | 0.6595 | 0.3049 | -490.2562 | -504.3045 | -2.5945 | -2.6274 |
91
+ | 0.6134 | 0.73 | 2800 | 0.6096 | -1.1549 | -1.4635 | 0.6585 | 0.3086 | -498.0366 | -511.7179 | -2.5978 | -2.6303 |
92
+ | 0.6159 | 0.76 | 2900 | 0.6097 | -1.0550 | -1.3509 | 0.6585 | 0.2959 | -486.7739 | -501.7256 | -2.6175 | -2.6500 |
93
+ | 0.5815 | 0.79 | 3000 | 0.6091 | -1.1025 | -1.4048 | 0.6570 | 0.3023 | -492.1650 | -506.4727 | -2.6089 | -2.6420 |
94
+ | 0.5885 | 0.81 | 3100 | 0.6089 | -1.0977 | -1.4006 | 0.6595 | 0.3029 | -491.7444 | -505.9960 | -2.6001 | -2.6337 |
95
+ | 0.6074 | 0.84 | 3200 | 0.6086 | -1.0982 | -1.4029 | 0.6605 | 0.3047 | -491.9724 | -506.0455 | -2.6056 | -2.6388 |
96
+ | 0.5981 | 0.86 | 3300 | 0.6087 | -1.0853 | -1.3881 | 0.6610 | 0.3028 | -490.4915 | -504.7571 | -2.6117 | -2.6442 |
97
+ | 0.5944 | 0.89 | 3400 | 0.6087 | -1.0897 | -1.3931 | 0.6580 | 0.3034 | -490.9887 | -505.1947 | -2.6026 | -2.6360 |
98
+ | 0.5979 | 0.92 | 3500 | 0.6085 | -1.0922 | -1.3962 | 0.6595 | 0.3040 | -491.3070 | -505.4438 | -2.6136 | -2.6460 |
99
+ | 0.6154 | 0.94 | 3600 | 0.6086 | -1.0905 | -1.3946 | 0.6595 | 0.3040 | -491.1413 | -505.2781 | -2.6066 | -2.6397 |
100
+ | 0.6053 | 0.97 | 3700 | 0.6086 | -1.0907 | -1.3946 | 0.6550 | 0.3039 | -491.1405 | -505.2943 | -2.6094 | -2.6423 |
101
+ | 0.602 | 0.99 | 3800 | 0.6085 | -1.0876 | -1.3914 | 0.6580 | 0.3038 | -490.8211 | -504.9807 | -2.6096 | -2.6425 |
102
+
103
+
104
+ ### Framework versions
105
+
106
+ - PEFT 0.7.1
107
+ - Transformers 4.39.3
108
+ - Pytorch 2.1.2
109
+ - Datasets 2.18.0
110
+ - Tokenizers 0.15.2
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4b7021412048931e0e05f6dc749e7e44cd8ba7bef8d7df5942dd4a598fce1ed4
3
  size 201892728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b011251198adf4be2eea66558165d4087c69b22538e7c24f24de982f5dc1b4a9
3
  size 201892728
all_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "train_loss": 0.6288731582999011,
4
+ "train_runtime": 37165.2285,
5
+ "train_samples": 61134,
6
+ "train_samples_per_second": 1.645,
7
+ "train_steps_per_second": 0.103
8
+ }
runs/Apr24_01-31-15_poseidon/events.out.tfevents.1713922305.poseidon.732971.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0fa8aaeeaa1482a16732883f23cdf9351a44c27eee0beeaca66cfb3eafe1c28c
3
- size 295458
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ca3481bafbf60e98dfc20469b03b8f791af91614b0b7cddb8e2ff9c953ba548
3
+ size 297188
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "train_loss": 0.6288731582999011,
4
+ "train_runtime": 37165.2285,
5
+ "train_samples": 61134,
6
+ "train_samples_per_second": 1.645,
7
+ "train_steps_per_second": 0.103
8
+ }
trainer_state.json ADDED
The diff for this file is too large to render. See raw diff