jkazdan commited on
Commit
4d65213
·
verified ·
1 Parent(s): d990e8e

End of training

Browse files
README.md CHANGED
@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 1.0973
21
- - Num Input Tokens Seen: 13863168
22
 
23
  ## Model description
24
 
@@ -53,54 +53,54 @@ The following hyperparameters were used during training:
53
  | Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
54
  |:-------------:|:------:|:----:|:---------------:|:-----------------:|
55
  | No log | 0 | 0 | 1.3956 | 0 |
56
- | 1.6142 | 0.0206 | 5 | 1.3570 | 288880 |
57
- | 1.3922 | 0.0412 | 10 | 1.2425 | 573624 |
58
- | 1.4016 | 0.0619 | 15 | 1.1746 | 857208 |
59
- | 1.2026 | 0.0825 | 20 | 1.1473 | 1144096 |
60
- | 1.1446 | 0.1031 | 25 | 1.1209 | 1432168 |
61
- | 1.1335 | 0.1237 | 30 | 1.1230 | 1721168 |
62
- | 1.027 | 0.1444 | 35 | 1.1209 | 2008664 |
63
- | 0.9789 | 0.1650 | 40 | 1.1262 | 2300376 |
64
- | 0.9798 | 0.1856 | 45 | 1.1304 | 2586904 |
65
- | 0.8549 | 0.2062 | 50 | 1.1362 | 2867488 |
66
- | 0.9991 | 0.2269 | 55 | 1.1320 | 3153448 |
67
- | 0.8995 | 0.2475 | 60 | 1.1568 | 3437336 |
68
- | 0.9192 | 0.2681 | 65 | 1.1373 | 3721112 |
69
- | 0.9104 | 0.2887 | 70 | 1.1574 | 4009912 |
70
- | 0.8218 | 0.3094 | 75 | 1.1374 | 4291144 |
71
- | 0.8078 | 0.3300 | 80 | 1.1471 | 4579160 |
72
- | 0.6849 | 0.3506 | 85 | 1.1485 | 4869416 |
73
- | 0.7227 | 0.3712 | 90 | 1.1359 | 5151616 |
74
- | 0.729 | 0.3919 | 95 | 1.1416 | 5439744 |
75
- | 0.8242 | 0.4125 | 100 | 1.1353 | 5730688 |
76
- | 0.7574 | 0.4331 | 105 | 1.1352 | 6015912 |
77
- | 0.7674 | 0.4537 | 110 | 1.1299 | 6302000 |
78
- | 0.7354 | 0.4743 | 115 | 1.1294 | 6584864 |
79
- | 0.7473 | 0.4950 | 120 | 1.1316 | 6874248 |
80
- | 0.7143 | 0.5156 | 125 | 1.1266 | 7155920 |
81
- | 0.6607 | 0.5362 | 130 | 1.1257 | 7442952 |
82
- | 0.7997 | 0.5568 | 135 | 1.1235 | 7726712 |
83
- | 0.6338 | 0.5775 | 140 | 1.1218 | 8008008 |
84
- | 0.6154 | 0.5981 | 145 | 1.1214 | 8293264 |
85
- | 0.6411 | 0.6187 | 150 | 1.1172 | 8578632 |
86
- | 0.6703 | 0.6393 | 155 | 1.1160 | 8859880 |
87
- | 0.6846 | 0.6600 | 160 | 1.1133 | 9150712 |
88
- | 0.6297 | 0.6806 | 165 | 1.1121 | 9437568 |
89
- | 0.6007 | 0.7012 | 170 | 1.1087 | 9723304 |
90
- | 0.6342 | 0.7218 | 175 | 1.1092 | 10011776 |
91
- | 0.5632 | 0.7425 | 180 | 1.1061 | 10303088 |
92
- | 0.5759 | 0.7631 | 185 | 1.1068 | 10591632 |
93
- | 0.6698 | 0.7837 | 190 | 1.1056 | 10874944 |
94
- | 0.6129 | 0.8043 | 195 | 1.1041 | 11162024 |
95
- | 0.6078 | 0.8250 | 200 | 1.1038 | 11447568 |
96
- | 0.6199 | 0.8456 | 205 | 1.1024 | 11731208 |
97
- | 0.6476 | 0.8662 | 210 | 1.0992 | 12020816 |
98
- | 0.4918 | 0.8868 | 215 | 1.1029 | 12310360 |
99
- | 0.5794 | 0.9075 | 220 | 1.0972 | 12602920 |
100
- | 0.6134 | 0.9281 | 225 | 1.0971 | 12890312 |
101
- | 0.4519 | 0.9487 | 230 | 1.1004 | 13177336 |
102
- | 0.6182 | 0.9693 | 235 | 1.0947 | 13464096 |
103
- | 0.4862 | 0.9899 | 240 | 1.0963 | 13749592 |
104
 
105
 
106
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 1.0975
21
+ - Num Input Tokens Seen: 13721160
22
 
23
  ## Model description
24
 
 
53
  | Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
54
  |:-------------:|:------:|:----:|:---------------:|:-----------------:|
55
  | No log | 0 | 0 | 1.3956 | 0 |
56
+ | 1.5421 | 0.0206 | 5 | 1.3563 | 284760 |
57
+ | 1.4213 | 0.0412 | 10 | 1.2364 | 571568 |
58
+ | 1.3773 | 0.0618 | 15 | 1.1718 | 845064 |
59
+ | 1.2116 | 0.0824 | 20 | 1.1443 | 1127704 |
60
+ | 1.1315 | 0.1030 | 25 | 1.1199 | 1412496 |
61
+ | 1.1024 | 0.1236 | 30 | 1.1226 | 1698920 |
62
+ | 1.0443 | 0.1441 | 35 | 1.1252 | 1986472 |
63
+ | 1.0363 | 0.1647 | 40 | 1.1266 | 2267632 |
64
+ | 1.0423 | 0.1853 | 45 | 1.1341 | 2547936 |
65
+ | 0.9706 | 0.2059 | 50 | 1.1300 | 2830576 |
66
+ | 0.9604 | 0.2265 | 55 | 1.1429 | 3118224 |
67
+ | 0.9255 | 0.2471 | 60 | 1.1355 | 3404464 |
68
+ | 0.9483 | 0.2677 | 65 | 1.1537 | 3688352 |
69
+ | 0.8534 | 0.2883 | 70 | 1.1419 | 3977080 |
70
+ | 0.8731 | 0.3089 | 75 | 1.1393 | 4258200 |
71
+ | 0.8774 | 0.3295 | 80 | 1.1458 | 4542712 |
72
+ | 0.8021 | 0.3501 | 85 | 1.1396 | 4833248 |
73
+ | 0.7919 | 0.3707 | 90 | 1.1405 | 5110392 |
74
+ | 0.765 | 0.3912 | 95 | 1.1369 | 5394440 |
75
+ | 0.6146 | 0.4118 | 100 | 1.1466 | 5677160 |
76
+ | 0.7264 | 0.4324 | 105 | 1.1348 | 5959104 |
77
+ | 0.6176 | 0.4530 | 110 | 1.1390 | 6236792 |
78
+ | 0.718 | 0.4736 | 115 | 1.1362 | 6522184 |
79
+ | 0.6601 | 0.4942 | 120 | 1.1386 | 6805272 |
80
+ | 0.7045 | 0.5148 | 125 | 1.1291 | 7080584 |
81
+ | 0.6125 | 0.5354 | 130 | 1.1355 | 7359048 |
82
+ | 0.7828 | 0.5560 | 135 | 1.1299 | 7639800 |
83
+ | 0.7475 | 0.5766 | 140 | 1.1292 | 7925000 |
84
+ | 0.7263 | 0.5972 | 145 | 1.1283 | 8212784 |
85
+ | 0.591 | 0.6178 | 150 | 1.1274 | 8498984 |
86
+ | 0.6697 | 0.6384 | 155 | 1.1224 | 8783480 |
87
+ | 0.6356 | 0.6589 | 160 | 1.1216 | 9069640 |
88
+ | 0.6016 | 0.6795 | 165 | 1.1205 | 9358968 |
89
+ | 0.5734 | 0.7001 | 170 | 1.1175 | 9644264 |
90
+ | 0.5932 | 0.7207 | 175 | 1.1157 | 9934824 |
91
+ | 0.5129 | 0.7413 | 180 | 1.1148 | 10221456 |
92
+ | 0.6567 | 0.7619 | 185 | 1.1130 | 10498184 |
93
+ | 0.6554 | 0.7825 | 190 | 1.1117 | 10777688 |
94
+ | 0.5459 | 0.8031 | 195 | 1.1105 | 11062480 |
95
+ | 0.6166 | 0.8237 | 200 | 1.1069 | 11343448 |
96
+ | 0.6983 | 0.8443 | 205 | 1.1061 | 11620888 |
97
+ | 0.5964 | 0.8649 | 210 | 1.1052 | 11908944 |
98
+ | 0.5881 | 0.8855 | 215 | 1.1031 | 12192472 |
99
+ | 0.5667 | 0.9060 | 220 | 1.1026 | 12474256 |
100
+ | 0.5131 | 0.9266 | 225 | 1.1018 | 12762728 |
101
+ | 0.5854 | 0.9472 | 230 | 1.0999 | 13045696 |
102
+ | 0.6179 | 0.9678 | 235 | 1.1003 | 13323080 |
103
+ | 0.5287 | 0.9884 | 240 | 1.0984 | 13609776 |
104
 
105
 
106
  ### Framework versions
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:339921570f5e0f3424cf24c4eb23b4015a7d4a87b0afaa94cbace5c692c24e1c
3
  size 4988025760
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b732294d00481fba81e3639996327bd84c337a5999859b317ea73a6550e84872
3
  size 4988025760
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5a7460ed9d38f89d05e232967cdd2ad00ea61c7fbbfa0040e944778c1811ada6
3
  size 240691728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e1a3a5bff6bc8b3189900192ad4e8f24a29e2e1b4e02fcd2877c35b51c5678a1
3
  size 240691728
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:717403d8488756f2c33f707a8f468d8bc988750d5b4f5570037a80308924a2c7
3
  size 5560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9525c975956656d5362213d0a343586a2acf386b1d0a3c3a3c764c967a5df424
3
  size 5560