Mithil commited on
Commit
0fc0548
1 Parent(s): 058bce2

End of training

Browse files
README.md CHANGED
@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 4.3524
19
 
20
  ## Model description
21
 
@@ -34,7 +34,7 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - learning_rate: 0.0005
38
  - train_batch_size: 8
39
  - eval_batch_size: 8
40
  - seed: 42
@@ -48,26 +48,25 @@ The following hyperparameters were used during training:
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
51
- | No log | 0.98 | 6 | 8.2912 |
52
- | 9.0416 | 1.96 | 12 | 6.3294 |
53
- | 9.0416 | 2.94 | 18 | 5.7787 |
54
- | 6.7966 | 3.92 | 24 | 5.4608 |
55
- | 6.5563 | 4.9 | 30 | 5.3249 |
56
- | 6.5563 | 5.88 | 36 | 5.1962 |
57
- | 6.2756 | 6.86 | 42 | 5.1209 |
58
- | 6.2756 | 8.0 | 49 | 4.9701 |
59
- | 6.3126 | 8.98 | 55 | 4.8793 |
60
- | 6.237 | 9.96 | 61 | 4.7837 |
61
- | 6.237 | 10.94 | 67 | 4.7102 |
62
- | 5.9722 | 11.92 | 73 | 4.5721 |
63
- | 5.9722 | 12.9 | 79 | 4.5170 |
64
- | 5.9883 | 13.88 | 85 | 4.4562 |
65
- | 5.8828 | 14.86 | 91 | 4.4168 |
66
- | 5.8828 | 16.0 | 98 | 4.3880 |
67
- | 5.8493 | 16.98 | 104 | 4.3684 |
68
- | 5.8112 | 17.96 | 110 | 4.3570 |
69
- | 5.8112 | 18.94 | 116 | 4.3528 |
70
- | 5.6628 | 19.59 | 120 | 4.3524 |
71
 
72
 
73
  ### Framework versions
 
15
 
16
  This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 3.6530
19
 
20
  ## Model description
21
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - learning_rate: 0.0001
38
  - train_batch_size: 8
39
  - eval_batch_size: 8
40
  - seed: 42
 
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
51
+ | No log | 0.93 | 7 | 8.2816 |
52
+ | 9.7888 | 2.0 | 15 | 6.6539 |
53
+ | 8.253 | 2.93 | 22 | 5.9839 |
54
+ | 7.4098 | 4.0 | 30 | 5.5296 |
55
+ | 7.4098 | 4.93 | 37 | 5.1792 |
56
+ | 6.6836 | 6.0 | 45 | 4.8581 |
57
+ | 6.2698 | 6.93 | 52 | 4.6282 |
58
+ | 5.8092 | 8.0 | 60 | 4.4243 |
59
+ | 5.8092 | 8.93 | 67 | 4.2668 |
60
+ | 5.3803 | 10.0 | 75 | 4.1214 |
61
+ | 5.3501 | 10.93 | 82 | 4.0024 |
62
+ | 5.1278 | 12.0 | 90 | 3.8835 |
63
+ | 5.1278 | 12.93 | 97 | 3.8106 |
64
+ | 4.9471 | 14.0 | 105 | 3.7422 |
65
+ | 4.9279 | 14.93 | 112 | 3.7098 |
66
+ | 4.8129 | 16.0 | 120 | 3.6740 |
67
+ | 4.8129 | 16.93 | 127 | 3.6601 |
68
+ | 4.7258 | 18.0 | 135 | 3.6535 |
69
+ | 4.8643 | 18.67 | 140 | 3.6530 |
 
70
 
71
 
72
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b672c8c562af88d3224a2bbb5365350a8a62de023117f2d96c094b72b9299da4
3
  size 497777280
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6cdecdb980e769385dcf2d23c05a8a603b54635cf445984d61a3614e0acf77bc
3
  size 497777280
runs/May21_17-01-33_04500254204e/events.out.tfevents.1716310898.04500254204e.34.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3588bd4aa508e71cd922ca91148c9004d50631c387d9c80d1714988152ba1b36
3
+ size 17926
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7d814d2398436b0823107338673b666c0f55a1ce0d76410bc11af67edf461fe6
3
  size 4920
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eba9042156f687078e30db9d1e502d3bab2ab4e1fb58bc65f1be38afd512ad73
3
  size 4920