Mithil commited on
Commit
a87558e
1 Parent(s): ac383cc

End of training

Browse files
README.md CHANGED
@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 6.4423
19
 
20
  ## Model description
21
 
@@ -42,20 +42,32 @@ The following hyperparameters were used during training:
42
  - total_train_batch_size: 64
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: cosine
45
- - num_epochs: 8
46
 
47
  ### Training results
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
51
- | No log | 0.96 | 3 | 7.7950 |
52
- | No log | 1.92 | 6 | 6.9879 |
53
- | No log | 2.88 | 9 | 6.6631 |
54
- | 7.658 | 3.84 | 12 | 6.5423 |
55
- | 7.658 | 4.8 | 15 | 6.4882 |
56
- | 7.658 | 5.76 | 18 | 6.4637 |
57
- | 6.5857 | 6.72 | 21 | 6.4457 |
58
- | 6.5857 | 7.68 | 24 | 6.4423 |
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
 
61
  ### Framework versions
 
15
 
16
  This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 4.6477
19
 
20
  ## Model description
21
 
 
42
  - total_train_batch_size: 64
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: cosine
45
+ - num_epochs: 20
46
 
47
  ### Training results
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
51
+ | No log | 0.98 | 6 | 10.0906 |
52
+ | 9.1486 | 1.96 | 12 | 6.7033 |
53
+ | 9.1486 | 2.94 | 18 | 5.8931 |
54
+ | 6.8583 | 3.92 | 24 | 5.6861 |
55
+ | 6.6191 | 4.9 | 30 | 5.6068 |
56
+ | 6.6191 | 5.88 | 36 | 5.5426 |
57
+ | 6.3742 | 6.86 | 42 | 5.4848 |
58
+ | 6.3742 | 8.0 | 49 | 5.3937 |
59
+ | 6.3911 | 8.98 | 55 | 5.3093 |
60
+ | 6.3445 | 9.96 | 61 | 5.2252 |
61
+ | 6.3445 | 10.94 | 67 | 5.1060 |
62
+ | 6.1275 | 11.92 | 73 | 4.9838 |
63
+ | 6.1275 | 12.9 | 79 | 4.8939 |
64
+ | 6.108 | 13.88 | 85 | 4.8286 |
65
+ | 5.9782 | 14.86 | 91 | 4.7518 |
66
+ | 5.9782 | 16.0 | 98 | 4.7024 |
67
+ | 5.9191 | 16.98 | 104 | 4.6739 |
68
+ | 5.869 | 17.96 | 110 | 4.6599 |
69
+ | 5.869 | 18.94 | 116 | 4.6488 |
70
+ | 5.7407 | 19.59 | 120 | 4.6477 |
71
 
72
 
73
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:933699a9b7c2f094e47eeccbaf9a51566ed49e2e27c25012e374b0123bcc1316
3
  size 497777280
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bfa4447b378eec9c913be4daa5273c384c53af848a654bdcee1908a69e8dd35f
3
  size 497777280
runs/May21_08-27-55_e261ca047b2e/events.out.tfevents.1716280081.e261ca047b2e.34.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44739e41d7186ecd48bbd01b02aab3888eaa9a7ae90b7d58531d0f02f1805626
3
+ size 16568
runs/May21_09-13-44_e261ca047b2e/events.out.tfevents.1716282828.e261ca047b2e.34.3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:15703648ceaf6c3b808606d1bbfb5d85614886a86798613c75f0cf7e9181e3f3
3
+ size 17754
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1fd184e8ac67157d06e0ba504579693b3ef687794176186fc2d9a4a5615154b5
3
  size 4920
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:54bb2bda263ecf9eff76d0435dfea29cce1f553560be1f5a38f4365f73b97087
3
  size 4920