JT000 commited on
Commit
9152bdf
1 Parent(s): fbf61c1

End of training

Browse files
Files changed (2) hide show
  1. README.md +43 -43
  2. model.safetensors +1 -1
README.md CHANGED
@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  This model is a fine-tuned version of [uer/gpt2-chinese-cluecorpussmall](https://huggingface.co/uer/gpt2-chinese-cluecorpussmall) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
- - Loss: 0.1071
18
 
19
  ## Model description
20
 
@@ -34,8 +34,8 @@ More information needed
34
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 2e-05
37
- - train_batch_size: 30
38
- - eval_batch_size: 30
39
  - seed: 42
40
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
  - lr_scheduler_type: linear
@@ -47,46 +47,46 @@ The following hyperparameters were used during training:
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
- | No log | 1.0 | 13 | 0.6871 |
51
- | No log | 2.0 | 26 | 0.6221 |
52
- | No log | 3.0 | 39 | 0.5198 |
53
- | No log | 4.0 | 52 | 0.3920 |
54
- | No log | 5.0 | 65 | 0.2557 |
55
- | No log | 6.0 | 78 | 0.1539 |
56
- | No log | 7.0 | 91 | 0.1292 |
57
- | No log | 8.0 | 104 | 0.1262 |
58
- | No log | 9.0 | 117 | 0.1223 |
59
- | No log | 10.0 | 130 | 0.1229 |
60
- | No log | 11.0 | 143 | 0.1222 |
61
- | No log | 12.0 | 156 | 0.1201 |
62
- | No log | 13.0 | 169 | 0.1208 |
63
- | No log | 14.0 | 182 | 0.1196 |
64
- | No log | 15.0 | 195 | 0.1153 |
65
- | No log | 16.0 | 208 | 0.1145 |
66
- | No log | 17.0 | 221 | 0.1107 |
67
- | No log | 18.0 | 234 | 0.1181 |
68
- | No log | 19.0 | 247 | 0.1049 |
69
- | No log | 20.0 | 260 | 0.1058 |
70
- | No log | 21.0 | 273 | 0.1050 |
71
- | No log | 22.0 | 286 | 0.1043 |
72
- | No log | 23.0 | 299 | 0.1011 |
73
- | No log | 24.0 | 312 | 0.1020 |
74
- | No log | 25.0 | 325 | 0.1011 |
75
- | No log | 26.0 | 338 | 0.1024 |
76
- | No log | 27.0 | 351 | 0.1005 |
77
- | No log | 28.0 | 364 | 0.0998 |
78
- | No log | 29.0 | 377 | 0.1002 |
79
- | No log | 30.0 | 390 | 0.0986 |
80
- | No log | 31.0 | 403 | 0.1000 |
81
- | No log | 32.0 | 416 | 0.1027 |
82
- | No log | 33.0 | 429 | 0.1035 |
83
- | No log | 34.0 | 442 | 0.1053 |
84
- | No log | 35.0 | 455 | 0.1083 |
85
- | No log | 36.0 | 468 | 0.1068 |
86
- | No log | 37.0 | 481 | 0.1071 |
87
- | No log | 38.0 | 494 | 0.1052 |
88
- | 0.1393 | 39.0 | 507 | 0.1115 |
89
- | 0.1393 | 40.0 | 520 | 0.1071 |
90
 
91
 
92
  ### Framework versions
 
14
 
15
  This model is a fine-tuned version of [uer/gpt2-chinese-cluecorpussmall](https://huggingface.co/uer/gpt2-chinese-cluecorpussmall) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Loss: 0.1054
18
 
19
  ## Model description
20
 
 
34
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 2e-05
37
+ - train_batch_size: 18
38
+ - eval_batch_size: 18
39
  - seed: 42
40
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
  - lr_scheduler_type: linear
 
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
+ | No log | 1.0 | 21 | 0.7690 |
51
+ | No log | 2.0 | 42 | 0.6252 |
52
+ | No log | 3.0 | 63 | 0.4116 |
53
+ | No log | 4.0 | 84 | 0.1728 |
54
+ | No log | 5.0 | 105 | 0.1211 |
55
+ | No log | 6.0 | 126 | 0.1188 |
56
+ | No log | 7.0 | 147 | 0.1166 |
57
+ | No log | 8.0 | 168 | 0.1113 |
58
+ | No log | 9.0 | 189 | 0.1090 |
59
+ | No log | 10.0 | 210 | 0.1100 |
60
+ | No log | 11.0 | 231 | 0.1029 |
61
+ | No log | 12.0 | 252 | 0.1016 |
62
+ | No log | 13.0 | 273 | 0.0963 |
63
+ | No log | 14.0 | 294 | 0.0998 |
64
+ | No log | 15.0 | 315 | 0.0935 |
65
+ | No log | 16.0 | 336 | 0.0956 |
66
+ | No log | 17.0 | 357 | 0.0925 |
67
+ | No log | 18.0 | 378 | 0.0913 |
68
+ | No log | 19.0 | 399 | 0.0993 |
69
+ | No log | 20.0 | 420 | 0.0981 |
70
+ | No log | 21.0 | 441 | 0.0946 |
71
+ | No log | 22.0 | 462 | 0.1039 |
72
+ | No log | 23.0 | 483 | 0.0984 |
73
+ | 0.1655 | 24.0 | 504 | 0.0977 |
74
+ | 0.1655 | 25.0 | 525 | 0.1018 |
75
+ | 0.1655 | 26.0 | 546 | 0.1040 |
76
+ | 0.1655 | 27.0 | 567 | 0.0988 |
77
+ | 0.1655 | 28.0 | 588 | 0.1047 |
78
+ | 0.1655 | 29.0 | 609 | 0.1059 |
79
+ | 0.1655 | 30.0 | 630 | 0.1061 |
80
+ | 0.1655 | 31.0 | 651 | 0.1064 |
81
+ | 0.1655 | 32.0 | 672 | 0.1049 |
82
+ | 0.1655 | 33.0 | 693 | 0.1038 |
83
+ | 0.1655 | 34.0 | 714 | 0.1054 |
84
+ | 0.1655 | 35.0 | 735 | 0.1016 |
85
+ | 0.1655 | 36.0 | 756 | 0.1076 |
86
+ | 0.1655 | 37.0 | 777 | 0.1047 |
87
+ | 0.1655 | 38.0 | 798 | 0.1055 |
88
+ | 0.1655 | 39.0 | 819 | 0.1056 |
89
+ | 0.1655 | 40.0 | 840 | 0.1054 |
90
 
91
 
92
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1e16c1cc2ecdb9050e6ca4674a9376a603370203000d730d9d851633347b2cc0
3
  size 408366800
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:96673dee63efb688f212714df7de66837e70d8271f97e2dc0c876217adc6c2fa
3
  size 408366800