stefan-it commited on
Commit
d0a092e
1 Parent(s): 13d7766

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72710eaf88b0495a8774d24ddcaae4ba48ebc7a25e83b6f41f674af2ad143871
3
+ size 870841135
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 18:31:57 0.0001 1.7931 0.4706 0.0000 0.0000 0.0000 0.0000
3
+ 2 18:34:59 0.0001 0.3534 0.2465 0.4365 0.4894 0.4615 0.3434
4
+ 3 18:38:03 0.0001 0.2006 0.1748 0.6836 0.6505 0.6667 0.5296
5
+ 4 18:41:07 0.0001 0.1161 0.1636 0.7406 0.7099 0.7250 0.5927
6
+ 5 18:44:13 0.0001 0.0719 0.1596 0.7544 0.7396 0.7469 0.6191
7
+ 6 18:47:20 0.0001 0.0481 0.1803 0.7395 0.7568 0.7481 0.6177
8
+ 7 18:50:27 0.0001 0.0338 0.1989 0.7641 0.7545 0.7592 0.6299
9
+ 8 18:53:34 0.0000 0.0249 0.2040 0.7626 0.7561 0.7593 0.6283
10
+ 9 18:56:42 0.0000 0.0201 0.2132 0.7454 0.7553 0.7503 0.6176
11
+ 10 18:59:45 0.0000 0.0180 0.2214 0.7526 0.7490 0.7508 0.6177
runs/events.out.tfevents.1697308136.d3463e005216.2433.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7dcfa0d0d7cb11e125b9615921ab2d570129eed18a8e4f1135e0b1fd15f8aace
3
+ size 253592
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,266 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-14 18:28:56,599 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-14 18:28:56,600 Model: "SequenceTagger(
3
+ (embeddings): ByT5Embeddings(
4
+ (model): T5EncoderModel(
5
+ (shared): Embedding(384, 1472)
6
+ (encoder): T5Stack(
7
+ (embed_tokens): Embedding(384, 1472)
8
+ (block): ModuleList(
9
+ (0): T5Block(
10
+ (layer): ModuleList(
11
+ (0): T5LayerSelfAttention(
12
+ (SelfAttention): T5Attention(
13
+ (q): Linear(in_features=1472, out_features=384, bias=False)
14
+ (k): Linear(in_features=1472, out_features=384, bias=False)
15
+ (v): Linear(in_features=1472, out_features=384, bias=False)
16
+ (o): Linear(in_features=384, out_features=1472, bias=False)
17
+ (relative_attention_bias): Embedding(32, 6)
18
+ )
19
+ (layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (1): T5LayerFF(
23
+ (DenseReluDense): T5DenseGatedActDense(
24
+ (wi_0): Linear(in_features=1472, out_features=3584, bias=False)
25
+ (wi_1): Linear(in_features=1472, out_features=3584, bias=False)
26
+ (wo): Linear(in_features=3584, out_features=1472, bias=False)
27
+ (dropout): Dropout(p=0.1, inplace=False)
28
+ (act): NewGELUActivation()
29
+ )
30
+ (layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
31
+ (dropout): Dropout(p=0.1, inplace=False)
32
+ )
33
+ )
34
+ )
35
+ (1-11): 11 x T5Block(
36
+ (layer): ModuleList(
37
+ (0): T5LayerSelfAttention(
38
+ (SelfAttention): T5Attention(
39
+ (q): Linear(in_features=1472, out_features=384, bias=False)
40
+ (k): Linear(in_features=1472, out_features=384, bias=False)
41
+ (v): Linear(in_features=1472, out_features=384, bias=False)
42
+ (o): Linear(in_features=384, out_features=1472, bias=False)
43
+ )
44
+ (layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
45
+ (dropout): Dropout(p=0.1, inplace=False)
46
+ )
47
+ (1): T5LayerFF(
48
+ (DenseReluDense): T5DenseGatedActDense(
49
+ (wi_0): Linear(in_features=1472, out_features=3584, bias=False)
50
+ (wi_1): Linear(in_features=1472, out_features=3584, bias=False)
51
+ (wo): Linear(in_features=3584, out_features=1472, bias=False)
52
+ (dropout): Dropout(p=0.1, inplace=False)
53
+ (act): NewGELUActivation()
54
+ )
55
+ (layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
56
+ (dropout): Dropout(p=0.1, inplace=False)
57
+ )
58
+ )
59
+ )
60
+ )
61
+ (final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
62
+ (dropout): Dropout(p=0.1, inplace=False)
63
+ )
64
+ )
65
+ )
66
+ (locked_dropout): LockedDropout(p=0.5)
67
+ (linear): Linear(in_features=1472, out_features=21, bias=True)
68
+ (loss_function): CrossEntropyLoss()
69
+ )"
70
+ 2023-10-14 18:28:56,600 ----------------------------------------------------------------------------------------------------
71
+ 2023-10-14 18:28:56,600 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
72
+ - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
73
+ 2023-10-14 18:28:56,600 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-14 18:28:56,600 Train: 3575 sentences
75
+ 2023-10-14 18:28:56,600 (train_with_dev=False, train_with_test=False)
76
+ 2023-10-14 18:28:56,600 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-14 18:28:56,600 Training Params:
78
+ 2023-10-14 18:28:56,600 - learning_rate: "0.00015"
79
+ 2023-10-14 18:28:56,601 - mini_batch_size: "8"
80
+ 2023-10-14 18:28:56,601 - max_epochs: "10"
81
+ 2023-10-14 18:28:56,601 - shuffle: "True"
82
+ 2023-10-14 18:28:56,601 ----------------------------------------------------------------------------------------------------
83
+ 2023-10-14 18:28:56,601 Plugins:
84
+ 2023-10-14 18:28:56,601 - TensorboardLogger
85
+ 2023-10-14 18:28:56,601 - LinearScheduler | warmup_fraction: '0.1'
86
+ 2023-10-14 18:28:56,601 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-14 18:28:56,601 Final evaluation on model from best epoch (best-model.pt)
88
+ 2023-10-14 18:28:56,601 - metric: "('micro avg', 'f1-score')"
89
+ 2023-10-14 18:28:56,601 ----------------------------------------------------------------------------------------------------
90
+ 2023-10-14 18:28:56,601 Computation:
91
+ 2023-10-14 18:28:56,601 - compute on device: cuda:0
92
+ 2023-10-14 18:28:56,601 - embedding storage: none
93
+ 2023-10-14 18:28:56,601 ----------------------------------------------------------------------------------------------------
94
+ 2023-10-14 18:28:56,601 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1"
95
+ 2023-10-14 18:28:56,601 ----------------------------------------------------------------------------------------------------
96
+ 2023-10-14 18:28:56,601 ----------------------------------------------------------------------------------------------------
97
+ 2023-10-14 18:28:56,601 Logging anything other than scalars to TensorBoard is currently not supported.
98
+ 2023-10-14 18:29:13,026 epoch 1 - iter 44/447 - loss 3.04968209 - time (sec): 16.42 - samples/sec: 558.83 - lr: 0.000014 - momentum: 0.000000
99
+ 2023-10-14 18:29:27,978 epoch 1 - iter 88/447 - loss 3.03220833 - time (sec): 31.38 - samples/sec: 550.16 - lr: 0.000029 - momentum: 0.000000
100
+ 2023-10-14 18:29:43,363 epoch 1 - iter 132/447 - loss 2.98006060 - time (sec): 46.76 - samples/sec: 545.03 - lr: 0.000044 - momentum: 0.000000
101
+ 2023-10-14 18:29:58,651 epoch 1 - iter 176/447 - loss 2.85571004 - time (sec): 62.05 - samples/sec: 546.60 - lr: 0.000059 - momentum: 0.000000
102
+ 2023-10-14 18:30:13,328 epoch 1 - iter 220/447 - loss 2.71064544 - time (sec): 76.73 - samples/sec: 540.41 - lr: 0.000073 - momentum: 0.000000
103
+ 2023-10-14 18:30:28,322 epoch 1 - iter 264/447 - loss 2.53638057 - time (sec): 91.72 - samples/sec: 539.60 - lr: 0.000088 - momentum: 0.000000
104
+ 2023-10-14 18:30:44,206 epoch 1 - iter 308/447 - loss 2.32818224 - time (sec): 107.60 - samples/sec: 545.92 - lr: 0.000103 - momentum: 0.000000
105
+ 2023-10-14 18:30:59,374 epoch 1 - iter 352/447 - loss 2.14903375 - time (sec): 122.77 - samples/sec: 546.91 - lr: 0.000118 - momentum: 0.000000
106
+ 2023-10-14 18:31:17,104 epoch 1 - iter 396/447 - loss 1.94372981 - time (sec): 140.50 - samples/sec: 551.15 - lr: 0.000133 - momentum: 0.000000
107
+ 2023-10-14 18:31:32,315 epoch 1 - iter 440/447 - loss 1.81241981 - time (sec): 155.71 - samples/sec: 547.13 - lr: 0.000147 - momentum: 0.000000
108
+ 2023-10-14 18:31:34,703 ----------------------------------------------------------------------------------------------------
109
+ 2023-10-14 18:31:34,703 EPOCH 1 done: loss 1.7931 - lr: 0.000147
110
+ 2023-10-14 18:31:57,276 DEV : loss 0.4706111252307892 - f1-score (micro avg) 0.0
111
+ 2023-10-14 18:31:57,301 ----------------------------------------------------------------------------------------------------
112
+ 2023-10-14 18:32:12,593 epoch 2 - iter 44/447 - loss 0.51221402 - time (sec): 15.29 - samples/sec: 553.41 - lr: 0.000148 - momentum: 0.000000
113
+ 2023-10-14 18:32:27,785 epoch 2 - iter 88/447 - loss 0.49327832 - time (sec): 30.48 - samples/sec: 552.35 - lr: 0.000147 - momentum: 0.000000
114
+ 2023-10-14 18:32:43,466 epoch 2 - iter 132/447 - loss 0.45252612 - time (sec): 46.16 - samples/sec: 567.85 - lr: 0.000145 - momentum: 0.000000
115
+ 2023-10-14 18:33:00,409 epoch 2 - iter 176/447 - loss 0.42537516 - time (sec): 63.11 - samples/sec: 564.16 - lr: 0.000143 - momentum: 0.000000
116
+ 2023-10-14 18:33:15,852 epoch 2 - iter 220/447 - loss 0.40524817 - time (sec): 78.55 - samples/sec: 562.93 - lr: 0.000142 - momentum: 0.000000
117
+ 2023-10-14 18:33:31,601 epoch 2 - iter 264/447 - loss 0.38456250 - time (sec): 94.30 - samples/sec: 561.96 - lr: 0.000140 - momentum: 0.000000
118
+ 2023-10-14 18:33:46,668 epoch 2 - iter 308/447 - loss 0.38166992 - time (sec): 109.37 - samples/sec: 557.13 - lr: 0.000139 - momentum: 0.000000
119
+ 2023-10-14 18:34:02,008 epoch 2 - iter 352/447 - loss 0.37044403 - time (sec): 124.71 - samples/sec: 557.20 - lr: 0.000137 - momentum: 0.000000
120
+ 2023-10-14 18:34:17,604 epoch 2 - iter 396/447 - loss 0.36100765 - time (sec): 140.30 - samples/sec: 555.12 - lr: 0.000135 - momentum: 0.000000
121
+ 2023-10-14 18:34:32,442 epoch 2 - iter 440/447 - loss 0.35295013 - time (sec): 155.14 - samples/sec: 551.20 - lr: 0.000134 - momentum: 0.000000
122
+ 2023-10-14 18:34:34,704 ----------------------------------------------------------------------------------------------------
123
+ 2023-10-14 18:34:34,705 EPOCH 2 done: loss 0.3534 - lr: 0.000134
124
+ 2023-10-14 18:34:59,145 DEV : loss 0.24648237228393555 - f1-score (micro avg) 0.4615
125
+ 2023-10-14 18:34:59,170 saving best model
126
+ 2023-10-14 18:34:59,966 ----------------------------------------------------------------------------------------------------
127
+ 2023-10-14 18:35:15,514 epoch 3 - iter 44/447 - loss 0.28031449 - time (sec): 15.55 - samples/sec: 529.00 - lr: 0.000132 - momentum: 0.000000
128
+ 2023-10-14 18:35:30,601 epoch 3 - iter 88/447 - loss 0.24660222 - time (sec): 30.63 - samples/sec: 533.31 - lr: 0.000130 - momentum: 0.000000
129
+ 2023-10-14 18:35:46,488 epoch 3 - iter 132/447 - loss 0.24011989 - time (sec): 46.52 - samples/sec: 535.83 - lr: 0.000128 - momentum: 0.000000
130
+ 2023-10-14 18:36:01,757 epoch 3 - iter 176/447 - loss 0.23680025 - time (sec): 61.79 - samples/sec: 538.05 - lr: 0.000127 - momentum: 0.000000
131
+ 2023-10-14 18:36:19,186 epoch 3 - iter 220/447 - loss 0.22759569 - time (sec): 79.22 - samples/sec: 547.03 - lr: 0.000125 - momentum: 0.000000
132
+ 2023-10-14 18:36:34,393 epoch 3 - iter 264/447 - loss 0.22429489 - time (sec): 94.43 - samples/sec: 545.33 - lr: 0.000124 - momentum: 0.000000
133
+ 2023-10-14 18:36:49,787 epoch 3 - iter 308/447 - loss 0.21942290 - time (sec): 109.82 - samples/sec: 543.54 - lr: 0.000122 - momentum: 0.000000
134
+ 2023-10-14 18:37:04,772 epoch 3 - iter 352/447 - loss 0.21213137 - time (sec): 124.80 - samples/sec: 541.12 - lr: 0.000120 - momentum: 0.000000
135
+ 2023-10-14 18:37:20,495 epoch 3 - iter 396/447 - loss 0.20683047 - time (sec): 140.53 - samples/sec: 543.46 - lr: 0.000119 - momentum: 0.000000
136
+ 2023-10-14 18:37:35,931 epoch 3 - iter 440/447 - loss 0.20185030 - time (sec): 155.96 - samples/sec: 545.25 - lr: 0.000117 - momentum: 0.000000
137
+ 2023-10-14 18:37:38,436 ----------------------------------------------------------------------------------------------------
138
+ 2023-10-14 18:37:38,437 EPOCH 3 done: loss 0.2006 - lr: 0.000117
139
+ 2023-10-14 18:38:03,004 DEV : loss 0.17484012246131897 - f1-score (micro avg) 0.6667
140
+ 2023-10-14 18:38:03,030 saving best model
141
+ 2023-10-14 18:38:03,867 ----------------------------------------------------------------------------------------------------
142
+ 2023-10-14 18:38:19,427 epoch 4 - iter 44/447 - loss 0.16075713 - time (sec): 15.56 - samples/sec: 535.32 - lr: 0.000115 - momentum: 0.000000
143
+ 2023-10-14 18:38:34,365 epoch 4 - iter 88/447 - loss 0.15141283 - time (sec): 30.50 - samples/sec: 526.90 - lr: 0.000113 - momentum: 0.000000
144
+ 2023-10-14 18:38:49,421 epoch 4 - iter 132/447 - loss 0.14651235 - time (sec): 45.55 - samples/sec: 528.07 - lr: 0.000112 - momentum: 0.000000
145
+ 2023-10-14 18:39:04,810 epoch 4 - iter 176/447 - loss 0.14574267 - time (sec): 60.94 - samples/sec: 528.80 - lr: 0.000110 - momentum: 0.000000
146
+ 2023-10-14 18:39:19,959 epoch 4 - iter 220/447 - loss 0.13882476 - time (sec): 76.09 - samples/sec: 527.66 - lr: 0.000109 - momentum: 0.000000
147
+ 2023-10-14 18:39:35,657 epoch 4 - iter 264/447 - loss 0.13265973 - time (sec): 91.79 - samples/sec: 535.86 - lr: 0.000107 - momentum: 0.000000
148
+ 2023-10-14 18:39:50,775 epoch 4 - iter 308/447 - loss 0.12641890 - time (sec): 106.91 - samples/sec: 535.20 - lr: 0.000105 - momentum: 0.000000
149
+ 2023-10-14 18:40:05,921 epoch 4 - iter 352/447 - loss 0.12439053 - time (sec): 122.05 - samples/sec: 535.52 - lr: 0.000104 - momentum: 0.000000
150
+ 2023-10-14 18:40:23,362 epoch 4 - iter 396/447 - loss 0.12260346 - time (sec): 139.49 - samples/sec: 539.84 - lr: 0.000102 - momentum: 0.000000
151
+ 2023-10-14 18:40:39,750 epoch 4 - iter 440/447 - loss 0.11757667 - time (sec): 155.88 - samples/sec: 544.25 - lr: 0.000100 - momentum: 0.000000
152
+ 2023-10-14 18:40:42,378 ----------------------------------------------------------------------------------------------------
153
+ 2023-10-14 18:40:42,378 EPOCH 4 done: loss 0.1161 - lr: 0.000100
154
+ 2023-10-14 18:41:07,153 DEV : loss 0.16356223821640015 - f1-score (micro avg) 0.725
155
+ 2023-10-14 18:41:07,179 saving best model
156
+ 2023-10-14 18:41:11,734 ----------------------------------------------------------------------------------------------------
157
+ 2023-10-14 18:41:26,859 epoch 5 - iter 44/447 - loss 0.07023867 - time (sec): 15.12 - samples/sec: 506.11 - lr: 0.000098 - momentum: 0.000000
158
+ 2023-10-14 18:41:42,294 epoch 5 - iter 88/447 - loss 0.06478001 - time (sec): 30.56 - samples/sec: 523.37 - lr: 0.000097 - momentum: 0.000000
159
+ 2023-10-14 18:41:58,072 epoch 5 - iter 132/447 - loss 0.06412258 - time (sec): 46.34 - samples/sec: 536.84 - lr: 0.000095 - momentum: 0.000000
160
+ 2023-10-14 18:42:13,295 epoch 5 - iter 176/447 - loss 0.07024363 - time (sec): 61.56 - samples/sec: 538.35 - lr: 0.000094 - momentum: 0.000000
161
+ 2023-10-14 18:42:28,441 epoch 5 - iter 220/447 - loss 0.06745593 - time (sec): 76.70 - samples/sec: 541.44 - lr: 0.000092 - momentum: 0.000000
162
+ 2023-10-14 18:42:45,723 epoch 5 - iter 264/447 - loss 0.07148378 - time (sec): 93.99 - samples/sec: 544.05 - lr: 0.000090 - momentum: 0.000000
163
+ 2023-10-14 18:43:00,596 epoch 5 - iter 308/447 - loss 0.07209831 - time (sec): 108.86 - samples/sec: 542.81 - lr: 0.000089 - momentum: 0.000000
164
+ 2023-10-14 18:43:15,817 epoch 5 - iter 352/447 - loss 0.07099360 - time (sec): 124.08 - samples/sec: 544.94 - lr: 0.000087 - momentum: 0.000000
165
+ 2023-10-14 18:43:31,278 epoch 5 - iter 396/447 - loss 0.07091432 - time (sec): 139.54 - samples/sec: 548.72 - lr: 0.000085 - momentum: 0.000000
166
+ 2023-10-14 18:43:46,569 epoch 5 - iter 440/447 - loss 0.07134499 - time (sec): 154.83 - samples/sec: 550.12 - lr: 0.000084 - momentum: 0.000000
167
+ 2023-10-14 18:43:48,967 ----------------------------------------------------------------------------------------------------
168
+ 2023-10-14 18:43:48,967 EPOCH 5 done: loss 0.0719 - lr: 0.000084
169
+ 2023-10-14 18:44:13,629 DEV : loss 0.1595887392759323 - f1-score (micro avg) 0.7469
170
+ 2023-10-14 18:44:13,655 saving best model
171
+ 2023-10-14 18:44:18,225 ----------------------------------------------------------------------------------------------------
172
+ 2023-10-14 18:44:33,929 epoch 6 - iter 44/447 - loss 0.02958638 - time (sec): 15.70 - samples/sec: 542.37 - lr: 0.000082 - momentum: 0.000000
173
+ 2023-10-14 18:44:49,057 epoch 6 - iter 88/447 - loss 0.03872661 - time (sec): 30.83 - samples/sec: 545.68 - lr: 0.000080 - momentum: 0.000000
174
+ 2023-10-14 18:45:04,426 epoch 6 - iter 132/447 - loss 0.04324198 - time (sec): 46.20 - samples/sec: 546.14 - lr: 0.000079 - momentum: 0.000000
175
+ 2023-10-14 18:45:19,712 epoch 6 - iter 176/447 - loss 0.04411436 - time (sec): 61.48 - samples/sec: 549.13 - lr: 0.000077 - momentum: 0.000000
176
+ 2023-10-14 18:45:34,792 epoch 6 - iter 220/447 - loss 0.04536235 - time (sec): 76.56 - samples/sec: 545.39 - lr: 0.000075 - momentum: 0.000000
177
+ 2023-10-14 18:45:51,942 epoch 6 - iter 264/447 - loss 0.04612584 - time (sec): 93.71 - samples/sec: 546.82 - lr: 0.000074 - momentum: 0.000000
178
+ 2023-10-14 18:46:07,599 epoch 6 - iter 308/447 - loss 0.04566728 - time (sec): 109.37 - samples/sec: 551.32 - lr: 0.000072 - momentum: 0.000000
179
+ 2023-10-14 18:46:23,331 epoch 6 - iter 352/447 - loss 0.04571960 - time (sec): 125.10 - samples/sec: 549.12 - lr: 0.000070 - momentum: 0.000000
180
+ 2023-10-14 18:46:38,316 epoch 6 - iter 396/447 - loss 0.04798319 - time (sec): 140.09 - samples/sec: 546.83 - lr: 0.000069 - momentum: 0.000000
181
+ 2023-10-14 18:46:53,874 epoch 6 - iter 440/447 - loss 0.04832325 - time (sec): 155.65 - samples/sec: 547.34 - lr: 0.000067 - momentum: 0.000000
182
+ 2023-10-14 18:46:56,276 ----------------------------------------------------------------------------------------------------
183
+ 2023-10-14 18:46:56,277 EPOCH 6 done: loss 0.0481 - lr: 0.000067
184
+ 2023-10-14 18:47:20,963 DEV : loss 0.1803191751241684 - f1-score (micro avg) 0.7481
185
+ 2023-10-14 18:47:20,988 saving best model
186
+ 2023-10-14 18:47:25,327 ----------------------------------------------------------------------------------------------------
187
+ 2023-10-14 18:47:42,568 epoch 7 - iter 44/447 - loss 0.04512986 - time (sec): 17.24 - samples/sec: 561.18 - lr: 0.000065 - momentum: 0.000000
188
+ 2023-10-14 18:47:58,148 epoch 7 - iter 88/447 - loss 0.03940830 - time (sec): 32.82 - samples/sec: 558.82 - lr: 0.000064 - momentum: 0.000000
189
+ 2023-10-14 18:48:13,106 epoch 7 - iter 132/447 - loss 0.04551032 - time (sec): 47.78 - samples/sec: 547.80 - lr: 0.000062 - momentum: 0.000000
190
+ 2023-10-14 18:48:28,259 epoch 7 - iter 176/447 - loss 0.04250563 - time (sec): 62.93 - samples/sec: 547.24 - lr: 0.000060 - momentum: 0.000000
191
+ 2023-10-14 18:48:43,733 epoch 7 - iter 220/447 - loss 0.03900809 - time (sec): 78.40 - samples/sec: 549.82 - lr: 0.000059 - momentum: 0.000000
192
+ 2023-10-14 18:48:59,886 epoch 7 - iter 264/447 - loss 0.03682204 - time (sec): 94.56 - samples/sec: 549.51 - lr: 0.000057 - momentum: 0.000000
193
+ 2023-10-14 18:49:15,131 epoch 7 - iter 308/447 - loss 0.03689685 - time (sec): 109.80 - samples/sec: 549.10 - lr: 0.000055 - momentum: 0.000000
194
+ 2023-10-14 18:49:30,170 epoch 7 - iter 352/447 - loss 0.03473793 - time (sec): 124.84 - samples/sec: 548.47 - lr: 0.000054 - momentum: 0.000000
195
+ 2023-10-14 18:49:45,552 epoch 7 - iter 396/447 - loss 0.03496683 - time (sec): 140.22 - samples/sec: 550.10 - lr: 0.000052 - momentum: 0.000000
196
+ 2023-10-14 18:50:00,748 epoch 7 - iter 440/447 - loss 0.03374463 - time (sec): 155.42 - samples/sec: 548.61 - lr: 0.000050 - momentum: 0.000000
197
+ 2023-10-14 18:50:03,115 ----------------------------------------------------------------------------------------------------
198
+ 2023-10-14 18:50:03,116 EPOCH 7 done: loss 0.0338 - lr: 0.000050
199
+ 2023-10-14 18:50:27,933 DEV : loss 0.1989319771528244 - f1-score (micro avg) 0.7592
200
+ 2023-10-14 18:50:27,958 saving best model
201
+ 2023-10-14 18:50:32,398 ----------------------------------------------------------------------------------------------------
202
+ 2023-10-14 18:50:47,481 epoch 8 - iter 44/447 - loss 0.02816068 - time (sec): 15.08 - samples/sec: 542.31 - lr: 0.000049 - momentum: 0.000000
203
+ 2023-10-14 18:51:03,151 epoch 8 - iter 88/447 - loss 0.03674012 - time (sec): 30.75 - samples/sec: 546.65 - lr: 0.000047 - momentum: 0.000000
204
+ 2023-10-14 18:51:18,134 epoch 8 - iter 132/447 - loss 0.03190669 - time (sec): 45.73 - samples/sec: 540.12 - lr: 0.000045 - momentum: 0.000000
205
+ 2023-10-14 18:51:33,859 epoch 8 - iter 176/447 - loss 0.02926721 - time (sec): 61.46 - samples/sec: 552.77 - lr: 0.000044 - momentum: 0.000000
206
+ 2023-10-14 18:51:49,605 epoch 8 - iter 220/447 - loss 0.02779927 - time (sec): 77.21 - samples/sec: 558.46 - lr: 0.000042 - momentum: 0.000000
207
+ 2023-10-14 18:52:05,003 epoch 8 - iter 264/447 - loss 0.02652766 - time (sec): 92.60 - samples/sec: 551.05 - lr: 0.000040 - momentum: 0.000000
208
+ 2023-10-14 18:52:21,852 epoch 8 - iter 308/447 - loss 0.02801775 - time (sec): 109.45 - samples/sec: 549.38 - lr: 0.000039 - momentum: 0.000000
209
+ 2023-10-14 18:52:36,870 epoch 8 - iter 352/447 - loss 0.02681118 - time (sec): 124.47 - samples/sec: 548.52 - lr: 0.000037 - momentum: 0.000000
210
+ 2023-10-14 18:52:52,122 epoch 8 - iter 396/447 - loss 0.02608349 - time (sec): 139.72 - samples/sec: 548.04 - lr: 0.000035 - momentum: 0.000000
211
+ 2023-10-14 18:53:07,402 epoch 8 - iter 440/447 - loss 0.02505040 - time (sec): 155.00 - samples/sec: 549.68 - lr: 0.000034 - momentum: 0.000000
212
+ 2023-10-14 18:53:09,819 ----------------------------------------------------------------------------------------------------
213
+ 2023-10-14 18:53:09,819 EPOCH 8 done: loss 0.0249 - lr: 0.000034
214
+ 2023-10-14 18:53:34,610 DEV : loss 0.20397181808948517 - f1-score (micro avg) 0.7593
215
+ 2023-10-14 18:53:34,635 saving best model
216
+ 2023-10-14 18:53:38,825 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-14 18:53:56,216 epoch 9 - iter 44/447 - loss 0.03413553 - time (sec): 17.39 - samples/sec: 559.05 - lr: 0.000032 - momentum: 0.000000
218
+ 2023-10-14 18:54:12,137 epoch 9 - iter 88/447 - loss 0.02537551 - time (sec): 33.31 - samples/sec: 563.45 - lr: 0.000030 - momentum: 0.000000
219
+ 2023-10-14 18:54:27,605 epoch 9 - iter 132/447 - loss 0.02230186 - time (sec): 48.78 - samples/sec: 561.25 - lr: 0.000029 - momentum: 0.000000
220
+ 2023-10-14 18:54:43,093 epoch 9 - iter 176/447 - loss 0.02191161 - time (sec): 64.27 - samples/sec: 562.30 - lr: 0.000027 - momentum: 0.000000
221
+ 2023-10-14 18:54:57,966 epoch 9 - iter 220/447 - loss 0.02003936 - time (sec): 79.14 - samples/sec: 553.96 - lr: 0.000025 - momentum: 0.000000
222
+ 2023-10-14 18:55:13,466 epoch 9 - iter 264/447 - loss 0.02206598 - time (sec): 94.64 - samples/sec: 550.03 - lr: 0.000024 - momentum: 0.000000
223
+ 2023-10-14 18:55:28,451 epoch 9 - iter 308/447 - loss 0.02063661 - time (sec): 109.62 - samples/sec: 545.60 - lr: 0.000022 - momentum: 0.000000
224
+ 2023-10-14 18:55:43,831 epoch 9 - iter 352/447 - loss 0.02036125 - time (sec): 125.00 - samples/sec: 545.41 - lr: 0.000020 - momentum: 0.000000
225
+ 2023-10-14 18:55:59,355 epoch 9 - iter 396/447 - loss 0.01930381 - time (sec): 140.53 - samples/sec: 546.14 - lr: 0.000019 - momentum: 0.000000
226
+ 2023-10-14 18:56:14,934 epoch 9 - iter 440/447 - loss 0.02015557 - time (sec): 156.11 - samples/sec: 545.94 - lr: 0.000017 - momentum: 0.000000
227
+ 2023-10-14 18:56:17,335 ----------------------------------------------------------------------------------------------------
228
+ 2023-10-14 18:56:17,335 EPOCH 9 done: loss 0.0201 - lr: 0.000017
229
+ 2023-10-14 18:56:42,347 DEV : loss 0.2131572663784027 - f1-score (micro avg) 0.7503
230
+ 2023-10-14 18:56:42,372 ----------------------------------------------------------------------------------------------------
231
+ 2023-10-14 18:56:57,773 epoch 10 - iter 44/447 - loss 0.02127213 - time (sec): 15.40 - samples/sec: 568.28 - lr: 0.000015 - momentum: 0.000000
232
+ 2023-10-14 18:57:12,525 epoch 10 - iter 88/447 - loss 0.01811628 - time (sec): 30.15 - samples/sec: 544.58 - lr: 0.000014 - momentum: 0.000000
233
+ 2023-10-14 18:57:27,600 epoch 10 - iter 132/447 - loss 0.01611064 - time (sec): 45.23 - samples/sec: 545.36 - lr: 0.000012 - momentum: 0.000000
234
+ 2023-10-14 18:57:43,311 epoch 10 - iter 176/447 - loss 0.01554251 - time (sec): 60.94 - samples/sec: 551.21 - lr: 0.000010 - momentum: 0.000000
235
+ 2023-10-14 18:58:01,072 epoch 10 - iter 220/447 - loss 0.01884512 - time (sec): 78.70 - samples/sec: 556.12 - lr: 0.000009 - momentum: 0.000000
236
+ 2023-10-14 18:58:16,786 epoch 10 - iter 264/447 - loss 0.01782274 - time (sec): 94.41 - samples/sec: 551.69 - lr: 0.000007 - momentum: 0.000000
237
+ 2023-10-14 18:58:31,821 epoch 10 - iter 308/447 - loss 0.01717610 - time (sec): 109.45 - samples/sec: 547.71 - lr: 0.000005 - momentum: 0.000000
238
+ 2023-10-14 18:58:46,654 epoch 10 - iter 352/447 - loss 0.01630284 - time (sec): 124.28 - samples/sec: 543.75 - lr: 0.000004 - momentum: 0.000000
239
+ 2023-10-14 18:59:02,118 epoch 10 - iter 396/447 - loss 0.01612965 - time (sec): 139.74 - samples/sec: 545.16 - lr: 0.000002 - momentum: 0.000000
240
+ 2023-10-14 18:59:18,081 epoch 10 - iter 440/447 - loss 0.01766576 - time (sec): 155.71 - samples/sec: 547.01 - lr: 0.000001 - momentum: 0.000000
241
+ 2023-10-14 18:59:20,506 ----------------------------------------------------------------------------------------------------
242
+ 2023-10-14 18:59:20,506 EPOCH 10 done: loss 0.0180 - lr: 0.000001
243
+ 2023-10-14 18:59:45,871 DEV : loss 0.22141988575458527 - f1-score (micro avg) 0.7508
244
+ 2023-10-14 18:59:46,688 ----------------------------------------------------------------------------------------------------
245
+ 2023-10-14 18:59:46,689 Loading model from best epoch ...
246
+ 2023-10-14 18:59:49,776 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
247
+ 2023-10-14 19:00:11,248
248
+ Results:
249
+ - F-score (micro) 0.7519
250
+ - F-score (macro) 0.6501
251
+ - Accuracy 0.6163
252
+
253
+ By class:
254
+ precision recall f1-score support
255
+
256
+ loc 0.8339 0.8674 0.8503 596
257
+ pers 0.6772 0.7748 0.7227 333
258
+ org 0.5263 0.5303 0.5283 132
259
+ prod 0.6296 0.5152 0.5667 66
260
+ time 0.5556 0.6122 0.5825 49
261
+
262
+ micro avg 0.7319 0.7730 0.7519 1176
263
+ macro avg 0.6445 0.6600 0.6501 1176
264
+ weighted avg 0.7319 0.7730 0.7510 1176
265
+
266
+ 2023-10-14 19:00:11,249 ----------------------------------------------------------------------------------------------------