Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697667582.46dc0c540dd0.3571.5 +3 -0
- test.tsv +0 -0
- training.log +244 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:88b3c16cfcccd7cd2ae8f901764b7d79b6b67e087a14cfeb8f26bd64d4d49b9a
|
3 |
+
size 19045922
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 22:20:07 0.0000 0.6319 0.2580 0.5714 0.0165 0.0321 0.0163
|
3 |
+
2 22:20:33 0.0000 0.2018 0.2216 0.5143 0.3533 0.4189 0.2767
|
4 |
+
3 22:20:59 0.0000 0.1743 0.2401 0.6168 0.2727 0.3782 0.2393
|
5 |
+
4 22:21:25 0.0000 0.1564 0.1863 0.5332 0.4308 0.4766 0.3255
|
6 |
+
5 22:21:51 0.0000 0.1420 0.1872 0.5277 0.4618 0.4926 0.3420
|
7 |
+
6 22:22:17 0.0000 0.1338 0.1842 0.4944 0.5000 0.4972 0.3470
|
8 |
+
7 22:22:43 0.0000 0.1285 0.1963 0.5589 0.4556 0.5020 0.3462
|
9 |
+
8 22:23:09 0.0000 0.1209 0.1914 0.5757 0.5145 0.5434 0.3866
|
10 |
+
9 22:23:35 0.0000 0.1177 0.1922 0.5496 0.5093 0.5287 0.3732
|
11 |
+
10 22:24:01 0.0000 0.1154 0.1951 0.5640 0.5010 0.5306 0.3737
|
runs/events.out.tfevents.1697667582.46dc0c540dd0.3571.5
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:299ed097312b5e4e3e949169ba14889a58a5f1dcd6c07133649f2b2250116833
|
3 |
+
size 808480
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,244 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-18 22:19:42,151 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-18 22:19:42,151 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 128)
|
7 |
+
(position_embeddings): Embedding(512, 128)
|
8 |
+
(token_type_embeddings): Embedding(2, 128)
|
9 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-1): 2 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=128, out_features=128, bias=True)
|
18 |
+
(key): Linear(in_features=128, out_features=128, bias=True)
|
19 |
+
(value): Linear(in_features=128, out_features=128, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=128, out_features=512, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=512, out_features=128, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=128, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-18 22:19:42,151 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-18 22:19:42,151 MultiCorpus: 5777 train + 722 dev + 723 test sentences
|
52 |
+
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
|
53 |
+
2023-10-18 22:19:42,152 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-18 22:19:42,152 Train: 5777 sentences
|
55 |
+
2023-10-18 22:19:42,152 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-18 22:19:42,152 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-18 22:19:42,152 Training Params:
|
58 |
+
2023-10-18 22:19:42,152 - learning_rate: "5e-05"
|
59 |
+
2023-10-18 22:19:42,152 - mini_batch_size: "4"
|
60 |
+
2023-10-18 22:19:42,152 - max_epochs: "10"
|
61 |
+
2023-10-18 22:19:42,152 - shuffle: "True"
|
62 |
+
2023-10-18 22:19:42,152 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-18 22:19:42,152 Plugins:
|
64 |
+
2023-10-18 22:19:42,152 - TensorboardLogger
|
65 |
+
2023-10-18 22:19:42,152 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-18 22:19:42,152 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-18 22:19:42,152 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-18 22:19:42,152 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-18 22:19:42,152 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-18 22:19:42,152 Computation:
|
71 |
+
2023-10-18 22:19:42,152 - compute on device: cuda:0
|
72 |
+
2023-10-18 22:19:42,152 - embedding storage: none
|
73 |
+
2023-10-18 22:19:42,152 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-18 22:19:42,152 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
|
75 |
+
2023-10-18 22:19:42,152 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-18 22:19:42,152 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-18 22:19:42,152 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-18 22:19:44,520 epoch 1 - iter 144/1445 - loss 2.32460041 - time (sec): 2.37 - samples/sec: 7843.84 - lr: 0.000005 - momentum: 0.000000
|
79 |
+
2023-10-18 22:19:46,876 epoch 1 - iter 288/1445 - loss 1.92642473 - time (sec): 4.72 - samples/sec: 7464.80 - lr: 0.000010 - momentum: 0.000000
|
80 |
+
2023-10-18 22:19:49,279 epoch 1 - iter 432/1445 - loss 1.48629302 - time (sec): 7.13 - samples/sec: 7314.71 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-18 22:19:51,709 epoch 1 - iter 576/1445 - loss 1.18873264 - time (sec): 9.56 - samples/sec: 7369.69 - lr: 0.000020 - momentum: 0.000000
|
82 |
+
2023-10-18 22:19:54,136 epoch 1 - iter 720/1445 - loss 1.01230605 - time (sec): 11.98 - samples/sec: 7321.67 - lr: 0.000025 - momentum: 0.000000
|
83 |
+
2023-10-18 22:19:56,577 epoch 1 - iter 864/1445 - loss 0.89552495 - time (sec): 14.42 - samples/sec: 7303.88 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-18 22:19:59,023 epoch 1 - iter 1008/1445 - loss 0.80131175 - time (sec): 16.87 - samples/sec: 7321.53 - lr: 0.000035 - momentum: 0.000000
|
85 |
+
2023-10-18 22:20:01,418 epoch 1 - iter 1152/1445 - loss 0.73544654 - time (sec): 19.26 - samples/sec: 7307.75 - lr: 0.000040 - momentum: 0.000000
|
86 |
+
2023-10-18 22:20:03,856 epoch 1 - iter 1296/1445 - loss 0.67844139 - time (sec): 21.70 - samples/sec: 7311.07 - lr: 0.000045 - momentum: 0.000000
|
87 |
+
2023-10-18 22:20:06,242 epoch 1 - iter 1440/1445 - loss 0.63343893 - time (sec): 24.09 - samples/sec: 7293.28 - lr: 0.000050 - momentum: 0.000000
|
88 |
+
2023-10-18 22:20:06,318 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-18 22:20:06,318 EPOCH 1 done: loss 0.6319 - lr: 0.000050
|
90 |
+
2023-10-18 22:20:07,617 DEV : loss 0.2580135762691498 - f1-score (micro avg) 0.0321
|
91 |
+
2023-10-18 22:20:07,631 saving best model
|
92 |
+
2023-10-18 22:20:07,663 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-18 22:20:10,040 epoch 2 - iter 144/1445 - loss 0.25850124 - time (sec): 2.38 - samples/sec: 6901.52 - lr: 0.000049 - momentum: 0.000000
|
94 |
+
2023-10-18 22:20:12,456 epoch 2 - iter 288/1445 - loss 0.21607017 - time (sec): 4.79 - samples/sec: 7204.82 - lr: 0.000049 - momentum: 0.000000
|
95 |
+
2023-10-18 22:20:14,896 epoch 2 - iter 432/1445 - loss 0.21285833 - time (sec): 7.23 - samples/sec: 7316.52 - lr: 0.000048 - momentum: 0.000000
|
96 |
+
2023-10-18 22:20:17,318 epoch 2 - iter 576/1445 - loss 0.20677689 - time (sec): 9.65 - samples/sec: 7313.33 - lr: 0.000048 - momentum: 0.000000
|
97 |
+
2023-10-18 22:20:19,730 epoch 2 - iter 720/1445 - loss 0.20570996 - time (sec): 12.07 - samples/sec: 7274.59 - lr: 0.000047 - momentum: 0.000000
|
98 |
+
2023-10-18 22:20:22,104 epoch 2 - iter 864/1445 - loss 0.20513499 - time (sec): 14.44 - samples/sec: 7277.59 - lr: 0.000047 - momentum: 0.000000
|
99 |
+
2023-10-18 22:20:24,416 epoch 2 - iter 1008/1445 - loss 0.20635190 - time (sec): 16.75 - samples/sec: 7260.09 - lr: 0.000046 - momentum: 0.000000
|
100 |
+
2023-10-18 22:20:26,712 epoch 2 - iter 1152/1445 - loss 0.20450060 - time (sec): 19.05 - samples/sec: 7370.36 - lr: 0.000046 - momentum: 0.000000
|
101 |
+
2023-10-18 22:20:29,099 epoch 2 - iter 1296/1445 - loss 0.20474662 - time (sec): 21.43 - samples/sec: 7327.74 - lr: 0.000045 - momentum: 0.000000
|
102 |
+
2023-10-18 22:20:31,543 epoch 2 - iter 1440/1445 - loss 0.20185081 - time (sec): 23.88 - samples/sec: 7358.25 - lr: 0.000044 - momentum: 0.000000
|
103 |
+
2023-10-18 22:20:31,615 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-18 22:20:31,615 EPOCH 2 done: loss 0.2018 - lr: 0.000044
|
105 |
+
2023-10-18 22:20:33,720 DEV : loss 0.22160635888576508 - f1-score (micro avg) 0.4189
|
106 |
+
2023-10-18 22:20:33,734 saving best model
|
107 |
+
2023-10-18 22:20:33,772 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-18 22:20:36,151 epoch 3 - iter 144/1445 - loss 0.17588072 - time (sec): 2.38 - samples/sec: 7116.26 - lr: 0.000044 - momentum: 0.000000
|
109 |
+
2023-10-18 22:20:38,467 epoch 3 - iter 288/1445 - loss 0.18389477 - time (sec): 4.69 - samples/sec: 7249.76 - lr: 0.000043 - momentum: 0.000000
|
110 |
+
2023-10-18 22:20:41,000 epoch 3 - iter 432/1445 - loss 0.18625268 - time (sec): 7.23 - samples/sec: 7422.60 - lr: 0.000043 - momentum: 0.000000
|
111 |
+
2023-10-18 22:20:43,401 epoch 3 - iter 576/1445 - loss 0.18382187 - time (sec): 9.63 - samples/sec: 7308.74 - lr: 0.000042 - momentum: 0.000000
|
112 |
+
2023-10-18 22:20:45,750 epoch 3 - iter 720/1445 - loss 0.18063974 - time (sec): 11.98 - samples/sec: 7350.99 - lr: 0.000042 - momentum: 0.000000
|
113 |
+
2023-10-18 22:20:48,188 epoch 3 - iter 864/1445 - loss 0.17872188 - time (sec): 14.42 - samples/sec: 7299.23 - lr: 0.000041 - momentum: 0.000000
|
114 |
+
2023-10-18 22:20:50,696 epoch 3 - iter 1008/1445 - loss 0.17759296 - time (sec): 16.92 - samples/sec: 7346.30 - lr: 0.000041 - momentum: 0.000000
|
115 |
+
2023-10-18 22:20:53,006 epoch 3 - iter 1152/1445 - loss 0.17684439 - time (sec): 19.23 - samples/sec: 7314.90 - lr: 0.000040 - momentum: 0.000000
|
116 |
+
2023-10-18 22:20:55,421 epoch 3 - iter 1296/1445 - loss 0.17496314 - time (sec): 21.65 - samples/sec: 7340.69 - lr: 0.000039 - momentum: 0.000000
|
117 |
+
2023-10-18 22:20:57,832 epoch 3 - iter 1440/1445 - loss 0.17443954 - time (sec): 24.06 - samples/sec: 7302.03 - lr: 0.000039 - momentum: 0.000000
|
118 |
+
2023-10-18 22:20:57,909 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-18 22:20:57,909 EPOCH 3 done: loss 0.1743 - lr: 0.000039
|
120 |
+
2023-10-18 22:20:59,671 DEV : loss 0.24006225168704987 - f1-score (micro avg) 0.3782
|
121 |
+
2023-10-18 22:20:59,685 ----------------------------------------------------------------------------------------------------
|
122 |
+
2023-10-18 22:21:02,053 epoch 4 - iter 144/1445 - loss 0.15704023 - time (sec): 2.37 - samples/sec: 7426.25 - lr: 0.000038 - momentum: 0.000000
|
123 |
+
2023-10-18 22:21:04,482 epoch 4 - iter 288/1445 - loss 0.14220206 - time (sec): 4.80 - samples/sec: 7531.94 - lr: 0.000038 - momentum: 0.000000
|
124 |
+
2023-10-18 22:21:06,890 epoch 4 - iter 432/1445 - loss 0.14888125 - time (sec): 7.20 - samples/sec: 7379.14 - lr: 0.000037 - momentum: 0.000000
|
125 |
+
2023-10-18 22:21:09,305 epoch 4 - iter 576/1445 - loss 0.14940799 - time (sec): 9.62 - samples/sec: 7363.11 - lr: 0.000037 - momentum: 0.000000
|
126 |
+
2023-10-18 22:21:11,606 epoch 4 - iter 720/1445 - loss 0.15192125 - time (sec): 11.92 - samples/sec: 7313.30 - lr: 0.000036 - momentum: 0.000000
|
127 |
+
2023-10-18 22:21:13,981 epoch 4 - iter 864/1445 - loss 0.15366328 - time (sec): 14.30 - samples/sec: 7376.79 - lr: 0.000036 - momentum: 0.000000
|
128 |
+
2023-10-18 22:21:16,386 epoch 4 - iter 1008/1445 - loss 0.15610653 - time (sec): 16.70 - samples/sec: 7371.99 - lr: 0.000035 - momentum: 0.000000
|
129 |
+
2023-10-18 22:21:18,758 epoch 4 - iter 1152/1445 - loss 0.15517198 - time (sec): 19.07 - samples/sec: 7361.65 - lr: 0.000034 - momentum: 0.000000
|
130 |
+
2023-10-18 22:21:20,878 epoch 4 - iter 1296/1445 - loss 0.15593427 - time (sec): 21.19 - samples/sec: 7424.63 - lr: 0.000034 - momentum: 0.000000
|
131 |
+
2023-10-18 22:21:23,147 epoch 4 - iter 1440/1445 - loss 0.15639925 - time (sec): 23.46 - samples/sec: 7485.62 - lr: 0.000033 - momentum: 0.000000
|
132 |
+
2023-10-18 22:21:23,246 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-18 22:21:23,246 EPOCH 4 done: loss 0.1564 - lr: 0.000033
|
134 |
+
2023-10-18 22:21:25,049 DEV : loss 0.18633760511875153 - f1-score (micro avg) 0.4766
|
135 |
+
2023-10-18 22:21:25,065 saving best model
|
136 |
+
2023-10-18 22:21:25,102 ----------------------------------------------------------------------------------------------------
|
137 |
+
2023-10-18 22:21:27,566 epoch 5 - iter 144/1445 - loss 0.14453777 - time (sec): 2.46 - samples/sec: 6981.73 - lr: 0.000033 - momentum: 0.000000
|
138 |
+
2023-10-18 22:21:30,047 epoch 5 - iter 288/1445 - loss 0.13657981 - time (sec): 4.94 - samples/sec: 7330.13 - lr: 0.000032 - momentum: 0.000000
|
139 |
+
2023-10-18 22:21:32,419 epoch 5 - iter 432/1445 - loss 0.13339998 - time (sec): 7.32 - samples/sec: 7300.07 - lr: 0.000032 - momentum: 0.000000
|
140 |
+
2023-10-18 22:21:34,892 epoch 5 - iter 576/1445 - loss 0.13696365 - time (sec): 9.79 - samples/sec: 7313.46 - lr: 0.000031 - momentum: 0.000000
|
141 |
+
2023-10-18 22:21:37,351 epoch 5 - iter 720/1445 - loss 0.13780642 - time (sec): 12.25 - samples/sec: 7348.95 - lr: 0.000031 - momentum: 0.000000
|
142 |
+
2023-10-18 22:21:39,733 epoch 5 - iter 864/1445 - loss 0.13896165 - time (sec): 14.63 - samples/sec: 7378.00 - lr: 0.000030 - momentum: 0.000000
|
143 |
+
2023-10-18 22:21:42,160 epoch 5 - iter 1008/1445 - loss 0.13873270 - time (sec): 17.06 - samples/sec: 7319.29 - lr: 0.000029 - momentum: 0.000000
|
144 |
+
2023-10-18 22:21:44,629 epoch 5 - iter 1152/1445 - loss 0.14038297 - time (sec): 19.53 - samples/sec: 7340.56 - lr: 0.000029 - momentum: 0.000000
|
145 |
+
2023-10-18 22:21:46,979 epoch 5 - iter 1296/1445 - loss 0.14262285 - time (sec): 21.88 - samples/sec: 7313.89 - lr: 0.000028 - momentum: 0.000000
|
146 |
+
2023-10-18 22:21:49,338 epoch 5 - iter 1440/1445 - loss 0.14218793 - time (sec): 24.24 - samples/sec: 7241.77 - lr: 0.000028 - momentum: 0.000000
|
147 |
+
2023-10-18 22:21:49,415 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-18 22:21:49,416 EPOCH 5 done: loss 0.1420 - lr: 0.000028
|
149 |
+
2023-10-18 22:21:51,587 DEV : loss 0.1872384250164032 - f1-score (micro avg) 0.4926
|
150 |
+
2023-10-18 22:21:51,602 saving best model
|
151 |
+
2023-10-18 22:21:51,637 ----------------------------------------------------------------------------------------------------
|
152 |
+
2023-10-18 22:21:54,062 epoch 6 - iter 144/1445 - loss 0.14976542 - time (sec): 2.42 - samples/sec: 7553.71 - lr: 0.000027 - momentum: 0.000000
|
153 |
+
2023-10-18 22:21:56,442 epoch 6 - iter 288/1445 - loss 0.14270623 - time (sec): 4.80 - samples/sec: 7364.78 - lr: 0.000027 - momentum: 0.000000
|
154 |
+
2023-10-18 22:21:58,860 epoch 6 - iter 432/1445 - loss 0.14528908 - time (sec): 7.22 - samples/sec: 7275.36 - lr: 0.000026 - momentum: 0.000000
|
155 |
+
2023-10-18 22:22:01,266 epoch 6 - iter 576/1445 - loss 0.13874679 - time (sec): 9.63 - samples/sec: 7350.00 - lr: 0.000026 - momentum: 0.000000
|
156 |
+
2023-10-18 22:22:03,716 epoch 6 - iter 720/1445 - loss 0.13598692 - time (sec): 12.08 - samples/sec: 7368.26 - lr: 0.000025 - momentum: 0.000000
|
157 |
+
2023-10-18 22:22:06,081 epoch 6 - iter 864/1445 - loss 0.13518192 - time (sec): 14.44 - samples/sec: 7323.88 - lr: 0.000024 - momentum: 0.000000
|
158 |
+
2023-10-18 22:22:08,376 epoch 6 - iter 1008/1445 - loss 0.13600947 - time (sec): 16.74 - samples/sec: 7333.86 - lr: 0.000024 - momentum: 0.000000
|
159 |
+
2023-10-18 22:22:10,772 epoch 6 - iter 1152/1445 - loss 0.13696181 - time (sec): 19.13 - samples/sec: 7338.30 - lr: 0.000023 - momentum: 0.000000
|
160 |
+
2023-10-18 22:22:13,239 epoch 6 - iter 1296/1445 - loss 0.13477708 - time (sec): 21.60 - samples/sec: 7319.71 - lr: 0.000023 - momentum: 0.000000
|
161 |
+
2023-10-18 22:22:15,372 epoch 6 - iter 1440/1445 - loss 0.13372540 - time (sec): 23.73 - samples/sec: 7394.25 - lr: 0.000022 - momentum: 0.000000
|
162 |
+
2023-10-18 22:22:15,481 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-18 22:22:15,481 EPOCH 6 done: loss 0.1338 - lr: 0.000022
|
164 |
+
2023-10-18 22:22:17,258 DEV : loss 0.18421348929405212 - f1-score (micro avg) 0.4972
|
165 |
+
2023-10-18 22:22:17,272 saving best model
|
166 |
+
2023-10-18 22:22:17,311 ----------------------------------------------------------------------------------------------------
|
167 |
+
2023-10-18 22:22:19,730 epoch 7 - iter 144/1445 - loss 0.12880539 - time (sec): 2.42 - samples/sec: 6843.39 - lr: 0.000022 - momentum: 0.000000
|
168 |
+
2023-10-18 22:22:22,186 epoch 7 - iter 288/1445 - loss 0.12742790 - time (sec): 4.87 - samples/sec: 7312.81 - lr: 0.000021 - momentum: 0.000000
|
169 |
+
2023-10-18 22:22:24,578 epoch 7 - iter 432/1445 - loss 0.12845746 - time (sec): 7.27 - samples/sec: 7399.15 - lr: 0.000021 - momentum: 0.000000
|
170 |
+
2023-10-18 22:22:26,964 epoch 7 - iter 576/1445 - loss 0.12798111 - time (sec): 9.65 - samples/sec: 7358.81 - lr: 0.000020 - momentum: 0.000000
|
171 |
+
2023-10-18 22:22:29,289 epoch 7 - iter 720/1445 - loss 0.13016094 - time (sec): 11.98 - samples/sec: 7385.20 - lr: 0.000019 - momentum: 0.000000
|
172 |
+
2023-10-18 22:22:31,764 epoch 7 - iter 864/1445 - loss 0.12801634 - time (sec): 14.45 - samples/sec: 7342.76 - lr: 0.000019 - momentum: 0.000000
|
173 |
+
2023-10-18 22:22:34,149 epoch 7 - iter 1008/1445 - loss 0.12857662 - time (sec): 16.84 - samples/sec: 7337.88 - lr: 0.000018 - momentum: 0.000000
|
174 |
+
2023-10-18 22:22:36,569 epoch 7 - iter 1152/1445 - loss 0.13054183 - time (sec): 19.26 - samples/sec: 7408.41 - lr: 0.000018 - momentum: 0.000000
|
175 |
+
2023-10-18 22:22:38,945 epoch 7 - iter 1296/1445 - loss 0.13113824 - time (sec): 21.63 - samples/sec: 7358.19 - lr: 0.000017 - momentum: 0.000000
|
176 |
+
2023-10-18 22:22:41,336 epoch 7 - iter 1440/1445 - loss 0.12852518 - time (sec): 24.02 - samples/sec: 7317.06 - lr: 0.000017 - momentum: 0.000000
|
177 |
+
2023-10-18 22:22:41,408 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-18 22:22:41,408 EPOCH 7 done: loss 0.1285 - lr: 0.000017
|
179 |
+
2023-10-18 22:22:43,179 DEV : loss 0.19626782834529877 - f1-score (micro avg) 0.502
|
180 |
+
2023-10-18 22:22:43,193 saving best model
|
181 |
+
2023-10-18 22:22:43,230 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-10-18 22:22:45,753 epoch 8 - iter 144/1445 - loss 0.14102370 - time (sec): 2.52 - samples/sec: 7557.82 - lr: 0.000016 - momentum: 0.000000
|
183 |
+
2023-10-18 22:22:48,152 epoch 8 - iter 288/1445 - loss 0.13225445 - time (sec): 4.92 - samples/sec: 7426.91 - lr: 0.000016 - momentum: 0.000000
|
184 |
+
2023-10-18 22:22:50,490 epoch 8 - iter 432/1445 - loss 0.12664049 - time (sec): 7.26 - samples/sec: 7311.35 - lr: 0.000015 - momentum: 0.000000
|
185 |
+
2023-10-18 22:22:52,853 epoch 8 - iter 576/1445 - loss 0.12982041 - time (sec): 9.62 - samples/sec: 7233.61 - lr: 0.000014 - momentum: 0.000000
|
186 |
+
2023-10-18 22:22:55,227 epoch 8 - iter 720/1445 - loss 0.12689940 - time (sec): 12.00 - samples/sec: 7186.26 - lr: 0.000014 - momentum: 0.000000
|
187 |
+
2023-10-18 22:22:57,640 epoch 8 - iter 864/1445 - loss 0.12659690 - time (sec): 14.41 - samples/sec: 7191.10 - lr: 0.000013 - momentum: 0.000000
|
188 |
+
2023-10-18 22:23:00,061 epoch 8 - iter 1008/1445 - loss 0.12444468 - time (sec): 16.83 - samples/sec: 7235.97 - lr: 0.000013 - momentum: 0.000000
|
189 |
+
2023-10-18 22:23:02,526 epoch 8 - iter 1152/1445 - loss 0.12377697 - time (sec): 19.29 - samples/sec: 7266.53 - lr: 0.000012 - momentum: 0.000000
|
190 |
+
2023-10-18 22:23:04,948 epoch 8 - iter 1296/1445 - loss 0.12275030 - time (sec): 21.72 - samples/sec: 7270.44 - lr: 0.000012 - momentum: 0.000000
|
191 |
+
2023-10-18 22:23:07,360 epoch 8 - iter 1440/1445 - loss 0.12073718 - time (sec): 24.13 - samples/sec: 7281.52 - lr: 0.000011 - momentum: 0.000000
|
192 |
+
2023-10-18 22:23:07,443 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-18 22:23:07,443 EPOCH 8 done: loss 0.1209 - lr: 0.000011
|
194 |
+
2023-10-18 22:23:09,575 DEV : loss 0.19142475724220276 - f1-score (micro avg) 0.5434
|
195 |
+
2023-10-18 22:23:09,590 saving best model
|
196 |
+
2023-10-18 22:23:09,625 ----------------------------------------------------------------------------------------------------
|
197 |
+
2023-10-18 22:23:12,078 epoch 9 - iter 144/1445 - loss 0.10959720 - time (sec): 2.45 - samples/sec: 6974.25 - lr: 0.000011 - momentum: 0.000000
|
198 |
+
2023-10-18 22:23:14,476 epoch 9 - iter 288/1445 - loss 0.10667037 - time (sec): 4.85 - samples/sec: 7252.19 - lr: 0.000010 - momentum: 0.000000
|
199 |
+
2023-10-18 22:23:16,919 epoch 9 - iter 432/1445 - loss 0.11178967 - time (sec): 7.29 - samples/sec: 7120.08 - lr: 0.000009 - momentum: 0.000000
|
200 |
+
2023-10-18 22:23:19,305 epoch 9 - iter 576/1445 - loss 0.11103154 - time (sec): 9.68 - samples/sec: 7277.83 - lr: 0.000009 - momentum: 0.000000
|
201 |
+
2023-10-18 22:23:21,730 epoch 9 - iter 720/1445 - loss 0.11192122 - time (sec): 12.10 - samples/sec: 7281.87 - lr: 0.000008 - momentum: 0.000000
|
202 |
+
2023-10-18 22:23:24,086 epoch 9 - iter 864/1445 - loss 0.11451662 - time (sec): 14.46 - samples/sec: 7238.30 - lr: 0.000008 - momentum: 0.000000
|
203 |
+
2023-10-18 22:23:26,606 epoch 9 - iter 1008/1445 - loss 0.11503792 - time (sec): 16.98 - samples/sec: 7204.60 - lr: 0.000007 - momentum: 0.000000
|
204 |
+
2023-10-18 22:23:29,063 epoch 9 - iter 1152/1445 - loss 0.11784699 - time (sec): 19.44 - samples/sec: 7246.04 - lr: 0.000007 - momentum: 0.000000
|
205 |
+
2023-10-18 22:23:31,381 epoch 9 - iter 1296/1445 - loss 0.11854548 - time (sec): 21.75 - samples/sec: 7285.23 - lr: 0.000006 - momentum: 0.000000
|
206 |
+
2023-10-18 22:23:33,674 epoch 9 - iter 1440/1445 - loss 0.11760527 - time (sec): 24.05 - samples/sec: 7305.87 - lr: 0.000006 - momentum: 0.000000
|
207 |
+
2023-10-18 22:23:33,747 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-18 22:23:33,747 EPOCH 9 done: loss 0.1177 - lr: 0.000006
|
209 |
+
2023-10-18 22:23:35,527 DEV : loss 0.19222453236579895 - f1-score (micro avg) 0.5287
|
210 |
+
2023-10-18 22:23:35,541 ----------------------------------------------------------------------------------------------------
|
211 |
+
2023-10-18 22:23:37,961 epoch 10 - iter 144/1445 - loss 0.14194788 - time (sec): 2.42 - samples/sec: 7135.18 - lr: 0.000005 - momentum: 0.000000
|
212 |
+
2023-10-18 22:23:40,379 epoch 10 - iter 288/1445 - loss 0.12535540 - time (sec): 4.84 - samples/sec: 7232.39 - lr: 0.000004 - momentum: 0.000000
|
213 |
+
2023-10-18 22:23:42,883 epoch 10 - iter 432/1445 - loss 0.11875922 - time (sec): 7.34 - samples/sec: 7294.08 - lr: 0.000004 - momentum: 0.000000
|
214 |
+
2023-10-18 22:23:45,300 epoch 10 - iter 576/1445 - loss 0.11839753 - time (sec): 9.76 - samples/sec: 7234.05 - lr: 0.000003 - momentum: 0.000000
|
215 |
+
2023-10-18 22:23:47,747 epoch 10 - iter 720/1445 - loss 0.11701466 - time (sec): 12.20 - samples/sec: 7228.56 - lr: 0.000003 - momentum: 0.000000
|
216 |
+
2023-10-18 22:23:50,178 epoch 10 - iter 864/1445 - loss 0.11883233 - time (sec): 14.64 - samples/sec: 7197.45 - lr: 0.000002 - momentum: 0.000000
|
217 |
+
2023-10-18 22:23:52,601 epoch 10 - iter 1008/1445 - loss 0.11719932 - time (sec): 17.06 - samples/sec: 7194.57 - lr: 0.000002 - momentum: 0.000000
|
218 |
+
2023-10-18 22:23:55,010 epoch 10 - iter 1152/1445 - loss 0.11486660 - time (sec): 19.47 - samples/sec: 7252.97 - lr: 0.000001 - momentum: 0.000000
|
219 |
+
2023-10-18 22:23:57,481 epoch 10 - iter 1296/1445 - loss 0.11531310 - time (sec): 21.94 - samples/sec: 7272.64 - lr: 0.000001 - momentum: 0.000000
|
220 |
+
2023-10-18 22:23:59,863 epoch 10 - iter 1440/1445 - loss 0.11535435 - time (sec): 24.32 - samples/sec: 7227.50 - lr: 0.000000 - momentum: 0.000000
|
221 |
+
2023-10-18 22:23:59,939 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-18 22:23:59,939 EPOCH 10 done: loss 0.1154 - lr: 0.000000
|
223 |
+
2023-10-18 22:24:01,714 DEV : loss 0.19508427381515503 - f1-score (micro avg) 0.5306
|
224 |
+
2023-10-18 22:24:01,760 ----------------------------------------------------------------------------------------------------
|
225 |
+
2023-10-18 22:24:01,761 Loading model from best epoch ...
|
226 |
+
2023-10-18 22:24:01,844 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
|
227 |
+
2023-10-18 22:24:03,182
|
228 |
+
Results:
|
229 |
+
- F-score (micro) 0.5675
|
230 |
+
- F-score (macro) 0.4004
|
231 |
+
- Accuracy 0.4057
|
232 |
+
|
233 |
+
By class:
|
234 |
+
precision recall f1-score support
|
235 |
+
|
236 |
+
LOC 0.6191 0.6638 0.6407 458
|
237 |
+
PER 0.5795 0.4917 0.5320 482
|
238 |
+
ORG 1.0000 0.0145 0.0286 69
|
239 |
+
|
240 |
+
micro avg 0.6016 0.5372 0.5675 1009
|
241 |
+
macro avg 0.7329 0.3900 0.4004 1009
|
242 |
+
weighted avg 0.6262 0.5372 0.5469 1009
|
243 |
+
|
244 |
+
2023-10-18 22:24:03,182 ----------------------------------------------------------------------------------------------------
|