Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +241 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e03242c86fdd5b176e24d55fecf4657355554e9f8c541ac4452e8182b8f5445e
|
3 |
+
size 443311111
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 08:03:13 0.0000 0.3561 0.1541 0.7038 0.4442 0.5446 0.3752
|
3 |
+
2 08:04:31 0.0000 0.1148 0.1283 0.8294 0.6126 0.7047 0.5511
|
4 |
+
3 08:05:48 0.0000 0.0738 0.1015 0.7992 0.7810 0.7900 0.6708
|
5 |
+
4 08:07:05 0.0000 0.0513 0.1252 0.8070 0.7386 0.7713 0.6418
|
6 |
+
5 08:08:22 0.0000 0.0372 0.1704 0.8198 0.7097 0.7608 0.6297
|
7 |
+
6 08:09:38 0.0000 0.0282 0.1647 0.8377 0.7304 0.7804 0.6504
|
8 |
+
7 08:10:54 0.0000 0.0198 0.1839 0.8465 0.7521 0.7965 0.6747
|
9 |
+
8 08:12:12 0.0000 0.0133 0.1841 0.8541 0.7500 0.7987 0.6753
|
10 |
+
9 08:13:29 0.0000 0.0095 0.1866 0.8352 0.7748 0.8039 0.6843
|
11 |
+
10 08:14:47 0.0000 0.0063 0.2062 0.8458 0.7593 0.8002 0.6806
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,241 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-14 08:01:57,233 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-14 08:01:57,234 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-14 08:01:57,234 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-14 08:01:57,235 MultiCorpus: 5777 train + 722 dev + 723 test sentences
|
52 |
+
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
|
53 |
+
2023-10-14 08:01:57,235 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-14 08:01:57,235 Train: 5777 sentences
|
55 |
+
2023-10-14 08:01:57,235 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-14 08:01:57,235 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-14 08:01:57,235 Training Params:
|
58 |
+
2023-10-14 08:01:57,235 - learning_rate: "3e-05"
|
59 |
+
2023-10-14 08:01:57,235 - mini_batch_size: "4"
|
60 |
+
2023-10-14 08:01:57,235 - max_epochs: "10"
|
61 |
+
2023-10-14 08:01:57,235 - shuffle: "True"
|
62 |
+
2023-10-14 08:01:57,235 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-14 08:01:57,235 Plugins:
|
64 |
+
2023-10-14 08:01:57,235 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-14 08:01:57,235 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-14 08:01:57,235 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-14 08:01:57,235 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-14 08:01:57,235 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-14 08:01:57,235 Computation:
|
70 |
+
2023-10-14 08:01:57,235 - compute on device: cuda:0
|
71 |
+
2023-10-14 08:01:57,235 - embedding storage: none
|
72 |
+
2023-10-14 08:01:57,235 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-14 08:01:57,235 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
|
74 |
+
2023-10-14 08:01:57,235 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-14 08:01:57,235 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-14 08:02:05,605 epoch 1 - iter 144/1445 - loss 1.98365673 - time (sec): 8.37 - samples/sec: 2024.95 - lr: 0.000003 - momentum: 0.000000
|
77 |
+
2023-10-14 08:02:12,644 epoch 1 - iter 288/1445 - loss 1.12353678 - time (sec): 15.41 - samples/sec: 2193.16 - lr: 0.000006 - momentum: 0.000000
|
78 |
+
2023-10-14 08:02:19,943 epoch 1 - iter 432/1445 - loss 0.80002002 - time (sec): 22.71 - samples/sec: 2300.53 - lr: 0.000009 - momentum: 0.000000
|
79 |
+
2023-10-14 08:02:27,153 epoch 1 - iter 576/1445 - loss 0.64820380 - time (sec): 29.92 - samples/sec: 2334.66 - lr: 0.000012 - momentum: 0.000000
|
80 |
+
2023-10-14 08:02:34,153 epoch 1 - iter 720/1445 - loss 0.55704517 - time (sec): 36.92 - samples/sec: 2349.72 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-14 08:02:41,087 epoch 1 - iter 864/1445 - loss 0.50191366 - time (sec): 43.85 - samples/sec: 2331.26 - lr: 0.000018 - momentum: 0.000000
|
82 |
+
2023-10-14 08:02:48,174 epoch 1 - iter 1008/1445 - loss 0.45051786 - time (sec): 50.94 - samples/sec: 2371.87 - lr: 0.000021 - momentum: 0.000000
|
83 |
+
2023-10-14 08:02:55,702 epoch 1 - iter 1152/1445 - loss 0.41254886 - time (sec): 58.47 - samples/sec: 2374.94 - lr: 0.000024 - momentum: 0.000000
|
84 |
+
2023-10-14 08:03:02,970 epoch 1 - iter 1296/1445 - loss 0.38359668 - time (sec): 65.73 - samples/sec: 2386.92 - lr: 0.000027 - momentum: 0.000000
|
85 |
+
2023-10-14 08:03:10,330 epoch 1 - iter 1440/1445 - loss 0.35707304 - time (sec): 73.09 - samples/sec: 2401.86 - lr: 0.000030 - momentum: 0.000000
|
86 |
+
2023-10-14 08:03:10,587 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-14 08:03:10,587 EPOCH 1 done: loss 0.3561 - lr: 0.000030
|
88 |
+
2023-10-14 08:03:13,488 DEV : loss 0.15405000746250153 - f1-score (micro avg) 0.5446
|
89 |
+
2023-10-14 08:03:13,502 saving best model
|
90 |
+
2023-10-14 08:03:13,875 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-14 08:03:21,212 epoch 2 - iter 144/1445 - loss 0.14057591 - time (sec): 7.34 - samples/sec: 2367.01 - lr: 0.000030 - momentum: 0.000000
|
92 |
+
2023-10-14 08:03:28,665 epoch 2 - iter 288/1445 - loss 0.13227084 - time (sec): 14.79 - samples/sec: 2328.11 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-14 08:03:36,066 epoch 2 - iter 432/1445 - loss 0.13180492 - time (sec): 22.19 - samples/sec: 2347.30 - lr: 0.000029 - momentum: 0.000000
|
94 |
+
2023-10-14 08:03:43,233 epoch 2 - iter 576/1445 - loss 0.12555163 - time (sec): 29.36 - samples/sec: 2360.57 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-14 08:03:50,688 epoch 2 - iter 720/1445 - loss 0.12318631 - time (sec): 36.81 - samples/sec: 2381.45 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-14 08:03:57,836 epoch 2 - iter 864/1445 - loss 0.12027431 - time (sec): 43.96 - samples/sec: 2382.69 - lr: 0.000028 - momentum: 0.000000
|
97 |
+
2023-10-14 08:04:05,205 epoch 2 - iter 1008/1445 - loss 0.12246549 - time (sec): 51.33 - samples/sec: 2391.19 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-14 08:04:12,194 epoch 2 - iter 1152/1445 - loss 0.11929573 - time (sec): 58.32 - samples/sec: 2377.13 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-14 08:04:19,819 epoch 2 - iter 1296/1445 - loss 0.11665455 - time (sec): 65.94 - samples/sec: 2393.24 - lr: 0.000027 - momentum: 0.000000
|
100 |
+
2023-10-14 08:04:27,010 epoch 2 - iter 1440/1445 - loss 0.11480563 - time (sec): 73.13 - samples/sec: 2401.85 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-14 08:04:27,254 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-14 08:04:27,254 EPOCH 2 done: loss 0.1148 - lr: 0.000027
|
103 |
+
2023-10-14 08:04:31,083 DEV : loss 0.1282866895198822 - f1-score (micro avg) 0.7047
|
104 |
+
2023-10-14 08:04:31,097 saving best model
|
105 |
+
2023-10-14 08:04:31,626 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-14 08:04:38,949 epoch 3 - iter 144/1445 - loss 0.08251436 - time (sec): 7.32 - samples/sec: 2420.04 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-14 08:04:46,376 epoch 3 - iter 288/1445 - loss 0.07784727 - time (sec): 14.75 - samples/sec: 2411.42 - lr: 0.000026 - momentum: 0.000000
|
108 |
+
2023-10-14 08:04:53,664 epoch 3 - iter 432/1445 - loss 0.07783536 - time (sec): 22.04 - samples/sec: 2414.24 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-14 08:05:00,965 epoch 3 - iter 576/1445 - loss 0.07514099 - time (sec): 29.34 - samples/sec: 2414.20 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-14 08:05:08,267 epoch 3 - iter 720/1445 - loss 0.07581881 - time (sec): 36.64 - samples/sec: 2416.46 - lr: 0.000025 - momentum: 0.000000
|
111 |
+
2023-10-14 08:05:15,292 epoch 3 - iter 864/1445 - loss 0.07472479 - time (sec): 43.66 - samples/sec: 2430.01 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-14 08:05:22,640 epoch 3 - iter 1008/1445 - loss 0.07638855 - time (sec): 51.01 - samples/sec: 2411.58 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-14 08:05:29,996 epoch 3 - iter 1152/1445 - loss 0.07480564 - time (sec): 58.37 - samples/sec: 2419.81 - lr: 0.000024 - momentum: 0.000000
|
114 |
+
2023-10-14 08:05:37,136 epoch 3 - iter 1296/1445 - loss 0.07451773 - time (sec): 65.51 - samples/sec: 2425.62 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-14 08:05:44,547 epoch 3 - iter 1440/1445 - loss 0.07377630 - time (sec): 72.92 - samples/sec: 2410.74 - lr: 0.000023 - momentum: 0.000000
|
116 |
+
2023-10-14 08:05:44,765 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-14 08:05:44,765 EPOCH 3 done: loss 0.0738 - lr: 0.000023
|
118 |
+
2023-10-14 08:05:48,189 DEV : loss 0.10148890316486359 - f1-score (micro avg) 0.79
|
119 |
+
2023-10-14 08:05:48,203 saving best model
|
120 |
+
2023-10-14 08:05:48,718 ----------------------------------------------------------------------------------------------------
|
121 |
+
2023-10-14 08:05:56,573 epoch 4 - iter 144/1445 - loss 0.04469757 - time (sec): 7.85 - samples/sec: 2296.48 - lr: 0.000023 - momentum: 0.000000
|
122 |
+
2023-10-14 08:06:03,927 epoch 4 - iter 288/1445 - loss 0.04147054 - time (sec): 15.21 - samples/sec: 2407.13 - lr: 0.000023 - momentum: 0.000000
|
123 |
+
2023-10-14 08:06:11,043 epoch 4 - iter 432/1445 - loss 0.04588436 - time (sec): 22.32 - samples/sec: 2382.90 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-14 08:06:18,317 epoch 4 - iter 576/1445 - loss 0.04687950 - time (sec): 29.60 - samples/sec: 2408.53 - lr: 0.000022 - momentum: 0.000000
|
125 |
+
2023-10-14 08:06:25,425 epoch 4 - iter 720/1445 - loss 0.05047665 - time (sec): 36.71 - samples/sec: 2414.44 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-14 08:06:32,454 epoch 4 - iter 864/1445 - loss 0.05060324 - time (sec): 43.73 - samples/sec: 2406.39 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-14 08:06:39,525 epoch 4 - iter 1008/1445 - loss 0.04884015 - time (sec): 50.81 - samples/sec: 2400.36 - lr: 0.000021 - momentum: 0.000000
|
128 |
+
2023-10-14 08:06:46,972 epoch 4 - iter 1152/1445 - loss 0.04928212 - time (sec): 58.25 - samples/sec: 2411.64 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-14 08:06:54,267 epoch 4 - iter 1296/1445 - loss 0.05041401 - time (sec): 65.55 - samples/sec: 2399.42 - lr: 0.000020 - momentum: 0.000000
|
130 |
+
2023-10-14 08:07:01,642 epoch 4 - iter 1440/1445 - loss 0.05090107 - time (sec): 72.92 - samples/sec: 2409.83 - lr: 0.000020 - momentum: 0.000000
|
131 |
+
2023-10-14 08:07:01,879 ----------------------------------------------------------------------------------------------------
|
132 |
+
2023-10-14 08:07:01,879 EPOCH 4 done: loss 0.0513 - lr: 0.000020
|
133 |
+
2023-10-14 08:07:05,346 DEV : loss 0.12521180510520935 - f1-score (micro avg) 0.7713
|
134 |
+
2023-10-14 08:07:05,361 ----------------------------------------------------------------------------------------------------
|
135 |
+
2023-10-14 08:07:12,830 epoch 5 - iter 144/1445 - loss 0.03689043 - time (sec): 7.47 - samples/sec: 2376.83 - lr: 0.000020 - momentum: 0.000000
|
136 |
+
2023-10-14 08:07:20,377 epoch 5 - iter 288/1445 - loss 0.04141324 - time (sec): 15.02 - samples/sec: 2371.46 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-14 08:07:27,312 epoch 5 - iter 432/1445 - loss 0.03820414 - time (sec): 21.95 - samples/sec: 2390.95 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-14 08:07:34,526 epoch 5 - iter 576/1445 - loss 0.03805636 - time (sec): 29.16 - samples/sec: 2397.87 - lr: 0.000019 - momentum: 0.000000
|
139 |
+
2023-10-14 08:07:41,652 epoch 5 - iter 720/1445 - loss 0.03728520 - time (sec): 36.29 - samples/sec: 2415.63 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-14 08:07:49,124 epoch 5 - iter 864/1445 - loss 0.03609096 - time (sec): 43.76 - samples/sec: 2419.45 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-14 08:07:56,226 epoch 5 - iter 1008/1445 - loss 0.03551319 - time (sec): 50.86 - samples/sec: 2412.87 - lr: 0.000018 - momentum: 0.000000
|
142 |
+
2023-10-14 08:08:03,611 epoch 5 - iter 1152/1445 - loss 0.03524412 - time (sec): 58.25 - samples/sec: 2413.61 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-14 08:08:11,110 epoch 5 - iter 1296/1445 - loss 0.03695923 - time (sec): 65.75 - samples/sec: 2410.80 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-14 08:08:18,082 epoch 5 - iter 1440/1445 - loss 0.03728152 - time (sec): 72.72 - samples/sec: 2413.40 - lr: 0.000017 - momentum: 0.000000
|
145 |
+
2023-10-14 08:08:18,376 ----------------------------------------------------------------------------------------------------
|
146 |
+
2023-10-14 08:08:18,376 EPOCH 5 done: loss 0.0372 - lr: 0.000017
|
147 |
+
2023-10-14 08:08:22,161 DEV : loss 0.17038105428218842 - f1-score (micro avg) 0.7608
|
148 |
+
2023-10-14 08:08:22,176 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-14 08:08:29,368 epoch 6 - iter 144/1445 - loss 0.02545181 - time (sec): 7.19 - samples/sec: 2500.25 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-14 08:08:36,614 epoch 6 - iter 288/1445 - loss 0.02562755 - time (sec): 14.44 - samples/sec: 2504.38 - lr: 0.000016 - momentum: 0.000000
|
151 |
+
2023-10-14 08:08:43,778 epoch 6 - iter 432/1445 - loss 0.02526942 - time (sec): 21.60 - samples/sec: 2499.22 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-14 08:08:51,447 epoch 6 - iter 576/1445 - loss 0.02775801 - time (sec): 29.27 - samples/sec: 2455.98 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-14 08:08:58,506 epoch 6 - iter 720/1445 - loss 0.02706181 - time (sec): 36.33 - samples/sec: 2439.93 - lr: 0.000015 - momentum: 0.000000
|
154 |
+
2023-10-14 08:09:05,788 epoch 6 - iter 864/1445 - loss 0.02599246 - time (sec): 43.61 - samples/sec: 2421.51 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-14 08:09:13,141 epoch 6 - iter 1008/1445 - loss 0.02794849 - time (sec): 50.96 - samples/sec: 2420.40 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-14 08:09:20,518 epoch 6 - iter 1152/1445 - loss 0.02869252 - time (sec): 58.34 - samples/sec: 2428.10 - lr: 0.000014 - momentum: 0.000000
|
157 |
+
2023-10-14 08:09:27,666 epoch 6 - iter 1296/1445 - loss 0.02856538 - time (sec): 65.49 - samples/sec: 2417.95 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-14 08:09:34,870 epoch 6 - iter 1440/1445 - loss 0.02820532 - time (sec): 72.69 - samples/sec: 2416.69 - lr: 0.000013 - momentum: 0.000000
|
159 |
+
2023-10-14 08:09:35,113 ----------------------------------------------------------------------------------------------------
|
160 |
+
2023-10-14 08:09:35,113 EPOCH 6 done: loss 0.0282 - lr: 0.000013
|
161 |
+
2023-10-14 08:09:38,563 DEV : loss 0.1647178679704666 - f1-score (micro avg) 0.7804
|
162 |
+
2023-10-14 08:09:38,579 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-14 08:09:45,895 epoch 7 - iter 144/1445 - loss 0.01300211 - time (sec): 7.31 - samples/sec: 2379.65 - lr: 0.000013 - momentum: 0.000000
|
164 |
+
2023-10-14 08:09:53,075 epoch 7 - iter 288/1445 - loss 0.01696728 - time (sec): 14.50 - samples/sec: 2354.75 - lr: 0.000013 - momentum: 0.000000
|
165 |
+
2023-10-14 08:10:00,712 epoch 7 - iter 432/1445 - loss 0.01849889 - time (sec): 22.13 - samples/sec: 2370.41 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-14 08:10:07,982 epoch 7 - iter 576/1445 - loss 0.01797361 - time (sec): 29.40 - samples/sec: 2398.07 - lr: 0.000012 - momentum: 0.000000
|
167 |
+
2023-10-14 08:10:15,199 epoch 7 - iter 720/1445 - loss 0.01798847 - time (sec): 36.62 - samples/sec: 2403.56 - lr: 0.000012 - momentum: 0.000000
|
168 |
+
2023-10-14 08:10:22,580 epoch 7 - iter 864/1445 - loss 0.01952435 - time (sec): 44.00 - samples/sec: 2420.08 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-14 08:10:29,751 epoch 7 - iter 1008/1445 - loss 0.01984725 - time (sec): 51.17 - samples/sec: 2407.97 - lr: 0.000011 - momentum: 0.000000
|
170 |
+
2023-10-14 08:10:37,085 epoch 7 - iter 1152/1445 - loss 0.01930156 - time (sec): 58.51 - samples/sec: 2417.50 - lr: 0.000011 - momentum: 0.000000
|
171 |
+
2023-10-14 08:10:44,067 epoch 7 - iter 1296/1445 - loss 0.01973561 - time (sec): 65.49 - samples/sec: 2413.04 - lr: 0.000010 - momentum: 0.000000
|
172 |
+
2023-10-14 08:10:51,271 epoch 7 - iter 1440/1445 - loss 0.01980517 - time (sec): 72.69 - samples/sec: 2417.01 - lr: 0.000010 - momentum: 0.000000
|
173 |
+
2023-10-14 08:10:51,497 ----------------------------------------------------------------------------------------------------
|
174 |
+
2023-10-14 08:10:51,497 EPOCH 7 done: loss 0.0198 - lr: 0.000010
|
175 |
+
2023-10-14 08:10:54,970 DEV : loss 0.1839003711938858 - f1-score (micro avg) 0.7965
|
176 |
+
2023-10-14 08:10:54,986 saving best model
|
177 |
+
2023-10-14 08:10:55,568 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-14 08:11:02,709 epoch 8 - iter 144/1445 - loss 0.01405551 - time (sec): 7.14 - samples/sec: 2442.76 - lr: 0.000010 - momentum: 0.000000
|
179 |
+
2023-10-14 08:11:10,323 epoch 8 - iter 288/1445 - loss 0.01175523 - time (sec): 14.75 - samples/sec: 2391.64 - lr: 0.000009 - momentum: 0.000000
|
180 |
+
2023-10-14 08:11:17,600 epoch 8 - iter 432/1445 - loss 0.01129704 - time (sec): 22.03 - samples/sec: 2402.18 - lr: 0.000009 - momentum: 0.000000
|
181 |
+
2023-10-14 08:11:25,174 epoch 8 - iter 576/1445 - loss 0.01292683 - time (sec): 29.60 - samples/sec: 2370.48 - lr: 0.000009 - momentum: 0.000000
|
182 |
+
2023-10-14 08:11:32,582 epoch 8 - iter 720/1445 - loss 0.01230325 - time (sec): 37.01 - samples/sec: 2406.86 - lr: 0.000008 - momentum: 0.000000
|
183 |
+
2023-10-14 08:11:39,737 epoch 8 - iter 864/1445 - loss 0.01183301 - time (sec): 44.17 - samples/sec: 2397.89 - lr: 0.000008 - momentum: 0.000000
|
184 |
+
2023-10-14 08:11:46,776 epoch 8 - iter 1008/1445 - loss 0.01208905 - time (sec): 51.21 - samples/sec: 2412.73 - lr: 0.000008 - momentum: 0.000000
|
185 |
+
2023-10-14 08:11:53,875 epoch 8 - iter 1152/1445 - loss 0.01318623 - time (sec): 58.31 - samples/sec: 2403.05 - lr: 0.000007 - momentum: 0.000000
|
186 |
+
2023-10-14 08:12:01,512 epoch 8 - iter 1296/1445 - loss 0.01358646 - time (sec): 65.94 - samples/sec: 2402.71 - lr: 0.000007 - momentum: 0.000000
|
187 |
+
2023-10-14 08:12:08,839 epoch 8 - iter 1440/1445 - loss 0.01334733 - time (sec): 73.27 - samples/sec: 2400.15 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2023-10-14 08:12:09,064 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-14 08:12:09,064 EPOCH 8 done: loss 0.0133 - lr: 0.000007
|
190 |
+
2023-10-14 08:12:12,861 DEV : loss 0.18413744866847992 - f1-score (micro avg) 0.7987
|
191 |
+
2023-10-14 08:12:12,876 saving best model
|
192 |
+
2023-10-14 08:12:13,455 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-14 08:12:20,659 epoch 9 - iter 144/1445 - loss 0.01146554 - time (sec): 7.20 - samples/sec: 2530.75 - lr: 0.000006 - momentum: 0.000000
|
194 |
+
2023-10-14 08:12:28,224 epoch 9 - iter 288/1445 - loss 0.01300968 - time (sec): 14.77 - samples/sec: 2489.92 - lr: 0.000006 - momentum: 0.000000
|
195 |
+
2023-10-14 08:12:35,547 epoch 9 - iter 432/1445 - loss 0.01004910 - time (sec): 22.09 - samples/sec: 2531.98 - lr: 0.000006 - momentum: 0.000000
|
196 |
+
2023-10-14 08:12:42,577 epoch 9 - iter 576/1445 - loss 0.00980441 - time (sec): 29.12 - samples/sec: 2477.06 - lr: 0.000005 - momentum: 0.000000
|
197 |
+
2023-10-14 08:12:50,040 epoch 9 - iter 720/1445 - loss 0.00958906 - time (sec): 36.58 - samples/sec: 2482.19 - lr: 0.000005 - momentum: 0.000000
|
198 |
+
2023-10-14 08:12:56,931 epoch 9 - iter 864/1445 - loss 0.00931059 - time (sec): 43.47 - samples/sec: 2463.93 - lr: 0.000005 - momentum: 0.000000
|
199 |
+
2023-10-14 08:13:04,320 epoch 9 - iter 1008/1445 - loss 0.00931590 - time (sec): 50.86 - samples/sec: 2440.57 - lr: 0.000004 - momentum: 0.000000
|
200 |
+
2023-10-14 08:13:11,276 epoch 9 - iter 1152/1445 - loss 0.00917441 - time (sec): 57.82 - samples/sec: 2423.84 - lr: 0.000004 - momentum: 0.000000
|
201 |
+
2023-10-14 08:13:18,525 epoch 9 - iter 1296/1445 - loss 0.00925497 - time (sec): 65.07 - samples/sec: 2420.64 - lr: 0.000004 - momentum: 0.000000
|
202 |
+
2023-10-14 08:13:25,900 epoch 9 - iter 1440/1445 - loss 0.00952078 - time (sec): 72.44 - samples/sec: 2424.96 - lr: 0.000003 - momentum: 0.000000
|
203 |
+
2023-10-14 08:13:26,131 ----------------------------------------------------------------------------------------------------
|
204 |
+
2023-10-14 08:13:26,131 EPOCH 9 done: loss 0.0095 - lr: 0.000003
|
205 |
+
2023-10-14 08:13:29,573 DEV : loss 0.186563640832901 - f1-score (micro avg) 0.8039
|
206 |
+
2023-10-14 08:13:29,588 saving best model
|
207 |
+
2023-10-14 08:13:30,149 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-14 08:13:37,534 epoch 10 - iter 144/1445 - loss 0.00663741 - time (sec): 7.38 - samples/sec: 2488.81 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-14 08:13:45,111 epoch 10 - iter 288/1445 - loss 0.00559908 - time (sec): 14.96 - samples/sec: 2318.32 - lr: 0.000003 - momentum: 0.000000
|
210 |
+
2023-10-14 08:13:52,608 epoch 10 - iter 432/1445 - loss 0.00730824 - time (sec): 22.46 - samples/sec: 2344.28 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-14 08:14:00,148 epoch 10 - iter 576/1445 - loss 0.00695631 - time (sec): 30.00 - samples/sec: 2363.90 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-14 08:14:07,251 epoch 10 - iter 720/1445 - loss 0.00719157 - time (sec): 37.10 - samples/sec: 2367.92 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-14 08:14:14,804 epoch 10 - iter 864/1445 - loss 0.00696365 - time (sec): 44.65 - samples/sec: 2398.27 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-14 08:14:21,883 epoch 10 - iter 1008/1445 - loss 0.00688362 - time (sec): 51.73 - samples/sec: 2399.15 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-14 08:14:29,091 epoch 10 - iter 1152/1445 - loss 0.00665468 - time (sec): 58.94 - samples/sec: 2394.67 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-14 08:14:36,092 epoch 10 - iter 1296/1445 - loss 0.00650502 - time (sec): 65.94 - samples/sec: 2393.15 - lr: 0.000000 - momentum: 0.000000
|
217 |
+
2023-10-14 08:14:43,476 epoch 10 - iter 1440/1445 - loss 0.00627816 - time (sec): 73.33 - samples/sec: 2398.42 - lr: 0.000000 - momentum: 0.000000
|
218 |
+
2023-10-14 08:14:43,689 ----------------------------------------------------------------------------------------------------
|
219 |
+
2023-10-14 08:14:43,689 EPOCH 10 done: loss 0.0063 - lr: 0.000000
|
220 |
+
2023-10-14 08:14:47,154 DEV : loss 0.20616155862808228 - f1-score (micro avg) 0.8002
|
221 |
+
2023-10-14 08:14:47,596 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-14 08:14:47,597 Loading model from best epoch ...
|
223 |
+
2023-10-14 08:14:49,401 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
|
224 |
+
2023-10-14 08:14:52,528
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.8004
|
227 |
+
- F-score (macro) 0.6926
|
228 |
+
- Accuracy 0.6828
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
PER 0.8282 0.8299 0.8290 482
|
234 |
+
LOC 0.8786 0.7904 0.8322 458
|
235 |
+
ORG 0.4000 0.4348 0.4167 69
|
236 |
+
|
237 |
+
micro avg 0.8165 0.7849 0.8004 1009
|
238 |
+
macro avg 0.7023 0.6850 0.6926 1009
|
239 |
+
weighted avg 0.8218 0.7849 0.8023 1009
|
240 |
+
|
241 |
+
2023-10-14 08:14:52,528 ----------------------------------------------------------------------------------------------------
|