Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +243 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1c5ffe75333cc4a60c5cc9da29d6149c8fd948969dc1576258b1c3f1e4606f39
|
3 |
+
size 443311111
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 09:55:49 0.0000 0.2833 0.1179 0.8053 0.6963 0.7468 0.6150
|
3 |
+
2 09:57:07 0.0000 0.1077 0.1063 0.7057 0.7779 0.7400 0.6063
|
4 |
+
3 09:58:24 0.0000 0.0739 0.1096 0.8357 0.7200 0.7736 0.6484
|
5 |
+
4 09:59:43 0.0000 0.0579 0.1358 0.7944 0.7624 0.7781 0.6485
|
6 |
+
5 10:01:02 0.0000 0.0432 0.1354 0.8430 0.7655 0.8024 0.6848
|
7 |
+
6 10:02:20 0.0000 0.0337 0.1495 0.8247 0.7872 0.8055 0.6902
|
8 |
+
7 10:03:38 0.0000 0.0218 0.1777 0.8522 0.7862 0.8178 0.7033
|
9 |
+
8 10:04:56 0.0000 0.0154 0.1747 0.8709 0.7738 0.8195 0.7073
|
10 |
+
9 10:06:13 0.0000 0.0098 0.1775 0.8786 0.7624 0.8164 0.7002
|
11 |
+
10 10:07:31 0.0000 0.0068 0.1839 0.8766 0.7779 0.8243 0.7158
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,243 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-14 09:54:33,483 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-14 09:54:33,484 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-14 09:54:33,484 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-14 09:54:33,484 MultiCorpus: 5777 train + 722 dev + 723 test sentences
|
52 |
+
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
|
53 |
+
2023-10-14 09:54:33,484 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-14 09:54:33,484 Train: 5777 sentences
|
55 |
+
2023-10-14 09:54:33,484 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-14 09:54:33,485 Training Params:
|
58 |
+
2023-10-14 09:54:33,485 - learning_rate: "5e-05"
|
59 |
+
2023-10-14 09:54:33,485 - mini_batch_size: "4"
|
60 |
+
2023-10-14 09:54:33,485 - max_epochs: "10"
|
61 |
+
2023-10-14 09:54:33,485 - shuffle: "True"
|
62 |
+
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-14 09:54:33,485 Plugins:
|
64 |
+
2023-10-14 09:54:33,485 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-14 09:54:33,485 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-14 09:54:33,485 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-14 09:54:33,485 Computation:
|
70 |
+
2023-10-14 09:54:33,485 - compute on device: cuda:0
|
71 |
+
2023-10-14 09:54:33,485 - embedding storage: none
|
72 |
+
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-14 09:54:33,485 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
|
74 |
+
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-14 09:54:40,631 epoch 1 - iter 144/1445 - loss 1.32151919 - time (sec): 7.14 - samples/sec: 2426.87 - lr: 0.000005 - momentum: 0.000000
|
77 |
+
2023-10-14 09:54:47,906 epoch 1 - iter 288/1445 - loss 0.79135229 - time (sec): 14.42 - samples/sec: 2414.73 - lr: 0.000010 - momentum: 0.000000
|
78 |
+
2023-10-14 09:54:54,888 epoch 1 - iter 432/1445 - loss 0.58960751 - time (sec): 21.40 - samples/sec: 2424.36 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-14 09:55:02,060 epoch 1 - iter 576/1445 - loss 0.49079879 - time (sec): 28.57 - samples/sec: 2430.02 - lr: 0.000020 - momentum: 0.000000
|
80 |
+
2023-10-14 09:55:09,558 epoch 1 - iter 720/1445 - loss 0.41600711 - time (sec): 36.07 - samples/sec: 2465.79 - lr: 0.000025 - momentum: 0.000000
|
81 |
+
2023-10-14 09:55:16,802 epoch 1 - iter 864/1445 - loss 0.37774428 - time (sec): 43.32 - samples/sec: 2443.10 - lr: 0.000030 - momentum: 0.000000
|
82 |
+
2023-10-14 09:55:23,834 epoch 1 - iter 1008/1445 - loss 0.34717009 - time (sec): 50.35 - samples/sec: 2433.37 - lr: 0.000035 - momentum: 0.000000
|
83 |
+
2023-10-14 09:55:31,157 epoch 1 - iter 1152/1445 - loss 0.32075520 - time (sec): 57.67 - samples/sec: 2436.34 - lr: 0.000040 - momentum: 0.000000
|
84 |
+
2023-10-14 09:55:38,446 epoch 1 - iter 1296/1445 - loss 0.29995277 - time (sec): 64.96 - samples/sec: 2440.65 - lr: 0.000045 - momentum: 0.000000
|
85 |
+
2023-10-14 09:55:45,688 epoch 1 - iter 1440/1445 - loss 0.28332814 - time (sec): 72.20 - samples/sec: 2436.27 - lr: 0.000050 - momentum: 0.000000
|
86 |
+
2023-10-14 09:55:45,899 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-14 09:55:45,900 EPOCH 1 done: loss 0.2833 - lr: 0.000050
|
88 |
+
2023-10-14 09:55:49,489 DEV : loss 0.11788433790206909 - f1-score (micro avg) 0.7468
|
89 |
+
2023-10-14 09:55:49,514 saving best model
|
90 |
+
2023-10-14 09:55:49,872 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-14 09:55:57,243 epoch 2 - iter 144/1445 - loss 0.13020671 - time (sec): 7.37 - samples/sec: 2424.95 - lr: 0.000049 - momentum: 0.000000
|
92 |
+
2023-10-14 09:56:04,400 epoch 2 - iter 288/1445 - loss 0.12131136 - time (sec): 14.53 - samples/sec: 2414.49 - lr: 0.000049 - momentum: 0.000000
|
93 |
+
2023-10-14 09:56:11,439 epoch 2 - iter 432/1445 - loss 0.11672384 - time (sec): 21.56 - samples/sec: 2425.04 - lr: 0.000048 - momentum: 0.000000
|
94 |
+
2023-10-14 09:56:18,951 epoch 2 - iter 576/1445 - loss 0.11924226 - time (sec): 29.08 - samples/sec: 2436.14 - lr: 0.000048 - momentum: 0.000000
|
95 |
+
2023-10-14 09:56:26,351 epoch 2 - iter 720/1445 - loss 0.11484030 - time (sec): 36.48 - samples/sec: 2450.06 - lr: 0.000047 - momentum: 0.000000
|
96 |
+
2023-10-14 09:56:33,706 epoch 2 - iter 864/1445 - loss 0.10994458 - time (sec): 43.83 - samples/sec: 2433.06 - lr: 0.000047 - momentum: 0.000000
|
97 |
+
2023-10-14 09:56:40,743 epoch 2 - iter 1008/1445 - loss 0.10976559 - time (sec): 50.87 - samples/sec: 2422.72 - lr: 0.000046 - momentum: 0.000000
|
98 |
+
2023-10-14 09:56:48,032 epoch 2 - iter 1152/1445 - loss 0.10785653 - time (sec): 58.16 - samples/sec: 2425.32 - lr: 0.000046 - momentum: 0.000000
|
99 |
+
2023-10-14 09:56:55,326 epoch 2 - iter 1296/1445 - loss 0.10867392 - time (sec): 65.45 - samples/sec: 2423.28 - lr: 0.000045 - momentum: 0.000000
|
100 |
+
2023-10-14 09:57:02,445 epoch 2 - iter 1440/1445 - loss 0.10774871 - time (sec): 72.57 - samples/sec: 2421.80 - lr: 0.000044 - momentum: 0.000000
|
101 |
+
2023-10-14 09:57:02,665 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-14 09:57:02,666 EPOCH 2 done: loss 0.1077 - lr: 0.000044
|
103 |
+
2023-10-14 09:57:07,129 DEV : loss 0.10633940249681473 - f1-score (micro avg) 0.74
|
104 |
+
2023-10-14 09:57:07,146 ----------------------------------------------------------------------------------------------------
|
105 |
+
2023-10-14 09:57:15,050 epoch 3 - iter 144/1445 - loss 0.09035149 - time (sec): 7.90 - samples/sec: 2218.66 - lr: 0.000044 - momentum: 0.000000
|
106 |
+
2023-10-14 09:57:22,733 epoch 3 - iter 288/1445 - loss 0.08638552 - time (sec): 15.59 - samples/sec: 2237.53 - lr: 0.000043 - momentum: 0.000000
|
107 |
+
2023-10-14 09:57:30,022 epoch 3 - iter 432/1445 - loss 0.07945969 - time (sec): 22.87 - samples/sec: 2313.62 - lr: 0.000043 - momentum: 0.000000
|
108 |
+
2023-10-14 09:57:37,519 epoch 3 - iter 576/1445 - loss 0.07900069 - time (sec): 30.37 - samples/sec: 2328.85 - lr: 0.000042 - momentum: 0.000000
|
109 |
+
2023-10-14 09:57:44,928 epoch 3 - iter 720/1445 - loss 0.07585140 - time (sec): 37.78 - samples/sec: 2331.64 - lr: 0.000042 - momentum: 0.000000
|
110 |
+
2023-10-14 09:57:52,073 epoch 3 - iter 864/1445 - loss 0.07259362 - time (sec): 44.93 - samples/sec: 2350.70 - lr: 0.000041 - momentum: 0.000000
|
111 |
+
2023-10-14 09:57:59,957 epoch 3 - iter 1008/1445 - loss 0.07342512 - time (sec): 52.81 - samples/sec: 2349.05 - lr: 0.000041 - momentum: 0.000000
|
112 |
+
2023-10-14 09:58:06,869 epoch 3 - iter 1152/1445 - loss 0.07356550 - time (sec): 59.72 - samples/sec: 2355.16 - lr: 0.000040 - momentum: 0.000000
|
113 |
+
2023-10-14 09:58:13,892 epoch 3 - iter 1296/1445 - loss 0.07388942 - time (sec): 66.74 - samples/sec: 2374.81 - lr: 0.000039 - momentum: 0.000000
|
114 |
+
2023-10-14 09:58:20,897 epoch 3 - iter 1440/1445 - loss 0.07387594 - time (sec): 73.75 - samples/sec: 2382.74 - lr: 0.000039 - momentum: 0.000000
|
115 |
+
2023-10-14 09:58:21,112 ----------------------------------------------------------------------------------------------------
|
116 |
+
2023-10-14 09:58:21,112 EPOCH 3 done: loss 0.0739 - lr: 0.000039
|
117 |
+
2023-10-14 09:58:24,726 DEV : loss 0.10955189168453217 - f1-score (micro avg) 0.7736
|
118 |
+
2023-10-14 09:58:24,753 saving best model
|
119 |
+
2023-10-14 09:58:25,237 ----------------------------------------------------------------------------------------------------
|
120 |
+
2023-10-14 09:58:33,415 epoch 4 - iter 144/1445 - loss 0.05575292 - time (sec): 8.17 - samples/sec: 2111.86 - lr: 0.000038 - momentum: 0.000000
|
121 |
+
2023-10-14 09:58:40,972 epoch 4 - iter 288/1445 - loss 0.05433728 - time (sec): 15.73 - samples/sec: 2209.10 - lr: 0.000038 - momentum: 0.000000
|
122 |
+
2023-10-14 09:58:48,740 epoch 4 - iter 432/1445 - loss 0.05126783 - time (sec): 23.50 - samples/sec: 2174.54 - lr: 0.000037 - momentum: 0.000000
|
123 |
+
2023-10-14 09:58:55,970 epoch 4 - iter 576/1445 - loss 0.05255992 - time (sec): 30.73 - samples/sec: 2256.80 - lr: 0.000037 - momentum: 0.000000
|
124 |
+
2023-10-14 09:59:03,172 epoch 4 - iter 720/1445 - loss 0.05383510 - time (sec): 37.93 - samples/sec: 2293.78 - lr: 0.000036 - momentum: 0.000000
|
125 |
+
2023-10-14 09:59:10,572 epoch 4 - iter 864/1445 - loss 0.05800646 - time (sec): 45.33 - samples/sec: 2327.58 - lr: 0.000036 - momentum: 0.000000
|
126 |
+
2023-10-14 09:59:17,841 epoch 4 - iter 1008/1445 - loss 0.05787781 - time (sec): 52.60 - samples/sec: 2357.32 - lr: 0.000035 - momentum: 0.000000
|
127 |
+
2023-10-14 09:59:24,960 epoch 4 - iter 1152/1445 - loss 0.05816878 - time (sec): 59.72 - samples/sec: 2351.87 - lr: 0.000034 - momentum: 0.000000
|
128 |
+
2023-10-14 09:59:31,975 epoch 4 - iter 1296/1445 - loss 0.05645199 - time (sec): 66.73 - samples/sec: 2360.65 - lr: 0.000034 - momentum: 0.000000
|
129 |
+
2023-10-14 09:59:39,299 epoch 4 - iter 1440/1445 - loss 0.05795378 - time (sec): 74.06 - samples/sec: 2374.45 - lr: 0.000033 - momentum: 0.000000
|
130 |
+
2023-10-14 09:59:39,517 ----------------------------------------------------------------------------------------------------
|
131 |
+
2023-10-14 09:59:39,517 EPOCH 4 done: loss 0.0579 - lr: 0.000033
|
132 |
+
2023-10-14 09:59:43,265 DEV : loss 0.13582421839237213 - f1-score (micro avg) 0.7781
|
133 |
+
2023-10-14 09:59:43,289 saving best model
|
134 |
+
2023-10-14 09:59:43,862 ----------------------------------------------------------------------------------------------------
|
135 |
+
2023-10-14 09:59:52,276 epoch 5 - iter 144/1445 - loss 0.03810618 - time (sec): 8.41 - samples/sec: 2223.59 - lr: 0.000033 - momentum: 0.000000
|
136 |
+
2023-10-14 10:00:00,013 epoch 5 - iter 288/1445 - loss 0.03940126 - time (sec): 16.15 - samples/sec: 2221.95 - lr: 0.000032 - momentum: 0.000000
|
137 |
+
2023-10-14 10:00:07,583 epoch 5 - iter 432/1445 - loss 0.04118564 - time (sec): 23.72 - samples/sec: 2276.97 - lr: 0.000032 - momentum: 0.000000
|
138 |
+
2023-10-14 10:00:14,965 epoch 5 - iter 576/1445 - loss 0.04133400 - time (sec): 31.10 - samples/sec: 2304.36 - lr: 0.000031 - momentum: 0.000000
|
139 |
+
2023-10-14 10:00:22,239 epoch 5 - iter 720/1445 - loss 0.04096234 - time (sec): 38.37 - samples/sec: 2322.89 - lr: 0.000031 - momentum: 0.000000
|
140 |
+
2023-10-14 10:00:29,460 epoch 5 - iter 864/1445 - loss 0.04169367 - time (sec): 45.60 - samples/sec: 2339.62 - lr: 0.000030 - momentum: 0.000000
|
141 |
+
2023-10-14 10:00:36,493 epoch 5 - iter 1008/1445 - loss 0.04137275 - time (sec): 52.63 - samples/sec: 2344.75 - lr: 0.000029 - momentum: 0.000000
|
142 |
+
2023-10-14 10:00:43,616 epoch 5 - iter 1152/1445 - loss 0.04174783 - time (sec): 59.75 - samples/sec: 2357.54 - lr: 0.000029 - momentum: 0.000000
|
143 |
+
2023-10-14 10:00:50,767 epoch 5 - iter 1296/1445 - loss 0.04150397 - time (sec): 66.90 - samples/sec: 2368.59 - lr: 0.000028 - momentum: 0.000000
|
144 |
+
2023-10-14 10:00:58,092 epoch 5 - iter 1440/1445 - loss 0.04328645 - time (sec): 74.23 - samples/sec: 2366.61 - lr: 0.000028 - momentum: 0.000000
|
145 |
+
2023-10-14 10:00:58,319 ----------------------------------------------------------------------------------------------------
|
146 |
+
2023-10-14 10:00:58,319 EPOCH 5 done: loss 0.0432 - lr: 0.000028
|
147 |
+
2023-10-14 10:01:02,496 DEV : loss 0.13541918992996216 - f1-score (micro avg) 0.8024
|
148 |
+
2023-10-14 10:01:02,522 saving best model
|
149 |
+
2023-10-14 10:01:03,190 ----------------------------------------------------------------------------------------------------
|
150 |
+
2023-10-14 10:01:11,193 epoch 6 - iter 144/1445 - loss 0.02701007 - time (sec): 8.00 - samples/sec: 2186.87 - lr: 0.000027 - momentum: 0.000000
|
151 |
+
2023-10-14 10:01:19,114 epoch 6 - iter 288/1445 - loss 0.02878942 - time (sec): 15.92 - samples/sec: 2281.52 - lr: 0.000027 - momentum: 0.000000
|
152 |
+
2023-10-14 10:01:26,397 epoch 6 - iter 432/1445 - loss 0.02954965 - time (sec): 23.21 - samples/sec: 2310.29 - lr: 0.000026 - momentum: 0.000000
|
153 |
+
2023-10-14 10:01:33,678 epoch 6 - iter 576/1445 - loss 0.03123260 - time (sec): 30.49 - samples/sec: 2330.56 - lr: 0.000026 - momentum: 0.000000
|
154 |
+
2023-10-14 10:01:40,900 epoch 6 - iter 720/1445 - loss 0.03333774 - time (sec): 37.71 - samples/sec: 2343.36 - lr: 0.000025 - momentum: 0.000000
|
155 |
+
2023-10-14 10:01:48,262 epoch 6 - iter 864/1445 - loss 0.03315317 - time (sec): 45.07 - samples/sec: 2370.89 - lr: 0.000024 - momentum: 0.000000
|
156 |
+
2023-10-14 10:01:55,480 epoch 6 - iter 1008/1445 - loss 0.03407127 - time (sec): 52.29 - samples/sec: 2374.76 - lr: 0.000024 - momentum: 0.000000
|
157 |
+
2023-10-14 10:02:02,421 epoch 6 - iter 1152/1445 - loss 0.03360862 - time (sec): 59.23 - samples/sec: 2376.83 - lr: 0.000023 - momentum: 0.000000
|
158 |
+
2023-10-14 10:02:09,545 epoch 6 - iter 1296/1445 - loss 0.03318720 - time (sec): 66.35 - samples/sec: 2369.79 - lr: 0.000023 - momentum: 0.000000
|
159 |
+
2023-10-14 10:02:16,875 epoch 6 - iter 1440/1445 - loss 0.03384080 - time (sec): 73.68 - samples/sec: 2382.28 - lr: 0.000022 - momentum: 0.000000
|
160 |
+
2023-10-14 10:02:17,142 ----------------------------------------------------------------------------------------------------
|
161 |
+
2023-10-14 10:02:17,142 EPOCH 6 done: loss 0.0337 - lr: 0.000022
|
162 |
+
2023-10-14 10:02:20,761 DEV : loss 0.14951762557029724 - f1-score (micro avg) 0.8055
|
163 |
+
2023-10-14 10:02:20,782 saving best model
|
164 |
+
2023-10-14 10:02:21,326 ----------------------------------------------------------------------------------------------------
|
165 |
+
2023-10-14 10:02:28,530 epoch 7 - iter 144/1445 - loss 0.01859679 - time (sec): 7.20 - samples/sec: 2409.56 - lr: 0.000022 - momentum: 0.000000
|
166 |
+
2023-10-14 10:02:36,495 epoch 7 - iter 288/1445 - loss 0.01862360 - time (sec): 15.17 - samples/sec: 2313.65 - lr: 0.000021 - momentum: 0.000000
|
167 |
+
2023-10-14 10:02:43,691 epoch 7 - iter 432/1445 - loss 0.02122236 - time (sec): 22.36 - samples/sec: 2339.13 - lr: 0.000021 - momentum: 0.000000
|
168 |
+
2023-10-14 10:02:51,023 epoch 7 - iter 576/1445 - loss 0.02018797 - time (sec): 29.69 - samples/sec: 2366.08 - lr: 0.000020 - momentum: 0.000000
|
169 |
+
2023-10-14 10:02:58,172 epoch 7 - iter 720/1445 - loss 0.02085339 - time (sec): 36.84 - samples/sec: 2377.25 - lr: 0.000019 - momentum: 0.000000
|
170 |
+
2023-10-14 10:03:05,594 epoch 7 - iter 864/1445 - loss 0.02188189 - time (sec): 44.27 - samples/sec: 2386.72 - lr: 0.000019 - momentum: 0.000000
|
171 |
+
2023-10-14 10:03:12,809 epoch 7 - iter 1008/1445 - loss 0.02164479 - time (sec): 51.48 - samples/sec: 2388.79 - lr: 0.000018 - momentum: 0.000000
|
172 |
+
2023-10-14 10:03:19,872 epoch 7 - iter 1152/1445 - loss 0.02215737 - time (sec): 58.54 - samples/sec: 2389.64 - lr: 0.000018 - momentum: 0.000000
|
173 |
+
2023-10-14 10:03:27,552 epoch 7 - iter 1296/1445 - loss 0.02159639 - time (sec): 66.22 - samples/sec: 2387.54 - lr: 0.000017 - momentum: 0.000000
|
174 |
+
2023-10-14 10:03:34,922 epoch 7 - iter 1440/1445 - loss 0.02175986 - time (sec): 73.59 - samples/sec: 2388.59 - lr: 0.000017 - momentum: 0.000000
|
175 |
+
2023-10-14 10:03:35,171 ----------------------------------------------------------------------------------------------------
|
176 |
+
2023-10-14 10:03:35,172 EPOCH 7 done: loss 0.0218 - lr: 0.000017
|
177 |
+
2023-10-14 10:03:38,829 DEV : loss 0.1777421534061432 - f1-score (micro avg) 0.8178
|
178 |
+
2023-10-14 10:03:38,851 saving best model
|
179 |
+
2023-10-14 10:03:39,474 ----------------------------------------------------------------------------------------------------
|
180 |
+
2023-10-14 10:03:46,871 epoch 8 - iter 144/1445 - loss 0.01548083 - time (sec): 7.40 - samples/sec: 2445.84 - lr: 0.000016 - momentum: 0.000000
|
181 |
+
2023-10-14 10:03:53,941 epoch 8 - iter 288/1445 - loss 0.01427568 - time (sec): 14.47 - samples/sec: 2431.92 - lr: 0.000016 - momentum: 0.000000
|
182 |
+
2023-10-14 10:04:01,854 epoch 8 - iter 432/1445 - loss 0.01655715 - time (sec): 22.38 - samples/sec: 2449.47 - lr: 0.000015 - momentum: 0.000000
|
183 |
+
2023-10-14 10:04:08,682 epoch 8 - iter 576/1445 - loss 0.01523833 - time (sec): 29.21 - samples/sec: 2379.70 - lr: 0.000014 - momentum: 0.000000
|
184 |
+
2023-10-14 10:04:16,110 epoch 8 - iter 720/1445 - loss 0.01549776 - time (sec): 36.63 - samples/sec: 2408.93 - lr: 0.000014 - momentum: 0.000000
|
185 |
+
2023-10-14 10:04:23,692 epoch 8 - iter 864/1445 - loss 0.01513458 - time (sec): 44.22 - samples/sec: 2413.23 - lr: 0.000013 - momentum: 0.000000
|
186 |
+
2023-10-14 10:04:30,915 epoch 8 - iter 1008/1445 - loss 0.01430193 - time (sec): 51.44 - samples/sec: 2407.19 - lr: 0.000013 - momentum: 0.000000
|
187 |
+
2023-10-14 10:04:38,088 epoch 8 - iter 1152/1445 - loss 0.01514747 - time (sec): 58.61 - samples/sec: 2405.46 - lr: 0.000012 - momentum: 0.000000
|
188 |
+
2023-10-14 10:04:45,354 epoch 8 - iter 1296/1445 - loss 0.01503901 - time (sec): 65.88 - samples/sec: 2412.79 - lr: 0.000012 - momentum: 0.000000
|
189 |
+
2023-10-14 10:04:52,563 epoch 8 - iter 1440/1445 - loss 0.01540928 - time (sec): 73.09 - samples/sec: 2404.82 - lr: 0.000011 - momentum: 0.000000
|
190 |
+
2023-10-14 10:04:52,787 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-14 10:04:52,787 EPOCH 8 done: loss 0.0154 - lr: 0.000011
|
192 |
+
2023-10-14 10:04:56,876 DEV : loss 0.17469239234924316 - f1-score (micro avg) 0.8195
|
193 |
+
2023-10-14 10:04:56,901 saving best model
|
194 |
+
2023-10-14 10:04:57,410 ----------------------------------------------------------------------------------------------------
|
195 |
+
2023-10-14 10:05:04,739 epoch 9 - iter 144/1445 - loss 0.00719715 - time (sec): 7.32 - samples/sec: 2462.89 - lr: 0.000011 - momentum: 0.000000
|
196 |
+
2023-10-14 10:05:11,984 epoch 9 - iter 288/1445 - loss 0.00849542 - time (sec): 14.57 - samples/sec: 2445.28 - lr: 0.000010 - momentum: 0.000000
|
197 |
+
2023-10-14 10:05:19,098 epoch 9 - iter 432/1445 - loss 0.00729865 - time (sec): 21.68 - samples/sec: 2425.52 - lr: 0.000009 - momentum: 0.000000
|
198 |
+
2023-10-14 10:05:26,573 epoch 9 - iter 576/1445 - loss 0.00850589 - time (sec): 29.15 - samples/sec: 2435.29 - lr: 0.000009 - momentum: 0.000000
|
199 |
+
2023-10-14 10:05:33,750 epoch 9 - iter 720/1445 - loss 0.00957931 - time (sec): 36.33 - samples/sec: 2419.21 - lr: 0.000008 - momentum: 0.000000
|
200 |
+
2023-10-14 10:05:41,317 epoch 9 - iter 864/1445 - loss 0.00998317 - time (sec): 43.90 - samples/sec: 2439.16 - lr: 0.000008 - momentum: 0.000000
|
201 |
+
2023-10-14 10:05:48,361 epoch 9 - iter 1008/1445 - loss 0.00949987 - time (sec): 50.94 - samples/sec: 2424.28 - lr: 0.000007 - momentum: 0.000000
|
202 |
+
2023-10-14 10:05:55,632 epoch 9 - iter 1152/1445 - loss 0.00972046 - time (sec): 58.21 - samples/sec: 2432.51 - lr: 0.000007 - momentum: 0.000000
|
203 |
+
2023-10-14 10:06:02,678 epoch 9 - iter 1296/1445 - loss 0.01012022 - time (sec): 65.26 - samples/sec: 2429.75 - lr: 0.000006 - momentum: 0.000000
|
204 |
+
2023-10-14 10:06:09,749 epoch 9 - iter 1440/1445 - loss 0.00980864 - time (sec): 72.33 - samples/sec: 2431.32 - lr: 0.000006 - momentum: 0.000000
|
205 |
+
2023-10-14 10:06:09,984 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-14 10:06:09,984 EPOCH 9 done: loss 0.0098 - lr: 0.000006
|
207 |
+
2023-10-14 10:06:13,690 DEV : loss 0.17753612995147705 - f1-score (micro avg) 0.8164
|
208 |
+
2023-10-14 10:06:13,708 ----------------------------------------------------------------------------------------------------
|
209 |
+
2023-10-14 10:06:20,952 epoch 10 - iter 144/1445 - loss 0.00578931 - time (sec): 7.24 - samples/sec: 2299.68 - lr: 0.000005 - momentum: 0.000000
|
210 |
+
2023-10-14 10:06:28,812 epoch 10 - iter 288/1445 - loss 0.00515456 - time (sec): 15.10 - samples/sec: 2349.36 - lr: 0.000004 - momentum: 0.000000
|
211 |
+
2023-10-14 10:06:36,007 epoch 10 - iter 432/1445 - loss 0.00857074 - time (sec): 22.30 - samples/sec: 2386.36 - lr: 0.000004 - momentum: 0.000000
|
212 |
+
2023-10-14 10:06:43,072 epoch 10 - iter 576/1445 - loss 0.00818790 - time (sec): 29.36 - samples/sec: 2377.72 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-14 10:06:50,686 epoch 10 - iter 720/1445 - loss 0.00811613 - time (sec): 36.98 - samples/sec: 2386.90 - lr: 0.000003 - momentum: 0.000000
|
214 |
+
2023-10-14 10:06:58,142 epoch 10 - iter 864/1445 - loss 0.00732934 - time (sec): 44.43 - samples/sec: 2373.05 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-14 10:07:05,597 epoch 10 - iter 1008/1445 - loss 0.00736235 - time (sec): 51.89 - samples/sec: 2389.94 - lr: 0.000002 - momentum: 0.000000
|
216 |
+
2023-10-14 10:07:12,759 epoch 10 - iter 1152/1445 - loss 0.00677836 - time (sec): 59.05 - samples/sec: 2393.33 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-14 10:07:19,810 epoch 10 - iter 1296/1445 - loss 0.00657795 - time (sec): 66.10 - samples/sec: 2392.12 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-14 10:07:27,162 epoch 10 - iter 1440/1445 - loss 0.00682360 - time (sec): 73.45 - samples/sec: 2388.96 - lr: 0.000000 - momentum: 0.000000
|
219 |
+
2023-10-14 10:07:27,463 ----------------------------------------------------------------------------------------------------
|
220 |
+
2023-10-14 10:07:27,464 EPOCH 10 done: loss 0.0068 - lr: 0.000000
|
221 |
+
2023-10-14 10:07:31,044 DEV : loss 0.18386265635490417 - f1-score (micro avg) 0.8243
|
222 |
+
2023-10-14 10:07:31,069 saving best model
|
223 |
+
2023-10-14 10:07:32,182 ----------------------------------------------------------------------------------------------------
|
224 |
+
2023-10-14 10:07:32,184 Loading model from best epoch ...
|
225 |
+
2023-10-14 10:07:33,955 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
|
226 |
+
2023-10-14 10:07:37,400
|
227 |
+
Results:
|
228 |
+
- F-score (micro) 0.818
|
229 |
+
- F-score (macro) 0.7206
|
230 |
+
- Accuracy 0.7037
|
231 |
+
|
232 |
+
By class:
|
233 |
+
precision recall f1-score support
|
234 |
+
|
235 |
+
PER 0.8142 0.8091 0.8117 482
|
236 |
+
LOC 0.9071 0.8319 0.8679 458
|
237 |
+
ORG 0.6279 0.3913 0.4821 69
|
238 |
+
|
239 |
+
micro avg 0.8471 0.7909 0.8180 1009
|
240 |
+
macro avg 0.7831 0.6774 0.7206 1009
|
241 |
+
weighted avg 0.8436 0.7909 0.8146 1009
|
242 |
+
|
243 |
+
2023-10-14 10:07:37,400 ----------------------------------------------------------------------------------------------------
|