Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +242 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:85ae2eaea40b6ba1b60342acbe13a9e3fd1c8d04be3adc199d4b70176db81f94
|
3 |
+
size 443323527
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 11:26:24 0.0000 0.4246 0.1049 0.2183 0.4205 0.2874 0.1682
|
3 |
+
2 11:29:43 0.0000 0.1457 0.1157 0.2544 0.4621 0.3282 0.1966
|
4 |
+
3 11:33:02 0.0000 0.0944 0.2599 0.2207 0.6515 0.3297 0.1983
|
5 |
+
4 11:36:21 0.0000 0.0655 0.3399 0.2383 0.6648 0.3508 0.2135
|
6 |
+
5 11:39:42 0.0000 0.0488 0.3383 0.2873 0.5436 0.3759 0.2324
|
7 |
+
6 11:42:59 0.0000 0.0347 0.3767 0.2823 0.5625 0.3759 0.2326
|
8 |
+
7 11:46:19 0.0000 0.0288 0.4418 0.2550 0.6042 0.3586 0.2192
|
9 |
+
8 11:49:36 0.0000 0.0201 0.4072 0.2760 0.5739 0.3727 0.2304
|
10 |
+
9 11:52:56 0.0000 0.0133 0.4837 0.2658 0.6061 0.3695 0.2276
|
11 |
+
10 11:56:14 0.0000 0.0097 0.5250 0.2594 0.6004 0.3623 0.2221
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-15 11:23:10,757 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-15 11:23:10,758 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-15 11:23:10,758 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
|
53 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-15 11:23:10,758 Train: 20847 sentences
|
55 |
+
2023-10-15 11:23:10,758 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-15 11:23:10,758 Training Params:
|
58 |
+
2023-10-15 11:23:10,758 - learning_rate: "3e-05"
|
59 |
+
2023-10-15 11:23:10,758 - mini_batch_size: "8"
|
60 |
+
2023-10-15 11:23:10,758 - max_epochs: "10"
|
61 |
+
2023-10-15 11:23:10,758 - shuffle: "True"
|
62 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-15 11:23:10,758 Plugins:
|
64 |
+
2023-10-15 11:23:10,758 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-15 11:23:10,758 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-15 11:23:10,758 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-15 11:23:10,758 Computation:
|
70 |
+
2023-10-15 11:23:10,758 - compute on device: cuda:0
|
71 |
+
2023-10-15 11:23:10,758 - embedding storage: none
|
72 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-15 11:23:10,758 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
|
74 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-15 11:23:10,758 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-15 11:23:30,871 epoch 1 - iter 260/2606 - loss 1.95608573 - time (sec): 20.11 - samples/sec: 1821.03 - lr: 0.000003 - momentum: 0.000000
|
77 |
+
2023-10-15 11:23:49,246 epoch 1 - iter 520/2606 - loss 1.20498957 - time (sec): 38.49 - samples/sec: 1902.63 - lr: 0.000006 - momentum: 0.000000
|
78 |
+
2023-10-15 11:24:08,718 epoch 1 - iter 780/2606 - loss 0.88774684 - time (sec): 57.96 - samples/sec: 1938.13 - lr: 0.000009 - momentum: 0.000000
|
79 |
+
2023-10-15 11:24:26,997 epoch 1 - iter 1040/2606 - loss 0.74089882 - time (sec): 76.24 - samples/sec: 1944.19 - lr: 0.000012 - momentum: 0.000000
|
80 |
+
2023-10-15 11:24:46,543 epoch 1 - iter 1300/2606 - loss 0.62868238 - time (sec): 95.78 - samples/sec: 1963.38 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-15 11:25:05,186 epoch 1 - iter 1560/2606 - loss 0.56241762 - time (sec): 114.43 - samples/sec: 1968.23 - lr: 0.000018 - momentum: 0.000000
|
82 |
+
2023-10-15 11:25:22,602 epoch 1 - iter 1820/2606 - loss 0.51681442 - time (sec): 131.84 - samples/sec: 1961.99 - lr: 0.000021 - momentum: 0.000000
|
83 |
+
2023-10-15 11:25:41,872 epoch 1 - iter 2080/2606 - loss 0.47721131 - time (sec): 151.11 - samples/sec: 1963.11 - lr: 0.000024 - momentum: 0.000000
|
84 |
+
2023-10-15 11:26:00,694 epoch 1 - iter 2340/2606 - loss 0.44717332 - time (sec): 169.93 - samples/sec: 1957.85 - lr: 0.000027 - momentum: 0.000000
|
85 |
+
2023-10-15 11:26:18,704 epoch 1 - iter 2600/2606 - loss 0.42510976 - time (sec): 187.94 - samples/sec: 1951.67 - lr: 0.000030 - momentum: 0.000000
|
86 |
+
2023-10-15 11:26:19,102 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-15 11:26:19,103 EPOCH 1 done: loss 0.4246 - lr: 0.000030
|
88 |
+
2023-10-15 11:26:24,777 DEV : loss 0.10493393242359161 - f1-score (micro avg) 0.2874
|
89 |
+
2023-10-15 11:26:24,804 saving best model
|
90 |
+
2023-10-15 11:26:25,279 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-15 11:26:44,170 epoch 2 - iter 260/2606 - loss 0.15135290 - time (sec): 18.89 - samples/sec: 2012.96 - lr: 0.000030 - momentum: 0.000000
|
92 |
+
2023-10-15 11:27:02,300 epoch 2 - iter 520/2606 - loss 0.15584007 - time (sec): 37.02 - samples/sec: 1928.81 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-15 11:27:21,347 epoch 2 - iter 780/2606 - loss 0.15466883 - time (sec): 56.07 - samples/sec: 1940.15 - lr: 0.000029 - momentum: 0.000000
|
94 |
+
2023-10-15 11:27:40,743 epoch 2 - iter 1040/2606 - loss 0.15722047 - time (sec): 75.46 - samples/sec: 1945.92 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-15 11:27:59,737 epoch 2 - iter 1300/2606 - loss 0.15371850 - time (sec): 94.46 - samples/sec: 1945.89 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-15 11:28:17,978 epoch 2 - iter 1560/2606 - loss 0.15077449 - time (sec): 112.70 - samples/sec: 1928.06 - lr: 0.000028 - momentum: 0.000000
|
97 |
+
2023-10-15 11:28:37,463 epoch 2 - iter 1820/2606 - loss 0.15121932 - time (sec): 132.18 - samples/sec: 1928.51 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-15 11:28:56,764 epoch 2 - iter 2080/2606 - loss 0.14626589 - time (sec): 151.48 - samples/sec: 1935.22 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-15 11:29:16,655 epoch 2 - iter 2340/2606 - loss 0.14497160 - time (sec): 171.37 - samples/sec: 1942.69 - lr: 0.000027 - momentum: 0.000000
|
100 |
+
2023-10-15 11:29:34,519 epoch 2 - iter 2600/2606 - loss 0.14582616 - time (sec): 189.24 - samples/sec: 1937.24 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-15 11:29:34,896 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-15 11:29:34,896 EPOCH 2 done: loss 0.1457 - lr: 0.000027
|
103 |
+
2023-10-15 11:29:43,811 DEV : loss 0.1157253161072731 - f1-score (micro avg) 0.3282
|
104 |
+
2023-10-15 11:29:43,842 saving best model
|
105 |
+
2023-10-15 11:29:44,383 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-15 11:30:04,129 epoch 3 - iter 260/2606 - loss 0.09672823 - time (sec): 19.74 - samples/sec: 1911.15 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-15 11:30:22,485 epoch 3 - iter 520/2606 - loss 0.09813069 - time (sec): 38.10 - samples/sec: 1920.33 - lr: 0.000026 - momentum: 0.000000
|
108 |
+
2023-10-15 11:30:41,096 epoch 3 - iter 780/2606 - loss 0.09789230 - time (sec): 56.71 - samples/sec: 1899.82 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-15 11:30:59,847 epoch 3 - iter 1040/2606 - loss 0.09970218 - time (sec): 75.46 - samples/sec: 1901.46 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-15 11:31:18,566 epoch 3 - iter 1300/2606 - loss 0.10029814 - time (sec): 94.18 - samples/sec: 1913.54 - lr: 0.000025 - momentum: 0.000000
|
111 |
+
2023-10-15 11:31:37,456 epoch 3 - iter 1560/2606 - loss 0.09911648 - time (sec): 113.07 - samples/sec: 1925.81 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-15 11:31:56,736 epoch 3 - iter 1820/2606 - loss 0.09770698 - time (sec): 132.35 - samples/sec: 1940.18 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-15 11:32:14,998 epoch 3 - iter 2080/2606 - loss 0.09548429 - time (sec): 150.61 - samples/sec: 1943.35 - lr: 0.000024 - momentum: 0.000000
|
114 |
+
2023-10-15 11:32:34,739 epoch 3 - iter 2340/2606 - loss 0.09541300 - time (sec): 170.35 - samples/sec: 1936.42 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-15 11:32:53,656 epoch 3 - iter 2600/2606 - loss 0.09437391 - time (sec): 189.27 - samples/sec: 1938.48 - lr: 0.000023 - momentum: 0.000000
|
116 |
+
2023-10-15 11:32:54,014 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-15 11:32:54,014 EPOCH 3 done: loss 0.0944 - lr: 0.000023
|
118 |
+
2023-10-15 11:33:02,873 DEV : loss 0.2598646879196167 - f1-score (micro avg) 0.3297
|
119 |
+
2023-10-15 11:33:02,898 saving best model
|
120 |
+
2023-10-15 11:33:03,466 ----------------------------------------------------------------------------------------------------
|
121 |
+
2023-10-15 11:33:21,480 epoch 4 - iter 260/2606 - loss 0.06904616 - time (sec): 18.01 - samples/sec: 1954.12 - lr: 0.000023 - momentum: 0.000000
|
122 |
+
2023-10-15 11:33:40,589 epoch 4 - iter 520/2606 - loss 0.06431747 - time (sec): 37.12 - samples/sec: 1901.00 - lr: 0.000023 - momentum: 0.000000
|
123 |
+
2023-10-15 11:34:00,177 epoch 4 - iter 780/2606 - loss 0.06516267 - time (sec): 56.71 - samples/sec: 1918.47 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-15 11:34:18,744 epoch 4 - iter 1040/2606 - loss 0.06696138 - time (sec): 75.28 - samples/sec: 1935.08 - lr: 0.000022 - momentum: 0.000000
|
125 |
+
2023-10-15 11:34:36,926 epoch 4 - iter 1300/2606 - loss 0.06644839 - time (sec): 93.46 - samples/sec: 1934.94 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-15 11:34:55,744 epoch 4 - iter 1560/2606 - loss 0.06837482 - time (sec): 112.28 - samples/sec: 1945.03 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-15 11:35:15,091 epoch 4 - iter 1820/2606 - loss 0.06759837 - time (sec): 131.62 - samples/sec: 1938.44 - lr: 0.000021 - momentum: 0.000000
|
128 |
+
2023-10-15 11:35:33,800 epoch 4 - iter 2080/2606 - loss 0.06569611 - time (sec): 150.33 - samples/sec: 1949.43 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-15 11:35:52,303 epoch 4 - iter 2340/2606 - loss 0.06595273 - time (sec): 168.84 - samples/sec: 1941.72 - lr: 0.000020 - momentum: 0.000000
|
130 |
+
2023-10-15 11:36:12,192 epoch 4 - iter 2600/2606 - loss 0.06525384 - time (sec): 188.72 - samples/sec: 1941.11 - lr: 0.000020 - momentum: 0.000000
|
131 |
+
2023-10-15 11:36:12,591 ----------------------------------------------------------------------------------------------------
|
132 |
+
2023-10-15 11:36:12,591 EPOCH 4 done: loss 0.0655 - lr: 0.000020
|
133 |
+
2023-10-15 11:36:21,481 DEV : loss 0.3398613929748535 - f1-score (micro avg) 0.3508
|
134 |
+
2023-10-15 11:36:21,508 saving best model
|
135 |
+
2023-10-15 11:36:22,101 ----------------------------------------------------------------------------------------------------
|
136 |
+
2023-10-15 11:36:40,837 epoch 5 - iter 260/2606 - loss 0.04681142 - time (sec): 18.73 - samples/sec: 1912.70 - lr: 0.000020 - momentum: 0.000000
|
137 |
+
2023-10-15 11:36:59,251 epoch 5 - iter 520/2606 - loss 0.04619425 - time (sec): 37.15 - samples/sec: 1898.26 - lr: 0.000019 - momentum: 0.000000
|
138 |
+
2023-10-15 11:37:18,191 epoch 5 - iter 780/2606 - loss 0.04916841 - time (sec): 56.09 - samples/sec: 1906.49 - lr: 0.000019 - momentum: 0.000000
|
139 |
+
2023-10-15 11:37:36,945 epoch 5 - iter 1040/2606 - loss 0.04835895 - time (sec): 74.84 - samples/sec: 1920.06 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-15 11:37:56,295 epoch 5 - iter 1300/2606 - loss 0.04935859 - time (sec): 94.19 - samples/sec: 1907.75 - lr: 0.000018 - momentum: 0.000000
|
141 |
+
2023-10-15 11:38:15,599 epoch 5 - iter 1560/2606 - loss 0.05069348 - time (sec): 113.50 - samples/sec: 1916.18 - lr: 0.000018 - momentum: 0.000000
|
142 |
+
2023-10-15 11:38:35,508 epoch 5 - iter 1820/2606 - loss 0.04985800 - time (sec): 133.41 - samples/sec: 1908.04 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-15 11:38:55,541 epoch 5 - iter 2080/2606 - loss 0.04931997 - time (sec): 153.44 - samples/sec: 1916.40 - lr: 0.000017 - momentum: 0.000000
|
144 |
+
2023-10-15 11:39:15,029 epoch 5 - iter 2340/2606 - loss 0.04925847 - time (sec): 172.93 - samples/sec: 1923.96 - lr: 0.000017 - momentum: 0.000000
|
145 |
+
2023-10-15 11:39:32,811 epoch 5 - iter 2600/2606 - loss 0.04882081 - time (sec): 190.71 - samples/sec: 1922.34 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-15 11:39:33,271 ----------------------------------------------------------------------------------------------------
|
147 |
+
2023-10-15 11:39:33,271 EPOCH 5 done: loss 0.0488 - lr: 0.000017
|
148 |
+
2023-10-15 11:39:42,111 DEV : loss 0.33827313780784607 - f1-score (micro avg) 0.3759
|
149 |
+
2023-10-15 11:39:42,135 saving best model
|
150 |
+
2023-10-15 11:39:42,636 ----------------------------------------------------------------------------------------------------
|
151 |
+
2023-10-15 11:40:02,083 epoch 6 - iter 260/2606 - loss 0.03216757 - time (sec): 19.44 - samples/sec: 2018.67 - lr: 0.000016 - momentum: 0.000000
|
152 |
+
2023-10-15 11:40:20,796 epoch 6 - iter 520/2606 - loss 0.03377516 - time (sec): 38.16 - samples/sec: 2000.21 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-15 11:40:40,451 epoch 6 - iter 780/2606 - loss 0.03450669 - time (sec): 57.81 - samples/sec: 2000.27 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-15 11:40:58,899 epoch 6 - iter 1040/2606 - loss 0.03447335 - time (sec): 76.26 - samples/sec: 1969.32 - lr: 0.000015 - momentum: 0.000000
|
155 |
+
2023-10-15 11:41:16,192 epoch 6 - iter 1300/2606 - loss 0.03445124 - time (sec): 93.55 - samples/sec: 1955.94 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-15 11:41:35,241 epoch 6 - iter 1560/2606 - loss 0.03426452 - time (sec): 112.60 - samples/sec: 1961.66 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-15 11:41:53,190 epoch 6 - iter 1820/2606 - loss 0.03449286 - time (sec): 130.55 - samples/sec: 1948.04 - lr: 0.000014 - momentum: 0.000000
|
158 |
+
2023-10-15 11:42:12,877 epoch 6 - iter 2080/2606 - loss 0.03459348 - time (sec): 150.24 - samples/sec: 1954.68 - lr: 0.000014 - momentum: 0.000000
|
159 |
+
2023-10-15 11:42:32,418 epoch 6 - iter 2340/2606 - loss 0.03491351 - time (sec): 169.78 - samples/sec: 1952.20 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-15 11:42:50,282 epoch 6 - iter 2600/2606 - loss 0.03468906 - time (sec): 187.64 - samples/sec: 1953.90 - lr: 0.000013 - momentum: 0.000000
|
161 |
+
2023-10-15 11:42:50,712 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-15 11:42:50,712 EPOCH 6 done: loss 0.0347 - lr: 0.000013
|
163 |
+
2023-10-15 11:42:59,578 DEV : loss 0.37671998143196106 - f1-score (micro avg) 0.3759
|
164 |
+
2023-10-15 11:42:59,603 saving best model
|
165 |
+
2023-10-15 11:43:00,168 ----------------------------------------------------------------------------------------------------
|
166 |
+
2023-10-15 11:43:18,395 epoch 7 - iter 260/2606 - loss 0.02095137 - time (sec): 18.22 - samples/sec: 1932.76 - lr: 0.000013 - momentum: 0.000000
|
167 |
+
2023-10-15 11:43:37,265 epoch 7 - iter 520/2606 - loss 0.02458808 - time (sec): 37.09 - samples/sec: 1940.53 - lr: 0.000013 - momentum: 0.000000
|
168 |
+
2023-10-15 11:43:56,200 epoch 7 - iter 780/2606 - loss 0.02842129 - time (sec): 56.03 - samples/sec: 1954.51 - lr: 0.000012 - momentum: 0.000000
|
169 |
+
2023-10-15 11:44:14,945 epoch 7 - iter 1040/2606 - loss 0.02983051 - time (sec): 74.78 - samples/sec: 1954.56 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-15 11:44:33,678 epoch 7 - iter 1300/2606 - loss 0.02902919 - time (sec): 93.51 - samples/sec: 1952.03 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-15 11:44:53,466 epoch 7 - iter 1560/2606 - loss 0.02891612 - time (sec): 113.30 - samples/sec: 1937.76 - lr: 0.000011 - momentum: 0.000000
|
172 |
+
2023-10-15 11:45:14,253 epoch 7 - iter 1820/2606 - loss 0.02831922 - time (sec): 134.08 - samples/sec: 1924.72 - lr: 0.000011 - momentum: 0.000000
|
173 |
+
2023-10-15 11:45:32,859 epoch 7 - iter 2080/2606 - loss 0.02779547 - time (sec): 152.69 - samples/sec: 1925.35 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-15 11:45:51,573 epoch 7 - iter 2340/2606 - loss 0.02854297 - time (sec): 171.40 - samples/sec: 1925.01 - lr: 0.000010 - momentum: 0.000000
|
175 |
+
2023-10-15 11:46:10,274 epoch 7 - iter 2600/2606 - loss 0.02881930 - time (sec): 190.10 - samples/sec: 1926.52 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-15 11:46:10,807 ----------------------------------------------------------------------------------------------------
|
177 |
+
2023-10-15 11:46:10,807 EPOCH 7 done: loss 0.0288 - lr: 0.000010
|
178 |
+
2023-10-15 11:46:19,724 DEV : loss 0.4418008029460907 - f1-score (micro avg) 0.3586
|
179 |
+
2023-10-15 11:46:19,749 ----------------------------------------------------------------------------------------------------
|
180 |
+
2023-10-15 11:46:38,755 epoch 8 - iter 260/2606 - loss 0.01765218 - time (sec): 19.01 - samples/sec: 1966.68 - lr: 0.000010 - momentum: 0.000000
|
181 |
+
2023-10-15 11:46:56,956 epoch 8 - iter 520/2606 - loss 0.02088866 - time (sec): 37.21 - samples/sec: 1945.95 - lr: 0.000009 - momentum: 0.000000
|
182 |
+
2023-10-15 11:47:16,035 epoch 8 - iter 780/2606 - loss 0.02077464 - time (sec): 56.28 - samples/sec: 1935.26 - lr: 0.000009 - momentum: 0.000000
|
183 |
+
2023-10-15 11:47:34,625 epoch 8 - iter 1040/2606 - loss 0.01935007 - time (sec): 74.87 - samples/sec: 1931.34 - lr: 0.000009 - momentum: 0.000000
|
184 |
+
2023-10-15 11:47:53,113 epoch 8 - iter 1300/2606 - loss 0.01961430 - time (sec): 93.36 - samples/sec: 1933.92 - lr: 0.000008 - momentum: 0.000000
|
185 |
+
2023-10-15 11:48:12,999 epoch 8 - iter 1560/2606 - loss 0.02100904 - time (sec): 113.25 - samples/sec: 1946.62 - lr: 0.000008 - momentum: 0.000000
|
186 |
+
2023-10-15 11:48:31,646 epoch 8 - iter 1820/2606 - loss 0.02141028 - time (sec): 131.90 - samples/sec: 1953.85 - lr: 0.000008 - momentum: 0.000000
|
187 |
+
2023-10-15 11:48:50,341 epoch 8 - iter 2080/2606 - loss 0.02087340 - time (sec): 150.59 - samples/sec: 1939.85 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2023-10-15 11:49:09,307 epoch 8 - iter 2340/2606 - loss 0.02019028 - time (sec): 169.56 - samples/sec: 1943.76 - lr: 0.000007 - momentum: 0.000000
|
189 |
+
2023-10-15 11:49:28,050 epoch 8 - iter 2600/2606 - loss 0.02013491 - time (sec): 188.30 - samples/sec: 1945.41 - lr: 0.000007 - momentum: 0.000000
|
190 |
+
2023-10-15 11:49:28,555 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-15 11:49:28,555 EPOCH 8 done: loss 0.0201 - lr: 0.000007
|
192 |
+
2023-10-15 11:49:36,740 DEV : loss 0.40716004371643066 - f1-score (micro avg) 0.3727
|
193 |
+
2023-10-15 11:49:36,765 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-15 11:49:57,592 epoch 9 - iter 260/2606 - loss 0.01784026 - time (sec): 20.83 - samples/sec: 1894.37 - lr: 0.000006 - momentum: 0.000000
|
195 |
+
2023-10-15 11:50:16,456 epoch 9 - iter 520/2606 - loss 0.01374666 - time (sec): 39.69 - samples/sec: 1939.75 - lr: 0.000006 - momentum: 0.000000
|
196 |
+
2023-10-15 11:50:34,107 epoch 9 - iter 780/2606 - loss 0.01346939 - time (sec): 57.34 - samples/sec: 1927.06 - lr: 0.000006 - momentum: 0.000000
|
197 |
+
2023-10-15 11:50:53,332 epoch 9 - iter 1040/2606 - loss 0.01388641 - time (sec): 76.57 - samples/sec: 1934.95 - lr: 0.000005 - momentum: 0.000000
|
198 |
+
2023-10-15 11:51:12,267 epoch 9 - iter 1300/2606 - loss 0.01330099 - time (sec): 95.50 - samples/sec: 1932.69 - lr: 0.000005 - momentum: 0.000000
|
199 |
+
2023-10-15 11:51:30,969 epoch 9 - iter 1560/2606 - loss 0.01318018 - time (sec): 114.20 - samples/sec: 1927.10 - lr: 0.000005 - momentum: 0.000000
|
200 |
+
2023-10-15 11:51:49,982 epoch 9 - iter 1820/2606 - loss 0.01362242 - time (sec): 133.22 - samples/sec: 1919.43 - lr: 0.000004 - momentum: 0.000000
|
201 |
+
2023-10-15 11:52:09,059 epoch 9 - iter 2080/2606 - loss 0.01379042 - time (sec): 152.29 - samples/sec: 1920.32 - lr: 0.000004 - momentum: 0.000000
|
202 |
+
2023-10-15 11:52:27,174 epoch 9 - iter 2340/2606 - loss 0.01323694 - time (sec): 170.41 - samples/sec: 1916.43 - lr: 0.000004 - momentum: 0.000000
|
203 |
+
2023-10-15 11:52:47,281 epoch 9 - iter 2600/2606 - loss 0.01336791 - time (sec): 190.51 - samples/sec: 1921.44 - lr: 0.000003 - momentum: 0.000000
|
204 |
+
2023-10-15 11:52:47,906 ----------------------------------------------------------------------------------------------------
|
205 |
+
2023-10-15 11:52:47,906 EPOCH 9 done: loss 0.0133 - lr: 0.000003
|
206 |
+
2023-10-15 11:52:56,091 DEV : loss 0.4836600720882416 - f1-score (micro avg) 0.3695
|
207 |
+
2023-10-15 11:52:56,116 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-15 11:53:15,511 epoch 10 - iter 260/2606 - loss 0.00761937 - time (sec): 19.39 - samples/sec: 1973.81 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-15 11:53:34,937 epoch 10 - iter 520/2606 - loss 0.01059122 - time (sec): 38.82 - samples/sec: 1950.47 - lr: 0.000003 - momentum: 0.000000
|
210 |
+
2023-10-15 11:53:54,907 epoch 10 - iter 780/2606 - loss 0.00959745 - time (sec): 58.79 - samples/sec: 1956.66 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-15 11:54:14,832 epoch 10 - iter 1040/2606 - loss 0.00974413 - time (sec): 78.72 - samples/sec: 1933.32 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-15 11:54:33,515 epoch 10 - iter 1300/2606 - loss 0.00955880 - time (sec): 97.40 - samples/sec: 1932.05 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-15 11:54:51,472 epoch 10 - iter 1560/2606 - loss 0.00930499 - time (sec): 115.36 - samples/sec: 1919.99 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-15 11:55:09,895 epoch 10 - iter 1820/2606 - loss 0.00896874 - time (sec): 133.78 - samples/sec: 1923.78 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-15 11:55:28,885 epoch 10 - iter 2080/2606 - loss 0.00899681 - time (sec): 152.77 - samples/sec: 1925.01 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-15 11:55:48,156 epoch 10 - iter 2340/2606 - loss 0.00934111 - time (sec): 172.04 - samples/sec: 1926.74 - lr: 0.000000 - momentum: 0.000000
|
217 |
+
2023-10-15 11:56:06,065 epoch 10 - iter 2600/2606 - loss 0.00967478 - time (sec): 189.95 - samples/sec: 1929.90 - lr: 0.000000 - momentum: 0.000000
|
218 |
+
2023-10-15 11:56:06,427 ----------------------------------------------------------------------------------------------------
|
219 |
+
2023-10-15 11:56:06,427 EPOCH 10 done: loss 0.0097 - lr: 0.000000
|
220 |
+
2023-10-15 11:56:14,658 DEV : loss 0.5250458717346191 - f1-score (micro avg) 0.3623
|
221 |
+
2023-10-15 11:56:15,086 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-15 11:56:15,087 Loading model from best epoch ...
|
223 |
+
2023-10-15 11:56:16,909 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
224 |
+
2023-10-15 11:56:31,670
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.4262
|
227 |
+
- F-score (macro) 0.2901
|
228 |
+
- Accuracy 0.2744
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
LOC 0.4727 0.5066 0.4891 1214
|
234 |
+
PER 0.3735 0.4369 0.4027 808
|
235 |
+
ORG 0.2800 0.2578 0.2684 353
|
236 |
+
HumanProd 0.0000 0.0000 0.0000 15
|
237 |
+
|
238 |
+
micro avg 0.4106 0.4431 0.4262 2390
|
239 |
+
macro avg 0.2816 0.3003 0.2901 2390
|
240 |
+
weighted avg 0.4078 0.4431 0.4242 2390
|
241 |
+
|
242 |
+
2023-10-15 11:56:31,671 ----------------------------------------------------------------------------------------------------
|