Upload ./training.log with huggingface_hub
Browse files- training.log +244 -0
training.log
ADDED
@@ -0,0 +1,244 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-11-16 08:54:28,192 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-11-16 08:54:28,194 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): XLMRobertaModel(
|
5 |
+
(embeddings): XLMRobertaEmbeddings(
|
6 |
+
(word_embeddings): Embedding(250003, 1024)
|
7 |
+
(position_embeddings): Embedding(514, 1024, padding_idx=1)
|
8 |
+
(token_type_embeddings): Embedding(1, 1024)
|
9 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): XLMRobertaEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-23): 24 x XLMRobertaLayer(
|
15 |
+
(attention): XLMRobertaAttention(
|
16 |
+
(self): XLMRobertaSelfAttention(
|
17 |
+
(query): Linear(in_features=1024, out_features=1024, bias=True)
|
18 |
+
(key): Linear(in_features=1024, out_features=1024, bias=True)
|
19 |
+
(value): Linear(in_features=1024, out_features=1024, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): XLMRobertaSelfOutput(
|
23 |
+
(dense): Linear(in_features=1024, out_features=1024, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): XLMRobertaIntermediate(
|
29 |
+
(dense): Linear(in_features=1024, out_features=4096, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): XLMRobertaOutput(
|
33 |
+
(dense): Linear(in_features=4096, out_features=1024, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): XLMRobertaPooler(
|
41 |
+
(dense): Linear(in_features=1024, out_features=1024, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=1024, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-11-16 08:54:28,194 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-11-16 08:54:28,194 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences
|
52 |
+
- ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en
|
53 |
+
- ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka
|
54 |
+
2023-11-16 08:54:28,194 ----------------------------------------------------------------------------------------------------
|
55 |
+
2023-11-16 08:54:28,194 Train: 30000 sentences
|
56 |
+
2023-11-16 08:54:28,194 (train_with_dev=False, train_with_test=False)
|
57 |
+
2023-11-16 08:54:28,194 ----------------------------------------------------------------------------------------------------
|
58 |
+
2023-11-16 08:54:28,194 Training Params:
|
59 |
+
2023-11-16 08:54:28,194 - learning_rate: "5e-06"
|
60 |
+
2023-11-16 08:54:28,194 - mini_batch_size: "4"
|
61 |
+
2023-11-16 08:54:28,194 - max_epochs: "10"
|
62 |
+
2023-11-16 08:54:28,194 - shuffle: "True"
|
63 |
+
2023-11-16 08:54:28,194 ----------------------------------------------------------------------------------------------------
|
64 |
+
2023-11-16 08:54:28,194 Plugins:
|
65 |
+
2023-11-16 08:54:28,194 - TensorboardLogger
|
66 |
+
2023-11-16 08:54:28,194 - LinearScheduler | warmup_fraction: '0.1'
|
67 |
+
2023-11-16 08:54:28,194 ----------------------------------------------------------------------------------------------------
|
68 |
+
2023-11-16 08:54:28,194 Final evaluation on model from best epoch (best-model.pt)
|
69 |
+
2023-11-16 08:54:28,194 - metric: "('micro avg', 'f1-score')"
|
70 |
+
2023-11-16 08:54:28,194 ----------------------------------------------------------------------------------------------------
|
71 |
+
2023-11-16 08:54:28,195 Computation:
|
72 |
+
2023-11-16 08:54:28,195 - compute on device: cuda:0
|
73 |
+
2023-11-16 08:54:28,195 - embedding storage: none
|
74 |
+
2023-11-16 08:54:28,195 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-11-16 08:54:28,195 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-5"
|
76 |
+
2023-11-16 08:54:28,195 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-11-16 08:54:28,195 ----------------------------------------------------------------------------------------------------
|
78 |
+
2023-11-16 08:54:28,195 Logging anything other than scalars to TensorBoard is currently not supported.
|
79 |
+
2023-11-16 08:56:03,042 epoch 1 - iter 750/7500 - loss 2.73514177 - time (sec): 94.85 - samples/sec: 251.28 - lr: 0.000000 - momentum: 0.000000
|
80 |
+
2023-11-16 08:57:34,557 epoch 1 - iter 1500/7500 - loss 2.27024326 - time (sec): 186.36 - samples/sec: 258.14 - lr: 0.000001 - momentum: 0.000000
|
81 |
+
2023-11-16 08:59:07,587 epoch 1 - iter 2250/7500 - loss 1.96361468 - time (sec): 279.39 - samples/sec: 259.21 - lr: 0.000001 - momentum: 0.000000
|
82 |
+
2023-11-16 09:00:41,392 epoch 1 - iter 3000/7500 - loss 1.71842298 - time (sec): 373.20 - samples/sec: 259.19 - lr: 0.000002 - momentum: 0.000000
|
83 |
+
2023-11-16 09:02:13,057 epoch 1 - iter 3750/7500 - loss 1.50867437 - time (sec): 464.86 - samples/sec: 260.24 - lr: 0.000002 - momentum: 0.000000
|
84 |
+
2023-11-16 09:03:44,283 epoch 1 - iter 4500/7500 - loss 1.34975880 - time (sec): 556.09 - samples/sec: 261.24 - lr: 0.000003 - momentum: 0.000000
|
85 |
+
2023-11-16 09:05:17,789 epoch 1 - iter 5250/7500 - loss 1.23153815 - time (sec): 649.59 - samples/sec: 261.16 - lr: 0.000003 - momentum: 0.000000
|
86 |
+
2023-11-16 09:06:51,967 epoch 1 - iter 6000/7500 - loss 1.13901611 - time (sec): 743.77 - samples/sec: 260.11 - lr: 0.000004 - momentum: 0.000000
|
87 |
+
2023-11-16 09:08:25,989 epoch 1 - iter 6750/7500 - loss 1.06518907 - time (sec): 837.79 - samples/sec: 259.14 - lr: 0.000004 - momentum: 0.000000
|
88 |
+
2023-11-16 09:10:02,189 epoch 1 - iter 7500/7500 - loss 1.00335008 - time (sec): 933.99 - samples/sec: 257.81 - lr: 0.000005 - momentum: 0.000000
|
89 |
+
2023-11-16 09:10:02,191 ----------------------------------------------------------------------------------------------------
|
90 |
+
2023-11-16 09:10:02,191 EPOCH 1 done: loss 1.0034 - lr: 0.000005
|
91 |
+
2023-11-16 09:10:29,151 DEV : loss 0.25209933519363403 - f1-score (micro avg) 0.818
|
92 |
+
2023-11-16 09:10:30,799 saving best model
|
93 |
+
2023-11-16 09:10:32,550 ----------------------------------------------------------------------------------------------------
|
94 |
+
2023-11-16 09:12:05,182 epoch 2 - iter 750/7500 - loss 0.40758191 - time (sec): 92.63 - samples/sec: 255.89 - lr: 0.000005 - momentum: 0.000000
|
95 |
+
2023-11-16 09:13:40,646 epoch 2 - iter 1500/7500 - loss 0.41354608 - time (sec): 188.09 - samples/sec: 252.51 - lr: 0.000005 - momentum: 0.000000
|
96 |
+
2023-11-16 09:15:13,127 epoch 2 - iter 2250/7500 - loss 0.41469889 - time (sec): 280.57 - samples/sec: 256.53 - lr: 0.000005 - momentum: 0.000000
|
97 |
+
2023-11-16 09:16:46,617 epoch 2 - iter 3000/7500 - loss 0.40864404 - time (sec): 374.06 - samples/sec: 257.65 - lr: 0.000005 - momentum: 0.000000
|
98 |
+
2023-11-16 09:18:19,301 epoch 2 - iter 3750/7500 - loss 0.40728475 - time (sec): 466.75 - samples/sec: 258.79 - lr: 0.000005 - momentum: 0.000000
|
99 |
+
2023-11-16 09:19:50,423 epoch 2 - iter 4500/7500 - loss 0.40431483 - time (sec): 557.87 - samples/sec: 259.91 - lr: 0.000005 - momentum: 0.000000
|
100 |
+
2023-11-16 09:21:22,320 epoch 2 - iter 5250/7500 - loss 0.40010819 - time (sec): 649.77 - samples/sec: 260.46 - lr: 0.000005 - momentum: 0.000000
|
101 |
+
2023-11-16 09:22:53,693 epoch 2 - iter 6000/7500 - loss 0.39994412 - time (sec): 741.14 - samples/sec: 260.74 - lr: 0.000005 - momentum: 0.000000
|
102 |
+
2023-11-16 09:24:27,071 epoch 2 - iter 6750/7500 - loss 0.40164576 - time (sec): 834.52 - samples/sec: 260.23 - lr: 0.000005 - momentum: 0.000000
|
103 |
+
2023-11-16 09:26:00,518 epoch 2 - iter 7500/7500 - loss 0.40139034 - time (sec): 927.97 - samples/sec: 259.49 - lr: 0.000004 - momentum: 0.000000
|
104 |
+
2023-11-16 09:26:00,523 ----------------------------------------------------------------------------------------------------
|
105 |
+
2023-11-16 09:26:00,523 EPOCH 2 done: loss 0.4014 - lr: 0.000004
|
106 |
+
2023-11-16 09:26:28,590 DEV : loss 0.27011075615882874 - f1-score (micro avg) 0.8685
|
107 |
+
2023-11-16 09:26:30,477 saving best model
|
108 |
+
2023-11-16 09:26:32,489 ----------------------------------------------------------------------------------------------------
|
109 |
+
2023-11-16 09:28:06,865 epoch 3 - iter 750/7500 - loss 0.34664813 - time (sec): 94.37 - samples/sec: 259.00 - lr: 0.000004 - momentum: 0.000000
|
110 |
+
2023-11-16 09:29:40,525 epoch 3 - iter 1500/7500 - loss 0.35802918 - time (sec): 188.03 - samples/sec: 256.41 - lr: 0.000004 - momentum: 0.000000
|
111 |
+
2023-11-16 09:31:13,248 epoch 3 - iter 2250/7500 - loss 0.34949160 - time (sec): 280.76 - samples/sec: 259.11 - lr: 0.000004 - momentum: 0.000000
|
112 |
+
2023-11-16 09:32:44,431 epoch 3 - iter 3000/7500 - loss 0.34400653 - time (sec): 371.94 - samples/sec: 261.74 - lr: 0.000004 - momentum: 0.000000
|
113 |
+
2023-11-16 09:34:18,607 epoch 3 - iter 3750/7500 - loss 0.34736997 - time (sec): 466.12 - samples/sec: 259.21 - lr: 0.000004 - momentum: 0.000000
|
114 |
+
2023-11-16 09:35:51,979 epoch 3 - iter 4500/7500 - loss 0.34780959 - time (sec): 559.49 - samples/sec: 258.64 - lr: 0.000004 - momentum: 0.000000
|
115 |
+
2023-11-16 09:37:25,102 epoch 3 - iter 5250/7500 - loss 0.34688026 - time (sec): 652.61 - samples/sec: 258.12 - lr: 0.000004 - momentum: 0.000000
|
116 |
+
2023-11-16 09:38:58,439 epoch 3 - iter 6000/7500 - loss 0.34475842 - time (sec): 745.95 - samples/sec: 257.81 - lr: 0.000004 - momentum: 0.000000
|
117 |
+
2023-11-16 09:40:34,113 epoch 3 - iter 6750/7500 - loss 0.34349763 - time (sec): 841.62 - samples/sec: 256.89 - lr: 0.000004 - momentum: 0.000000
|
118 |
+
2023-11-16 09:42:10,676 epoch 3 - iter 7500/7500 - loss 0.34117216 - time (sec): 938.18 - samples/sec: 256.66 - lr: 0.000004 - momentum: 0.000000
|
119 |
+
2023-11-16 09:42:10,679 ----------------------------------------------------------------------------------------------------
|
120 |
+
2023-11-16 09:42:10,679 EPOCH 3 done: loss 0.3412 - lr: 0.000004
|
121 |
+
2023-11-16 09:42:37,676 DEV : loss 0.2926769554615021 - f1-score (micro avg) 0.8843
|
122 |
+
2023-11-16 09:42:39,581 saving best model
|
123 |
+
2023-11-16 09:42:41,582 ----------------------------------------------------------------------------------------------------
|
124 |
+
2023-11-16 09:44:14,986 epoch 4 - iter 750/7500 - loss 0.28935096 - time (sec): 93.40 - samples/sec: 255.70 - lr: 0.000004 - momentum: 0.000000
|
125 |
+
2023-11-16 09:45:47,989 epoch 4 - iter 1500/7500 - loss 0.29093752 - time (sec): 186.40 - samples/sec: 257.90 - lr: 0.000004 - momentum: 0.000000
|
126 |
+
2023-11-16 09:47:19,359 epoch 4 - iter 2250/7500 - loss 0.29888625 - time (sec): 277.77 - samples/sec: 259.38 - lr: 0.000004 - momentum: 0.000000
|
127 |
+
2023-11-16 09:48:51,599 epoch 4 - iter 3000/7500 - loss 0.30114670 - time (sec): 370.01 - samples/sec: 258.77 - lr: 0.000004 - momentum: 0.000000
|
128 |
+
2023-11-16 09:50:23,345 epoch 4 - iter 3750/7500 - loss 0.30007515 - time (sec): 461.76 - samples/sec: 259.19 - lr: 0.000004 - momentum: 0.000000
|
129 |
+
2023-11-16 09:51:56,749 epoch 4 - iter 4500/7500 - loss 0.29780444 - time (sec): 555.16 - samples/sec: 259.74 - lr: 0.000004 - momentum: 0.000000
|
130 |
+
2023-11-16 09:53:29,783 epoch 4 - iter 5250/7500 - loss 0.29632423 - time (sec): 648.20 - samples/sec: 260.29 - lr: 0.000004 - momentum: 0.000000
|
131 |
+
2023-11-16 09:55:03,332 epoch 4 - iter 6000/7500 - loss 0.29715629 - time (sec): 741.75 - samples/sec: 259.64 - lr: 0.000003 - momentum: 0.000000
|
132 |
+
2023-11-16 09:56:39,271 epoch 4 - iter 6750/7500 - loss 0.29789543 - time (sec): 837.69 - samples/sec: 258.76 - lr: 0.000003 - momentum: 0.000000
|
133 |
+
2023-11-16 09:58:14,979 epoch 4 - iter 7500/7500 - loss 0.30080354 - time (sec): 933.39 - samples/sec: 257.98 - lr: 0.000003 - momentum: 0.000000
|
134 |
+
2023-11-16 09:58:14,982 ----------------------------------------------------------------------------------------------------
|
135 |
+
2023-11-16 09:58:14,982 EPOCH 4 done: loss 0.3008 - lr: 0.000003
|
136 |
+
2023-11-16 09:58:42,391 DEV : loss 0.24168777465820312 - f1-score (micro avg) 0.8958
|
137 |
+
2023-11-16 09:58:44,896 saving best model
|
138 |
+
2023-11-16 09:58:47,890 ----------------------------------------------------------------------------------------------------
|
139 |
+
2023-11-16 10:00:21,193 epoch 5 - iter 750/7500 - loss 0.24325762 - time (sec): 93.30 - samples/sec: 254.74 - lr: 0.000003 - momentum: 0.000000
|
140 |
+
2023-11-16 10:01:54,947 epoch 5 - iter 1500/7500 - loss 0.24699916 - time (sec): 187.05 - samples/sec: 256.42 - lr: 0.000003 - momentum: 0.000000
|
141 |
+
2023-11-16 10:03:33,373 epoch 5 - iter 2250/7500 - loss 0.24105182 - time (sec): 285.48 - samples/sec: 251.76 - lr: 0.000003 - momentum: 0.000000
|
142 |
+
2023-11-16 10:05:10,591 epoch 5 - iter 3000/7500 - loss 0.24548635 - time (sec): 382.70 - samples/sec: 250.75 - lr: 0.000003 - momentum: 0.000000
|
143 |
+
2023-11-16 10:06:45,160 epoch 5 - iter 3750/7500 - loss 0.24697996 - time (sec): 477.27 - samples/sec: 252.00 - lr: 0.000003 - momentum: 0.000000
|
144 |
+
2023-11-16 10:08:18,840 epoch 5 - iter 4500/7500 - loss 0.24902921 - time (sec): 570.95 - samples/sec: 252.17 - lr: 0.000003 - momentum: 0.000000
|
145 |
+
2023-11-16 10:09:54,495 epoch 5 - iter 5250/7500 - loss 0.24900570 - time (sec): 666.60 - samples/sec: 251.98 - lr: 0.000003 - momentum: 0.000000
|
146 |
+
2023-11-16 10:11:28,132 epoch 5 - iter 6000/7500 - loss 0.25246330 - time (sec): 760.24 - samples/sec: 253.26 - lr: 0.000003 - momentum: 0.000000
|
147 |
+
2023-11-16 10:13:02,532 epoch 5 - iter 6750/7500 - loss 0.25090384 - time (sec): 854.64 - samples/sec: 253.77 - lr: 0.000003 - momentum: 0.000000
|
148 |
+
2023-11-16 10:14:33,980 epoch 5 - iter 7500/7500 - loss 0.25096782 - time (sec): 946.09 - samples/sec: 254.52 - lr: 0.000003 - momentum: 0.000000
|
149 |
+
2023-11-16 10:14:33,983 ----------------------------------------------------------------------------------------------------
|
150 |
+
2023-11-16 10:14:33,983 EPOCH 5 done: loss 0.2510 - lr: 0.000003
|
151 |
+
2023-11-16 10:15:01,770 DEV : loss 0.30133897066116333 - f1-score (micro avg) 0.8909
|
152 |
+
2023-11-16 10:15:04,505 ----------------------------------------------------------------------------------------------------
|
153 |
+
2023-11-16 10:16:43,786 epoch 6 - iter 750/7500 - loss 0.21043668 - time (sec): 99.28 - samples/sec: 246.62 - lr: 0.000003 - momentum: 0.000000
|
154 |
+
2023-11-16 10:18:21,438 epoch 6 - iter 1500/7500 - loss 0.21551137 - time (sec): 196.93 - samples/sec: 244.08 - lr: 0.000003 - momentum: 0.000000
|
155 |
+
2023-11-16 10:19:58,967 epoch 6 - iter 2250/7500 - loss 0.21671764 - time (sec): 294.46 - samples/sec: 244.90 - lr: 0.000003 - momentum: 0.000000
|
156 |
+
2023-11-16 10:21:36,357 epoch 6 - iter 3000/7500 - loss 0.21410789 - time (sec): 391.85 - samples/sec: 245.03 - lr: 0.000003 - momentum: 0.000000
|
157 |
+
2023-11-16 10:23:11,820 epoch 6 - iter 3750/7500 - loss 0.21190957 - time (sec): 487.31 - samples/sec: 246.63 - lr: 0.000003 - momentum: 0.000000
|
158 |
+
2023-11-16 10:24:44,519 epoch 6 - iter 4500/7500 - loss 0.21724671 - time (sec): 580.01 - samples/sec: 248.11 - lr: 0.000002 - momentum: 0.000000
|
159 |
+
2023-11-16 10:26:19,240 epoch 6 - iter 5250/7500 - loss 0.21517436 - time (sec): 674.73 - samples/sec: 249.52 - lr: 0.000002 - momentum: 0.000000
|
160 |
+
2023-11-16 10:27:52,483 epoch 6 - iter 6000/7500 - loss 0.21502133 - time (sec): 767.97 - samples/sec: 250.59 - lr: 0.000002 - momentum: 0.000000
|
161 |
+
2023-11-16 10:29:26,618 epoch 6 - iter 6750/7500 - loss 0.21284089 - time (sec): 862.11 - samples/sec: 251.34 - lr: 0.000002 - momentum: 0.000000
|
162 |
+
2023-11-16 10:31:02,401 epoch 6 - iter 7500/7500 - loss 0.21376183 - time (sec): 957.89 - samples/sec: 251.38 - lr: 0.000002 - momentum: 0.000000
|
163 |
+
2023-11-16 10:31:02,403 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-11-16 10:31:02,404 EPOCH 6 done: loss 0.2138 - lr: 0.000002
|
165 |
+
2023-11-16 10:31:29,893 DEV : loss 0.2858603894710541 - f1-score (micro avg) 0.9013
|
166 |
+
2023-11-16 10:31:32,354 saving best model
|
167 |
+
2023-11-16 10:31:34,717 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-11-16 10:33:10,947 epoch 7 - iter 750/7500 - loss 0.18560997 - time (sec): 96.22 - samples/sec: 252.05 - lr: 0.000002 - momentum: 0.000000
|
169 |
+
2023-11-16 10:34:45,274 epoch 7 - iter 1500/7500 - loss 0.18388762 - time (sec): 190.55 - samples/sec: 254.81 - lr: 0.000002 - momentum: 0.000000
|
170 |
+
2023-11-16 10:36:20,608 epoch 7 - iter 2250/7500 - loss 0.17300910 - time (sec): 285.89 - samples/sec: 254.67 - lr: 0.000002 - momentum: 0.000000
|
171 |
+
2023-11-16 10:37:56,122 epoch 7 - iter 3000/7500 - loss 0.18185895 - time (sec): 381.40 - samples/sec: 253.44 - lr: 0.000002 - momentum: 0.000000
|
172 |
+
2023-11-16 10:39:29,201 epoch 7 - iter 3750/7500 - loss 0.18240739 - time (sec): 474.48 - samples/sec: 253.00 - lr: 0.000002 - momentum: 0.000000
|
173 |
+
2023-11-16 10:41:03,810 epoch 7 - iter 4500/7500 - loss 0.18167213 - time (sec): 569.09 - samples/sec: 253.57 - lr: 0.000002 - momentum: 0.000000
|
174 |
+
2023-11-16 10:42:34,871 epoch 7 - iter 5250/7500 - loss 0.18305956 - time (sec): 660.15 - samples/sec: 255.17 - lr: 0.000002 - momentum: 0.000000
|
175 |
+
2023-11-16 10:44:06,443 epoch 7 - iter 6000/7500 - loss 0.18397991 - time (sec): 751.72 - samples/sec: 256.24 - lr: 0.000002 - momentum: 0.000000
|
176 |
+
2023-11-16 10:45:40,244 epoch 7 - iter 6750/7500 - loss 0.18284928 - time (sec): 845.52 - samples/sec: 256.33 - lr: 0.000002 - momentum: 0.000000
|
177 |
+
2023-11-16 10:47:13,133 epoch 7 - iter 7500/7500 - loss 0.18356346 - time (sec): 938.41 - samples/sec: 256.60 - lr: 0.000002 - momentum: 0.000000
|
178 |
+
2023-11-16 10:47:13,135 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-11-16 10:47:13,136 EPOCH 7 done: loss 0.1836 - lr: 0.000002
|
180 |
+
2023-11-16 10:47:40,667 DEV : loss 0.3011305034160614 - f1-score (micro avg) 0.8987
|
181 |
+
2023-11-16 10:47:42,779 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-11-16 10:49:16,539 epoch 8 - iter 750/7500 - loss 0.14330999 - time (sec): 93.76 - samples/sec: 256.81 - lr: 0.000002 - momentum: 0.000000
|
183 |
+
2023-11-16 10:50:50,862 epoch 8 - iter 1500/7500 - loss 0.14160047 - time (sec): 188.08 - samples/sec: 252.69 - lr: 0.000002 - momentum: 0.000000
|
184 |
+
2023-11-16 10:52:23,414 epoch 8 - iter 2250/7500 - loss 0.14478770 - time (sec): 280.63 - samples/sec: 256.60 - lr: 0.000002 - momentum: 0.000000
|
185 |
+
2023-11-16 10:53:56,497 epoch 8 - iter 3000/7500 - loss 0.14930840 - time (sec): 373.71 - samples/sec: 258.22 - lr: 0.000001 - momentum: 0.000000
|
186 |
+
2023-11-16 10:55:28,192 epoch 8 - iter 3750/7500 - loss 0.15175926 - time (sec): 465.41 - samples/sec: 258.55 - lr: 0.000001 - momentum: 0.000000
|
187 |
+
2023-11-16 10:57:00,742 epoch 8 - iter 4500/7500 - loss 0.15482872 - time (sec): 557.96 - samples/sec: 259.33 - lr: 0.000001 - momentum: 0.000000
|
188 |
+
2023-11-16 10:58:32,426 epoch 8 - iter 5250/7500 - loss 0.15110368 - time (sec): 649.64 - samples/sec: 259.55 - lr: 0.000001 - momentum: 0.000000
|
189 |
+
2023-11-16 11:00:02,971 epoch 8 - iter 6000/7500 - loss 0.15064249 - time (sec): 740.19 - samples/sec: 260.02 - lr: 0.000001 - momentum: 0.000000
|
190 |
+
2023-11-16 11:01:35,563 epoch 8 - iter 6750/7500 - loss 0.15119609 - time (sec): 832.78 - samples/sec: 260.42 - lr: 0.000001 - momentum: 0.000000
|
191 |
+
2023-11-16 11:03:09,780 epoch 8 - iter 7500/7500 - loss 0.15276122 - time (sec): 927.00 - samples/sec: 259.76 - lr: 0.000001 - momentum: 0.000000
|
192 |
+
2023-11-16 11:03:09,783 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-11-16 11:03:09,783 EPOCH 8 done: loss 0.1528 - lr: 0.000001
|
194 |
+
2023-11-16 11:03:37,801 DEV : loss 0.31595587730407715 - f1-score (micro avg) 0.9048
|
195 |
+
2023-11-16 11:03:40,424 saving best model
|
196 |
+
2023-11-16 11:03:43,075 ----------------------------------------------------------------------------------------------------
|
197 |
+
2023-11-16 11:05:19,854 epoch 9 - iter 750/7500 - loss 0.12299891 - time (sec): 96.77 - samples/sec: 253.49 - lr: 0.000001 - momentum: 0.000000
|
198 |
+
2023-11-16 11:06:56,996 epoch 9 - iter 1500/7500 - loss 0.12314833 - time (sec): 193.92 - samples/sec: 248.21 - lr: 0.000001 - momentum: 0.000000
|
199 |
+
2023-11-16 11:08:31,883 epoch 9 - iter 2250/7500 - loss 0.13008764 - time (sec): 288.80 - samples/sec: 250.26 - lr: 0.000001 - momentum: 0.000000
|
200 |
+
2023-11-16 11:10:08,385 epoch 9 - iter 3000/7500 - loss 0.13080561 - time (sec): 385.30 - samples/sec: 250.44 - lr: 0.000001 - momentum: 0.000000
|
201 |
+
2023-11-16 11:11:42,064 epoch 9 - iter 3750/7500 - loss 0.13273049 - time (sec): 478.98 - samples/sec: 251.53 - lr: 0.000001 - momentum: 0.000000
|
202 |
+
2023-11-16 11:13:16,037 epoch 9 - iter 4500/7500 - loss 0.13360348 - time (sec): 572.96 - samples/sec: 251.77 - lr: 0.000001 - momentum: 0.000000
|
203 |
+
2023-11-16 11:14:49,720 epoch 9 - iter 5250/7500 - loss 0.13299160 - time (sec): 666.64 - samples/sec: 252.02 - lr: 0.000001 - momentum: 0.000000
|
204 |
+
2023-11-16 11:16:23,602 epoch 9 - iter 6000/7500 - loss 0.13133134 - time (sec): 760.52 - samples/sec: 252.84 - lr: 0.000001 - momentum: 0.000000
|
205 |
+
2023-11-16 11:17:56,438 epoch 9 - iter 6750/7500 - loss 0.13363917 - time (sec): 853.36 - samples/sec: 253.29 - lr: 0.000001 - momentum: 0.000000
|
206 |
+
2023-11-16 11:19:29,919 epoch 9 - iter 7500/7500 - loss 0.13122225 - time (sec): 946.84 - samples/sec: 254.32 - lr: 0.000001 - momentum: 0.000000
|
207 |
+
2023-11-16 11:19:29,921 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-11-16 11:19:29,921 EPOCH 9 done: loss 0.1312 - lr: 0.000001
|
209 |
+
2023-11-16 11:19:57,777 DEV : loss 0.33471381664276123 - f1-score (micro avg) 0.9024
|
210 |
+
2023-11-16 11:20:00,060 ----------------------------------------------------------------------------------------------------
|
211 |
+
2023-11-16 11:21:34,000 epoch 10 - iter 750/7500 - loss 0.11743559 - time (sec): 93.94 - samples/sec: 260.10 - lr: 0.000001 - momentum: 0.000000
|
212 |
+
2023-11-16 11:23:09,621 epoch 10 - iter 1500/7500 - loss 0.12274941 - time (sec): 189.56 - samples/sec: 254.16 - lr: 0.000000 - momentum: 0.000000
|
213 |
+
2023-11-16 11:24:43,426 epoch 10 - iter 2250/7500 - loss 0.11588386 - time (sec): 283.36 - samples/sec: 255.95 - lr: 0.000000 - momentum: 0.000000
|
214 |
+
2023-11-16 11:26:20,492 epoch 10 - iter 3000/7500 - loss 0.11238624 - time (sec): 380.43 - samples/sec: 255.77 - lr: 0.000000 - momentum: 0.000000
|
215 |
+
2023-11-16 11:27:52,662 epoch 10 - iter 3750/7500 - loss 0.11433172 - time (sec): 472.60 - samples/sec: 256.50 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-11-16 11:29:26,109 epoch 10 - iter 4500/7500 - loss 0.11525478 - time (sec): 566.05 - samples/sec: 256.49 - lr: 0.000000 - momentum: 0.000000
|
217 |
+
2023-11-16 11:30:58,606 epoch 10 - iter 5250/7500 - loss 0.11793983 - time (sec): 658.54 - samples/sec: 256.98 - lr: 0.000000 - momentum: 0.000000
|
218 |
+
2023-11-16 11:32:30,455 epoch 10 - iter 6000/7500 - loss 0.11586937 - time (sec): 750.39 - samples/sec: 257.62 - lr: 0.000000 - momentum: 0.000000
|
219 |
+
2023-11-16 11:34:03,956 epoch 10 - iter 6750/7500 - loss 0.11407474 - time (sec): 843.89 - samples/sec: 257.08 - lr: 0.000000 - momentum: 0.000000
|
220 |
+
2023-11-16 11:35:37,383 epoch 10 - iter 7500/7500 - loss 0.11399531 - time (sec): 937.32 - samples/sec: 256.90 - lr: 0.000000 - momentum: 0.000000
|
221 |
+
2023-11-16 11:35:37,386 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-11-16 11:35:37,386 EPOCH 10 done: loss 0.1140 - lr: 0.000000
|
223 |
+
2023-11-16 11:36:05,240 DEV : loss 0.3250023126602173 - f1-score (micro avg) 0.9045
|
224 |
+
2023-11-16 11:36:08,837 ----------------------------------------------------------------------------------------------------
|
225 |
+
2023-11-16 11:36:08,840 Loading model from best epoch ...
|
226 |
+
2023-11-16 11:36:16,806 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER
|
227 |
+
2023-11-16 11:36:45,465
|
228 |
+
Results:
|
229 |
+
- F-score (micro) 0.9042
|
230 |
+
- F-score (macro) 0.9031
|
231 |
+
- Accuracy 0.8544
|
232 |
+
|
233 |
+
By class:
|
234 |
+
precision recall f1-score support
|
235 |
+
|
236 |
+
LOC 0.9076 0.9106 0.9091 5288
|
237 |
+
PER 0.9288 0.9419 0.9353 3962
|
238 |
+
ORG 0.8606 0.8695 0.8650 3807
|
239 |
+
|
240 |
+
micro avg 0.9004 0.9081 0.9042 13057
|
241 |
+
macro avg 0.8990 0.9073 0.9031 13057
|
242 |
+
weighted avg 0.9004 0.9081 0.9042 13057
|
243 |
+
|
244 |
+
2023-11-16 11:36:45,466 ----------------------------------------------------------------------------------------------------
|