Upload ./training.log with huggingface_hub
Browse files- training.log +242 -0
training.log
ADDED
@@ -0,0 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-25 17:00:22,362 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-25 17:00:22,363 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(64001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-25 17:00:22,363 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-25 17:00:22,364 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
|
53 |
+
2023-10-25 17:00:22,364 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-25 17:00:22,364 Train: 7142 sentences
|
55 |
+
2023-10-25 17:00:22,364 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-25 17:00:22,364 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-25 17:00:22,364 Training Params:
|
58 |
+
2023-10-25 17:00:22,364 - learning_rate: "3e-05"
|
59 |
+
2023-10-25 17:00:22,364 - mini_batch_size: "8"
|
60 |
+
2023-10-25 17:00:22,364 - max_epochs: "10"
|
61 |
+
2023-10-25 17:00:22,364 - shuffle: "True"
|
62 |
+
2023-10-25 17:00:22,364 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-25 17:00:22,364 Plugins:
|
64 |
+
2023-10-25 17:00:22,364 - TensorboardLogger
|
65 |
+
2023-10-25 17:00:22,364 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-25 17:00:22,364 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-25 17:00:22,364 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-25 17:00:22,364 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-25 17:00:22,364 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-25 17:00:22,364 Computation:
|
71 |
+
2023-10-25 17:00:22,364 - compute on device: cuda:0
|
72 |
+
2023-10-25 17:00:22,364 - embedding storage: none
|
73 |
+
2023-10-25 17:00:22,364 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-25 17:00:22,364 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
|
75 |
+
2023-10-25 17:00:22,364 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-25 17:00:22,364 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-25 17:00:22,365 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-25 17:00:28,672 epoch 1 - iter 89/893 - loss 2.19742480 - time (sec): 6.31 - samples/sec: 4004.07 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-25 17:00:35,048 epoch 1 - iter 178/893 - loss 1.39693997 - time (sec): 12.68 - samples/sec: 4002.88 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-25 17:00:41,309 epoch 1 - iter 267/893 - loss 1.07633123 - time (sec): 18.94 - samples/sec: 3947.26 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-25 17:00:47,306 epoch 1 - iter 356/893 - loss 0.87429157 - time (sec): 24.94 - samples/sec: 3993.37 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-25 17:00:53,202 epoch 1 - iter 445/893 - loss 0.74799460 - time (sec): 30.84 - samples/sec: 4002.33 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-25 17:00:59,211 epoch 1 - iter 534/893 - loss 0.65718250 - time (sec): 36.85 - samples/sec: 4018.73 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-25 17:01:05,222 epoch 1 - iter 623/893 - loss 0.58410501 - time (sec): 42.86 - samples/sec: 4035.30 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-25 17:01:11,176 epoch 1 - iter 712/893 - loss 0.52802697 - time (sec): 48.81 - samples/sec: 4073.71 - lr: 0.000024 - momentum: 0.000000
|
86 |
+
2023-10-25 17:01:17,243 epoch 1 - iter 801/893 - loss 0.48658916 - time (sec): 54.88 - samples/sec: 4080.04 - lr: 0.000027 - momentum: 0.000000
|
87 |
+
2023-10-25 17:01:23,202 epoch 1 - iter 890/893 - loss 0.45333206 - time (sec): 60.84 - samples/sec: 4071.24 - lr: 0.000030 - momentum: 0.000000
|
88 |
+
2023-10-25 17:01:23,417 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-25 17:01:23,417 EPOCH 1 done: loss 0.4518 - lr: 0.000030
|
90 |
+
2023-10-25 17:01:27,249 DEV : loss 0.0998985692858696 - f1-score (micro avg) 0.7288
|
91 |
+
2023-10-25 17:01:27,270 saving best model
|
92 |
+
2023-10-25 17:01:27,743 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-25 17:01:33,963 epoch 2 - iter 89/893 - loss 0.11010133 - time (sec): 6.22 - samples/sec: 3972.66 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-25 17:01:40,098 epoch 2 - iter 178/893 - loss 0.10005667 - time (sec): 12.35 - samples/sec: 3975.15 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-25 17:01:46,248 epoch 2 - iter 267/893 - loss 0.10041275 - time (sec): 18.50 - samples/sec: 4054.32 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-25 17:01:52,378 epoch 2 - iter 356/893 - loss 0.10308505 - time (sec): 24.63 - samples/sec: 4108.35 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-25 17:01:58,565 epoch 2 - iter 445/893 - loss 0.10163209 - time (sec): 30.82 - samples/sec: 4120.74 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-25 17:02:04,485 epoch 2 - iter 534/893 - loss 0.10147740 - time (sec): 36.74 - samples/sec: 4101.34 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-25 17:02:10,425 epoch 2 - iter 623/893 - loss 0.10276131 - time (sec): 42.68 - samples/sec: 4107.64 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-25 17:02:16,395 epoch 2 - iter 712/893 - loss 0.10195994 - time (sec): 48.65 - samples/sec: 4123.54 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-25 17:02:22,164 epoch 2 - iter 801/893 - loss 0.10209269 - time (sec): 54.42 - samples/sec: 4096.53 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-25 17:02:28,238 epoch 2 - iter 890/893 - loss 0.10100025 - time (sec): 60.49 - samples/sec: 4102.24 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-25 17:02:28,441 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-25 17:02:28,441 EPOCH 2 done: loss 0.1009 - lr: 0.000027
|
105 |
+
2023-10-25 17:02:33,319 DEV : loss 0.09367502480745316 - f1-score (micro avg) 0.7629
|
106 |
+
2023-10-25 17:02:33,342 saving best model
|
107 |
+
2023-10-25 17:02:34,008 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-25 17:02:39,971 epoch 3 - iter 89/893 - loss 0.06332527 - time (sec): 5.96 - samples/sec: 3937.76 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-25 17:02:46,251 epoch 3 - iter 178/893 - loss 0.06194394 - time (sec): 12.24 - samples/sec: 4036.39 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-25 17:02:52,083 epoch 3 - iter 267/893 - loss 0.06017200 - time (sec): 18.07 - samples/sec: 4109.10 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-25 17:02:58,255 epoch 3 - iter 356/893 - loss 0.06164862 - time (sec): 24.24 - samples/sec: 4086.99 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-25 17:03:04,363 epoch 3 - iter 445/893 - loss 0.06151607 - time (sec): 30.35 - samples/sec: 4106.56 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-25 17:03:10,344 epoch 3 - iter 534/893 - loss 0.06067296 - time (sec): 36.33 - samples/sec: 4120.16 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-25 17:03:16,234 epoch 3 - iter 623/893 - loss 0.06086561 - time (sec): 42.22 - samples/sec: 4129.45 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-25 17:03:22,147 epoch 3 - iter 712/893 - loss 0.06099317 - time (sec): 48.13 - samples/sec: 4094.37 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-25 17:03:28,280 epoch 3 - iter 801/893 - loss 0.06051995 - time (sec): 54.27 - samples/sec: 4119.00 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-25 17:03:34,251 epoch 3 - iter 890/893 - loss 0.06220034 - time (sec): 60.24 - samples/sec: 4117.78 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-25 17:03:34,451 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-25 17:03:34,451 EPOCH 3 done: loss 0.0624 - lr: 0.000023
|
120 |
+
2023-10-25 17:03:39,555 DEV : loss 0.10349678248167038 - f1-score (micro avg) 0.7851
|
121 |
+
2023-10-25 17:03:39,573 saving best model
|
122 |
+
2023-10-25 17:03:40,237 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-25 17:03:46,372 epoch 4 - iter 89/893 - loss 0.03754159 - time (sec): 6.13 - samples/sec: 4230.43 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-25 17:03:52,281 epoch 4 - iter 178/893 - loss 0.04483007 - time (sec): 12.04 - samples/sec: 4282.44 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-25 17:03:57,977 epoch 4 - iter 267/893 - loss 0.04464268 - time (sec): 17.74 - samples/sec: 4228.06 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-25 17:04:04,177 epoch 4 - iter 356/893 - loss 0.04410251 - time (sec): 23.94 - samples/sec: 4134.91 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-25 17:04:10,450 epoch 4 - iter 445/893 - loss 0.04290747 - time (sec): 30.21 - samples/sec: 4118.64 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-25 17:04:16,490 epoch 4 - iter 534/893 - loss 0.04337588 - time (sec): 36.25 - samples/sec: 4135.74 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-25 17:04:22,526 epoch 4 - iter 623/893 - loss 0.04450695 - time (sec): 42.29 - samples/sec: 4100.04 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-25 17:04:28,608 epoch 4 - iter 712/893 - loss 0.04474457 - time (sec): 48.37 - samples/sec: 4104.66 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-25 17:04:34,686 epoch 4 - iter 801/893 - loss 0.04576971 - time (sec): 54.45 - samples/sec: 4112.57 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-25 17:04:40,591 epoch 4 - iter 890/893 - loss 0.04490676 - time (sec): 60.35 - samples/sec: 4097.87 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-25 17:04:40,899 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-25 17:04:40,904 EPOCH 4 done: loss 0.0447 - lr: 0.000020
|
135 |
+
2023-10-25 17:04:45,230 DEV : loss 0.14620383083820343 - f1-score (micro avg) 0.8037
|
136 |
+
2023-10-25 17:04:45,256 saving best model
|
137 |
+
2023-10-25 17:04:46,044 ----------------------------------------------------------------------------------------------------
|
138 |
+
2023-10-25 17:04:52,073 epoch 5 - iter 89/893 - loss 0.03530497 - time (sec): 6.03 - samples/sec: 3848.91 - lr: 0.000020 - momentum: 0.000000
|
139 |
+
2023-10-25 17:04:58,050 epoch 5 - iter 178/893 - loss 0.03402744 - time (sec): 12.00 - samples/sec: 4001.34 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-25 17:05:04,150 epoch 5 - iter 267/893 - loss 0.03369793 - time (sec): 18.10 - samples/sec: 4023.43 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-25 17:05:10,344 epoch 5 - iter 356/893 - loss 0.03388932 - time (sec): 24.30 - samples/sec: 4021.69 - lr: 0.000019 - momentum: 0.000000
|
142 |
+
2023-10-25 17:05:16,425 epoch 5 - iter 445/893 - loss 0.03377847 - time (sec): 30.38 - samples/sec: 4048.47 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-25 17:05:22,583 epoch 5 - iter 534/893 - loss 0.03360074 - time (sec): 36.53 - samples/sec: 4053.71 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-25 17:05:28,654 epoch 5 - iter 623/893 - loss 0.03242307 - time (sec): 42.61 - samples/sec: 4046.48 - lr: 0.000018 - momentum: 0.000000
|
145 |
+
2023-10-25 17:05:34,820 epoch 5 - iter 712/893 - loss 0.03229538 - time (sec): 48.77 - samples/sec: 4034.35 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-25 17:05:40,937 epoch 5 - iter 801/893 - loss 0.03238963 - time (sec): 54.89 - samples/sec: 4065.34 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-25 17:05:47,041 epoch 5 - iter 890/893 - loss 0.03261197 - time (sec): 60.99 - samples/sec: 4063.33 - lr: 0.000017 - momentum: 0.000000
|
148 |
+
2023-10-25 17:05:47,251 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-25 17:05:47,251 EPOCH 5 done: loss 0.0325 - lr: 0.000017
|
150 |
+
2023-10-25 17:05:52,885 DEV : loss 0.1633528769016266 - f1-score (micro avg) 0.797
|
151 |
+
2023-10-25 17:05:52,915 ----------------------------------------------------------------------------------------------------
|
152 |
+
2023-10-25 17:05:59,081 epoch 6 - iter 89/893 - loss 0.03029989 - time (sec): 6.16 - samples/sec: 3842.51 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-25 17:06:05,172 epoch 6 - iter 178/893 - loss 0.02564591 - time (sec): 12.26 - samples/sec: 3799.19 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-25 17:06:11,310 epoch 6 - iter 267/893 - loss 0.02415048 - time (sec): 18.39 - samples/sec: 3923.45 - lr: 0.000016 - momentum: 0.000000
|
155 |
+
2023-10-25 17:06:17,327 epoch 6 - iter 356/893 - loss 0.02531047 - time (sec): 24.41 - samples/sec: 3979.94 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-25 17:06:23,353 epoch 6 - iter 445/893 - loss 0.02540534 - time (sec): 30.44 - samples/sec: 4031.53 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-25 17:06:29,489 epoch 6 - iter 534/893 - loss 0.02638207 - time (sec): 36.57 - samples/sec: 4054.77 - lr: 0.000015 - momentum: 0.000000
|
158 |
+
2023-10-25 17:06:35,690 epoch 6 - iter 623/893 - loss 0.02582057 - time (sec): 42.77 - samples/sec: 4044.98 - lr: 0.000014 - momentum: 0.000000
|
159 |
+
2023-10-25 17:06:41,917 epoch 6 - iter 712/893 - loss 0.02512173 - time (sec): 49.00 - samples/sec: 4050.05 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-25 17:06:48,057 epoch 6 - iter 801/893 - loss 0.02591665 - time (sec): 55.14 - samples/sec: 4038.91 - lr: 0.000014 - momentum: 0.000000
|
161 |
+
2023-10-25 17:06:54,250 epoch 6 - iter 890/893 - loss 0.02583713 - time (sec): 61.33 - samples/sec: 4048.55 - lr: 0.000013 - momentum: 0.000000
|
162 |
+
2023-10-25 17:06:54,451 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-25 17:06:54,452 EPOCH 6 done: loss 0.0259 - lr: 0.000013
|
164 |
+
2023-10-25 17:06:59,824 DEV : loss 0.18684536218643188 - f1-score (micro avg) 0.7976
|
165 |
+
2023-10-25 17:06:59,848 ----------------------------------------------------------------------------------------------------
|
166 |
+
2023-10-25 17:07:06,024 epoch 7 - iter 89/893 - loss 0.01485614 - time (sec): 6.17 - samples/sec: 3881.67 - lr: 0.000013 - momentum: 0.000000
|
167 |
+
2023-10-25 17:07:12,130 epoch 7 - iter 178/893 - loss 0.01598830 - time (sec): 12.28 - samples/sec: 3958.05 - lr: 0.000013 - momentum: 0.000000
|
168 |
+
2023-10-25 17:07:18,166 epoch 7 - iter 267/893 - loss 0.01783078 - time (sec): 18.32 - samples/sec: 4076.48 - lr: 0.000012 - momentum: 0.000000
|
169 |
+
2023-10-25 17:07:24,070 epoch 7 - iter 356/893 - loss 0.01936177 - time (sec): 24.22 - samples/sec: 4105.59 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-25 17:07:30,059 epoch 7 - iter 445/893 - loss 0.01988732 - time (sec): 30.21 - samples/sec: 4146.37 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-25 17:07:36,108 epoch 7 - iter 534/893 - loss 0.01928793 - time (sec): 36.26 - samples/sec: 4162.19 - lr: 0.000011 - momentum: 0.000000
|
172 |
+
2023-10-25 17:07:42,411 epoch 7 - iter 623/893 - loss 0.02039273 - time (sec): 42.56 - samples/sec: 4121.89 - lr: 0.000011 - momentum: 0.000000
|
173 |
+
2023-10-25 17:07:48,261 epoch 7 - iter 712/893 - loss 0.02001692 - time (sec): 48.41 - samples/sec: 4092.95 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-25 17:07:54,353 epoch 7 - iter 801/893 - loss 0.02017325 - time (sec): 54.50 - samples/sec: 4087.57 - lr: 0.000010 - momentum: 0.000000
|
175 |
+
2023-10-25 17:08:00,473 epoch 7 - iter 890/893 - loss 0.01991182 - time (sec): 60.62 - samples/sec: 4094.74 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-25 17:08:00,656 ----------------------------------------------------------------------------------------------------
|
177 |
+
2023-10-25 17:08:00,657 EPOCH 7 done: loss 0.0199 - lr: 0.000010
|
178 |
+
2023-10-25 17:08:05,217 DEV : loss 0.2105928510427475 - f1-score (micro avg) 0.8011
|
179 |
+
2023-10-25 17:08:05,237 ----------------------------------------------------------------------------------------------------
|
180 |
+
2023-10-25 17:08:11,326 epoch 8 - iter 89/893 - loss 0.01743845 - time (sec): 6.09 - samples/sec: 4235.38 - lr: 0.000010 - momentum: 0.000000
|
181 |
+
2023-10-25 17:08:17,447 epoch 8 - iter 178/893 - loss 0.01794746 - time (sec): 12.21 - samples/sec: 4130.72 - lr: 0.000009 - momentum: 0.000000
|
182 |
+
2023-10-25 17:08:23,444 epoch 8 - iter 267/893 - loss 0.01517857 - time (sec): 18.21 - samples/sec: 4109.13 - lr: 0.000009 - momentum: 0.000000
|
183 |
+
2023-10-25 17:08:29,466 epoch 8 - iter 356/893 - loss 0.01542169 - time (sec): 24.23 - samples/sec: 4050.01 - lr: 0.000009 - momentum: 0.000000
|
184 |
+
2023-10-25 17:08:35,548 epoch 8 - iter 445/893 - loss 0.01465711 - time (sec): 30.31 - samples/sec: 4032.48 - lr: 0.000008 - momentum: 0.000000
|
185 |
+
2023-10-25 17:08:41,861 epoch 8 - iter 534/893 - loss 0.01456755 - time (sec): 36.62 - samples/sec: 4029.83 - lr: 0.000008 - momentum: 0.000000
|
186 |
+
2023-10-25 17:08:47,636 epoch 8 - iter 623/893 - loss 0.01409835 - time (sec): 42.40 - samples/sec: 4055.86 - lr: 0.000008 - momentum: 0.000000
|
187 |
+
2023-10-25 17:08:53,646 epoch 8 - iter 712/893 - loss 0.01384014 - time (sec): 48.41 - samples/sec: 4051.49 - lr: 0.000007 - momentum: 0.000000
|
188 |
+
2023-10-25 17:08:59,620 epoch 8 - iter 801/893 - loss 0.01422010 - time (sec): 54.38 - samples/sec: 4078.14 - lr: 0.000007 - momentum: 0.000000
|
189 |
+
2023-10-25 17:09:05,944 epoch 8 - iter 890/893 - loss 0.01456695 - time (sec): 60.71 - samples/sec: 4085.39 - lr: 0.000007 - momentum: 0.000000
|
190 |
+
2023-10-25 17:09:06,139 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-25 17:09:06,140 EPOCH 8 done: loss 0.0146 - lr: 0.000007
|
192 |
+
2023-10-25 17:09:11,159 DEV : loss 0.21266496181488037 - f1-score (micro avg) 0.7947
|
193 |
+
2023-10-25 17:09:11,180 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-25 17:09:17,250 epoch 9 - iter 89/893 - loss 0.00472214 - time (sec): 6.07 - samples/sec: 4171.11 - lr: 0.000006 - momentum: 0.000000
|
195 |
+
2023-10-25 17:09:23,216 epoch 9 - iter 178/893 - loss 0.00879912 - time (sec): 12.03 - samples/sec: 4177.79 - lr: 0.000006 - momentum: 0.000000
|
196 |
+
2023-10-25 17:09:29,325 epoch 9 - iter 267/893 - loss 0.01001564 - time (sec): 18.14 - samples/sec: 4086.48 - lr: 0.000006 - momentum: 0.000000
|
197 |
+
2023-10-25 17:09:35,358 epoch 9 - iter 356/893 - loss 0.01086924 - time (sec): 24.18 - samples/sec: 4140.04 - lr: 0.000005 - momentum: 0.000000
|
198 |
+
2023-10-25 17:09:41,382 epoch 9 - iter 445/893 - loss 0.01063271 - time (sec): 30.20 - samples/sec: 4151.32 - lr: 0.000005 - momentum: 0.000000
|
199 |
+
2023-10-25 17:09:47,352 epoch 9 - iter 534/893 - loss 0.01049232 - time (sec): 36.17 - samples/sec: 4114.54 - lr: 0.000005 - momentum: 0.000000
|
200 |
+
2023-10-25 17:09:53,532 epoch 9 - iter 623/893 - loss 0.01032612 - time (sec): 42.35 - samples/sec: 4133.27 - lr: 0.000004 - momentum: 0.000000
|
201 |
+
2023-10-25 17:09:59,456 epoch 9 - iter 712/893 - loss 0.01040970 - time (sec): 48.27 - samples/sec: 4105.70 - lr: 0.000004 - momentum: 0.000000
|
202 |
+
2023-10-25 17:10:05,526 epoch 9 - iter 801/893 - loss 0.01045359 - time (sec): 54.34 - samples/sec: 4091.22 - lr: 0.000004 - momentum: 0.000000
|
203 |
+
2023-10-25 17:10:11,623 epoch 9 - iter 890/893 - loss 0.01060696 - time (sec): 60.44 - samples/sec: 4100.12 - lr: 0.000003 - momentum: 0.000000
|
204 |
+
2023-10-25 17:10:11,820 ----------------------------------------------------------------------------------------------------
|
205 |
+
2023-10-25 17:10:11,820 EPOCH 9 done: loss 0.0106 - lr: 0.000003
|
206 |
+
2023-10-25 17:10:17,087 DEV : loss 0.2295289933681488 - f1-score (micro avg) 0.8011
|
207 |
+
2023-10-25 17:10:17,112 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-25 17:10:23,006 epoch 10 - iter 89/893 - loss 0.01002804 - time (sec): 5.89 - samples/sec: 4117.60 - lr: 0.000003 - momentum: 0.000000
|
209 |
+
2023-10-25 17:10:29,006 epoch 10 - iter 178/893 - loss 0.01019071 - time (sec): 11.89 - samples/sec: 3983.06 - lr: 0.000003 - momentum: 0.000000
|
210 |
+
2023-10-25 17:10:35,268 epoch 10 - iter 267/893 - loss 0.00952875 - time (sec): 18.15 - samples/sec: 4042.88 - lr: 0.000002 - momentum: 0.000000
|
211 |
+
2023-10-25 17:10:41,474 epoch 10 - iter 356/893 - loss 0.00888864 - time (sec): 24.36 - samples/sec: 4047.46 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-25 17:10:47,407 epoch 10 - iter 445/893 - loss 0.00924368 - time (sec): 30.29 - samples/sec: 4025.62 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-25 17:10:53,592 epoch 10 - iter 534/893 - loss 0.00922564 - time (sec): 36.48 - samples/sec: 4056.96 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-25 17:10:59,611 epoch 10 - iter 623/893 - loss 0.00891649 - time (sec): 42.50 - samples/sec: 4071.51 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-25 17:11:05,620 epoch 10 - iter 712/893 - loss 0.00840213 - time (sec): 48.51 - samples/sec: 4045.77 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-25 17:11:11,776 epoch 10 - iter 801/893 - loss 0.00803404 - time (sec): 54.66 - samples/sec: 4059.14 - lr: 0.000000 - momentum: 0.000000
|
217 |
+
2023-10-25 17:11:18,055 epoch 10 - iter 890/893 - loss 0.00778595 - time (sec): 60.94 - samples/sec: 4068.83 - lr: 0.000000 - momentum: 0.000000
|
218 |
+
2023-10-25 17:11:18,253 ----------------------------------------------------------------------------------------------------
|
219 |
+
2023-10-25 17:11:18,253 EPOCH 10 done: loss 0.0078 - lr: 0.000000
|
220 |
+
2023-10-25 17:11:22,853 DEV : loss 0.23914724588394165 - f1-score (micro avg) 0.7997
|
221 |
+
2023-10-25 17:11:23,516 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-25 17:11:23,517 Loading model from best epoch ...
|
223 |
+
2023-10-25 17:11:25,628 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
224 |
+
2023-10-25 17:11:37,463
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.6825
|
227 |
+
- F-score (macro) 0.5925
|
228 |
+
- Accuracy 0.5411
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
LOC 0.7044 0.6813 0.6927 1095
|
234 |
+
PER 0.7967 0.7628 0.7794 1012
|
235 |
+
ORG 0.3908 0.5966 0.4723 357
|
236 |
+
HumanProd 0.3279 0.6061 0.4255 33
|
237 |
+
|
238 |
+
micro avg 0.6648 0.7012 0.6825 2497
|
239 |
+
macro avg 0.5550 0.6617 0.5925 2497
|
240 |
+
weighted avg 0.6920 0.7012 0.6928 2497
|
241 |
+
|
242 |
+
2023-10-25 17:11:37,464 ----------------------------------------------------------------------------------------------------
|