Upload ./training.log with huggingface_hub
Browse files- training.log +245 -0
training.log
ADDED
@@ -0,0 +1,245 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-25 15:07:16,252 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-25 15:07:16,253 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(64001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-25 15:07:16,253 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-25 15:07:16,254 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
|
53 |
+
2023-10-25 15:07:16,254 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-25 15:07:16,254 Train: 7142 sentences
|
55 |
+
2023-10-25 15:07:16,254 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-25 15:07:16,254 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-25 15:07:16,254 Training Params:
|
58 |
+
2023-10-25 15:07:16,254 - learning_rate: "3e-05"
|
59 |
+
2023-10-25 15:07:16,254 - mini_batch_size: "8"
|
60 |
+
2023-10-25 15:07:16,254 - max_epochs: "10"
|
61 |
+
2023-10-25 15:07:16,254 - shuffle: "True"
|
62 |
+
2023-10-25 15:07:16,254 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-25 15:07:16,254 Plugins:
|
64 |
+
2023-10-25 15:07:16,254 - TensorboardLogger
|
65 |
+
2023-10-25 15:07:16,254 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-25 15:07:16,254 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-25 15:07:16,254 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-25 15:07:16,254 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-25 15:07:16,254 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-25 15:07:16,254 Computation:
|
71 |
+
2023-10-25 15:07:16,254 - compute on device: cuda:0
|
72 |
+
2023-10-25 15:07:16,254 - embedding storage: none
|
73 |
+
2023-10-25 15:07:16,254 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-25 15:07:16,254 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
|
75 |
+
2023-10-25 15:07:16,255 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-25 15:07:16,255 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-25 15:07:16,255 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-25 15:07:22,287 epoch 1 - iter 89/893 - loss 2.32679756 - time (sec): 6.03 - samples/sec: 4228.16 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-25 15:07:27,984 epoch 1 - iter 178/893 - loss 1.51396462 - time (sec): 11.73 - samples/sec: 4163.39 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-25 15:07:33,714 epoch 1 - iter 267/893 - loss 1.14478279 - time (sec): 17.46 - samples/sec: 4138.90 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-25 15:07:39,758 epoch 1 - iter 356/893 - loss 0.92624548 - time (sec): 23.50 - samples/sec: 4118.92 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-25 15:07:45,647 epoch 1 - iter 445/893 - loss 0.77920214 - time (sec): 29.39 - samples/sec: 4149.48 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-25 15:07:51,499 epoch 1 - iter 534/893 - loss 0.67179663 - time (sec): 35.24 - samples/sec: 4208.38 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-25 15:07:57,019 epoch 1 - iter 623/893 - loss 0.60130729 - time (sec): 40.76 - samples/sec: 4247.79 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-25 15:08:02,490 epoch 1 - iter 712/893 - loss 0.54657076 - time (sec): 46.23 - samples/sec: 4282.03 - lr: 0.000024 - momentum: 0.000000
|
86 |
+
2023-10-25 15:08:07,955 epoch 1 - iter 801/893 - loss 0.50230078 - time (sec): 51.70 - samples/sec: 4306.93 - lr: 0.000027 - momentum: 0.000000
|
87 |
+
2023-10-25 15:08:13,549 epoch 1 - iter 890/893 - loss 0.46653814 - time (sec): 57.29 - samples/sec: 4330.30 - lr: 0.000030 - momentum: 0.000000
|
88 |
+
2023-10-25 15:08:13,712 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-25 15:08:13,712 EPOCH 1 done: loss 0.4656 - lr: 0.000030
|
90 |
+
2023-10-25 15:08:17,345 DEV : loss 0.1060444563627243 - f1-score (micro avg) 0.7387
|
91 |
+
2023-10-25 15:08:17,369 saving best model
|
92 |
+
2023-10-25 15:08:17,905 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-25 15:08:23,847 epoch 2 - iter 89/893 - loss 0.11786204 - time (sec): 5.94 - samples/sec: 4321.04 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-25 15:08:29,333 epoch 2 - iter 178/893 - loss 0.11830957 - time (sec): 11.43 - samples/sec: 4126.75 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-25 15:08:35,534 epoch 2 - iter 267/893 - loss 0.11117703 - time (sec): 17.63 - samples/sec: 4187.28 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-25 15:08:41,384 epoch 2 - iter 356/893 - loss 0.11088492 - time (sec): 23.48 - samples/sec: 4194.66 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-25 15:08:47,307 epoch 2 - iter 445/893 - loss 0.10753992 - time (sec): 29.40 - samples/sec: 4218.91 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-25 15:08:53,058 epoch 2 - iter 534/893 - loss 0.10789428 - time (sec): 35.15 - samples/sec: 4232.20 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-25 15:08:58,673 epoch 2 - iter 623/893 - loss 0.10558977 - time (sec): 40.77 - samples/sec: 4289.97 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-25 15:09:04,088 epoch 2 - iter 712/893 - loss 0.10375529 - time (sec): 46.18 - samples/sec: 4271.55 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-25 15:09:09,613 epoch 2 - iter 801/893 - loss 0.10372405 - time (sec): 51.71 - samples/sec: 4305.77 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-25 15:09:15,240 epoch 2 - iter 890/893 - loss 0.10324515 - time (sec): 57.33 - samples/sec: 4321.41 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-25 15:09:15,429 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-25 15:09:15,430 EPOCH 2 done: loss 0.1031 - lr: 0.000027
|
105 |
+
2023-10-25 15:09:20,245 DEV : loss 0.09593858569860458 - f1-score (micro avg) 0.777
|
106 |
+
2023-10-25 15:09:20,268 saving best model
|
107 |
+
2023-10-25 15:09:20,917 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-25 15:09:26,476 epoch 3 - iter 89/893 - loss 0.06107853 - time (sec): 5.56 - samples/sec: 4578.32 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-25 15:09:31,970 epoch 3 - iter 178/893 - loss 0.05655927 - time (sec): 11.05 - samples/sec: 4428.83 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-25 15:09:37,621 epoch 3 - iter 267/893 - loss 0.05765556 - time (sec): 16.70 - samples/sec: 4489.95 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-25 15:09:43,134 epoch 3 - iter 356/893 - loss 0.05899019 - time (sec): 22.22 - samples/sec: 4487.90 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-25 15:09:48,684 epoch 3 - iter 445/893 - loss 0.06059285 - time (sec): 27.77 - samples/sec: 4430.84 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-25 15:09:54,437 epoch 3 - iter 534/893 - loss 0.06203883 - time (sec): 33.52 - samples/sec: 4406.62 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-25 15:10:00,394 epoch 3 - iter 623/893 - loss 0.06228959 - time (sec): 39.48 - samples/sec: 4403.41 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-25 15:10:06,271 epoch 3 - iter 712/893 - loss 0.06223592 - time (sec): 45.35 - samples/sec: 4403.81 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-25 15:10:12,116 epoch 3 - iter 801/893 - loss 0.06176464 - time (sec): 51.20 - samples/sec: 4397.44 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-25 15:10:17,781 epoch 3 - iter 890/893 - loss 0.06140542 - time (sec): 56.86 - samples/sec: 4364.69 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-25 15:10:17,965 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-25 15:10:17,965 EPOCH 3 done: loss 0.0613 - lr: 0.000023
|
120 |
+
2023-10-25 15:10:22,849 DEV : loss 0.10392870754003525 - f1-score (micro avg) 0.7824
|
121 |
+
2023-10-25 15:10:22,870 saving best model
|
122 |
+
2023-10-25 15:10:23,572 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-25 15:10:29,407 epoch 4 - iter 89/893 - loss 0.04410301 - time (sec): 5.83 - samples/sec: 4278.04 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-25 15:10:35,183 epoch 4 - iter 178/893 - loss 0.04544728 - time (sec): 11.61 - samples/sec: 4301.00 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-25 15:10:40,777 epoch 4 - iter 267/893 - loss 0.04597693 - time (sec): 17.20 - samples/sec: 4268.80 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-25 15:10:46,358 epoch 4 - iter 356/893 - loss 0.04537082 - time (sec): 22.78 - samples/sec: 4353.99 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-25 15:10:52,237 epoch 4 - iter 445/893 - loss 0.04624815 - time (sec): 28.66 - samples/sec: 4335.26 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-25 15:10:58,294 epoch 4 - iter 534/893 - loss 0.04475990 - time (sec): 34.72 - samples/sec: 4348.02 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-25 15:11:04,125 epoch 4 - iter 623/893 - loss 0.04559760 - time (sec): 40.55 - samples/sec: 4316.02 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-25 15:11:09,950 epoch 4 - iter 712/893 - loss 0.04461195 - time (sec): 46.38 - samples/sec: 4271.43 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-25 15:11:15,961 epoch 4 - iter 801/893 - loss 0.04371497 - time (sec): 52.39 - samples/sec: 4285.47 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-25 15:11:21,628 epoch 4 - iter 890/893 - loss 0.04355757 - time (sec): 58.05 - samples/sec: 4275.23 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-25 15:11:21,806 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-25 15:11:21,807 EPOCH 4 done: loss 0.0437 - lr: 0.000020
|
135 |
+
2023-10-25 15:11:25,871 DEV : loss 0.1405394971370697 - f1-score (micro avg) 0.7739
|
136 |
+
2023-10-25 15:11:25,895 ----------------------------------------------------------------------------------------------------
|
137 |
+
2023-10-25 15:11:31,806 epoch 5 - iter 89/893 - loss 0.03086483 - time (sec): 5.91 - samples/sec: 4018.56 - lr: 0.000020 - momentum: 0.000000
|
138 |
+
2023-10-25 15:11:37,763 epoch 5 - iter 178/893 - loss 0.03408901 - time (sec): 11.87 - samples/sec: 4169.34 - lr: 0.000019 - momentum: 0.000000
|
139 |
+
2023-10-25 15:11:43,934 epoch 5 - iter 267/893 - loss 0.03368610 - time (sec): 18.04 - samples/sec: 4168.56 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-25 15:11:49,742 epoch 5 - iter 356/893 - loss 0.03342150 - time (sec): 23.84 - samples/sec: 4171.56 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-25 15:11:55,551 epoch 5 - iter 445/893 - loss 0.03366136 - time (sec): 29.65 - samples/sec: 4169.40 - lr: 0.000018 - momentum: 0.000000
|
142 |
+
2023-10-25 15:12:01,573 epoch 5 - iter 534/893 - loss 0.03390282 - time (sec): 35.68 - samples/sec: 4202.78 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-25 15:12:07,368 epoch 5 - iter 623/893 - loss 0.03357706 - time (sec): 41.47 - samples/sec: 4197.80 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-25 15:12:13,096 epoch 5 - iter 712/893 - loss 0.03266215 - time (sec): 47.20 - samples/sec: 4199.37 - lr: 0.000017 - momentum: 0.000000
|
145 |
+
2023-10-25 15:12:18,874 epoch 5 - iter 801/893 - loss 0.03391376 - time (sec): 52.98 - samples/sec: 4207.00 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-25 15:12:24,683 epoch 5 - iter 890/893 - loss 0.03419362 - time (sec): 58.79 - samples/sec: 4219.53 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-25 15:12:24,885 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-25 15:12:24,885 EPOCH 5 done: loss 0.0341 - lr: 0.000017
|
149 |
+
2023-10-25 15:12:29,919 DEV : loss 0.16618064045906067 - f1-score (micro avg) 0.8051
|
150 |
+
2023-10-25 15:12:29,940 saving best model
|
151 |
+
2023-10-25 15:12:30,620 ----------------------------------------------------------------------------------------------------
|
152 |
+
2023-10-25 15:12:36,386 epoch 6 - iter 89/893 - loss 0.01934906 - time (sec): 5.76 - samples/sec: 4367.06 - lr: 0.000016 - momentum: 0.000000
|
153 |
+
2023-10-25 15:12:42,034 epoch 6 - iter 178/893 - loss 0.01971571 - time (sec): 11.41 - samples/sec: 4277.35 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-25 15:12:48,099 epoch 6 - iter 267/893 - loss 0.01891099 - time (sec): 17.48 - samples/sec: 4240.69 - lr: 0.000016 - momentum: 0.000000
|
155 |
+
2023-10-25 15:12:54,052 epoch 6 - iter 356/893 - loss 0.02463843 - time (sec): 23.43 - samples/sec: 4232.23 - lr: 0.000015 - momentum: 0.000000
|
156 |
+
2023-10-25 15:12:59,977 epoch 6 - iter 445/893 - loss 0.02485032 - time (sec): 29.35 - samples/sec: 4266.10 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-25 15:13:05,810 epoch 6 - iter 534/893 - loss 0.02602922 - time (sec): 35.19 - samples/sec: 4217.86 - lr: 0.000015 - momentum: 0.000000
|
158 |
+
2023-10-25 15:13:11,965 epoch 6 - iter 623/893 - loss 0.02550398 - time (sec): 41.34 - samples/sec: 4206.73 - lr: 0.000014 - momentum: 0.000000
|
159 |
+
2023-10-25 15:13:17,850 epoch 6 - iter 712/893 - loss 0.02490406 - time (sec): 47.23 - samples/sec: 4219.72 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-25 15:13:23,687 epoch 6 - iter 801/893 - loss 0.02516364 - time (sec): 53.06 - samples/sec: 4220.25 - lr: 0.000014 - momentum: 0.000000
|
161 |
+
2023-10-25 15:13:29,433 epoch 6 - iter 890/893 - loss 0.02571510 - time (sec): 58.81 - samples/sec: 4220.50 - lr: 0.000013 - momentum: 0.000000
|
162 |
+
2023-10-25 15:13:29,621 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-25 15:13:29,622 EPOCH 6 done: loss 0.0258 - lr: 0.000013
|
164 |
+
2023-10-25 15:13:34,396 DEV : loss 0.17371046543121338 - f1-score (micro avg) 0.8112
|
165 |
+
2023-10-25 15:13:34,417 saving best model
|
166 |
+
2023-10-25 15:13:35,072 ----------------------------------------------------------------------------------------------------
|
167 |
+
2023-10-25 15:13:41,193 epoch 7 - iter 89/893 - loss 0.02011470 - time (sec): 6.12 - samples/sec: 4355.79 - lr: 0.000013 - momentum: 0.000000
|
168 |
+
2023-10-25 15:13:47,009 epoch 7 - iter 178/893 - loss 0.02325248 - time (sec): 11.94 - samples/sec: 4245.96 - lr: 0.000013 - momentum: 0.000000
|
169 |
+
2023-10-25 15:13:52,906 epoch 7 - iter 267/893 - loss 0.02032808 - time (sec): 17.83 - samples/sec: 4206.61 - lr: 0.000012 - momentum: 0.000000
|
170 |
+
2023-10-25 15:13:58,744 epoch 7 - iter 356/893 - loss 0.02038788 - time (sec): 23.67 - samples/sec: 4235.40 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-25 15:14:04,629 epoch 7 - iter 445/893 - loss 0.01997941 - time (sec): 29.56 - samples/sec: 4217.35 - lr: 0.000012 - momentum: 0.000000
|
172 |
+
2023-10-25 15:14:10,590 epoch 7 - iter 534/893 - loss 0.01925592 - time (sec): 35.52 - samples/sec: 4185.73 - lr: 0.000011 - momentum: 0.000000
|
173 |
+
2023-10-25 15:14:16,316 epoch 7 - iter 623/893 - loss 0.01978486 - time (sec): 41.24 - samples/sec: 4162.69 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-25 15:14:22,572 epoch 7 - iter 712/893 - loss 0.01991371 - time (sec): 47.50 - samples/sec: 4168.20 - lr: 0.000011 - momentum: 0.000000
|
175 |
+
2023-10-25 15:14:28,265 epoch 7 - iter 801/893 - loss 0.01994353 - time (sec): 53.19 - samples/sec: 4181.97 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-25 15:14:34,108 epoch 7 - iter 890/893 - loss 0.02008156 - time (sec): 59.03 - samples/sec: 4202.26 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-25 15:14:34,306 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-25 15:14:34,306 EPOCH 7 done: loss 0.0200 - lr: 0.000010
|
179 |
+
2023-10-25 15:14:38,320 DEV : loss 0.17937816679477692 - f1-score (micro avg) 0.8123
|
180 |
+
2023-10-25 15:14:38,342 saving best model
|
181 |
+
2023-10-25 15:14:39,015 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-10-25 15:14:44,770 epoch 8 - iter 89/893 - loss 0.01881002 - time (sec): 5.75 - samples/sec: 4163.23 - lr: 0.000010 - momentum: 0.000000
|
183 |
+
2023-10-25 15:14:50,519 epoch 8 - iter 178/893 - loss 0.01785540 - time (sec): 11.50 - samples/sec: 4223.67 - lr: 0.000009 - momentum: 0.000000
|
184 |
+
2023-10-25 15:14:56,466 epoch 8 - iter 267/893 - loss 0.01663373 - time (sec): 17.45 - samples/sec: 4239.61 - lr: 0.000009 - momentum: 0.000000
|
185 |
+
2023-10-25 15:15:02,200 epoch 8 - iter 356/893 - loss 0.01641502 - time (sec): 23.18 - samples/sec: 4212.11 - lr: 0.000009 - momentum: 0.000000
|
186 |
+
2023-10-25 15:15:08,033 epoch 8 - iter 445/893 - loss 0.01638939 - time (sec): 29.02 - samples/sec: 4217.12 - lr: 0.000008 - momentum: 0.000000
|
187 |
+
2023-10-25 15:15:14,046 epoch 8 - iter 534/893 - loss 0.01532943 - time (sec): 35.03 - samples/sec: 4218.47 - lr: 0.000008 - momentum: 0.000000
|
188 |
+
2023-10-25 15:15:20,415 epoch 8 - iter 623/893 - loss 0.01502360 - time (sec): 41.40 - samples/sec: 4207.79 - lr: 0.000008 - momentum: 0.000000
|
189 |
+
2023-10-25 15:15:26,296 epoch 8 - iter 712/893 - loss 0.01549364 - time (sec): 47.28 - samples/sec: 4180.53 - lr: 0.000007 - momentum: 0.000000
|
190 |
+
2023-10-25 15:15:32,060 epoch 8 - iter 801/893 - loss 0.01600174 - time (sec): 53.04 - samples/sec: 4193.62 - lr: 0.000007 - momentum: 0.000000
|
191 |
+
2023-10-25 15:15:38,072 epoch 8 - iter 890/893 - loss 0.01593321 - time (sec): 59.06 - samples/sec: 4200.02 - lr: 0.000007 - momentum: 0.000000
|
192 |
+
2023-10-25 15:15:38,267 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-25 15:15:38,267 EPOCH 8 done: loss 0.0159 - lr: 0.000007
|
194 |
+
2023-10-25 15:15:43,263 DEV : loss 0.21227356791496277 - f1-score (micro avg) 0.7971
|
195 |
+
2023-10-25 15:15:43,284 ----------------------------------------------------------------------------------------------------
|
196 |
+
2023-10-25 15:15:49,118 epoch 9 - iter 89/893 - loss 0.00920994 - time (sec): 5.83 - samples/sec: 4231.87 - lr: 0.000006 - momentum: 0.000000
|
197 |
+
2023-10-25 15:15:54,872 epoch 9 - iter 178/893 - loss 0.00891932 - time (sec): 11.59 - samples/sec: 4226.85 - lr: 0.000006 - momentum: 0.000000
|
198 |
+
2023-10-25 15:16:00,508 epoch 9 - iter 267/893 - loss 0.01039436 - time (sec): 17.22 - samples/sec: 4278.15 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-25 15:16:06,782 epoch 9 - iter 356/893 - loss 0.01056567 - time (sec): 23.50 - samples/sec: 4283.75 - lr: 0.000005 - momentum: 0.000000
|
200 |
+
2023-10-25 15:16:12,664 epoch 9 - iter 445/893 - loss 0.01205154 - time (sec): 29.38 - samples/sec: 4283.65 - lr: 0.000005 - momentum: 0.000000
|
201 |
+
2023-10-25 15:16:18,609 epoch 9 - iter 534/893 - loss 0.01179973 - time (sec): 35.32 - samples/sec: 4285.62 - lr: 0.000005 - momentum: 0.000000
|
202 |
+
2023-10-25 15:16:24,456 epoch 9 - iter 623/893 - loss 0.01148151 - time (sec): 41.17 - samples/sec: 4253.07 - lr: 0.000004 - momentum: 0.000000
|
203 |
+
2023-10-25 15:16:30,157 epoch 9 - iter 712/893 - loss 0.01108116 - time (sec): 46.87 - samples/sec: 4279.34 - lr: 0.000004 - momentum: 0.000000
|
204 |
+
2023-10-25 15:16:35,747 epoch 9 - iter 801/893 - loss 0.01083303 - time (sec): 52.46 - samples/sec: 4280.96 - lr: 0.000004 - momentum: 0.000000
|
205 |
+
2023-10-25 15:16:41,190 epoch 9 - iter 890/893 - loss 0.01082633 - time (sec): 57.90 - samples/sec: 4284.21 - lr: 0.000003 - momentum: 0.000000
|
206 |
+
2023-10-25 15:16:41,365 ----------------------------------------------------------------------------------------------------
|
207 |
+
2023-10-25 15:16:41,366 EPOCH 9 done: loss 0.0108 - lr: 0.000003
|
208 |
+
2023-10-25 15:16:46,208 DEV : loss 0.21176157891750336 - f1-score (micro avg) 0.8104
|
209 |
+
2023-10-25 15:16:46,230 ----------------------------------------------------------------------------------------------------
|
210 |
+
2023-10-25 15:16:51,777 epoch 10 - iter 89/893 - loss 0.01317723 - time (sec): 5.55 - samples/sec: 4308.65 - lr: 0.000003 - momentum: 0.000000
|
211 |
+
2023-10-25 15:16:57,614 epoch 10 - iter 178/893 - loss 0.00887501 - time (sec): 11.38 - samples/sec: 4360.90 - lr: 0.000003 - momentum: 0.000000
|
212 |
+
2023-10-25 15:17:03,568 epoch 10 - iter 267/893 - loss 0.00690912 - time (sec): 17.34 - samples/sec: 4311.16 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-25 15:17:09,586 epoch 10 - iter 356/893 - loss 0.00707851 - time (sec): 23.35 - samples/sec: 4278.17 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-25 15:17:15,125 epoch 10 - iter 445/893 - loss 0.00657208 - time (sec): 28.89 - samples/sec: 4225.74 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-25 15:17:20,950 epoch 10 - iter 534/893 - loss 0.00656213 - time (sec): 34.72 - samples/sec: 4263.47 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-25 15:17:26,715 epoch 10 - iter 623/893 - loss 0.00748677 - time (sec): 40.48 - samples/sec: 4268.60 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-25 15:17:32,254 epoch 10 - iter 712/893 - loss 0.00756648 - time (sec): 46.02 - samples/sec: 4296.21 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-25 15:17:37,995 epoch 10 - iter 801/893 - loss 0.00771944 - time (sec): 51.76 - samples/sec: 4279.89 - lr: 0.000000 - momentum: 0.000000
|
219 |
+
2023-10-25 15:17:43,827 epoch 10 - iter 890/893 - loss 0.00812425 - time (sec): 57.60 - samples/sec: 4307.35 - lr: 0.000000 - momentum: 0.000000
|
220 |
+
2023-10-25 15:17:43,999 ----------------------------------------------------------------------------------------------------
|
221 |
+
2023-10-25 15:17:43,999 EPOCH 10 done: loss 0.0081 - lr: 0.000000
|
222 |
+
2023-10-25 15:17:47,978 DEV : loss 0.21720275282859802 - f1-score (micro avg) 0.8147
|
223 |
+
2023-10-25 15:17:47,999 saving best model
|
224 |
+
2023-10-25 15:17:49,119 ----------------------------------------------------------------------------------------------------
|
225 |
+
2023-10-25 15:17:49,120 Loading model from best epoch ...
|
226 |
+
2023-10-25 15:17:51,035 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
227 |
+
2023-10-25 15:18:03,575
|
228 |
+
Results:
|
229 |
+
- F-score (micro) 0.6992
|
230 |
+
- F-score (macro) 0.6245
|
231 |
+
- Accuracy 0.5561
|
232 |
+
|
233 |
+
By class:
|
234 |
+
precision recall f1-score support
|
235 |
+
|
236 |
+
LOC 0.7038 0.6986 0.7012 1095
|
237 |
+
PER 0.7808 0.7816 0.7812 1012
|
238 |
+
ORG 0.4549 0.5798 0.5099 357
|
239 |
+
HumanProd 0.4074 0.6667 0.5057 33
|
240 |
+
|
241 |
+
micro avg 0.6842 0.7149 0.6992 2497
|
242 |
+
macro avg 0.5867 0.6817 0.6245 2497
|
243 |
+
weighted avg 0.6955 0.7149 0.7037 2497
|
244 |
+
|
245 |
+
2023-10-25 15:18:03,576 ----------------------------------------------------------------------------------------------------
|