stefan-it's picture
Upload folder using huggingface_hub
f9cc5f0
2023-10-13 09:43:04,570 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,571 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 09:43:04,571 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,571 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-13 09:43:04,571 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,571 Train: 1214 sentences
2023-10-13 09:43:04,571 (train_with_dev=False, train_with_test=False)
2023-10-13 09:43:04,571 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,571 Training Params:
2023-10-13 09:43:04,571 - learning_rate: "3e-05"
2023-10-13 09:43:04,571 - mini_batch_size: "4"
2023-10-13 09:43:04,571 - max_epochs: "10"
2023-10-13 09:43:04,571 - shuffle: "True"
2023-10-13 09:43:04,571 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,571 Plugins:
2023-10-13 09:43:04,572 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 09:43:04,572 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,572 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 09:43:04,572 - metric: "('micro avg', 'f1-score')"
2023-10-13 09:43:04,572 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,572 Computation:
2023-10-13 09:43:04,572 - compute on device: cuda:0
2023-10-13 09:43:04,572 - embedding storage: none
2023-10-13 09:43:04,572 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,572 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-13 09:43:04,572 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:04,572 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:05,888 epoch 1 - iter 30/304 - loss 3.18777132 - time (sec): 1.32 - samples/sec: 2145.66 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:43:07,227 epoch 1 - iter 60/304 - loss 2.73681219 - time (sec): 2.65 - samples/sec: 2358.08 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:43:08,537 epoch 1 - iter 90/304 - loss 2.09750424 - time (sec): 3.96 - samples/sec: 2381.10 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:43:09,839 epoch 1 - iter 120/304 - loss 1.78399400 - time (sec): 5.27 - samples/sec: 2373.95 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:43:11,127 epoch 1 - iter 150/304 - loss 1.56438139 - time (sec): 6.55 - samples/sec: 2345.68 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:43:12,432 epoch 1 - iter 180/304 - loss 1.38262168 - time (sec): 7.86 - samples/sec: 2325.17 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:43:13,788 epoch 1 - iter 210/304 - loss 1.23673635 - time (sec): 9.21 - samples/sec: 2305.97 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:43:15,154 epoch 1 - iter 240/304 - loss 1.10795345 - time (sec): 10.58 - samples/sec: 2313.02 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:43:16,476 epoch 1 - iter 270/304 - loss 1.01589125 - time (sec): 11.90 - samples/sec: 2307.23 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:43:17,839 epoch 1 - iter 300/304 - loss 0.93379970 - time (sec): 13.27 - samples/sec: 2312.35 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:43:18,016 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:18,017 EPOCH 1 done: loss 0.9263 - lr: 0.000030
2023-10-13 09:43:19,099 DEV : loss 0.2200542837381363 - f1-score (micro avg) 0.5681
2023-10-13 09:43:19,106 saving best model
2023-10-13 09:43:19,511 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:20,888 epoch 2 - iter 30/304 - loss 0.17307732 - time (sec): 1.38 - samples/sec: 2174.47 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:43:22,243 epoch 2 - iter 60/304 - loss 0.17320414 - time (sec): 2.73 - samples/sec: 2146.06 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:43:23,621 epoch 2 - iter 90/304 - loss 0.18760075 - time (sec): 4.11 - samples/sec: 2173.24 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:43:24,985 epoch 2 - iter 120/304 - loss 0.19205634 - time (sec): 5.47 - samples/sec: 2201.31 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:43:26,290 epoch 2 - iter 150/304 - loss 0.18439761 - time (sec): 6.78 - samples/sec: 2199.32 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:43:27,746 epoch 2 - iter 180/304 - loss 0.16674569 - time (sec): 8.23 - samples/sec: 2174.12 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:43:29,264 epoch 2 - iter 210/304 - loss 0.16494412 - time (sec): 9.75 - samples/sec: 2175.89 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:43:30,785 epoch 2 - iter 240/304 - loss 0.16424428 - time (sec): 11.27 - samples/sec: 2162.64 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:43:32,294 epoch 2 - iter 270/304 - loss 0.15967076 - time (sec): 12.78 - samples/sec: 2140.71 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:43:33,856 epoch 2 - iter 300/304 - loss 0.15519499 - time (sec): 14.34 - samples/sec: 2132.80 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:43:34,071 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:34,072 EPOCH 2 done: loss 0.1558 - lr: 0.000027
2023-10-13 09:43:35,012 DEV : loss 0.1596992164850235 - f1-score (micro avg) 0.7454
2023-10-13 09:43:35,022 saving best model
2023-10-13 09:43:35,504 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:37,119 epoch 3 - iter 30/304 - loss 0.11300790 - time (sec): 1.61 - samples/sec: 1984.96 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:43:38,744 epoch 3 - iter 60/304 - loss 0.09132321 - time (sec): 3.24 - samples/sec: 1885.50 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:43:40,347 epoch 3 - iter 90/304 - loss 0.09134842 - time (sec): 4.84 - samples/sec: 1952.61 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:43:41,938 epoch 3 - iter 120/304 - loss 0.08989897 - time (sec): 6.43 - samples/sec: 1899.90 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:43:43,528 epoch 3 - iter 150/304 - loss 0.08942096 - time (sec): 8.02 - samples/sec: 1918.25 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:43:45,111 epoch 3 - iter 180/304 - loss 0.08417011 - time (sec): 9.60 - samples/sec: 1902.32 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:43:46,663 epoch 3 - iter 210/304 - loss 0.08708482 - time (sec): 11.16 - samples/sec: 1928.86 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:43:48,217 epoch 3 - iter 240/304 - loss 0.08485546 - time (sec): 12.71 - samples/sec: 1920.44 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:43:49,712 epoch 3 - iter 270/304 - loss 0.08273766 - time (sec): 14.21 - samples/sec: 1935.56 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:43:51,278 epoch 3 - iter 300/304 - loss 0.08175386 - time (sec): 15.77 - samples/sec: 1944.07 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:43:51,463 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:51,463 EPOCH 3 done: loss 0.0811 - lr: 0.000023
2023-10-13 09:43:52,423 DEV : loss 0.15745516121387482 - f1-score (micro avg) 0.823
2023-10-13 09:43:52,431 saving best model
2023-10-13 09:43:52,949 ----------------------------------------------------------------------------------------------------
2023-10-13 09:43:54,354 epoch 4 - iter 30/304 - loss 0.08037408 - time (sec): 1.40 - samples/sec: 2217.79 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:43:55,799 epoch 4 - iter 60/304 - loss 0.05185332 - time (sec): 2.84 - samples/sec: 2213.19 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:43:57,124 epoch 4 - iter 90/304 - loss 0.04267741 - time (sec): 4.17 - samples/sec: 2166.53 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:43:58,511 epoch 4 - iter 120/304 - loss 0.05381535 - time (sec): 5.56 - samples/sec: 2160.11 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:43:59,928 epoch 4 - iter 150/304 - loss 0.05611766 - time (sec): 6.97 - samples/sec: 2158.07 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:44:01,264 epoch 4 - iter 180/304 - loss 0.05710902 - time (sec): 8.31 - samples/sec: 2185.65 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:44:02,592 epoch 4 - iter 210/304 - loss 0.05513965 - time (sec): 9.64 - samples/sec: 2228.90 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:44:03,883 epoch 4 - iter 240/304 - loss 0.05318763 - time (sec): 10.93 - samples/sec: 2247.68 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:44:05,277 epoch 4 - iter 270/304 - loss 0.05833081 - time (sec): 12.32 - samples/sec: 2251.97 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:44:06,628 epoch 4 - iter 300/304 - loss 0.05753093 - time (sec): 13.67 - samples/sec: 2241.49 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:44:06,828 ----------------------------------------------------------------------------------------------------
2023-10-13 09:44:06,828 EPOCH 4 done: loss 0.0589 - lr: 0.000020
2023-10-13 09:44:07,775 DEV : loss 0.17054308950901031 - f1-score (micro avg) 0.8459
2023-10-13 09:44:07,782 saving best model
2023-10-13 09:44:08,224 ----------------------------------------------------------------------------------------------------
2023-10-13 09:44:09,687 epoch 5 - iter 30/304 - loss 0.03275953 - time (sec): 1.46 - samples/sec: 2395.60 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:44:11,050 epoch 5 - iter 60/304 - loss 0.03803949 - time (sec): 2.82 - samples/sec: 2265.93 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:44:12,387 epoch 5 - iter 90/304 - loss 0.04627479 - time (sec): 4.16 - samples/sec: 2209.66 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:44:13,739 epoch 5 - iter 120/304 - loss 0.04780246 - time (sec): 5.51 - samples/sec: 2238.04 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:44:15,140 epoch 5 - iter 150/304 - loss 0.04629657 - time (sec): 6.91 - samples/sec: 2224.18 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:44:16,498 epoch 5 - iter 180/304 - loss 0.05397125 - time (sec): 8.27 - samples/sec: 2215.68 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:44:17,827 epoch 5 - iter 210/304 - loss 0.05390279 - time (sec): 9.60 - samples/sec: 2218.25 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:44:19,155 epoch 5 - iter 240/304 - loss 0.04991622 - time (sec): 10.92 - samples/sec: 2251.74 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:44:20,494 epoch 5 - iter 270/304 - loss 0.04695046 - time (sec): 12.26 - samples/sec: 2257.09 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:44:21,829 epoch 5 - iter 300/304 - loss 0.04815118 - time (sec): 13.60 - samples/sec: 2259.06 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:44:22,005 ----------------------------------------------------------------------------------------------------
2023-10-13 09:44:22,005 EPOCH 5 done: loss 0.0478 - lr: 0.000017
2023-10-13 09:44:22,946 DEV : loss 0.18562708795070648 - f1-score (micro avg) 0.8517
2023-10-13 09:44:22,953 saving best model
2023-10-13 09:44:23,391 ----------------------------------------------------------------------------------------------------
2023-10-13 09:44:24,744 epoch 6 - iter 30/304 - loss 0.02949098 - time (sec): 1.35 - samples/sec: 2254.15 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:44:26,088 epoch 6 - iter 60/304 - loss 0.03277319 - time (sec): 2.70 - samples/sec: 2278.42 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:44:27,446 epoch 6 - iter 90/304 - loss 0.03536898 - time (sec): 4.05 - samples/sec: 2303.35 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:44:28,791 epoch 6 - iter 120/304 - loss 0.03228230 - time (sec): 5.40 - samples/sec: 2319.29 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:44:30,203 epoch 6 - iter 150/304 - loss 0.03069445 - time (sec): 6.81 - samples/sec: 2277.68 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:44:31,549 epoch 6 - iter 180/304 - loss 0.03440282 - time (sec): 8.16 - samples/sec: 2277.90 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:44:32,979 epoch 6 - iter 210/304 - loss 0.03271867 - time (sec): 9.59 - samples/sec: 2272.83 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:44:34,389 epoch 6 - iter 240/304 - loss 0.03193408 - time (sec): 11.00 - samples/sec: 2250.68 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:44:35,723 epoch 6 - iter 270/304 - loss 0.03421196 - time (sec): 12.33 - samples/sec: 2247.74 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:44:37,066 epoch 6 - iter 300/304 - loss 0.03653127 - time (sec): 13.67 - samples/sec: 2243.82 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:44:37,238 ----------------------------------------------------------------------------------------------------
2023-10-13 09:44:37,238 EPOCH 6 done: loss 0.0363 - lr: 0.000013
2023-10-13 09:44:38,411 DEV : loss 0.19996623694896698 - f1-score (micro avg) 0.8463
2023-10-13 09:44:38,418 ----------------------------------------------------------------------------------------------------
2023-10-13 09:44:39,833 epoch 7 - iter 30/304 - loss 0.01248742 - time (sec): 1.41 - samples/sec: 2184.05 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:44:41,231 epoch 7 - iter 60/304 - loss 0.01048119 - time (sec): 2.81 - samples/sec: 2249.78 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:44:42,675 epoch 7 - iter 90/304 - loss 0.01767536 - time (sec): 4.26 - samples/sec: 2175.25 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:44:44,095 epoch 7 - iter 120/304 - loss 0.01439525 - time (sec): 5.68 - samples/sec: 2155.56 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:44:45,501 epoch 7 - iter 150/304 - loss 0.02138702 - time (sec): 7.08 - samples/sec: 2142.59 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:44:46,880 epoch 7 - iter 180/304 - loss 0.02495787 - time (sec): 8.46 - samples/sec: 2151.11 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:44:48,362 epoch 7 - iter 210/304 - loss 0.02267407 - time (sec): 9.94 - samples/sec: 2122.44 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:44:49,823 epoch 7 - iter 240/304 - loss 0.02105063 - time (sec): 11.40 - samples/sec: 2115.61 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:44:51,208 epoch 7 - iter 270/304 - loss 0.02507215 - time (sec): 12.79 - samples/sec: 2168.01 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:44:52,530 epoch 7 - iter 300/304 - loss 0.02827736 - time (sec): 14.11 - samples/sec: 2178.98 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:44:52,712 ----------------------------------------------------------------------------------------------------
2023-10-13 09:44:52,712 EPOCH 7 done: loss 0.0281 - lr: 0.000010
2023-10-13 09:44:53,657 DEV : loss 0.20399750769138336 - f1-score (micro avg) 0.8386
2023-10-13 09:44:53,664 ----------------------------------------------------------------------------------------------------
2023-10-13 09:44:54,987 epoch 8 - iter 30/304 - loss 0.01453405 - time (sec): 1.32 - samples/sec: 2474.56 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:44:56,442 epoch 8 - iter 60/304 - loss 0.02522980 - time (sec): 2.78 - samples/sec: 2328.53 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:44:57,906 epoch 8 - iter 90/304 - loss 0.02720838 - time (sec): 4.24 - samples/sec: 2201.84 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:44:59,292 epoch 8 - iter 120/304 - loss 0.02254900 - time (sec): 5.63 - samples/sec: 2230.48 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:45:00,696 epoch 8 - iter 150/304 - loss 0.02353663 - time (sec): 7.03 - samples/sec: 2213.46 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:45:02,058 epoch 8 - iter 180/304 - loss 0.02081346 - time (sec): 8.39 - samples/sec: 2242.90 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:45:03,440 epoch 8 - iter 210/304 - loss 0.02477093 - time (sec): 9.78 - samples/sec: 2225.11 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:45:04,806 epoch 8 - iter 240/304 - loss 0.02426797 - time (sec): 11.14 - samples/sec: 2210.29 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:45:06,184 epoch 8 - iter 270/304 - loss 0.02211778 - time (sec): 12.52 - samples/sec: 2230.27 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:45:07,636 epoch 8 - iter 300/304 - loss 0.02288395 - time (sec): 13.97 - samples/sec: 2200.17 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:45:07,811 ----------------------------------------------------------------------------------------------------
2023-10-13 09:45:07,811 EPOCH 8 done: loss 0.0227 - lr: 0.000007
2023-10-13 09:45:08,827 DEV : loss 0.2056579738855362 - f1-score (micro avg) 0.8511
2023-10-13 09:45:08,838 ----------------------------------------------------------------------------------------------------
2023-10-13 09:45:10,362 epoch 9 - iter 30/304 - loss 0.00578469 - time (sec): 1.52 - samples/sec: 1837.33 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:45:11,764 epoch 9 - iter 60/304 - loss 0.00992627 - time (sec): 2.92 - samples/sec: 2008.99 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:45:13,092 epoch 9 - iter 90/304 - loss 0.00992139 - time (sec): 4.25 - samples/sec: 2120.90 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:45:14,406 epoch 9 - iter 120/304 - loss 0.01727825 - time (sec): 5.57 - samples/sec: 2144.26 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:45:15,728 epoch 9 - iter 150/304 - loss 0.02120179 - time (sec): 6.89 - samples/sec: 2158.08 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:45:17,045 epoch 9 - iter 180/304 - loss 0.01809081 - time (sec): 8.21 - samples/sec: 2217.94 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:45:18,395 epoch 9 - iter 210/304 - loss 0.01948613 - time (sec): 9.55 - samples/sec: 2223.18 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:45:19,741 epoch 9 - iter 240/304 - loss 0.02088858 - time (sec): 10.90 - samples/sec: 2258.04 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:45:21,201 epoch 9 - iter 270/304 - loss 0.01990736 - time (sec): 12.36 - samples/sec: 2236.94 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:45:22,610 epoch 9 - iter 300/304 - loss 0.01800169 - time (sec): 13.77 - samples/sec: 2223.96 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:45:22,781 ----------------------------------------------------------------------------------------------------
2023-10-13 09:45:22,781 EPOCH 9 done: loss 0.0178 - lr: 0.000003
2023-10-13 09:45:23,767 DEV : loss 0.21016356348991394 - f1-score (micro avg) 0.8494
2023-10-13 09:45:23,777 ----------------------------------------------------------------------------------------------------
2023-10-13 09:45:25,335 epoch 10 - iter 30/304 - loss 0.00232505 - time (sec): 1.56 - samples/sec: 1989.11 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:45:26,954 epoch 10 - iter 60/304 - loss 0.01192489 - time (sec): 3.18 - samples/sec: 1920.03 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:45:28,564 epoch 10 - iter 90/304 - loss 0.01414615 - time (sec): 4.79 - samples/sec: 1997.32 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:45:30,139 epoch 10 - iter 120/304 - loss 0.01329637 - time (sec): 6.36 - samples/sec: 2014.38 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:45:31,689 epoch 10 - iter 150/304 - loss 0.01259800 - time (sec): 7.91 - samples/sec: 1979.91 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:45:33,228 epoch 10 - iter 180/304 - loss 0.01223902 - time (sec): 9.45 - samples/sec: 1969.71 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:45:34,750 epoch 10 - iter 210/304 - loss 0.01243686 - time (sec): 10.97 - samples/sec: 1992.22 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:45:36,233 epoch 10 - iter 240/304 - loss 0.01176342 - time (sec): 12.46 - samples/sec: 1968.81 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:45:37,770 epoch 10 - iter 270/304 - loss 0.01375307 - time (sec): 13.99 - samples/sec: 1974.36 - lr: 0.000000 - momentum: 0.000000
2023-10-13 09:45:39,279 epoch 10 - iter 300/304 - loss 0.01592456 - time (sec): 15.50 - samples/sec: 1970.68 - lr: 0.000000 - momentum: 0.000000
2023-10-13 09:45:39,469 ----------------------------------------------------------------------------------------------------
2023-10-13 09:45:39,469 EPOCH 10 done: loss 0.0157 - lr: 0.000000
2023-10-13 09:45:40,430 DEV : loss 0.21193119883537292 - f1-score (micro avg) 0.8461
2023-10-13 09:45:40,765 ----------------------------------------------------------------------------------------------------
2023-10-13 09:45:40,767 Loading model from best epoch ...
2023-10-13 09:45:42,274 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-13 09:45:43,361
Results:
- F-score (micro) 0.7936
- F-score (macro) 0.6136
- Accuracy 0.6637
By class:
precision recall f1-score support
scope 0.7610 0.8013 0.7806 151
pers 0.7395 0.9167 0.8186 96
work 0.7265 0.8947 0.8019 95
loc 0.6667 0.6667 0.6667 3
date 0.0000 0.0000 0.0000 3
micro avg 0.7437 0.8506 0.7936 348
macro avg 0.5787 0.6559 0.6136 348
weighted avg 0.7383 0.8506 0.7892 348
2023-10-13 09:45:43,362 ----------------------------------------------------------------------------------------------------