|
2023-10-18 18:42:19,552 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,552 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,553 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences |
|
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,553 Train: 5901 sentences |
|
2023-10-18 18:42:19,553 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,553 Training Params: |
|
2023-10-18 18:42:19,553 - learning_rate: "3e-05" |
|
2023-10-18 18:42:19,553 - mini_batch_size: "4" |
|
2023-10-18 18:42:19,553 - max_epochs: "10" |
|
2023-10-18 18:42:19,553 - shuffle: "True" |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,553 Plugins: |
|
2023-10-18 18:42:19,553 - TensorboardLogger |
|
2023-10-18 18:42:19,553 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,553 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 18:42:19,553 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,553 Computation: |
|
2023-10-18 18:42:19,553 - compute on device: cuda:0 |
|
2023-10-18 18:42:19,553 - embedding storage: none |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,553 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:19,554 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 18:42:22,483 epoch 1 - iter 147/1476 - loss 3.78556163 - time (sec): 2.93 - samples/sec: 5682.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:42:24,846 epoch 1 - iter 294/1476 - loss 3.49556145 - time (sec): 5.29 - samples/sec: 6046.59 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:42:27,226 epoch 1 - iter 441/1476 - loss 3.01465804 - time (sec): 7.67 - samples/sec: 6603.04 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:42:29,575 epoch 1 - iter 588/1476 - loss 2.53729222 - time (sec): 10.02 - samples/sec: 6748.99 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:42:31,898 epoch 1 - iter 735/1476 - loss 2.20001287 - time (sec): 12.34 - samples/sec: 6749.73 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:42:34,215 epoch 1 - iter 882/1476 - loss 1.96299069 - time (sec): 14.66 - samples/sec: 6740.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:42:36,563 epoch 1 - iter 1029/1476 - loss 1.76317650 - time (sec): 17.01 - samples/sec: 6825.64 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:42:38,861 epoch 1 - iter 1176/1476 - loss 1.61885567 - time (sec): 19.31 - samples/sec: 6848.58 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:42:41,179 epoch 1 - iter 1323/1476 - loss 1.50834098 - time (sec): 21.63 - samples/sec: 6821.58 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:42:43,583 epoch 1 - iter 1470/1476 - loss 1.39708712 - time (sec): 24.03 - samples/sec: 6903.66 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 18:42:43,811 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:43,811 EPOCH 1 done: loss 1.3940 - lr: 0.000030 |
|
2023-10-18 18:42:46,113 DEV : loss 0.41898399591445923 - f1-score (micro avg) 0.0443 |
|
2023-10-18 18:42:46,137 saving best model |
|
2023-10-18 18:42:46,171 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:42:48,566 epoch 2 - iter 147/1476 - loss 0.47548170 - time (sec): 2.39 - samples/sec: 6840.57 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 18:42:51,002 epoch 2 - iter 294/1476 - loss 0.48872686 - time (sec): 4.83 - samples/sec: 7549.30 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:42:53,397 epoch 2 - iter 441/1476 - loss 0.47995919 - time (sec): 7.23 - samples/sec: 7321.02 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:42:55,762 epoch 2 - iter 588/1476 - loss 0.47927875 - time (sec): 9.59 - samples/sec: 7245.34 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:42:58,037 epoch 2 - iter 735/1476 - loss 0.47349392 - time (sec): 11.87 - samples/sec: 7110.07 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:43:00,393 epoch 2 - iter 882/1476 - loss 0.47046838 - time (sec): 14.22 - samples/sec: 7145.73 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:43:02,764 epoch 2 - iter 1029/1476 - loss 0.45588573 - time (sec): 16.59 - samples/sec: 7206.55 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:43:05,048 epoch 2 - iter 1176/1476 - loss 0.45406832 - time (sec): 18.88 - samples/sec: 7118.13 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:43:07,287 epoch 2 - iter 1323/1476 - loss 0.44810182 - time (sec): 21.12 - samples/sec: 7086.86 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:43:09,615 epoch 2 - iter 1470/1476 - loss 0.44252694 - time (sec): 23.44 - samples/sec: 7074.16 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:43:09,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:43:09,700 EPOCH 2 done: loss 0.4425 - lr: 0.000027 |
|
2023-10-18 18:43:17,042 DEV : loss 0.31516316533088684 - f1-score (micro avg) 0.2966 |
|
2023-10-18 18:43:17,065 saving best model |
|
2023-10-18 18:43:17,099 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:43:19,471 epoch 3 - iter 147/1476 - loss 0.37653743 - time (sec): 2.37 - samples/sec: 7173.15 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:43:21,856 epoch 3 - iter 294/1476 - loss 0.38356178 - time (sec): 4.76 - samples/sec: 7220.20 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:43:24,194 epoch 3 - iter 441/1476 - loss 0.38400063 - time (sec): 7.09 - samples/sec: 7109.22 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:43:26,526 epoch 3 - iter 588/1476 - loss 0.37943481 - time (sec): 9.43 - samples/sec: 7064.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:43:28,778 epoch 3 - iter 735/1476 - loss 0.38111007 - time (sec): 11.68 - samples/sec: 6886.94 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:43:31,090 epoch 3 - iter 882/1476 - loss 0.37841769 - time (sec): 13.99 - samples/sec: 6911.81 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:43:33,464 epoch 3 - iter 1029/1476 - loss 0.36828271 - time (sec): 16.36 - samples/sec: 6962.24 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:43:35,777 epoch 3 - iter 1176/1476 - loss 0.37002507 - time (sec): 18.68 - samples/sec: 7063.49 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:43:38,053 epoch 3 - iter 1323/1476 - loss 0.37070190 - time (sec): 20.95 - samples/sec: 7070.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:43:40,424 epoch 3 - iter 1470/1476 - loss 0.37243905 - time (sec): 23.32 - samples/sec: 7107.03 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:43:40,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:43:40,511 EPOCH 3 done: loss 0.3731 - lr: 0.000023 |
|
2023-10-18 18:43:47,546 DEV : loss 0.28877052664756775 - f1-score (micro avg) 0.3558 |
|
2023-10-18 18:43:47,570 saving best model |
|
2023-10-18 18:43:47,607 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:43:50,005 epoch 4 - iter 147/1476 - loss 0.34367342 - time (sec): 2.40 - samples/sec: 7349.88 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:43:52,283 epoch 4 - iter 294/1476 - loss 0.32751733 - time (sec): 4.68 - samples/sec: 7008.91 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:43:54,604 epoch 4 - iter 441/1476 - loss 0.33681949 - time (sec): 7.00 - samples/sec: 6977.62 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:43:56,781 epoch 4 - iter 588/1476 - loss 0.33417208 - time (sec): 9.17 - samples/sec: 7228.72 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:43:58,893 epoch 4 - iter 735/1476 - loss 0.32979133 - time (sec): 11.29 - samples/sec: 7480.23 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:44:01,204 epoch 4 - iter 882/1476 - loss 0.33930172 - time (sec): 13.60 - samples/sec: 7433.45 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:44:03,533 epoch 4 - iter 1029/1476 - loss 0.33934197 - time (sec): 15.93 - samples/sec: 7378.33 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:44:05,902 epoch 4 - iter 1176/1476 - loss 0.33984314 - time (sec): 18.29 - samples/sec: 7307.51 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:44:08,208 epoch 4 - iter 1323/1476 - loss 0.33833182 - time (sec): 20.60 - samples/sec: 7277.07 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:44:10,567 epoch 4 - iter 1470/1476 - loss 0.33694420 - time (sec): 22.96 - samples/sec: 7221.17 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:44:10,651 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:44:10,651 EPOCH 4 done: loss 0.3368 - lr: 0.000020 |
|
2023-10-18 18:44:17,690 DEV : loss 0.28429877758026123 - f1-score (micro avg) 0.3935 |
|
2023-10-18 18:44:17,715 saving best model |
|
2023-10-18 18:44:17,749 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:44:20,063 epoch 5 - iter 147/1476 - loss 0.32343183 - time (sec): 2.31 - samples/sec: 7010.69 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:44:22,406 epoch 5 - iter 294/1476 - loss 0.31723666 - time (sec): 4.66 - samples/sec: 6933.10 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:44:24,747 epoch 5 - iter 441/1476 - loss 0.31507297 - time (sec): 7.00 - samples/sec: 6946.03 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:44:27,237 epoch 5 - iter 588/1476 - loss 0.31066895 - time (sec): 9.49 - samples/sec: 7185.66 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:44:29,596 epoch 5 - iter 735/1476 - loss 0.31208579 - time (sec): 11.85 - samples/sec: 7206.13 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:44:32,023 epoch 5 - iter 882/1476 - loss 0.31223448 - time (sec): 14.27 - samples/sec: 7080.60 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:44:34,374 epoch 5 - iter 1029/1476 - loss 0.31346772 - time (sec): 16.62 - samples/sec: 6926.93 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:44:36,703 epoch 5 - iter 1176/1476 - loss 0.31351190 - time (sec): 18.95 - samples/sec: 6893.34 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:44:39,107 epoch 5 - iter 1323/1476 - loss 0.31194294 - time (sec): 21.36 - samples/sec: 6974.05 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:44:41,419 epoch 5 - iter 1470/1476 - loss 0.31206302 - time (sec): 23.67 - samples/sec: 7004.06 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:44:41,510 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:44:41,511 EPOCH 5 done: loss 0.3121 - lr: 0.000017 |
|
2023-10-18 18:44:48,628 DEV : loss 0.26452046632766724 - f1-score (micro avg) 0.4284 |
|
2023-10-18 18:44:48,654 saving best model |
|
2023-10-18 18:44:48,691 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:44:51,083 epoch 6 - iter 147/1476 - loss 0.30324497 - time (sec): 2.39 - samples/sec: 7532.60 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:44:53,512 epoch 6 - iter 294/1476 - loss 0.28835079 - time (sec): 4.82 - samples/sec: 7249.21 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:44:55,909 epoch 6 - iter 441/1476 - loss 0.28662018 - time (sec): 7.22 - samples/sec: 7110.57 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:44:58,264 epoch 6 - iter 588/1476 - loss 0.29081123 - time (sec): 9.57 - samples/sec: 7068.00 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:45:00,689 epoch 6 - iter 735/1476 - loss 0.28695726 - time (sec): 12.00 - samples/sec: 7051.66 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:45:03,003 epoch 6 - iter 882/1476 - loss 0.29455832 - time (sec): 14.31 - samples/sec: 6931.39 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:45:05,403 epoch 6 - iter 1029/1476 - loss 0.29671075 - time (sec): 16.71 - samples/sec: 6946.77 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:45:07,722 epoch 6 - iter 1176/1476 - loss 0.29458444 - time (sec): 19.03 - samples/sec: 6913.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:45:10,046 epoch 6 - iter 1323/1476 - loss 0.29287569 - time (sec): 21.35 - samples/sec: 6903.39 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:45:12,430 epoch 6 - iter 1470/1476 - loss 0.29310511 - time (sec): 23.74 - samples/sec: 6974.93 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:45:12,531 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:45:12,531 EPOCH 6 done: loss 0.2924 - lr: 0.000013 |
|
2023-10-18 18:45:19,620 DEV : loss 0.25417211651802063 - f1-score (micro avg) 0.4487 |
|
2023-10-18 18:45:19,645 saving best model |
|
2023-10-18 18:45:19,689 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:45:22,089 epoch 7 - iter 147/1476 - loss 0.26935127 - time (sec): 2.40 - samples/sec: 6983.45 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:45:24,410 epoch 7 - iter 294/1476 - loss 0.28105637 - time (sec): 4.72 - samples/sec: 6999.27 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:45:26,729 epoch 7 - iter 441/1476 - loss 0.26761446 - time (sec): 7.04 - samples/sec: 6984.77 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:45:29,056 epoch 7 - iter 588/1476 - loss 0.27708379 - time (sec): 9.37 - samples/sec: 6995.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:45:31,434 epoch 7 - iter 735/1476 - loss 0.27588589 - time (sec): 11.74 - samples/sec: 7045.15 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:45:33,886 epoch 7 - iter 882/1476 - loss 0.27477675 - time (sec): 14.20 - samples/sec: 6994.89 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:45:36,251 epoch 7 - iter 1029/1476 - loss 0.27432064 - time (sec): 16.56 - samples/sec: 6982.17 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:45:38,753 epoch 7 - iter 1176/1476 - loss 0.27174772 - time (sec): 19.06 - samples/sec: 6939.55 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:45:41,236 epoch 7 - iter 1323/1476 - loss 0.27271128 - time (sec): 21.55 - samples/sec: 6912.42 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:45:43,620 epoch 7 - iter 1470/1476 - loss 0.27459010 - time (sec): 23.93 - samples/sec: 6932.62 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:45:43,716 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:45:43,717 EPOCH 7 done: loss 0.2748 - lr: 0.000010 |
|
2023-10-18 18:45:50,837 DEV : loss 0.2561444044113159 - f1-score (micro avg) 0.4575 |
|
2023-10-18 18:45:50,862 saving best model |
|
2023-10-18 18:45:50,900 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:45:53,369 epoch 8 - iter 147/1476 - loss 0.27413252 - time (sec): 2.47 - samples/sec: 8275.39 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:45:55,733 epoch 8 - iter 294/1476 - loss 0.27388149 - time (sec): 4.83 - samples/sec: 7719.31 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:45:58,088 epoch 8 - iter 441/1476 - loss 0.27274651 - time (sec): 7.19 - samples/sec: 7394.54 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:46:00,809 epoch 8 - iter 588/1476 - loss 0.27643641 - time (sec): 9.91 - samples/sec: 7042.61 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:46:03,084 epoch 8 - iter 735/1476 - loss 0.27009636 - time (sec): 12.18 - samples/sec: 6960.18 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:46:05,465 epoch 8 - iter 882/1476 - loss 0.26957002 - time (sec): 14.56 - samples/sec: 6919.68 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:46:07,775 epoch 8 - iter 1029/1476 - loss 0.26747275 - time (sec): 16.87 - samples/sec: 6925.56 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:46:10,253 epoch 8 - iter 1176/1476 - loss 0.26851629 - time (sec): 19.35 - samples/sec: 6915.17 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:46:12,590 epoch 8 - iter 1323/1476 - loss 0.26905409 - time (sec): 21.69 - samples/sec: 6894.10 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:46:14,955 epoch 8 - iter 1470/1476 - loss 0.26863538 - time (sec): 24.05 - samples/sec: 6896.96 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:46:15,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:46:15,045 EPOCH 8 done: loss 0.2688 - lr: 0.000007 |
|
2023-10-18 18:46:22,285 DEV : loss 0.24966885149478912 - f1-score (micro avg) 0.4711 |
|
2023-10-18 18:46:22,310 saving best model |
|
2023-10-18 18:46:22,350 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:46:24,624 epoch 9 - iter 147/1476 - loss 0.26497575 - time (sec): 2.27 - samples/sec: 6837.90 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:46:26,935 epoch 9 - iter 294/1476 - loss 0.25295697 - time (sec): 4.58 - samples/sec: 6872.19 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:46:29,313 epoch 9 - iter 441/1476 - loss 0.26741752 - time (sec): 6.96 - samples/sec: 7248.47 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:46:31,663 epoch 9 - iter 588/1476 - loss 0.26622468 - time (sec): 9.31 - samples/sec: 7194.86 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:46:33,964 epoch 9 - iter 735/1476 - loss 0.26781353 - time (sec): 11.61 - samples/sec: 7123.94 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:46:36,257 epoch 9 - iter 882/1476 - loss 0.26776521 - time (sec): 13.91 - samples/sec: 7083.45 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:46:38,629 epoch 9 - iter 1029/1476 - loss 0.26644524 - time (sec): 16.28 - samples/sec: 7099.69 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:46:40,957 epoch 9 - iter 1176/1476 - loss 0.26542276 - time (sec): 18.61 - samples/sec: 7091.97 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:46:43,289 epoch 9 - iter 1323/1476 - loss 0.26480558 - time (sec): 20.94 - samples/sec: 7023.52 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:46:45,685 epoch 9 - iter 1470/1476 - loss 0.26287148 - time (sec): 23.33 - samples/sec: 7090.80 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:46:45,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:46:45,789 EPOCH 9 done: loss 0.2622 - lr: 0.000003 |
|
2023-10-18 18:46:52,888 DEV : loss 0.2507059872150421 - f1-score (micro avg) 0.4744 |
|
2023-10-18 18:46:52,913 saving best model |
|
2023-10-18 18:46:52,955 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:46:55,376 epoch 10 - iter 147/1476 - loss 0.22586996 - time (sec): 2.42 - samples/sec: 7307.73 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:46:57,716 epoch 10 - iter 294/1476 - loss 0.23115650 - time (sec): 4.76 - samples/sec: 7185.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:47:00,090 epoch 10 - iter 441/1476 - loss 0.24322112 - time (sec): 7.13 - samples/sec: 7118.39 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:47:02,428 epoch 10 - iter 588/1476 - loss 0.25190140 - time (sec): 9.47 - samples/sec: 6993.68 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:47:04,802 epoch 10 - iter 735/1476 - loss 0.25558653 - time (sec): 11.85 - samples/sec: 7004.81 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:47:07,197 epoch 10 - iter 882/1476 - loss 0.25425158 - time (sec): 14.24 - samples/sec: 7056.10 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:47:09,373 epoch 10 - iter 1029/1476 - loss 0.25465722 - time (sec): 16.42 - samples/sec: 7091.58 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:47:11,469 epoch 10 - iter 1176/1476 - loss 0.25966368 - time (sec): 18.51 - samples/sec: 7244.08 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:47:13,530 epoch 10 - iter 1323/1476 - loss 0.25860524 - time (sec): 20.57 - samples/sec: 7315.70 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 18:47:15,573 epoch 10 - iter 1470/1476 - loss 0.25742169 - time (sec): 22.62 - samples/sec: 7336.12 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 18:47:15,652 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:47:15,652 EPOCH 10 done: loss 0.2572 - lr: 0.000000 |
|
2023-10-18 18:47:22,783 DEV : loss 0.24806223809719086 - f1-score (micro avg) 0.4774 |
|
2023-10-18 18:47:22,808 saving best model |
|
2023-10-18 18:47:22,872 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:47:22,872 Loading model from best epoch ... |
|
2023-10-18 18:47:22,947 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod |
|
2023-10-18 18:47:25,503 |
|
Results: |
|
- F-score (micro) 0.5021 |
|
- F-score (macro) 0.2868 |
|
- Accuracy 0.3569 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.5109 0.7343 0.6026 858 |
|
pers 0.3793 0.4767 0.4224 537 |
|
org 0.2000 0.0303 0.0526 132 |
|
time 0.3830 0.3333 0.3564 54 |
|
prod 0.0000 0.0000 0.0000 61 |
|
|
|
micro avg 0.4597 0.5530 0.5021 1642 |
|
macro avg 0.2946 0.3149 0.2868 1642 |
|
weighted avg 0.4197 0.5530 0.4690 1642 |
|
|
|
2023-10-18 18:47:25,503 ---------------------------------------------------------------------------------------------------- |
|
|