stefan-it's picture
Upload folder using huggingface_hub
78d1470
2023-10-19 00:55:08,399 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,399 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 00:55:08,399 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,399 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-19 00:55:08,399 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,399 Train: 14465 sentences
2023-10-19 00:55:08,399 (train_with_dev=False, train_with_test=False)
2023-10-19 00:55:08,399 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,399 Training Params:
2023-10-19 00:55:08,399 - learning_rate: "3e-05"
2023-10-19 00:55:08,399 - mini_batch_size: "8"
2023-10-19 00:55:08,400 - max_epochs: "10"
2023-10-19 00:55:08,400 - shuffle: "True"
2023-10-19 00:55:08,400 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,400 Plugins:
2023-10-19 00:55:08,400 - TensorboardLogger
2023-10-19 00:55:08,400 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 00:55:08,400 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,400 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 00:55:08,400 - metric: "('micro avg', 'f1-score')"
2023-10-19 00:55:08,400 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,400 Computation:
2023-10-19 00:55:08,400 - compute on device: cuda:0
2023-10-19 00:55:08,400 - embedding storage: none
2023-10-19 00:55:08,400 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,400 Model training base path: "hmbench-letemps/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-19 00:55:08,400 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,400 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:08,400 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 00:55:12,651 epoch 1 - iter 180/1809 - loss 3.02059904 - time (sec): 4.25 - samples/sec: 9147.25 - lr: 0.000003 - momentum: 0.000000
2023-10-19 00:55:16,860 epoch 1 - iter 360/1809 - loss 2.59240564 - time (sec): 8.46 - samples/sec: 8968.75 - lr: 0.000006 - momentum: 0.000000
2023-10-19 00:55:21,151 epoch 1 - iter 540/1809 - loss 2.03891534 - time (sec): 12.75 - samples/sec: 8948.76 - lr: 0.000009 - momentum: 0.000000
2023-10-19 00:55:25,390 epoch 1 - iter 720/1809 - loss 1.63251111 - time (sec): 16.99 - samples/sec: 8927.94 - lr: 0.000012 - momentum: 0.000000
2023-10-19 00:55:29,608 epoch 1 - iter 900/1809 - loss 1.36226435 - time (sec): 21.21 - samples/sec: 9018.73 - lr: 0.000015 - momentum: 0.000000
2023-10-19 00:55:33,791 epoch 1 - iter 1080/1809 - loss 1.18357407 - time (sec): 25.39 - samples/sec: 9057.79 - lr: 0.000018 - momentum: 0.000000
2023-10-19 00:55:37,901 epoch 1 - iter 1260/1809 - loss 1.06438656 - time (sec): 29.50 - samples/sec: 9006.09 - lr: 0.000021 - momentum: 0.000000
2023-10-19 00:55:42,120 epoch 1 - iter 1440/1809 - loss 0.96650038 - time (sec): 33.72 - samples/sec: 8998.06 - lr: 0.000024 - momentum: 0.000000
2023-10-19 00:55:46,289 epoch 1 - iter 1620/1809 - loss 0.88699189 - time (sec): 37.89 - samples/sec: 8986.47 - lr: 0.000027 - momentum: 0.000000
2023-10-19 00:55:50,495 epoch 1 - iter 1800/1809 - loss 0.82250609 - time (sec): 42.10 - samples/sec: 8995.27 - lr: 0.000030 - momentum: 0.000000
2023-10-19 00:55:50,684 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:50,684 EPOCH 1 done: loss 0.8204 - lr: 0.000030
2023-10-19 00:55:52,944 DEV : loss 0.18831811845302582 - f1-score (micro avg) 0.1008
2023-10-19 00:55:52,971 saving best model
2023-10-19 00:55:53,001 ----------------------------------------------------------------------------------------------------
2023-10-19 00:55:57,197 epoch 2 - iter 180/1809 - loss 0.23379582 - time (sec): 4.20 - samples/sec: 8960.12 - lr: 0.000030 - momentum: 0.000000
2023-10-19 00:56:01,382 epoch 2 - iter 360/1809 - loss 0.22598926 - time (sec): 8.38 - samples/sec: 9041.69 - lr: 0.000029 - momentum: 0.000000
2023-10-19 00:56:06,265 epoch 2 - iter 540/1809 - loss 0.21606913 - time (sec): 13.26 - samples/sec: 8611.22 - lr: 0.000029 - momentum: 0.000000
2023-10-19 00:56:10,362 epoch 2 - iter 720/1809 - loss 0.21128402 - time (sec): 17.36 - samples/sec: 8626.96 - lr: 0.000029 - momentum: 0.000000
2023-10-19 00:56:14,495 epoch 2 - iter 900/1809 - loss 0.20916640 - time (sec): 21.49 - samples/sec: 8655.64 - lr: 0.000028 - momentum: 0.000000
2023-10-19 00:56:18,779 epoch 2 - iter 1080/1809 - loss 0.20755304 - time (sec): 25.78 - samples/sec: 8741.45 - lr: 0.000028 - momentum: 0.000000
2023-10-19 00:56:22,933 epoch 2 - iter 1260/1809 - loss 0.20505981 - time (sec): 29.93 - samples/sec: 8787.58 - lr: 0.000028 - momentum: 0.000000
2023-10-19 00:56:27,097 epoch 2 - iter 1440/1809 - loss 0.20371547 - time (sec): 34.10 - samples/sec: 8800.24 - lr: 0.000027 - momentum: 0.000000
2023-10-19 00:56:31,349 epoch 2 - iter 1620/1809 - loss 0.20010663 - time (sec): 38.35 - samples/sec: 8831.85 - lr: 0.000027 - momentum: 0.000000
2023-10-19 00:56:35,594 epoch 2 - iter 1800/1809 - loss 0.19871096 - time (sec): 42.59 - samples/sec: 8875.93 - lr: 0.000027 - momentum: 0.000000
2023-10-19 00:56:35,803 ----------------------------------------------------------------------------------------------------
2023-10-19 00:56:35,803 EPOCH 2 done: loss 0.1987 - lr: 0.000027
2023-10-19 00:56:38,999 DEV : loss 0.1624661087989807 - f1-score (micro avg) 0.3303
2023-10-19 00:56:39,026 saving best model
2023-10-19 00:56:39,058 ----------------------------------------------------------------------------------------------------
2023-10-19 00:56:43,380 epoch 3 - iter 180/1809 - loss 0.16077504 - time (sec): 4.32 - samples/sec: 8719.75 - lr: 0.000026 - momentum: 0.000000
2023-10-19 00:56:47,669 epoch 3 - iter 360/1809 - loss 0.16133266 - time (sec): 8.61 - samples/sec: 8780.24 - lr: 0.000026 - momentum: 0.000000
2023-10-19 00:56:52,002 epoch 3 - iter 540/1809 - loss 0.16796854 - time (sec): 12.94 - samples/sec: 8782.24 - lr: 0.000026 - momentum: 0.000000
2023-10-19 00:56:56,217 epoch 3 - iter 720/1809 - loss 0.17031261 - time (sec): 17.16 - samples/sec: 8823.88 - lr: 0.000025 - momentum: 0.000000
2023-10-19 00:57:00,632 epoch 3 - iter 900/1809 - loss 0.16734894 - time (sec): 21.57 - samples/sec: 8814.43 - lr: 0.000025 - momentum: 0.000000
2023-10-19 00:57:04,969 epoch 3 - iter 1080/1809 - loss 0.16844113 - time (sec): 25.91 - samples/sec: 8772.84 - lr: 0.000025 - momentum: 0.000000
2023-10-19 00:57:09,257 epoch 3 - iter 1260/1809 - loss 0.16814282 - time (sec): 30.20 - samples/sec: 8801.78 - lr: 0.000024 - momentum: 0.000000
2023-10-19 00:57:13,430 epoch 3 - iter 1440/1809 - loss 0.16710162 - time (sec): 34.37 - samples/sec: 8831.62 - lr: 0.000024 - momentum: 0.000000
2023-10-19 00:57:17,630 epoch 3 - iter 1620/1809 - loss 0.16523955 - time (sec): 38.57 - samples/sec: 8838.63 - lr: 0.000024 - momentum: 0.000000
2023-10-19 00:57:21,905 epoch 3 - iter 1800/1809 - loss 0.16521378 - time (sec): 42.85 - samples/sec: 8826.30 - lr: 0.000023 - momentum: 0.000000
2023-10-19 00:57:22,108 ----------------------------------------------------------------------------------------------------
2023-10-19 00:57:22,108 EPOCH 3 done: loss 0.1650 - lr: 0.000023
2023-10-19 00:57:25,860 DEV : loss 0.1545405089855194 - f1-score (micro avg) 0.3642
2023-10-19 00:57:25,887 saving best model
2023-10-19 00:57:25,925 ----------------------------------------------------------------------------------------------------
2023-10-19 00:57:30,131 epoch 4 - iter 180/1809 - loss 0.15355927 - time (sec): 4.21 - samples/sec: 8784.05 - lr: 0.000023 - momentum: 0.000000
2023-10-19 00:57:34,395 epoch 4 - iter 360/1809 - loss 0.15216298 - time (sec): 8.47 - samples/sec: 8948.25 - lr: 0.000023 - momentum: 0.000000
2023-10-19 00:57:38,701 epoch 4 - iter 540/1809 - loss 0.15726329 - time (sec): 12.78 - samples/sec: 8876.64 - lr: 0.000022 - momentum: 0.000000
2023-10-19 00:57:42,969 epoch 4 - iter 720/1809 - loss 0.15406047 - time (sec): 17.04 - samples/sec: 8893.59 - lr: 0.000022 - momentum: 0.000000
2023-10-19 00:57:47,289 epoch 4 - iter 900/1809 - loss 0.15371008 - time (sec): 21.36 - samples/sec: 8883.62 - lr: 0.000022 - momentum: 0.000000
2023-10-19 00:57:51,506 epoch 4 - iter 1080/1809 - loss 0.15297029 - time (sec): 25.58 - samples/sec: 8871.45 - lr: 0.000021 - momentum: 0.000000
2023-10-19 00:57:55,625 epoch 4 - iter 1260/1809 - loss 0.15162532 - time (sec): 29.70 - samples/sec: 8847.34 - lr: 0.000021 - momentum: 0.000000
2023-10-19 00:57:59,903 epoch 4 - iter 1440/1809 - loss 0.15023300 - time (sec): 33.98 - samples/sec: 8887.31 - lr: 0.000021 - momentum: 0.000000
2023-10-19 00:58:04,111 epoch 4 - iter 1620/1809 - loss 0.14985131 - time (sec): 38.19 - samples/sec: 8936.38 - lr: 0.000020 - momentum: 0.000000
2023-10-19 00:58:08,343 epoch 4 - iter 1800/1809 - loss 0.14967964 - time (sec): 42.42 - samples/sec: 8912.53 - lr: 0.000020 - momentum: 0.000000
2023-10-19 00:58:08,544 ----------------------------------------------------------------------------------------------------
2023-10-19 00:58:08,544 EPOCH 4 done: loss 0.1497 - lr: 0.000020
2023-10-19 00:58:11,752 DEV : loss 0.15396162867546082 - f1-score (micro avg) 0.4096
2023-10-19 00:58:11,781 saving best model
2023-10-19 00:58:11,817 ----------------------------------------------------------------------------------------------------
2023-10-19 00:58:16,031 epoch 5 - iter 180/1809 - loss 0.15330329 - time (sec): 4.21 - samples/sec: 8377.02 - lr: 0.000020 - momentum: 0.000000
2023-10-19 00:58:20,408 epoch 5 - iter 360/1809 - loss 0.14613552 - time (sec): 8.59 - samples/sec: 8656.53 - lr: 0.000019 - momentum: 0.000000
2023-10-19 00:58:24,697 epoch 5 - iter 540/1809 - loss 0.13629708 - time (sec): 12.88 - samples/sec: 8629.68 - lr: 0.000019 - momentum: 0.000000
2023-10-19 00:58:29,017 epoch 5 - iter 720/1809 - loss 0.13634111 - time (sec): 17.20 - samples/sec: 8693.16 - lr: 0.000019 - momentum: 0.000000
2023-10-19 00:58:33,228 epoch 5 - iter 900/1809 - loss 0.13677079 - time (sec): 21.41 - samples/sec: 8693.05 - lr: 0.000018 - momentum: 0.000000
2023-10-19 00:58:37,395 epoch 5 - iter 1080/1809 - loss 0.13705700 - time (sec): 25.58 - samples/sec: 8799.68 - lr: 0.000018 - momentum: 0.000000
2023-10-19 00:58:41,676 epoch 5 - iter 1260/1809 - loss 0.13678143 - time (sec): 29.86 - samples/sec: 8871.29 - lr: 0.000018 - momentum: 0.000000
2023-10-19 00:58:45,897 epoch 5 - iter 1440/1809 - loss 0.13770878 - time (sec): 34.08 - samples/sec: 8873.33 - lr: 0.000017 - momentum: 0.000000
2023-10-19 00:58:50,129 epoch 5 - iter 1620/1809 - loss 0.13719641 - time (sec): 38.31 - samples/sec: 8900.11 - lr: 0.000017 - momentum: 0.000000
2023-10-19 00:58:54,302 epoch 5 - iter 1800/1809 - loss 0.13755607 - time (sec): 42.48 - samples/sec: 8901.51 - lr: 0.000017 - momentum: 0.000000
2023-10-19 00:58:54,513 ----------------------------------------------------------------------------------------------------
2023-10-19 00:58:54,513 EPOCH 5 done: loss 0.1376 - lr: 0.000017
2023-10-19 00:58:58,318 DEV : loss 0.15693862736225128 - f1-score (micro avg) 0.4322
2023-10-19 00:58:58,347 saving best model
2023-10-19 00:58:58,387 ----------------------------------------------------------------------------------------------------
2023-10-19 00:59:02,681 epoch 6 - iter 180/1809 - loss 0.12354942 - time (sec): 4.29 - samples/sec: 8938.29 - lr: 0.000016 - momentum: 0.000000
2023-10-19 00:59:06,923 epoch 6 - iter 360/1809 - loss 0.12329277 - time (sec): 8.54 - samples/sec: 8790.19 - lr: 0.000016 - momentum: 0.000000
2023-10-19 00:59:11,152 epoch 6 - iter 540/1809 - loss 0.12382227 - time (sec): 12.76 - samples/sec: 8880.49 - lr: 0.000016 - momentum: 0.000000
2023-10-19 00:59:15,382 epoch 6 - iter 720/1809 - loss 0.12812576 - time (sec): 16.99 - samples/sec: 8824.66 - lr: 0.000015 - momentum: 0.000000
2023-10-19 00:59:19,569 epoch 6 - iter 900/1809 - loss 0.12995777 - time (sec): 21.18 - samples/sec: 8882.48 - lr: 0.000015 - momentum: 0.000000
2023-10-19 00:59:23,761 epoch 6 - iter 1080/1809 - loss 0.13162934 - time (sec): 25.37 - samples/sec: 8897.00 - lr: 0.000015 - momentum: 0.000000
2023-10-19 00:59:28,006 epoch 6 - iter 1260/1809 - loss 0.13296655 - time (sec): 29.62 - samples/sec: 8933.26 - lr: 0.000014 - momentum: 0.000000
2023-10-19 00:59:32,166 epoch 6 - iter 1440/1809 - loss 0.13302667 - time (sec): 33.78 - samples/sec: 8916.02 - lr: 0.000014 - momentum: 0.000000
2023-10-19 00:59:36,493 epoch 6 - iter 1620/1809 - loss 0.13312201 - time (sec): 38.11 - samples/sec: 8905.70 - lr: 0.000014 - momentum: 0.000000
2023-10-19 00:59:40,713 epoch 6 - iter 1800/1809 - loss 0.13199896 - time (sec): 42.33 - samples/sec: 8919.97 - lr: 0.000013 - momentum: 0.000000
2023-10-19 00:59:40,938 ----------------------------------------------------------------------------------------------------
2023-10-19 00:59:40,938 EPOCH 6 done: loss 0.1319 - lr: 0.000013
2023-10-19 00:59:44,140 DEV : loss 0.15839844942092896 - f1-score (micro avg) 0.4447
2023-10-19 00:59:44,170 saving best model
2023-10-19 00:59:44,208 ----------------------------------------------------------------------------------------------------
2023-10-19 00:59:48,386 epoch 7 - iter 180/1809 - loss 0.12883141 - time (sec): 4.18 - samples/sec: 9157.87 - lr: 0.000013 - momentum: 0.000000
2023-10-19 00:59:52,409 epoch 7 - iter 360/1809 - loss 0.12400669 - time (sec): 8.20 - samples/sec: 9283.73 - lr: 0.000013 - momentum: 0.000000
2023-10-19 00:59:56,667 epoch 7 - iter 540/1809 - loss 0.12259146 - time (sec): 12.46 - samples/sec: 9152.68 - lr: 0.000012 - momentum: 0.000000
2023-10-19 01:00:00,980 epoch 7 - iter 720/1809 - loss 0.12350739 - time (sec): 16.77 - samples/sec: 9051.03 - lr: 0.000012 - momentum: 0.000000
2023-10-19 01:00:05,186 epoch 7 - iter 900/1809 - loss 0.12312114 - time (sec): 20.98 - samples/sec: 9025.80 - lr: 0.000012 - momentum: 0.000000
2023-10-19 01:00:09,488 epoch 7 - iter 1080/1809 - loss 0.12684864 - time (sec): 25.28 - samples/sec: 8956.39 - lr: 0.000011 - momentum: 0.000000
2023-10-19 01:00:13,954 epoch 7 - iter 1260/1809 - loss 0.12759953 - time (sec): 29.74 - samples/sec: 8905.46 - lr: 0.000011 - momentum: 0.000000
2023-10-19 01:00:18,107 epoch 7 - iter 1440/1809 - loss 0.12647126 - time (sec): 33.90 - samples/sec: 8894.61 - lr: 0.000011 - momentum: 0.000000
2023-10-19 01:00:22,429 epoch 7 - iter 1620/1809 - loss 0.12603946 - time (sec): 38.22 - samples/sec: 8877.74 - lr: 0.000010 - momentum: 0.000000
2023-10-19 01:00:26,698 epoch 7 - iter 1800/1809 - loss 0.12487945 - time (sec): 42.49 - samples/sec: 8893.36 - lr: 0.000010 - momentum: 0.000000
2023-10-19 01:00:26,911 ----------------------------------------------------------------------------------------------------
2023-10-19 01:00:26,911 EPOCH 7 done: loss 0.1249 - lr: 0.000010
2023-10-19 01:00:30,696 DEV : loss 0.15553328394889832 - f1-score (micro avg) 0.4456
2023-10-19 01:00:30,724 saving best model
2023-10-19 01:00:30,757 ----------------------------------------------------------------------------------------------------
2023-10-19 01:00:35,031 epoch 8 - iter 180/1809 - loss 0.11513408 - time (sec): 4.27 - samples/sec: 9108.93 - lr: 0.000010 - momentum: 0.000000
2023-10-19 01:00:39,276 epoch 8 - iter 360/1809 - loss 0.11403661 - time (sec): 8.52 - samples/sec: 9164.60 - lr: 0.000009 - momentum: 0.000000
2023-10-19 01:00:43,549 epoch 8 - iter 540/1809 - loss 0.11605998 - time (sec): 12.79 - samples/sec: 9073.43 - lr: 0.000009 - momentum: 0.000000
2023-10-19 01:00:47,776 epoch 8 - iter 720/1809 - loss 0.11460389 - time (sec): 17.02 - samples/sec: 9048.75 - lr: 0.000009 - momentum: 0.000000
2023-10-19 01:00:52,038 epoch 8 - iter 900/1809 - loss 0.11892077 - time (sec): 21.28 - samples/sec: 9056.00 - lr: 0.000008 - momentum: 0.000000
2023-10-19 01:00:56,269 epoch 8 - iter 1080/1809 - loss 0.11833919 - time (sec): 25.51 - samples/sec: 9032.48 - lr: 0.000008 - momentum: 0.000000
2023-10-19 01:01:00,443 epoch 8 - iter 1260/1809 - loss 0.11862510 - time (sec): 29.68 - samples/sec: 9007.98 - lr: 0.000008 - momentum: 0.000000
2023-10-19 01:01:04,678 epoch 8 - iter 1440/1809 - loss 0.11840134 - time (sec): 33.92 - samples/sec: 8974.47 - lr: 0.000007 - momentum: 0.000000
2023-10-19 01:01:08,829 epoch 8 - iter 1620/1809 - loss 0.11883557 - time (sec): 38.07 - samples/sec: 8985.37 - lr: 0.000007 - momentum: 0.000000
2023-10-19 01:01:13,012 epoch 8 - iter 1800/1809 - loss 0.11942175 - time (sec): 42.25 - samples/sec: 8943.81 - lr: 0.000007 - momentum: 0.000000
2023-10-19 01:01:13,228 ----------------------------------------------------------------------------------------------------
2023-10-19 01:01:13,228 EPOCH 8 done: loss 0.1192 - lr: 0.000007
2023-10-19 01:01:16,438 DEV : loss 0.16024847328662872 - f1-score (micro avg) 0.4581
2023-10-19 01:01:16,466 saving best model
2023-10-19 01:01:16,497 ----------------------------------------------------------------------------------------------------
2023-10-19 01:01:20,726 epoch 9 - iter 180/1809 - loss 0.11259705 - time (sec): 4.23 - samples/sec: 9054.25 - lr: 0.000006 - momentum: 0.000000
2023-10-19 01:01:24,835 epoch 9 - iter 360/1809 - loss 0.11599603 - time (sec): 8.34 - samples/sec: 9031.02 - lr: 0.000006 - momentum: 0.000000
2023-10-19 01:01:29,005 epoch 9 - iter 540/1809 - loss 0.11430380 - time (sec): 12.51 - samples/sec: 9048.96 - lr: 0.000006 - momentum: 0.000000
2023-10-19 01:01:33,297 epoch 9 - iter 720/1809 - loss 0.11420665 - time (sec): 16.80 - samples/sec: 8952.07 - lr: 0.000005 - momentum: 0.000000
2023-10-19 01:01:37,154 epoch 9 - iter 900/1809 - loss 0.11560138 - time (sec): 20.66 - samples/sec: 9153.90 - lr: 0.000005 - momentum: 0.000000
2023-10-19 01:01:41,366 epoch 9 - iter 1080/1809 - loss 0.11787546 - time (sec): 24.87 - samples/sec: 9093.09 - lr: 0.000005 - momentum: 0.000000
2023-10-19 01:01:45,609 epoch 9 - iter 1260/1809 - loss 0.11693200 - time (sec): 29.11 - samples/sec: 9069.64 - lr: 0.000004 - momentum: 0.000000
2023-10-19 01:01:49,850 epoch 9 - iter 1440/1809 - loss 0.11685936 - time (sec): 33.35 - samples/sec: 9041.80 - lr: 0.000004 - momentum: 0.000000
2023-10-19 01:01:54,173 epoch 9 - iter 1620/1809 - loss 0.11583111 - time (sec): 37.68 - samples/sec: 8987.30 - lr: 0.000004 - momentum: 0.000000
2023-10-19 01:01:58,485 epoch 9 - iter 1800/1809 - loss 0.11702665 - time (sec): 41.99 - samples/sec: 9014.85 - lr: 0.000003 - momentum: 0.000000
2023-10-19 01:01:58,686 ----------------------------------------------------------------------------------------------------
2023-10-19 01:01:58,686 EPOCH 9 done: loss 0.1172 - lr: 0.000003
2023-10-19 01:02:02,531 DEV : loss 0.16129888594150543 - f1-score (micro avg) 0.465
2023-10-19 01:02:02,559 saving best model
2023-10-19 01:02:02,596 ----------------------------------------------------------------------------------------------------
2023-10-19 01:02:06,956 epoch 10 - iter 180/1809 - loss 0.11628266 - time (sec): 4.36 - samples/sec: 8815.42 - lr: 0.000003 - momentum: 0.000000
2023-10-19 01:02:11,145 epoch 10 - iter 360/1809 - loss 0.11391073 - time (sec): 8.55 - samples/sec: 8972.40 - lr: 0.000003 - momentum: 0.000000
2023-10-19 01:02:15,290 epoch 10 - iter 540/1809 - loss 0.12021392 - time (sec): 12.69 - samples/sec: 8842.68 - lr: 0.000002 - momentum: 0.000000
2023-10-19 01:02:19,563 epoch 10 - iter 720/1809 - loss 0.11652333 - time (sec): 16.97 - samples/sec: 8884.38 - lr: 0.000002 - momentum: 0.000000
2023-10-19 01:02:23,789 epoch 10 - iter 900/1809 - loss 0.11604266 - time (sec): 21.19 - samples/sec: 8932.15 - lr: 0.000002 - momentum: 0.000000
2023-10-19 01:02:28,092 epoch 10 - iter 1080/1809 - loss 0.11441938 - time (sec): 25.50 - samples/sec: 8944.45 - lr: 0.000001 - momentum: 0.000000
2023-10-19 01:02:32,319 epoch 10 - iter 1260/1809 - loss 0.11448108 - time (sec): 29.72 - samples/sec: 8913.31 - lr: 0.000001 - momentum: 0.000000
2023-10-19 01:02:36,556 epoch 10 - iter 1440/1809 - loss 0.11365218 - time (sec): 33.96 - samples/sec: 8955.55 - lr: 0.000001 - momentum: 0.000000
2023-10-19 01:02:40,682 epoch 10 - iter 1620/1809 - loss 0.11589731 - time (sec): 38.09 - samples/sec: 8991.07 - lr: 0.000000 - momentum: 0.000000
2023-10-19 01:02:44,829 epoch 10 - iter 1800/1809 - loss 0.11657828 - time (sec): 42.23 - samples/sec: 8950.50 - lr: 0.000000 - momentum: 0.000000
2023-10-19 01:02:45,039 ----------------------------------------------------------------------------------------------------
2023-10-19 01:02:45,039 EPOCH 10 done: loss 0.1165 - lr: 0.000000
2023-10-19 01:02:48,223 DEV : loss 0.1601785123348236 - f1-score (micro avg) 0.4656
2023-10-19 01:02:48,251 saving best model
2023-10-19 01:02:48,313 ----------------------------------------------------------------------------------------------------
2023-10-19 01:02:48,313 Loading model from best epoch ...
2023-10-19 01:02:48,391 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-19 01:02:52,385
Results:
- F-score (micro) 0.4939
- F-score (macro) 0.3255
- Accuracy 0.3406
By class:
precision recall f1-score support
loc 0.5032 0.6650 0.5729 591
pers 0.3771 0.4342 0.4036 357
org 0.0000 0.0000 0.0000 79
micro avg 0.4597 0.5336 0.4939 1027
macro avg 0.2934 0.3664 0.3255 1027
weighted avg 0.4207 0.5336 0.4700 1027
2023-10-19 01:02:52,386 ----------------------------------------------------------------------------------------------------