|
2023-10-18 22:27:37,466 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,467 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 22:27:37,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,467 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-18 22:27:37,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,467 Train: 5777 sentences |
|
2023-10-18 22:27:37,467 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 22:27:37,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,467 Training Params: |
|
2023-10-18 22:27:37,467 - learning_rate: "5e-05" |
|
2023-10-18 22:27:37,467 - mini_batch_size: "8" |
|
2023-10-18 22:27:37,467 - max_epochs: "10" |
|
2023-10-18 22:27:37,467 - shuffle: "True" |
|
2023-10-18 22:27:37,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,467 Plugins: |
|
2023-10-18 22:27:37,467 - TensorboardLogger |
|
2023-10-18 22:27:37,467 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 22:27:37,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,467 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 22:27:37,467 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 22:27:37,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,467 Computation: |
|
2023-10-18 22:27:37,468 - compute on device: cuda:0 |
|
2023-10-18 22:27:37,468 - embedding storage: none |
|
2023-10-18 22:27:37,468 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,468 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-18 22:27:37,468 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,468 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:37,468 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 22:27:39,324 epoch 1 - iter 72/723 - loss 2.36511468 - time (sec): 1.86 - samples/sec: 10003.84 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 22:27:41,106 epoch 1 - iter 144/723 - loss 2.09982531 - time (sec): 3.64 - samples/sec: 9691.83 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 22:27:42,929 epoch 1 - iter 216/723 - loss 1.74834944 - time (sec): 5.46 - samples/sec: 9544.86 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 22:27:44,728 epoch 1 - iter 288/723 - loss 1.40804798 - time (sec): 7.26 - samples/sec: 9700.77 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 22:27:46,506 epoch 1 - iter 360/723 - loss 1.19010718 - time (sec): 9.04 - samples/sec: 9708.17 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 22:27:48,253 epoch 1 - iter 432/723 - loss 1.04622066 - time (sec): 10.78 - samples/sec: 9768.63 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 22:27:50,025 epoch 1 - iter 504/723 - loss 0.93297948 - time (sec): 12.56 - samples/sec: 9836.59 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 22:27:51,878 epoch 1 - iter 576/723 - loss 0.85587372 - time (sec): 14.41 - samples/sec: 9769.86 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 22:27:53,701 epoch 1 - iter 648/723 - loss 0.78715867 - time (sec): 16.23 - samples/sec: 9774.91 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 22:27:55,500 epoch 1 - iter 720/723 - loss 0.73238157 - time (sec): 18.03 - samples/sec: 9743.47 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-18 22:27:55,557 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:55,557 EPOCH 1 done: loss 0.7306 - lr: 0.000050 |
|
2023-10-18 22:27:56,785 DEV : loss 0.2867981195449829 - f1-score (micro avg) 0.0 |
|
2023-10-18 22:27:56,799 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:27:58,934 epoch 2 - iter 72/723 - loss 0.25942088 - time (sec): 2.13 - samples/sec: 7682.53 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 22:28:00,739 epoch 2 - iter 144/723 - loss 0.21906746 - time (sec): 3.94 - samples/sec: 8765.46 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 22:28:02,561 epoch 2 - iter 216/723 - loss 0.21678165 - time (sec): 5.76 - samples/sec: 9184.23 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 22:28:04,341 epoch 2 - iter 288/723 - loss 0.21038271 - time (sec): 7.54 - samples/sec: 9362.65 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 22:28:06,138 epoch 2 - iter 360/723 - loss 0.21027297 - time (sec): 9.34 - samples/sec: 9400.01 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 22:28:07,875 epoch 2 - iter 432/723 - loss 0.20832767 - time (sec): 11.07 - samples/sec: 9489.01 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 22:28:09,619 epoch 2 - iter 504/723 - loss 0.20875209 - time (sec): 12.82 - samples/sec: 9487.37 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 22:28:11,521 epoch 2 - iter 576/723 - loss 0.20840629 - time (sec): 14.72 - samples/sec: 9536.30 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 22:28:13,355 epoch 2 - iter 648/723 - loss 0.20860571 - time (sec): 16.56 - samples/sec: 9487.43 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 22:28:15,162 epoch 2 - iter 720/723 - loss 0.20557568 - time (sec): 18.36 - samples/sec: 9568.85 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 22:28:15,219 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:28:15,219 EPOCH 2 done: loss 0.2055 - lr: 0.000044 |
|
2023-10-18 22:28:16,981 DEV : loss 0.2222003936767578 - f1-score (micro avg) 0.2793 |
|
2023-10-18 22:28:16,997 saving best model |
|
2023-10-18 22:28:17,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:28:18,836 epoch 3 - iter 72/723 - loss 0.18051981 - time (sec): 1.81 - samples/sec: 9364.80 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 22:28:20,585 epoch 3 - iter 144/723 - loss 0.19116936 - time (sec): 3.56 - samples/sec: 9571.67 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 22:28:22,447 epoch 3 - iter 216/723 - loss 0.19470015 - time (sec): 5.42 - samples/sec: 9900.63 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 22:28:24,187 epoch 3 - iter 288/723 - loss 0.19333086 - time (sec): 7.16 - samples/sec: 9830.63 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 22:28:25,964 epoch 3 - iter 360/723 - loss 0.19006666 - time (sec): 8.93 - samples/sec: 9854.38 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 22:28:27,678 epoch 3 - iter 432/723 - loss 0.18721035 - time (sec): 10.65 - samples/sec: 9880.88 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 22:28:29,610 epoch 3 - iter 504/723 - loss 0.18435836 - time (sec): 12.58 - samples/sec: 9882.06 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 22:28:31,224 epoch 3 - iter 576/723 - loss 0.18328049 - time (sec): 14.20 - samples/sec: 9910.72 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 22:28:32,726 epoch 3 - iter 648/723 - loss 0.18043363 - time (sec): 15.70 - samples/sec: 10123.60 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 22:28:34,514 epoch 3 - iter 720/723 - loss 0.17920459 - time (sec): 17.49 - samples/sec: 10047.57 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 22:28:34,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:28:34,576 EPOCH 3 done: loss 0.1791 - lr: 0.000039 |
|
2023-10-18 22:28:36,677 DEV : loss 0.20642346143722534 - f1-score (micro avg) 0.409 |
|
2023-10-18 22:28:36,693 saving best model |
|
2023-10-18 22:28:36,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:28:38,542 epoch 4 - iter 72/723 - loss 0.16700376 - time (sec): 1.81 - samples/sec: 9704.05 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 22:28:40,316 epoch 4 - iter 144/723 - loss 0.14874843 - time (sec): 3.59 - samples/sec: 10076.03 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 22:28:42,101 epoch 4 - iter 216/723 - loss 0.15586357 - time (sec): 5.37 - samples/sec: 9899.34 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 22:28:43,920 epoch 4 - iter 288/723 - loss 0.15457710 - time (sec): 7.19 - samples/sec: 9850.30 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 22:28:45,432 epoch 4 - iter 360/723 - loss 0.15618795 - time (sec): 8.70 - samples/sec: 10017.83 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 22:28:47,143 epoch 4 - iter 432/723 - loss 0.15772027 - time (sec): 10.41 - samples/sec: 10127.23 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 22:28:48,908 epoch 4 - iter 504/723 - loss 0.16057992 - time (sec): 12.18 - samples/sec: 10109.88 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 22:28:50,659 epoch 4 - iter 576/723 - loss 0.15979312 - time (sec): 13.93 - samples/sec: 10079.75 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 22:28:52,454 epoch 4 - iter 648/723 - loss 0.16008903 - time (sec): 15.72 - samples/sec: 10006.95 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 22:28:54,339 epoch 4 - iter 720/723 - loss 0.16083710 - time (sec): 17.61 - samples/sec: 9973.12 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 22:28:54,404 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:28:54,404 EPOCH 4 done: loss 0.1607 - lr: 0.000033 |
|
2023-10-18 22:28:56,169 DEV : loss 0.187706857919693 - f1-score (micro avg) 0.48 |
|
2023-10-18 22:28:56,184 saving best model |
|
2023-10-18 22:28:56,219 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:28:57,982 epoch 5 - iter 72/723 - loss 0.14954716 - time (sec): 1.76 - samples/sec: 9754.75 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 22:28:59,834 epoch 5 - iter 144/723 - loss 0.14578875 - time (sec): 3.62 - samples/sec: 10024.33 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 22:29:01,604 epoch 5 - iter 216/723 - loss 0.14351982 - time (sec): 5.39 - samples/sec: 9918.01 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 22:29:03,404 epoch 5 - iter 288/723 - loss 0.14842176 - time (sec): 7.18 - samples/sec: 9965.00 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 22:29:05,241 epoch 5 - iter 360/723 - loss 0.14796514 - time (sec): 9.02 - samples/sec: 9976.67 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 22:29:07,005 epoch 5 - iter 432/723 - loss 0.14927873 - time (sec): 10.79 - samples/sec: 10007.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 22:29:08,799 epoch 5 - iter 504/723 - loss 0.14865267 - time (sec): 12.58 - samples/sec: 9924.79 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 22:29:10,967 epoch 5 - iter 576/723 - loss 0.14988695 - time (sec): 14.75 - samples/sec: 9719.28 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 22:29:12,739 epoch 5 - iter 648/723 - loss 0.15031619 - time (sec): 16.52 - samples/sec: 9685.72 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 22:29:14,502 epoch 5 - iter 720/723 - loss 0.14995440 - time (sec): 18.28 - samples/sec: 9599.76 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 22:29:14,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:29:14,573 EPOCH 5 done: loss 0.1498 - lr: 0.000028 |
|
2023-10-18 22:29:16,347 DEV : loss 0.18636666238307953 - f1-score (micro avg) 0.4798 |
|
2023-10-18 22:29:16,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:29:18,218 epoch 6 - iter 72/723 - loss 0.16355756 - time (sec): 1.86 - samples/sec: 9865.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 22:29:20,042 epoch 6 - iter 144/723 - loss 0.15511150 - time (sec): 3.68 - samples/sec: 9614.80 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 22:29:21,873 epoch 6 - iter 216/723 - loss 0.15860935 - time (sec): 5.51 - samples/sec: 9533.18 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 22:29:23,744 epoch 6 - iter 288/723 - loss 0.15159689 - time (sec): 7.38 - samples/sec: 9585.84 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 22:29:25,589 epoch 6 - iter 360/723 - loss 0.14744356 - time (sec): 9.23 - samples/sec: 9644.63 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 22:29:27,393 epoch 6 - iter 432/723 - loss 0.14621311 - time (sec): 11.03 - samples/sec: 9589.04 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 22:29:29,211 epoch 6 - iter 504/723 - loss 0.14600430 - time (sec): 12.85 - samples/sec: 9553.28 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 22:29:31,077 epoch 6 - iter 576/723 - loss 0.14547259 - time (sec): 14.72 - samples/sec: 9542.05 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 22:29:32,937 epoch 6 - iter 648/723 - loss 0.14365435 - time (sec): 16.58 - samples/sec: 9538.93 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 22:29:34,690 epoch 6 - iter 720/723 - loss 0.14159301 - time (sec): 18.33 - samples/sec: 9574.94 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 22:29:34,755 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:29:34,755 EPOCH 6 done: loss 0.1416 - lr: 0.000022 |
|
2023-10-18 22:29:36,533 DEV : loss 0.18104662001132965 - f1-score (micro avg) 0.5036 |
|
2023-10-18 22:29:36,547 saving best model |
|
2023-10-18 22:29:36,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:29:38,313 epoch 7 - iter 72/723 - loss 0.13697695 - time (sec): 1.73 - samples/sec: 9569.18 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 22:29:40,219 epoch 7 - iter 144/723 - loss 0.13702263 - time (sec): 3.64 - samples/sec: 9803.62 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 22:29:42,345 epoch 7 - iter 216/723 - loss 0.13456961 - time (sec): 5.76 - samples/sec: 9331.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 22:29:44,122 epoch 7 - iter 288/723 - loss 0.13344226 - time (sec): 7.54 - samples/sec: 9422.56 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 22:29:45,905 epoch 7 - iter 360/723 - loss 0.13696669 - time (sec): 9.32 - samples/sec: 9489.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 22:29:47,741 epoch 7 - iter 432/723 - loss 0.13467303 - time (sec): 11.16 - samples/sec: 9511.45 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 22:29:49,559 epoch 7 - iter 504/723 - loss 0.13559814 - time (sec): 12.97 - samples/sec: 9522.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 22:29:51,395 epoch 7 - iter 576/723 - loss 0.13750076 - time (sec): 14.81 - samples/sec: 9632.04 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 22:29:53,104 epoch 7 - iter 648/723 - loss 0.13728059 - time (sec): 16.52 - samples/sec: 9635.25 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 22:29:54,878 epoch 7 - iter 720/723 - loss 0.13441463 - time (sec): 18.29 - samples/sec: 9608.82 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 22:29:54,941 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:29:54,941 EPOCH 7 done: loss 0.1345 - lr: 0.000017 |
|
2023-10-18 22:29:56,731 DEV : loss 0.18202269077301025 - f1-score (micro avg) 0.5085 |
|
2023-10-18 22:29:56,746 saving best model |
|
2023-10-18 22:29:56,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:29:58,697 epoch 8 - iter 72/723 - loss 0.15191020 - time (sec): 1.92 - samples/sec: 9945.36 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 22:30:00,545 epoch 8 - iter 144/723 - loss 0.14023371 - time (sec): 3.77 - samples/sec: 9707.45 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 22:30:02,346 epoch 8 - iter 216/723 - loss 0.13344774 - time (sec): 5.57 - samples/sec: 9536.00 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 22:30:04,134 epoch 8 - iter 288/723 - loss 0.13592961 - time (sec): 7.35 - samples/sec: 9464.42 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 22:30:05,888 epoch 8 - iter 360/723 - loss 0.13386073 - time (sec): 9.11 - samples/sec: 9465.62 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 22:30:07,421 epoch 8 - iter 432/723 - loss 0.13276603 - time (sec): 10.64 - samples/sec: 9737.72 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 22:30:09,158 epoch 8 - iter 504/723 - loss 0.13042166 - time (sec): 12.38 - samples/sec: 9839.06 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 22:30:10,903 epoch 8 - iter 576/723 - loss 0.13048395 - time (sec): 14.12 - samples/sec: 9928.03 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 22:30:12,715 epoch 8 - iter 648/723 - loss 0.12897279 - time (sec): 15.93 - samples/sec: 9908.71 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 22:30:14,483 epoch 8 - iter 720/723 - loss 0.12713101 - time (sec): 17.70 - samples/sec: 9925.22 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 22:30:14,543 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:30:14,543 EPOCH 8 done: loss 0.1273 - lr: 0.000011 |
|
2023-10-18 22:30:16,659 DEV : loss 0.18342925608158112 - f1-score (micro avg) 0.521 |
|
2023-10-18 22:30:16,674 saving best model |
|
2023-10-18 22:30:16,709 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:30:18,255 epoch 9 - iter 72/723 - loss 0.12500718 - time (sec): 1.55 - samples/sec: 11066.37 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 22:30:20,019 epoch 9 - iter 144/723 - loss 0.11775500 - time (sec): 3.31 - samples/sec: 10629.45 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 22:30:21,821 epoch 9 - iter 216/723 - loss 0.12154064 - time (sec): 5.11 - samples/sec: 10159.32 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 22:30:23,593 epoch 9 - iter 288/723 - loss 0.12150161 - time (sec): 6.88 - samples/sec: 10234.46 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 22:30:25,410 epoch 9 - iter 360/723 - loss 0.12240116 - time (sec): 8.70 - samples/sec: 10130.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 22:30:27,129 epoch 9 - iter 432/723 - loss 0.12399957 - time (sec): 10.42 - samples/sec: 10045.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 22:30:28,883 epoch 9 - iter 504/723 - loss 0.12498672 - time (sec): 12.17 - samples/sec: 10049.18 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 22:30:30,709 epoch 9 - iter 576/723 - loss 0.12693121 - time (sec): 14.00 - samples/sec: 10060.43 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 22:30:32,437 epoch 9 - iter 648/723 - loss 0.12709227 - time (sec): 15.73 - samples/sec: 10077.70 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 22:30:34,220 epoch 9 - iter 720/723 - loss 0.12633684 - time (sec): 17.51 - samples/sec: 10033.34 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 22:30:34,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:30:34,281 EPOCH 9 done: loss 0.1264 - lr: 0.000006 |
|
2023-10-18 22:30:36,067 DEV : loss 0.18146364390850067 - f1-score (micro avg) 0.5221 |
|
2023-10-18 22:30:36,082 saving best model |
|
2023-10-18 22:30:36,118 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:30:37,935 epoch 10 - iter 72/723 - loss 0.14193419 - time (sec): 1.82 - samples/sec: 9505.93 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 22:30:39,689 epoch 10 - iter 144/723 - loss 0.13120399 - time (sec): 3.57 - samples/sec: 9799.69 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 22:30:41,263 epoch 10 - iter 216/723 - loss 0.12735289 - time (sec): 5.14 - samples/sec: 10409.29 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 22:30:42,774 epoch 10 - iter 288/723 - loss 0.12730359 - time (sec): 6.66 - samples/sec: 10606.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 22:30:44,541 epoch 10 - iter 360/723 - loss 0.12509208 - time (sec): 8.42 - samples/sec: 10474.88 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 22:30:46,295 epoch 10 - iter 432/723 - loss 0.12757956 - time (sec): 10.18 - samples/sec: 10351.85 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 22:30:48,027 epoch 10 - iter 504/723 - loss 0.12553789 - time (sec): 11.91 - samples/sec: 10306.51 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 22:30:49,784 epoch 10 - iter 576/723 - loss 0.12306870 - time (sec): 13.67 - samples/sec: 10332.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 22:30:51,579 epoch 10 - iter 648/723 - loss 0.12391591 - time (sec): 15.46 - samples/sec: 10320.27 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 22:30:53,317 epoch 10 - iter 720/723 - loss 0.12374139 - time (sec): 17.20 - samples/sec: 10220.84 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 22:30:53,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:30:53,379 EPOCH 10 done: loss 0.1239 - lr: 0.000000 |
|
2023-10-18 22:30:55,484 DEV : loss 0.18225204944610596 - f1-score (micro avg) 0.5287 |
|
2023-10-18 22:30:55,498 saving best model |
|
2023-10-18 22:30:55,560 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:30:55,561 Loading model from best epoch ... |
|
2023-10-18 22:30:55,641 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-18 22:30:57,008 |
|
Results: |
|
- F-score (micro) 0.5609 |
|
- F-score (macro) 0.3857 |
|
- Accuracy 0.4021 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5915 0.6703 0.6285 458 |
|
PER 0.6098 0.4668 0.5288 482 |
|
ORG 0.0000 0.0000 0.0000 69 |
|
|
|
micro avg 0.5991 0.5273 0.5609 1009 |
|
macro avg 0.4004 0.3790 0.3857 1009 |
|
weighted avg 0.5598 0.5273 0.5379 1009 |
|
|
|
2023-10-18 22:30:57,008 ---------------------------------------------------------------------------------------------------- |
|
|