stefan-it's picture
Upload ./training.log with huggingface_hub
bc3a130
2023-10-25 18:09:17,563 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,564 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 18:09:17,564 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,564 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-25 18:09:17,564 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,564 Train: 7142 sentences
2023-10-25 18:09:17,564 (train_with_dev=False, train_with_test=False)
2023-10-25 18:09:17,564 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,564 Training Params:
2023-10-25 18:09:17,564 - learning_rate: "5e-05"
2023-10-25 18:09:17,565 - mini_batch_size: "8"
2023-10-25 18:09:17,565 - max_epochs: "10"
2023-10-25 18:09:17,565 - shuffle: "True"
2023-10-25 18:09:17,565 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,565 Plugins:
2023-10-25 18:09:17,565 - TensorboardLogger
2023-10-25 18:09:17,565 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 18:09:17,565 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,565 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 18:09:17,565 - metric: "('micro avg', 'f1-score')"
2023-10-25 18:09:17,565 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,565 Computation:
2023-10-25 18:09:17,565 - compute on device: cuda:0
2023-10-25 18:09:17,565 - embedding storage: none
2023-10-25 18:09:17,565 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,565 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-25 18:09:17,565 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,565 ----------------------------------------------------------------------------------------------------
2023-10-25 18:09:17,565 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 18:09:23,250 epoch 1 - iter 89/893 - loss 1.65132140 - time (sec): 5.68 - samples/sec: 4015.11 - lr: 0.000005 - momentum: 0.000000
2023-10-25 18:09:28,853 epoch 1 - iter 178/893 - loss 1.06278864 - time (sec): 11.29 - samples/sec: 4140.27 - lr: 0.000010 - momentum: 0.000000
2023-10-25 18:09:34,607 epoch 1 - iter 267/893 - loss 0.81561733 - time (sec): 17.04 - samples/sec: 4180.48 - lr: 0.000015 - momentum: 0.000000
2023-10-25 18:09:40,608 epoch 1 - iter 356/893 - loss 0.65844535 - time (sec): 23.04 - samples/sec: 4269.01 - lr: 0.000020 - momentum: 0.000000
2023-10-25 18:09:46,286 epoch 1 - iter 445/893 - loss 0.56413544 - time (sec): 28.72 - samples/sec: 4274.79 - lr: 0.000025 - momentum: 0.000000
2023-10-25 18:09:52,088 epoch 1 - iter 534/893 - loss 0.49928189 - time (sec): 34.52 - samples/sec: 4238.74 - lr: 0.000030 - momentum: 0.000000
2023-10-25 18:09:57,858 epoch 1 - iter 623/893 - loss 0.45029812 - time (sec): 40.29 - samples/sec: 4240.69 - lr: 0.000035 - momentum: 0.000000
2023-10-25 18:10:03,775 epoch 1 - iter 712/893 - loss 0.41037659 - time (sec): 46.21 - samples/sec: 4249.87 - lr: 0.000040 - momentum: 0.000000
2023-10-25 18:10:09,732 epoch 1 - iter 801/893 - loss 0.37745804 - time (sec): 52.17 - samples/sec: 4269.90 - lr: 0.000045 - momentum: 0.000000
2023-10-25 18:10:15,606 epoch 1 - iter 890/893 - loss 0.35448580 - time (sec): 58.04 - samples/sec: 4271.30 - lr: 0.000050 - momentum: 0.000000
2023-10-25 18:10:15,835 ----------------------------------------------------------------------------------------------------
2023-10-25 18:10:15,835 EPOCH 1 done: loss 0.3537 - lr: 0.000050
2023-10-25 18:10:20,017 DEV : loss 0.10159404575824738 - f1-score (micro avg) 0.7117
2023-10-25 18:10:20,039 saving best model
2023-10-25 18:10:20,518 ----------------------------------------------------------------------------------------------------
2023-10-25 18:10:26,313 epoch 2 - iter 89/893 - loss 0.10501087 - time (sec): 5.79 - samples/sec: 4461.81 - lr: 0.000049 - momentum: 0.000000
2023-10-25 18:10:32,121 epoch 2 - iter 178/893 - loss 0.10733813 - time (sec): 11.60 - samples/sec: 4175.85 - lr: 0.000049 - momentum: 0.000000
2023-10-25 18:10:38,067 epoch 2 - iter 267/893 - loss 0.10809746 - time (sec): 17.55 - samples/sec: 4178.06 - lr: 0.000048 - momentum: 0.000000
2023-10-25 18:10:44,188 epoch 2 - iter 356/893 - loss 0.10864389 - time (sec): 23.67 - samples/sec: 4185.37 - lr: 0.000048 - momentum: 0.000000
2023-10-25 18:10:50,075 epoch 2 - iter 445/893 - loss 0.10989896 - time (sec): 29.56 - samples/sec: 4130.98 - lr: 0.000047 - momentum: 0.000000
2023-10-25 18:10:55,957 epoch 2 - iter 534/893 - loss 0.10852416 - time (sec): 35.44 - samples/sec: 4144.15 - lr: 0.000047 - momentum: 0.000000
2023-10-25 18:11:01,862 epoch 2 - iter 623/893 - loss 0.10744061 - time (sec): 41.34 - samples/sec: 4149.06 - lr: 0.000046 - momentum: 0.000000
2023-10-25 18:11:07,809 epoch 2 - iter 712/893 - loss 0.10583880 - time (sec): 47.29 - samples/sec: 4198.13 - lr: 0.000046 - momentum: 0.000000
2023-10-25 18:11:13,624 epoch 2 - iter 801/893 - loss 0.10472964 - time (sec): 53.10 - samples/sec: 4200.19 - lr: 0.000045 - momentum: 0.000000
2023-10-25 18:11:19,614 epoch 2 - iter 890/893 - loss 0.10478980 - time (sec): 59.09 - samples/sec: 4197.41 - lr: 0.000044 - momentum: 0.000000
2023-10-25 18:11:19,790 ----------------------------------------------------------------------------------------------------
2023-10-25 18:11:19,790 EPOCH 2 done: loss 0.1048 - lr: 0.000044
2023-10-25 18:11:25,332 DEV : loss 0.09946658462285995 - f1-score (micro avg) 0.765
2023-10-25 18:11:25,354 saving best model
2023-10-25 18:11:25,992 ----------------------------------------------------------------------------------------------------
2023-10-25 18:11:31,956 epoch 3 - iter 89/893 - loss 0.05522842 - time (sec): 5.96 - samples/sec: 4202.31 - lr: 0.000044 - momentum: 0.000000
2023-10-25 18:11:38,008 epoch 3 - iter 178/893 - loss 0.06062471 - time (sec): 12.01 - samples/sec: 4096.34 - lr: 0.000043 - momentum: 0.000000
2023-10-25 18:11:43,870 epoch 3 - iter 267/893 - loss 0.06593504 - time (sec): 17.88 - samples/sec: 4116.95 - lr: 0.000043 - momentum: 0.000000
2023-10-25 18:11:49,717 epoch 3 - iter 356/893 - loss 0.06303396 - time (sec): 23.72 - samples/sec: 4161.60 - lr: 0.000042 - momentum: 0.000000
2023-10-25 18:11:55,572 epoch 3 - iter 445/893 - loss 0.06478063 - time (sec): 29.58 - samples/sec: 4183.60 - lr: 0.000042 - momentum: 0.000000
2023-10-25 18:12:01,310 epoch 3 - iter 534/893 - loss 0.06506442 - time (sec): 35.32 - samples/sec: 4221.56 - lr: 0.000041 - momentum: 0.000000
2023-10-25 18:12:07,098 epoch 3 - iter 623/893 - loss 0.06368569 - time (sec): 41.10 - samples/sec: 4253.09 - lr: 0.000041 - momentum: 0.000000
2023-10-25 18:12:12,565 epoch 3 - iter 712/893 - loss 0.06430331 - time (sec): 46.57 - samples/sec: 4226.93 - lr: 0.000040 - momentum: 0.000000
2023-10-25 18:12:18,426 epoch 3 - iter 801/893 - loss 0.06458432 - time (sec): 52.43 - samples/sec: 4239.18 - lr: 0.000039 - momentum: 0.000000
2023-10-25 18:12:24,096 epoch 3 - iter 890/893 - loss 0.06445304 - time (sec): 58.10 - samples/sec: 4267.55 - lr: 0.000039 - momentum: 0.000000
2023-10-25 18:12:24,293 ----------------------------------------------------------------------------------------------------
2023-10-25 18:12:24,294 EPOCH 3 done: loss 0.0644 - lr: 0.000039
2023-10-25 18:12:28,773 DEV : loss 0.12554588913917542 - f1-score (micro avg) 0.7652
2023-10-25 18:12:28,797 saving best model
2023-10-25 18:12:29,446 ----------------------------------------------------------------------------------------------------
2023-10-25 18:12:35,361 epoch 4 - iter 89/893 - loss 0.03479416 - time (sec): 5.91 - samples/sec: 4162.26 - lr: 0.000038 - momentum: 0.000000
2023-10-25 18:12:41,460 epoch 4 - iter 178/893 - loss 0.04205946 - time (sec): 12.01 - samples/sec: 4193.39 - lr: 0.000038 - momentum: 0.000000
2023-10-25 18:12:47,435 epoch 4 - iter 267/893 - loss 0.04443524 - time (sec): 17.99 - samples/sec: 4142.35 - lr: 0.000037 - momentum: 0.000000
2023-10-25 18:12:53,146 epoch 4 - iter 356/893 - loss 0.04557293 - time (sec): 23.70 - samples/sec: 4173.08 - lr: 0.000037 - momentum: 0.000000
2023-10-25 18:12:59,134 epoch 4 - iter 445/893 - loss 0.04373872 - time (sec): 29.68 - samples/sec: 4148.50 - lr: 0.000036 - momentum: 0.000000
2023-10-25 18:13:05,160 epoch 4 - iter 534/893 - loss 0.04544751 - time (sec): 35.71 - samples/sec: 4163.70 - lr: 0.000036 - momentum: 0.000000
2023-10-25 18:13:11,006 epoch 4 - iter 623/893 - loss 0.04656032 - time (sec): 41.56 - samples/sec: 4159.60 - lr: 0.000035 - momentum: 0.000000
2023-10-25 18:13:16,899 epoch 4 - iter 712/893 - loss 0.04719361 - time (sec): 47.45 - samples/sec: 4181.65 - lr: 0.000034 - momentum: 0.000000
2023-10-25 18:13:22,625 epoch 4 - iter 801/893 - loss 0.04702508 - time (sec): 53.18 - samples/sec: 4201.77 - lr: 0.000034 - momentum: 0.000000
2023-10-25 18:13:28,404 epoch 4 - iter 890/893 - loss 0.04610922 - time (sec): 58.95 - samples/sec: 4208.06 - lr: 0.000033 - momentum: 0.000000
2023-10-25 18:13:28,580 ----------------------------------------------------------------------------------------------------
2023-10-25 18:13:28,581 EPOCH 4 done: loss 0.0461 - lr: 0.000033
2023-10-25 18:13:34,002 DEV : loss 0.1372321993112564 - f1-score (micro avg) 0.7862
2023-10-25 18:13:34,024 saving best model
2023-10-25 18:13:34,692 ----------------------------------------------------------------------------------------------------
2023-10-25 18:13:40,502 epoch 5 - iter 89/893 - loss 0.04635040 - time (sec): 5.81 - samples/sec: 4069.25 - lr: 0.000033 - momentum: 0.000000
2023-10-25 18:13:46,432 epoch 5 - iter 178/893 - loss 0.03653266 - time (sec): 11.74 - samples/sec: 4288.50 - lr: 0.000032 - momentum: 0.000000
2023-10-25 18:13:52,285 epoch 5 - iter 267/893 - loss 0.03520206 - time (sec): 17.59 - samples/sec: 4270.09 - lr: 0.000032 - momentum: 0.000000
2023-10-25 18:13:58,092 epoch 5 - iter 356/893 - loss 0.03411594 - time (sec): 23.40 - samples/sec: 4288.79 - lr: 0.000031 - momentum: 0.000000
2023-10-25 18:14:04,128 epoch 5 - iter 445/893 - loss 0.03382321 - time (sec): 29.43 - samples/sec: 4274.27 - lr: 0.000031 - momentum: 0.000000
2023-10-25 18:14:09,938 epoch 5 - iter 534/893 - loss 0.03422126 - time (sec): 35.24 - samples/sec: 4232.30 - lr: 0.000030 - momentum: 0.000000
2023-10-25 18:14:15,808 epoch 5 - iter 623/893 - loss 0.03418581 - time (sec): 41.11 - samples/sec: 4243.40 - lr: 0.000029 - momentum: 0.000000
2023-10-25 18:14:21,817 epoch 5 - iter 712/893 - loss 0.03392476 - time (sec): 47.12 - samples/sec: 4211.80 - lr: 0.000029 - momentum: 0.000000
2023-10-25 18:14:27,540 epoch 5 - iter 801/893 - loss 0.03385011 - time (sec): 52.85 - samples/sec: 4199.33 - lr: 0.000028 - momentum: 0.000000
2023-10-25 18:14:33,522 epoch 5 - iter 890/893 - loss 0.03490586 - time (sec): 58.83 - samples/sec: 4212.08 - lr: 0.000028 - momentum: 0.000000
2023-10-25 18:14:33,722 ----------------------------------------------------------------------------------------------------
2023-10-25 18:14:33,723 EPOCH 5 done: loss 0.0348 - lr: 0.000028
2023-10-25 18:14:37,895 DEV : loss 0.14049120247364044 - f1-score (micro avg) 0.7836
2023-10-25 18:14:37,916 ----------------------------------------------------------------------------------------------------
2023-10-25 18:14:44,341 epoch 6 - iter 89/893 - loss 0.02763906 - time (sec): 6.42 - samples/sec: 3921.78 - lr: 0.000027 - momentum: 0.000000
2023-10-25 18:14:50,155 epoch 6 - iter 178/893 - loss 0.02912153 - time (sec): 12.24 - samples/sec: 4172.07 - lr: 0.000027 - momentum: 0.000000
2023-10-25 18:14:55,713 epoch 6 - iter 267/893 - loss 0.02581420 - time (sec): 17.80 - samples/sec: 4180.14 - lr: 0.000026 - momentum: 0.000000
2023-10-25 18:15:01,227 epoch 6 - iter 356/893 - loss 0.02606856 - time (sec): 23.31 - samples/sec: 4259.32 - lr: 0.000026 - momentum: 0.000000
2023-10-25 18:15:06,921 epoch 6 - iter 445/893 - loss 0.02492310 - time (sec): 29.00 - samples/sec: 4310.50 - lr: 0.000025 - momentum: 0.000000
2023-10-25 18:15:12,527 epoch 6 - iter 534/893 - loss 0.02555709 - time (sec): 34.61 - samples/sec: 4331.19 - lr: 0.000024 - momentum: 0.000000
2023-10-25 18:15:18,003 epoch 6 - iter 623/893 - loss 0.02495625 - time (sec): 40.08 - samples/sec: 4365.30 - lr: 0.000024 - momentum: 0.000000
2023-10-25 18:15:23,840 epoch 6 - iter 712/893 - loss 0.02450773 - time (sec): 45.92 - samples/sec: 4318.83 - lr: 0.000023 - momentum: 0.000000
2023-10-25 18:15:29,872 epoch 6 - iter 801/893 - loss 0.02444535 - time (sec): 51.95 - samples/sec: 4336.59 - lr: 0.000023 - momentum: 0.000000
2023-10-25 18:15:35,272 epoch 6 - iter 890/893 - loss 0.02562554 - time (sec): 57.35 - samples/sec: 4325.96 - lr: 0.000022 - momentum: 0.000000
2023-10-25 18:15:35,451 ----------------------------------------------------------------------------------------------------
2023-10-25 18:15:35,452 EPOCH 6 done: loss 0.0257 - lr: 0.000022
2023-10-25 18:15:39,796 DEV : loss 0.17027460038661957 - f1-score (micro avg) 0.7798
2023-10-25 18:15:39,819 ----------------------------------------------------------------------------------------------------
2023-10-25 18:15:45,625 epoch 7 - iter 89/893 - loss 0.02266238 - time (sec): 5.80 - samples/sec: 4467.24 - lr: 0.000022 - momentum: 0.000000
2023-10-25 18:15:50,967 epoch 7 - iter 178/893 - loss 0.01912464 - time (sec): 11.15 - samples/sec: 4417.31 - lr: 0.000021 - momentum: 0.000000
2023-10-25 18:15:56,616 epoch 7 - iter 267/893 - loss 0.01869102 - time (sec): 16.80 - samples/sec: 4420.10 - lr: 0.000021 - momentum: 0.000000
2023-10-25 18:16:02,234 epoch 7 - iter 356/893 - loss 0.02015385 - time (sec): 22.41 - samples/sec: 4407.80 - lr: 0.000020 - momentum: 0.000000
2023-10-25 18:16:08,267 epoch 7 - iter 445/893 - loss 0.02014003 - time (sec): 28.45 - samples/sec: 4409.43 - lr: 0.000019 - momentum: 0.000000
2023-10-25 18:16:14,221 epoch 7 - iter 534/893 - loss 0.01999305 - time (sec): 34.40 - samples/sec: 4383.95 - lr: 0.000019 - momentum: 0.000000
2023-10-25 18:16:19,845 epoch 7 - iter 623/893 - loss 0.01930448 - time (sec): 40.02 - samples/sec: 4416.99 - lr: 0.000018 - momentum: 0.000000
2023-10-25 18:16:25,243 epoch 7 - iter 712/893 - loss 0.01963156 - time (sec): 45.42 - samples/sec: 4377.22 - lr: 0.000018 - momentum: 0.000000
2023-10-25 18:16:30,988 epoch 7 - iter 801/893 - loss 0.01985202 - time (sec): 51.17 - samples/sec: 4364.53 - lr: 0.000017 - momentum: 0.000000
2023-10-25 18:16:36,726 epoch 7 - iter 890/893 - loss 0.01990503 - time (sec): 56.91 - samples/sec: 4354.46 - lr: 0.000017 - momentum: 0.000000
2023-10-25 18:16:36,915 ----------------------------------------------------------------------------------------------------
2023-10-25 18:16:36,915 EPOCH 7 done: loss 0.0199 - lr: 0.000017
2023-10-25 18:16:42,672 DEV : loss 0.19226676225662231 - f1-score (micro avg) 0.7799
2023-10-25 18:16:42,695 ----------------------------------------------------------------------------------------------------
2023-10-25 18:16:48,797 epoch 8 - iter 89/893 - loss 0.02127649 - time (sec): 6.10 - samples/sec: 4059.70 - lr: 0.000016 - momentum: 0.000000
2023-10-25 18:16:54,491 epoch 8 - iter 178/893 - loss 0.01835301 - time (sec): 11.79 - samples/sec: 4108.32 - lr: 0.000016 - momentum: 0.000000
2023-10-25 18:17:00,306 epoch 8 - iter 267/893 - loss 0.01566630 - time (sec): 17.61 - samples/sec: 4176.93 - lr: 0.000015 - momentum: 0.000000
2023-10-25 18:17:06,122 epoch 8 - iter 356/893 - loss 0.01498561 - time (sec): 23.42 - samples/sec: 4231.40 - lr: 0.000014 - momentum: 0.000000
2023-10-25 18:17:11,874 epoch 8 - iter 445/893 - loss 0.01437023 - time (sec): 29.18 - samples/sec: 4237.49 - lr: 0.000014 - momentum: 0.000000
2023-10-25 18:17:17,393 epoch 8 - iter 534/893 - loss 0.01500300 - time (sec): 34.70 - samples/sec: 4237.39 - lr: 0.000013 - momentum: 0.000000
2023-10-25 18:17:22,870 epoch 8 - iter 623/893 - loss 0.01579148 - time (sec): 40.17 - samples/sec: 4270.60 - lr: 0.000013 - momentum: 0.000000
2023-10-25 18:17:28,500 epoch 8 - iter 712/893 - loss 0.01534715 - time (sec): 45.80 - samples/sec: 4292.33 - lr: 0.000012 - momentum: 0.000000
2023-10-25 18:17:34,330 epoch 8 - iter 801/893 - loss 0.01512829 - time (sec): 51.63 - samples/sec: 4323.70 - lr: 0.000012 - momentum: 0.000000
2023-10-25 18:17:40,007 epoch 8 - iter 890/893 - loss 0.01492451 - time (sec): 57.31 - samples/sec: 4327.28 - lr: 0.000011 - momentum: 0.000000
2023-10-25 18:17:40,180 ----------------------------------------------------------------------------------------------------
2023-10-25 18:17:40,181 EPOCH 8 done: loss 0.0149 - lr: 0.000011
2023-10-25 18:17:45,426 DEV : loss 0.20437157154083252 - f1-score (micro avg) 0.8035
2023-10-25 18:17:45,449 saving best model
2023-10-25 18:17:46,125 ----------------------------------------------------------------------------------------------------
2023-10-25 18:17:52,051 epoch 9 - iter 89/893 - loss 0.00720147 - time (sec): 5.92 - samples/sec: 4304.66 - lr: 0.000011 - momentum: 0.000000
2023-10-25 18:17:57,758 epoch 9 - iter 178/893 - loss 0.00797037 - time (sec): 11.63 - samples/sec: 4156.36 - lr: 0.000010 - momentum: 0.000000
2023-10-25 18:18:03,741 epoch 9 - iter 267/893 - loss 0.01074602 - time (sec): 17.61 - samples/sec: 4115.72 - lr: 0.000009 - momentum: 0.000000
2023-10-25 18:18:09,547 epoch 9 - iter 356/893 - loss 0.01102070 - time (sec): 23.42 - samples/sec: 4141.98 - lr: 0.000009 - momentum: 0.000000
2023-10-25 18:18:15,305 epoch 9 - iter 445/893 - loss 0.01067919 - time (sec): 29.18 - samples/sec: 4120.11 - lr: 0.000008 - momentum: 0.000000
2023-10-25 18:18:21,197 epoch 9 - iter 534/893 - loss 0.01118844 - time (sec): 35.07 - samples/sec: 4186.96 - lr: 0.000008 - momentum: 0.000000
2023-10-25 18:18:27,184 epoch 9 - iter 623/893 - loss 0.01159658 - time (sec): 41.06 - samples/sec: 4167.30 - lr: 0.000007 - momentum: 0.000000
2023-10-25 18:18:33,415 epoch 9 - iter 712/893 - loss 0.01121097 - time (sec): 47.29 - samples/sec: 4157.68 - lr: 0.000007 - momentum: 0.000000
2023-10-25 18:18:39,219 epoch 9 - iter 801/893 - loss 0.01074529 - time (sec): 53.09 - samples/sec: 4172.05 - lr: 0.000006 - momentum: 0.000000
2023-10-25 18:18:44,963 epoch 9 - iter 890/893 - loss 0.01044653 - time (sec): 58.83 - samples/sec: 4217.16 - lr: 0.000006 - momentum: 0.000000
2023-10-25 18:18:45,130 ----------------------------------------------------------------------------------------------------
2023-10-25 18:18:45,131 EPOCH 9 done: loss 0.0105 - lr: 0.000006
2023-10-25 18:18:49,498 DEV : loss 0.20618826150894165 - f1-score (micro avg) 0.8
2023-10-25 18:18:49,521 ----------------------------------------------------------------------------------------------------
2023-10-25 18:18:55,313 epoch 10 - iter 89/893 - loss 0.00546683 - time (sec): 5.79 - samples/sec: 4337.89 - lr: 0.000005 - momentum: 0.000000
2023-10-25 18:19:00,951 epoch 10 - iter 178/893 - loss 0.00510394 - time (sec): 11.43 - samples/sec: 4262.88 - lr: 0.000004 - momentum: 0.000000
2023-10-25 18:19:06,678 epoch 10 - iter 267/893 - loss 0.00543944 - time (sec): 17.16 - samples/sec: 4321.42 - lr: 0.000004 - momentum: 0.000000
2023-10-25 18:19:12,302 epoch 10 - iter 356/893 - loss 0.00534475 - time (sec): 22.78 - samples/sec: 4366.80 - lr: 0.000003 - momentum: 0.000000
2023-10-25 18:19:18,206 epoch 10 - iter 445/893 - loss 0.00559368 - time (sec): 28.68 - samples/sec: 4343.29 - lr: 0.000003 - momentum: 0.000000
2023-10-25 18:19:24,055 epoch 10 - iter 534/893 - loss 0.00562371 - time (sec): 34.53 - samples/sec: 4310.63 - lr: 0.000002 - momentum: 0.000000
2023-10-25 18:19:29,965 epoch 10 - iter 623/893 - loss 0.00643885 - time (sec): 40.44 - samples/sec: 4282.06 - lr: 0.000002 - momentum: 0.000000
2023-10-25 18:19:35,983 epoch 10 - iter 712/893 - loss 0.00611002 - time (sec): 46.46 - samples/sec: 4284.89 - lr: 0.000001 - momentum: 0.000000
2023-10-25 18:19:41,750 epoch 10 - iter 801/893 - loss 0.00600347 - time (sec): 52.23 - samples/sec: 4283.04 - lr: 0.000001 - momentum: 0.000000
2023-10-25 18:19:47,445 epoch 10 - iter 890/893 - loss 0.00666217 - time (sec): 57.92 - samples/sec: 4281.34 - lr: 0.000000 - momentum: 0.000000
2023-10-25 18:19:47,627 ----------------------------------------------------------------------------------------------------
2023-10-25 18:19:47,628 EPOCH 10 done: loss 0.0066 - lr: 0.000000
2023-10-25 18:19:53,110 DEV : loss 0.2108031064271927 - f1-score (micro avg) 0.8091
2023-10-25 18:19:53,131 saving best model
2023-10-25 18:19:54,267 ----------------------------------------------------------------------------------------------------
2023-10-25 18:19:54,269 Loading model from best epoch ...
2023-10-25 18:19:56,139 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 18:20:08,490
Results:
- F-score (micro) 0.6887
- F-score (macro) 0.6019
- Accuracy 0.5414
By class:
precision recall f1-score support
LOC 0.6914 0.6795 0.6854 1095
PER 0.7963 0.7648 0.7802 1012
ORG 0.4454 0.5938 0.5090 357
HumanProd 0.3281 0.6364 0.4330 33
micro avg 0.6766 0.7012 0.6887 2497
macro avg 0.5653 0.6686 0.6019 2497
weighted avg 0.6940 0.7012 0.6953 2497
2023-10-25 18:20:08,490 ----------------------------------------------------------------------------------------------------