|
2023-10-25 18:09:17,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,564 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 18:09:17,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,564 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-25 18:09:17,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,564 Train: 7142 sentences |
|
2023-10-25 18:09:17,564 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 18:09:17,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,564 Training Params: |
|
2023-10-25 18:09:17,564 - learning_rate: "5e-05" |
|
2023-10-25 18:09:17,565 - mini_batch_size: "8" |
|
2023-10-25 18:09:17,565 - max_epochs: "10" |
|
2023-10-25 18:09:17,565 - shuffle: "True" |
|
2023-10-25 18:09:17,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,565 Plugins: |
|
2023-10-25 18:09:17,565 - TensorboardLogger |
|
2023-10-25 18:09:17,565 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 18:09:17,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,565 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 18:09:17,565 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 18:09:17,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,565 Computation: |
|
2023-10-25 18:09:17,565 - compute on device: cuda:0 |
|
2023-10-25 18:09:17,565 - embedding storage: none |
|
2023-10-25 18:09:17,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,565 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-25 18:09:17,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:09:17,565 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 18:09:23,250 epoch 1 - iter 89/893 - loss 1.65132140 - time (sec): 5.68 - samples/sec: 4015.11 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 18:09:28,853 epoch 1 - iter 178/893 - loss 1.06278864 - time (sec): 11.29 - samples/sec: 4140.27 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 18:09:34,607 epoch 1 - iter 267/893 - loss 0.81561733 - time (sec): 17.04 - samples/sec: 4180.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:09:40,608 epoch 1 - iter 356/893 - loss 0.65844535 - time (sec): 23.04 - samples/sec: 4269.01 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 18:09:46,286 epoch 1 - iter 445/893 - loss 0.56413544 - time (sec): 28.72 - samples/sec: 4274.79 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 18:09:52,088 epoch 1 - iter 534/893 - loss 0.49928189 - time (sec): 34.52 - samples/sec: 4238.74 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 18:09:57,858 epoch 1 - iter 623/893 - loss 0.45029812 - time (sec): 40.29 - samples/sec: 4240.69 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 18:10:03,775 epoch 1 - iter 712/893 - loss 0.41037659 - time (sec): 46.21 - samples/sec: 4249.87 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 18:10:09,732 epoch 1 - iter 801/893 - loss 0.37745804 - time (sec): 52.17 - samples/sec: 4269.90 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 18:10:15,606 epoch 1 - iter 890/893 - loss 0.35448580 - time (sec): 58.04 - samples/sec: 4271.30 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 18:10:15,835 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:10:15,835 EPOCH 1 done: loss 0.3537 - lr: 0.000050 |
|
2023-10-25 18:10:20,017 DEV : loss 0.10159404575824738 - f1-score (micro avg) 0.7117 |
|
2023-10-25 18:10:20,039 saving best model |
|
2023-10-25 18:10:20,518 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:10:26,313 epoch 2 - iter 89/893 - loss 0.10501087 - time (sec): 5.79 - samples/sec: 4461.81 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 18:10:32,121 epoch 2 - iter 178/893 - loss 0.10733813 - time (sec): 11.60 - samples/sec: 4175.85 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 18:10:38,067 epoch 2 - iter 267/893 - loss 0.10809746 - time (sec): 17.55 - samples/sec: 4178.06 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 18:10:44,188 epoch 2 - iter 356/893 - loss 0.10864389 - time (sec): 23.67 - samples/sec: 4185.37 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 18:10:50,075 epoch 2 - iter 445/893 - loss 0.10989896 - time (sec): 29.56 - samples/sec: 4130.98 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 18:10:55,957 epoch 2 - iter 534/893 - loss 0.10852416 - time (sec): 35.44 - samples/sec: 4144.15 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 18:11:01,862 epoch 2 - iter 623/893 - loss 0.10744061 - time (sec): 41.34 - samples/sec: 4149.06 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 18:11:07,809 epoch 2 - iter 712/893 - loss 0.10583880 - time (sec): 47.29 - samples/sec: 4198.13 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 18:11:13,624 epoch 2 - iter 801/893 - loss 0.10472964 - time (sec): 53.10 - samples/sec: 4200.19 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 18:11:19,614 epoch 2 - iter 890/893 - loss 0.10478980 - time (sec): 59.09 - samples/sec: 4197.41 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 18:11:19,790 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:11:19,790 EPOCH 2 done: loss 0.1048 - lr: 0.000044 |
|
2023-10-25 18:11:25,332 DEV : loss 0.09946658462285995 - f1-score (micro avg) 0.765 |
|
2023-10-25 18:11:25,354 saving best model |
|
2023-10-25 18:11:25,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:11:31,956 epoch 3 - iter 89/893 - loss 0.05522842 - time (sec): 5.96 - samples/sec: 4202.31 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 18:11:38,008 epoch 3 - iter 178/893 - loss 0.06062471 - time (sec): 12.01 - samples/sec: 4096.34 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 18:11:43,870 epoch 3 - iter 267/893 - loss 0.06593504 - time (sec): 17.88 - samples/sec: 4116.95 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 18:11:49,717 epoch 3 - iter 356/893 - loss 0.06303396 - time (sec): 23.72 - samples/sec: 4161.60 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 18:11:55,572 epoch 3 - iter 445/893 - loss 0.06478063 - time (sec): 29.58 - samples/sec: 4183.60 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 18:12:01,310 epoch 3 - iter 534/893 - loss 0.06506442 - time (sec): 35.32 - samples/sec: 4221.56 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 18:12:07,098 epoch 3 - iter 623/893 - loss 0.06368569 - time (sec): 41.10 - samples/sec: 4253.09 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 18:12:12,565 epoch 3 - iter 712/893 - loss 0.06430331 - time (sec): 46.57 - samples/sec: 4226.93 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 18:12:18,426 epoch 3 - iter 801/893 - loss 0.06458432 - time (sec): 52.43 - samples/sec: 4239.18 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 18:12:24,096 epoch 3 - iter 890/893 - loss 0.06445304 - time (sec): 58.10 - samples/sec: 4267.55 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 18:12:24,293 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:12:24,294 EPOCH 3 done: loss 0.0644 - lr: 0.000039 |
|
2023-10-25 18:12:28,773 DEV : loss 0.12554588913917542 - f1-score (micro avg) 0.7652 |
|
2023-10-25 18:12:28,797 saving best model |
|
2023-10-25 18:12:29,446 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:12:35,361 epoch 4 - iter 89/893 - loss 0.03479416 - time (sec): 5.91 - samples/sec: 4162.26 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 18:12:41,460 epoch 4 - iter 178/893 - loss 0.04205946 - time (sec): 12.01 - samples/sec: 4193.39 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 18:12:47,435 epoch 4 - iter 267/893 - loss 0.04443524 - time (sec): 17.99 - samples/sec: 4142.35 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 18:12:53,146 epoch 4 - iter 356/893 - loss 0.04557293 - time (sec): 23.70 - samples/sec: 4173.08 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 18:12:59,134 epoch 4 - iter 445/893 - loss 0.04373872 - time (sec): 29.68 - samples/sec: 4148.50 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 18:13:05,160 epoch 4 - iter 534/893 - loss 0.04544751 - time (sec): 35.71 - samples/sec: 4163.70 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 18:13:11,006 epoch 4 - iter 623/893 - loss 0.04656032 - time (sec): 41.56 - samples/sec: 4159.60 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 18:13:16,899 epoch 4 - iter 712/893 - loss 0.04719361 - time (sec): 47.45 - samples/sec: 4181.65 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 18:13:22,625 epoch 4 - iter 801/893 - loss 0.04702508 - time (sec): 53.18 - samples/sec: 4201.77 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 18:13:28,404 epoch 4 - iter 890/893 - loss 0.04610922 - time (sec): 58.95 - samples/sec: 4208.06 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 18:13:28,580 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:13:28,581 EPOCH 4 done: loss 0.0461 - lr: 0.000033 |
|
2023-10-25 18:13:34,002 DEV : loss 0.1372321993112564 - f1-score (micro avg) 0.7862 |
|
2023-10-25 18:13:34,024 saving best model |
|
2023-10-25 18:13:34,692 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:13:40,502 epoch 5 - iter 89/893 - loss 0.04635040 - time (sec): 5.81 - samples/sec: 4069.25 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 18:13:46,432 epoch 5 - iter 178/893 - loss 0.03653266 - time (sec): 11.74 - samples/sec: 4288.50 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 18:13:52,285 epoch 5 - iter 267/893 - loss 0.03520206 - time (sec): 17.59 - samples/sec: 4270.09 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 18:13:58,092 epoch 5 - iter 356/893 - loss 0.03411594 - time (sec): 23.40 - samples/sec: 4288.79 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 18:14:04,128 epoch 5 - iter 445/893 - loss 0.03382321 - time (sec): 29.43 - samples/sec: 4274.27 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 18:14:09,938 epoch 5 - iter 534/893 - loss 0.03422126 - time (sec): 35.24 - samples/sec: 4232.30 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 18:14:15,808 epoch 5 - iter 623/893 - loss 0.03418581 - time (sec): 41.11 - samples/sec: 4243.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 18:14:21,817 epoch 5 - iter 712/893 - loss 0.03392476 - time (sec): 47.12 - samples/sec: 4211.80 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 18:14:27,540 epoch 5 - iter 801/893 - loss 0.03385011 - time (sec): 52.85 - samples/sec: 4199.33 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 18:14:33,522 epoch 5 - iter 890/893 - loss 0.03490586 - time (sec): 58.83 - samples/sec: 4212.08 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 18:14:33,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:14:33,723 EPOCH 5 done: loss 0.0348 - lr: 0.000028 |
|
2023-10-25 18:14:37,895 DEV : loss 0.14049120247364044 - f1-score (micro avg) 0.7836 |
|
2023-10-25 18:14:37,916 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:14:44,341 epoch 6 - iter 89/893 - loss 0.02763906 - time (sec): 6.42 - samples/sec: 3921.78 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:14:50,155 epoch 6 - iter 178/893 - loss 0.02912153 - time (sec): 12.24 - samples/sec: 4172.07 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 18:14:55,713 epoch 6 - iter 267/893 - loss 0.02581420 - time (sec): 17.80 - samples/sec: 4180.14 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 18:15:01,227 epoch 6 - iter 356/893 - loss 0.02606856 - time (sec): 23.31 - samples/sec: 4259.32 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 18:15:06,921 epoch 6 - iter 445/893 - loss 0.02492310 - time (sec): 29.00 - samples/sec: 4310.50 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 18:15:12,527 epoch 6 - iter 534/893 - loss 0.02555709 - time (sec): 34.61 - samples/sec: 4331.19 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:15:18,003 epoch 6 - iter 623/893 - loss 0.02495625 - time (sec): 40.08 - samples/sec: 4365.30 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 18:15:23,840 epoch 6 - iter 712/893 - loss 0.02450773 - time (sec): 45.92 - samples/sec: 4318.83 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 18:15:29,872 epoch 6 - iter 801/893 - loss 0.02444535 - time (sec): 51.95 - samples/sec: 4336.59 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 18:15:35,272 epoch 6 - iter 890/893 - loss 0.02562554 - time (sec): 57.35 - samples/sec: 4325.96 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 18:15:35,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:15:35,452 EPOCH 6 done: loss 0.0257 - lr: 0.000022 |
|
2023-10-25 18:15:39,796 DEV : loss 0.17027460038661957 - f1-score (micro avg) 0.7798 |
|
2023-10-25 18:15:39,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:15:45,625 epoch 7 - iter 89/893 - loss 0.02266238 - time (sec): 5.80 - samples/sec: 4467.24 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 18:15:50,967 epoch 7 - iter 178/893 - loss 0.01912464 - time (sec): 11.15 - samples/sec: 4417.31 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:15:56,616 epoch 7 - iter 267/893 - loss 0.01869102 - time (sec): 16.80 - samples/sec: 4420.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 18:16:02,234 epoch 7 - iter 356/893 - loss 0.02015385 - time (sec): 22.41 - samples/sec: 4407.80 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 18:16:08,267 epoch 7 - iter 445/893 - loss 0.02014003 - time (sec): 28.45 - samples/sec: 4409.43 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 18:16:14,221 epoch 7 - iter 534/893 - loss 0.01999305 - time (sec): 34.40 - samples/sec: 4383.95 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 18:16:19,845 epoch 7 - iter 623/893 - loss 0.01930448 - time (sec): 40.02 - samples/sec: 4416.99 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:16:25,243 epoch 7 - iter 712/893 - loss 0.01963156 - time (sec): 45.42 - samples/sec: 4377.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 18:16:30,988 epoch 7 - iter 801/893 - loss 0.01985202 - time (sec): 51.17 - samples/sec: 4364.53 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 18:16:36,726 epoch 7 - iter 890/893 - loss 0.01990503 - time (sec): 56.91 - samples/sec: 4354.46 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 18:16:36,915 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:16:36,915 EPOCH 7 done: loss 0.0199 - lr: 0.000017 |
|
2023-10-25 18:16:42,672 DEV : loss 0.19226676225662231 - f1-score (micro avg) 0.7799 |
|
2023-10-25 18:16:42,695 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:16:48,797 epoch 8 - iter 89/893 - loss 0.02127649 - time (sec): 6.10 - samples/sec: 4059.70 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 18:16:54,491 epoch 8 - iter 178/893 - loss 0.01835301 - time (sec): 11.79 - samples/sec: 4108.32 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 18:17:00,306 epoch 8 - iter 267/893 - loss 0.01566630 - time (sec): 17.61 - samples/sec: 4176.93 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 18:17:06,122 epoch 8 - iter 356/893 - loss 0.01498561 - time (sec): 23.42 - samples/sec: 4231.40 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 18:17:11,874 epoch 8 - iter 445/893 - loss 0.01437023 - time (sec): 29.18 - samples/sec: 4237.49 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 18:17:17,393 epoch 8 - iter 534/893 - loss 0.01500300 - time (sec): 34.70 - samples/sec: 4237.39 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 18:17:22,870 epoch 8 - iter 623/893 - loss 0.01579148 - time (sec): 40.17 - samples/sec: 4270.60 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 18:17:28,500 epoch 8 - iter 712/893 - loss 0.01534715 - time (sec): 45.80 - samples/sec: 4292.33 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:17:34,330 epoch 8 - iter 801/893 - loss 0.01512829 - time (sec): 51.63 - samples/sec: 4323.70 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 18:17:40,007 epoch 8 - iter 890/893 - loss 0.01492451 - time (sec): 57.31 - samples/sec: 4327.28 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 18:17:40,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:17:40,181 EPOCH 8 done: loss 0.0149 - lr: 0.000011 |
|
2023-10-25 18:17:45,426 DEV : loss 0.20437157154083252 - f1-score (micro avg) 0.8035 |
|
2023-10-25 18:17:45,449 saving best model |
|
2023-10-25 18:17:46,125 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:17:52,051 epoch 9 - iter 89/893 - loss 0.00720147 - time (sec): 5.92 - samples/sec: 4304.66 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 18:17:57,758 epoch 9 - iter 178/893 - loss 0.00797037 - time (sec): 11.63 - samples/sec: 4156.36 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 18:18:03,741 epoch 9 - iter 267/893 - loss 0.01074602 - time (sec): 17.61 - samples/sec: 4115.72 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:18:09,547 epoch 9 - iter 356/893 - loss 0.01102070 - time (sec): 23.42 - samples/sec: 4141.98 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 18:18:15,305 epoch 9 - iter 445/893 - loss 0.01067919 - time (sec): 29.18 - samples/sec: 4120.11 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 18:18:21,197 epoch 9 - iter 534/893 - loss 0.01118844 - time (sec): 35.07 - samples/sec: 4186.96 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 18:18:27,184 epoch 9 - iter 623/893 - loss 0.01159658 - time (sec): 41.06 - samples/sec: 4167.30 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 18:18:33,415 epoch 9 - iter 712/893 - loss 0.01121097 - time (sec): 47.29 - samples/sec: 4157.68 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 18:18:39,219 epoch 9 - iter 801/893 - loss 0.01074529 - time (sec): 53.09 - samples/sec: 4172.05 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:18:44,963 epoch 9 - iter 890/893 - loss 0.01044653 - time (sec): 58.83 - samples/sec: 4217.16 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 18:18:45,130 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:18:45,131 EPOCH 9 done: loss 0.0105 - lr: 0.000006 |
|
2023-10-25 18:18:49,498 DEV : loss 0.20618826150894165 - f1-score (micro avg) 0.8 |
|
2023-10-25 18:18:49,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:18:55,313 epoch 10 - iter 89/893 - loss 0.00546683 - time (sec): 5.79 - samples/sec: 4337.89 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 18:19:00,951 epoch 10 - iter 178/893 - loss 0.00510394 - time (sec): 11.43 - samples/sec: 4262.88 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 18:19:06,678 epoch 10 - iter 267/893 - loss 0.00543944 - time (sec): 17.16 - samples/sec: 4321.42 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 18:19:12,302 epoch 10 - iter 356/893 - loss 0.00534475 - time (sec): 22.78 - samples/sec: 4366.80 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:19:18,206 epoch 10 - iter 445/893 - loss 0.00559368 - time (sec): 28.68 - samples/sec: 4343.29 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 18:19:24,055 epoch 10 - iter 534/893 - loss 0.00562371 - time (sec): 34.53 - samples/sec: 4310.63 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 18:19:29,965 epoch 10 - iter 623/893 - loss 0.00643885 - time (sec): 40.44 - samples/sec: 4282.06 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 18:19:35,983 epoch 10 - iter 712/893 - loss 0.00611002 - time (sec): 46.46 - samples/sec: 4284.89 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 18:19:41,750 epoch 10 - iter 801/893 - loss 0.00600347 - time (sec): 52.23 - samples/sec: 4283.04 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 18:19:47,445 epoch 10 - iter 890/893 - loss 0.00666217 - time (sec): 57.92 - samples/sec: 4281.34 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 18:19:47,627 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:19:47,628 EPOCH 10 done: loss 0.0066 - lr: 0.000000 |
|
2023-10-25 18:19:53,110 DEV : loss 0.2108031064271927 - f1-score (micro avg) 0.8091 |
|
2023-10-25 18:19:53,131 saving best model |
|
2023-10-25 18:19:54,267 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 18:19:54,269 Loading model from best epoch ... |
|
2023-10-25 18:19:56,139 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 18:20:08,490 |
|
Results: |
|
- F-score (micro) 0.6887 |
|
- F-score (macro) 0.6019 |
|
- Accuracy 0.5414 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6914 0.6795 0.6854 1095 |
|
PER 0.7963 0.7648 0.7802 1012 |
|
ORG 0.4454 0.5938 0.5090 357 |
|
HumanProd 0.3281 0.6364 0.4330 33 |
|
|
|
micro avg 0.6766 0.7012 0.6887 2497 |
|
macro avg 0.5653 0.6686 0.6019 2497 |
|
weighted avg 0.6940 0.7012 0.6953 2497 |
|
|
|
2023-10-25 18:20:08,490 ---------------------------------------------------------------------------------------------------- |
|
|