|
2023-10-16 08:59:27,927 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,928 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 08:59:27,928 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,928 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-16 08:59:27,928 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,928 Train: 7142 sentences |
|
2023-10-16 08:59:27,928 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 08:59:27,928 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,928 Training Params: |
|
2023-10-16 08:59:27,928 - learning_rate: "5e-05" |
|
2023-10-16 08:59:27,928 - mini_batch_size: "4" |
|
2023-10-16 08:59:27,928 - max_epochs: "10" |
|
2023-10-16 08:59:27,928 - shuffle: "True" |
|
2023-10-16 08:59:27,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,929 Plugins: |
|
2023-10-16 08:59:27,929 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 08:59:27,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,929 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 08:59:27,929 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 08:59:27,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,929 Computation: |
|
2023-10-16 08:59:27,929 - compute on device: cuda:0 |
|
2023-10-16 08:59:27,929 - embedding storage: none |
|
2023-10-16 08:59:27,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,929 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-16 08:59:27,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:27,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 08:59:36,701 epoch 1 - iter 178/1786 - loss 1.93950853 - time (sec): 8.77 - samples/sec: 2806.48 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 08:59:45,303 epoch 1 - iter 356/1786 - loss 1.19684096 - time (sec): 17.37 - samples/sec: 2873.23 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 08:59:53,918 epoch 1 - iter 534/1786 - loss 0.90350754 - time (sec): 25.99 - samples/sec: 2868.02 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 09:00:02,649 epoch 1 - iter 712/1786 - loss 0.73207711 - time (sec): 34.72 - samples/sec: 2899.46 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 09:00:11,223 epoch 1 - iter 890/1786 - loss 0.63211784 - time (sec): 43.29 - samples/sec: 2877.60 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 09:00:19,866 epoch 1 - iter 1068/1786 - loss 0.55770395 - time (sec): 51.94 - samples/sec: 2860.36 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 09:00:28,458 epoch 1 - iter 1246/1786 - loss 0.50479715 - time (sec): 60.53 - samples/sec: 2864.56 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 09:00:37,259 epoch 1 - iter 1424/1786 - loss 0.45844199 - time (sec): 69.33 - samples/sec: 2856.58 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 09:00:46,035 epoch 1 - iter 1602/1786 - loss 0.42492404 - time (sec): 78.11 - samples/sec: 2872.10 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 09:00:54,534 epoch 1 - iter 1780/1786 - loss 0.40035384 - time (sec): 86.60 - samples/sec: 2862.69 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-16 09:00:54,814 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:00:54,815 EPOCH 1 done: loss 0.3994 - lr: 0.000050 |
|
2023-10-16 09:00:57,828 DEV : loss 0.164632648229599 - f1-score (micro avg) 0.6901 |
|
2023-10-16 09:00:57,844 saving best model |
|
2023-10-16 09:00:58,206 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:01:06,863 epoch 2 - iter 178/1786 - loss 0.11487137 - time (sec): 8.66 - samples/sec: 2797.96 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 09:01:15,536 epoch 2 - iter 356/1786 - loss 0.10865483 - time (sec): 17.33 - samples/sec: 2865.27 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 09:01:24,138 epoch 2 - iter 534/1786 - loss 0.11685553 - time (sec): 25.93 - samples/sec: 2839.41 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 09:01:32,735 epoch 2 - iter 712/1786 - loss 0.11885426 - time (sec): 34.53 - samples/sec: 2838.78 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 09:01:41,455 epoch 2 - iter 890/1786 - loss 0.11920458 - time (sec): 43.25 - samples/sec: 2830.37 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 09:01:50,204 epoch 2 - iter 1068/1786 - loss 0.11979459 - time (sec): 52.00 - samples/sec: 2832.05 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 09:01:58,827 epoch 2 - iter 1246/1786 - loss 0.12470555 - time (sec): 60.62 - samples/sec: 2847.73 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 09:02:07,725 epoch 2 - iter 1424/1786 - loss 0.12324524 - time (sec): 69.52 - samples/sec: 2848.49 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 09:02:16,897 epoch 2 - iter 1602/1786 - loss 0.12369002 - time (sec): 78.69 - samples/sec: 2834.51 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 09:02:25,680 epoch 2 - iter 1780/1786 - loss 0.12238966 - time (sec): 87.47 - samples/sec: 2833.26 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 09:02:25,944 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:02:25,944 EPOCH 2 done: loss 0.1228 - lr: 0.000044 |
|
2023-10-16 09:02:30,616 DEV : loss 0.1199851781129837 - f1-score (micro avg) 0.7637 |
|
2023-10-16 09:02:30,633 saving best model |
|
2023-10-16 09:02:31,073 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:02:39,971 epoch 3 - iter 178/1786 - loss 0.08422764 - time (sec): 8.90 - samples/sec: 2650.63 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 09:02:48,902 epoch 3 - iter 356/1786 - loss 0.08832799 - time (sec): 17.83 - samples/sec: 2666.78 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 09:02:57,286 epoch 3 - iter 534/1786 - loss 0.08603336 - time (sec): 26.21 - samples/sec: 2720.96 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 09:03:06,353 epoch 3 - iter 712/1786 - loss 0.08439697 - time (sec): 35.28 - samples/sec: 2768.02 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 09:03:15,079 epoch 3 - iter 890/1786 - loss 0.08403027 - time (sec): 44.00 - samples/sec: 2776.45 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 09:03:23,832 epoch 3 - iter 1068/1786 - loss 0.08339811 - time (sec): 52.76 - samples/sec: 2796.66 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 09:03:32,601 epoch 3 - iter 1246/1786 - loss 0.08366626 - time (sec): 61.53 - samples/sec: 2796.00 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 09:03:41,648 epoch 3 - iter 1424/1786 - loss 0.08520626 - time (sec): 70.57 - samples/sec: 2811.10 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 09:03:50,239 epoch 3 - iter 1602/1786 - loss 0.08861563 - time (sec): 79.16 - samples/sec: 2795.22 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 09:03:59,241 epoch 3 - iter 1780/1786 - loss 0.08777026 - time (sec): 88.17 - samples/sec: 2814.81 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 09:03:59,514 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:03:59,514 EPOCH 3 done: loss 0.0876 - lr: 0.000039 |
|
2023-10-16 09:04:03,562 DEV : loss 0.12482591718435287 - f1-score (micro avg) 0.7656 |
|
2023-10-16 09:04:03,577 saving best model |
|
2023-10-16 09:04:04,030 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:04:12,850 epoch 4 - iter 178/1786 - loss 0.07490226 - time (sec): 8.82 - samples/sec: 2826.41 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 09:04:21,987 epoch 4 - iter 356/1786 - loss 0.07388713 - time (sec): 17.96 - samples/sec: 2819.39 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 09:04:30,657 epoch 4 - iter 534/1786 - loss 0.07199900 - time (sec): 26.63 - samples/sec: 2796.78 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 09:04:39,443 epoch 4 - iter 712/1786 - loss 0.06917859 - time (sec): 35.41 - samples/sec: 2848.90 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 09:04:48,062 epoch 4 - iter 890/1786 - loss 0.06579901 - time (sec): 44.03 - samples/sec: 2843.02 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 09:04:57,041 epoch 4 - iter 1068/1786 - loss 0.06377113 - time (sec): 53.01 - samples/sec: 2843.47 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 09:05:05,388 epoch 4 - iter 1246/1786 - loss 0.06597385 - time (sec): 61.36 - samples/sec: 2819.29 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 09:05:14,250 epoch 4 - iter 1424/1786 - loss 0.06743665 - time (sec): 70.22 - samples/sec: 2821.07 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 09:05:23,540 epoch 4 - iter 1602/1786 - loss 0.06657675 - time (sec): 79.51 - samples/sec: 2802.30 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 09:05:32,491 epoch 4 - iter 1780/1786 - loss 0.06664923 - time (sec): 88.46 - samples/sec: 2801.56 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 09:05:32,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:05:32,817 EPOCH 4 done: loss 0.0669 - lr: 0.000033 |
|
2023-10-16 09:05:36,817 DEV : loss 0.17001311480998993 - f1-score (micro avg) 0.7773 |
|
2023-10-16 09:05:36,833 saving best model |
|
2023-10-16 09:05:37,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:05:46,244 epoch 5 - iter 178/1786 - loss 0.05321277 - time (sec): 8.96 - samples/sec: 2962.54 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 09:05:55,007 epoch 5 - iter 356/1786 - loss 0.05027400 - time (sec): 17.72 - samples/sec: 2914.67 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 09:06:03,991 epoch 5 - iter 534/1786 - loss 0.05055258 - time (sec): 26.70 - samples/sec: 2883.74 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 09:06:12,573 epoch 5 - iter 712/1786 - loss 0.04836095 - time (sec): 35.29 - samples/sec: 2930.97 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 09:06:21,516 epoch 5 - iter 890/1786 - loss 0.04806037 - time (sec): 44.23 - samples/sec: 2907.81 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 09:06:30,294 epoch 5 - iter 1068/1786 - loss 0.04703404 - time (sec): 53.01 - samples/sec: 2880.93 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 09:06:38,968 epoch 5 - iter 1246/1786 - loss 0.04922091 - time (sec): 61.68 - samples/sec: 2876.27 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 09:06:47,627 epoch 5 - iter 1424/1786 - loss 0.05007639 - time (sec): 70.34 - samples/sec: 2846.92 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 09:06:56,313 epoch 5 - iter 1602/1786 - loss 0.05063967 - time (sec): 79.03 - samples/sec: 2824.96 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 09:07:04,982 epoch 5 - iter 1780/1786 - loss 0.05136262 - time (sec): 87.70 - samples/sec: 2821.84 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 09:07:05,332 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:07:05,332 EPOCH 5 done: loss 0.0514 - lr: 0.000028 |
|
2023-10-16 09:07:09,890 DEV : loss 0.1479008048772812 - f1-score (micro avg) 0.7853 |
|
2023-10-16 09:07:09,906 saving best model |
|
2023-10-16 09:07:10,355 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:07:19,191 epoch 6 - iter 178/1786 - loss 0.04084234 - time (sec): 8.83 - samples/sec: 2787.69 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 09:07:27,857 epoch 6 - iter 356/1786 - loss 0.03988235 - time (sec): 17.50 - samples/sec: 2819.19 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 09:07:36,931 epoch 6 - iter 534/1786 - loss 0.04230182 - time (sec): 26.57 - samples/sec: 2812.14 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 09:07:45,917 epoch 6 - iter 712/1786 - loss 0.04315507 - time (sec): 35.56 - samples/sec: 2844.22 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 09:07:54,708 epoch 6 - iter 890/1786 - loss 0.03981327 - time (sec): 44.35 - samples/sec: 2841.28 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 09:08:03,319 epoch 6 - iter 1068/1786 - loss 0.04083548 - time (sec): 52.96 - samples/sec: 2847.09 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 09:08:12,027 epoch 6 - iter 1246/1786 - loss 0.03919229 - time (sec): 61.67 - samples/sec: 2864.30 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 09:08:20,916 epoch 6 - iter 1424/1786 - loss 0.03877892 - time (sec): 70.56 - samples/sec: 2859.52 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 09:08:29,628 epoch 6 - iter 1602/1786 - loss 0.03867605 - time (sec): 79.27 - samples/sec: 2826.83 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 09:08:38,418 epoch 6 - iter 1780/1786 - loss 0.03727970 - time (sec): 88.06 - samples/sec: 2818.19 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 09:08:38,691 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:08:38,692 EPOCH 6 done: loss 0.0374 - lr: 0.000022 |
|
2023-10-16 09:08:42,752 DEV : loss 0.17757560312747955 - f1-score (micro avg) 0.7965 |
|
2023-10-16 09:08:42,768 saving best model |
|
2023-10-16 09:08:43,226 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:08:52,089 epoch 7 - iter 178/1786 - loss 0.02636464 - time (sec): 8.86 - samples/sec: 2939.18 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 09:09:00,968 epoch 7 - iter 356/1786 - loss 0.02439338 - time (sec): 17.74 - samples/sec: 2863.45 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 09:09:10,107 epoch 7 - iter 534/1786 - loss 0.02868364 - time (sec): 26.88 - samples/sec: 2854.02 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 09:09:19,060 epoch 7 - iter 712/1786 - loss 0.02820433 - time (sec): 35.83 - samples/sec: 2819.85 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 09:09:27,628 epoch 7 - iter 890/1786 - loss 0.02740708 - time (sec): 44.40 - samples/sec: 2785.70 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 09:09:36,262 epoch 7 - iter 1068/1786 - loss 0.02931505 - time (sec): 53.03 - samples/sec: 2806.18 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 09:09:45,216 epoch 7 - iter 1246/1786 - loss 0.03044116 - time (sec): 61.98 - samples/sec: 2800.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 09:09:54,031 epoch 7 - iter 1424/1786 - loss 0.02978701 - time (sec): 70.80 - samples/sec: 2799.97 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 09:10:02,873 epoch 7 - iter 1602/1786 - loss 0.02989961 - time (sec): 79.64 - samples/sec: 2806.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 09:10:11,596 epoch 7 - iter 1780/1786 - loss 0.02875032 - time (sec): 88.37 - samples/sec: 2808.03 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 09:10:11,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:10:11,893 EPOCH 7 done: loss 0.0287 - lr: 0.000017 |
|
2023-10-16 09:10:16,471 DEV : loss 0.18708880245685577 - f1-score (micro avg) 0.7881 |
|
2023-10-16 09:10:16,487 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:10:25,496 epoch 8 - iter 178/1786 - loss 0.02251176 - time (sec): 9.01 - samples/sec: 2740.85 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 09:10:34,741 epoch 8 - iter 356/1786 - loss 0.02431309 - time (sec): 18.25 - samples/sec: 2821.65 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 09:10:43,326 epoch 8 - iter 534/1786 - loss 0.02293182 - time (sec): 26.84 - samples/sec: 2799.53 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 09:10:52,052 epoch 8 - iter 712/1786 - loss 0.02282164 - time (sec): 35.56 - samples/sec: 2771.58 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 09:11:00,770 epoch 8 - iter 890/1786 - loss 0.02376710 - time (sec): 44.28 - samples/sec: 2737.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 09:11:10,034 epoch 8 - iter 1068/1786 - loss 0.02298881 - time (sec): 53.55 - samples/sec: 2757.59 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 09:11:18,807 epoch 8 - iter 1246/1786 - loss 0.02269837 - time (sec): 62.32 - samples/sec: 2768.66 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 09:11:27,722 epoch 8 - iter 1424/1786 - loss 0.02263600 - time (sec): 71.23 - samples/sec: 2798.75 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 09:11:36,520 epoch 8 - iter 1602/1786 - loss 0.02277881 - time (sec): 80.03 - samples/sec: 2808.31 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 09:11:44,784 epoch 8 - iter 1780/1786 - loss 0.02229267 - time (sec): 88.30 - samples/sec: 2811.57 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 09:11:45,035 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:11:45,036 EPOCH 8 done: loss 0.0224 - lr: 0.000011 |
|
2023-10-16 09:11:49,077 DEV : loss 0.2031172513961792 - f1-score (micro avg) 0.7976 |
|
2023-10-16 09:11:49,093 saving best model |
|
2023-10-16 09:11:49,555 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:11:58,992 epoch 9 - iter 178/1786 - loss 0.02189545 - time (sec): 9.43 - samples/sec: 2629.64 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 09:12:07,517 epoch 9 - iter 356/1786 - loss 0.01747601 - time (sec): 17.96 - samples/sec: 2762.19 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 09:12:16,272 epoch 9 - iter 534/1786 - loss 0.01786279 - time (sec): 26.71 - samples/sec: 2828.56 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 09:12:25,209 epoch 9 - iter 712/1786 - loss 0.01550724 - time (sec): 35.65 - samples/sec: 2797.98 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 09:12:33,988 epoch 9 - iter 890/1786 - loss 0.01545366 - time (sec): 44.43 - samples/sec: 2794.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 09:12:42,895 epoch 9 - iter 1068/1786 - loss 0.01489816 - time (sec): 53.34 - samples/sec: 2812.03 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 09:12:51,587 epoch 9 - iter 1246/1786 - loss 0.01528282 - time (sec): 62.03 - samples/sec: 2832.17 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 09:13:00,247 epoch 9 - iter 1424/1786 - loss 0.01453460 - time (sec): 70.69 - samples/sec: 2840.93 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 09:13:08,659 epoch 9 - iter 1602/1786 - loss 0.01464221 - time (sec): 79.10 - samples/sec: 2832.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 09:13:17,327 epoch 9 - iter 1780/1786 - loss 0.01504874 - time (sec): 87.77 - samples/sec: 2825.63 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 09:13:17,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:13:17,625 EPOCH 9 done: loss 0.0150 - lr: 0.000006 |
|
2023-10-16 09:13:21,657 DEV : loss 0.18724995851516724 - f1-score (micro avg) 0.8027 |
|
2023-10-16 09:13:21,673 saving best model |
|
2023-10-16 09:13:22,121 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:13:31,015 epoch 10 - iter 178/1786 - loss 0.00571772 - time (sec): 8.89 - samples/sec: 2914.90 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 09:13:39,977 epoch 10 - iter 356/1786 - loss 0.00917164 - time (sec): 17.85 - samples/sec: 2941.77 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 09:13:49,087 epoch 10 - iter 534/1786 - loss 0.00931697 - time (sec): 26.96 - samples/sec: 2873.81 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 09:13:57,666 epoch 10 - iter 712/1786 - loss 0.00909777 - time (sec): 35.54 - samples/sec: 2879.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 09:14:06,392 epoch 10 - iter 890/1786 - loss 0.00911553 - time (sec): 44.27 - samples/sec: 2859.28 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 09:14:14,995 epoch 10 - iter 1068/1786 - loss 0.00852015 - time (sec): 52.87 - samples/sec: 2838.41 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 09:14:23,665 epoch 10 - iter 1246/1786 - loss 0.00864843 - time (sec): 61.54 - samples/sec: 2817.84 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 09:14:32,625 epoch 10 - iter 1424/1786 - loss 0.00874912 - time (sec): 70.50 - samples/sec: 2815.35 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 09:14:41,364 epoch 10 - iter 1602/1786 - loss 0.00888492 - time (sec): 79.24 - samples/sec: 2821.53 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 09:14:50,107 epoch 10 - iter 1780/1786 - loss 0.00921080 - time (sec): 87.98 - samples/sec: 2818.57 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 09:14:50,383 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:14:50,383 EPOCH 10 done: loss 0.0092 - lr: 0.000000 |
|
2023-10-16 09:14:55,010 DEV : loss 0.2006855458021164 - f1-score (micro avg) 0.8054 |
|
2023-10-16 09:14:55,026 saving best model |
|
2023-10-16 09:14:55,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:14:55,865 Loading model from best epoch ... |
|
2023-10-16 09:14:57,658 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 09:15:06,928 |
|
Results: |
|
- F-score (micro) 0.6898 |
|
- F-score (macro) 0.6075 |
|
- Accuracy 0.5436 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7223 0.6721 0.6963 1095 |
|
PER 0.7563 0.7668 0.7615 1012 |
|
ORG 0.4680 0.5742 0.5157 357 |
|
HumanProd 0.3559 0.6364 0.4565 33 |
|
|
|
micro avg 0.6837 0.6960 0.6898 2497 |
|
macro avg 0.5756 0.6624 0.6075 2497 |
|
weighted avg 0.6949 0.6960 0.6938 2497 |
|
|
|
2023-10-16 09:15:06,928 ---------------------------------------------------------------------------------------------------- |
|
|