|
2023-10-14 11:33:32,804 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,805 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 11:33:32,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,805 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-14 11:33:32,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,805 Train: 5777 sentences |
|
2023-10-14 11:33:32,805 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 11:33:32,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,805 Training Params: |
|
2023-10-14 11:33:32,805 - learning_rate: "5e-05" |
|
2023-10-14 11:33:32,806 - mini_batch_size: "4" |
|
2023-10-14 11:33:32,806 - max_epochs: "10" |
|
2023-10-14 11:33:32,806 - shuffle: "True" |
|
2023-10-14 11:33:32,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,806 Plugins: |
|
2023-10-14 11:33:32,806 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 11:33:32,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,806 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 11:33:32,806 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 11:33:32,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,806 Computation: |
|
2023-10-14 11:33:32,806 - compute on device: cuda:0 |
|
2023-10-14 11:33:32,806 - embedding storage: none |
|
2023-10-14 11:33:32,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,806 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-14 11:33:32,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:32,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:33:41,067 epoch 1 - iter 144/1445 - loss 1.43369159 - time (sec): 8.26 - samples/sec: 2245.44 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 11:33:48,582 epoch 1 - iter 288/1445 - loss 0.85898979 - time (sec): 15.78 - samples/sec: 2276.91 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 11:33:55,828 epoch 1 - iter 432/1445 - loss 0.64395719 - time (sec): 23.02 - samples/sec: 2316.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 11:34:03,175 epoch 1 - iter 576/1445 - loss 0.53253749 - time (sec): 30.37 - samples/sec: 2350.21 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 11:34:10,523 epoch 1 - iter 720/1445 - loss 0.46097312 - time (sec): 37.72 - samples/sec: 2351.32 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 11:34:17,871 epoch 1 - iter 864/1445 - loss 0.41136182 - time (sec): 45.06 - samples/sec: 2357.69 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 11:34:25,309 epoch 1 - iter 1008/1445 - loss 0.37502039 - time (sec): 52.50 - samples/sec: 2361.41 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-14 11:34:32,412 epoch 1 - iter 1152/1445 - loss 0.34642014 - time (sec): 59.61 - samples/sec: 2352.91 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-14 11:34:39,655 epoch 1 - iter 1296/1445 - loss 0.32134796 - time (sec): 66.85 - samples/sec: 2361.73 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 11:34:46,952 epoch 1 - iter 1440/1445 - loss 0.30279384 - time (sec): 74.14 - samples/sec: 2368.90 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-14 11:34:47,192 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:34:47,193 EPOCH 1 done: loss 0.3024 - lr: 0.000050 |
|
2023-10-14 11:34:50,329 DEV : loss 0.14426745474338531 - f1-score (micro avg) 0.631 |
|
2023-10-14 11:34:50,345 saving best model |
|
2023-10-14 11:34:50,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:34:58,203 epoch 2 - iter 144/1445 - loss 0.11871103 - time (sec): 7.42 - samples/sec: 2306.42 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-14 11:35:05,309 epoch 2 - iter 288/1445 - loss 0.12164115 - time (sec): 14.52 - samples/sec: 2350.70 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-14 11:35:12,783 epoch 2 - iter 432/1445 - loss 0.11897072 - time (sec): 22.00 - samples/sec: 2382.87 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 11:35:20,278 epoch 2 - iter 576/1445 - loss 0.11592658 - time (sec): 29.49 - samples/sec: 2404.28 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 11:35:27,275 epoch 2 - iter 720/1445 - loss 0.11482763 - time (sec): 36.49 - samples/sec: 2410.85 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-14 11:35:34,371 epoch 2 - iter 864/1445 - loss 0.11209204 - time (sec): 43.59 - samples/sec: 2402.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-14 11:35:41,763 epoch 2 - iter 1008/1445 - loss 0.11071289 - time (sec): 50.98 - samples/sec: 2399.75 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-14 11:35:49,360 epoch 2 - iter 1152/1445 - loss 0.10882393 - time (sec): 58.58 - samples/sec: 2393.42 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-14 11:35:56,980 epoch 2 - iter 1296/1445 - loss 0.10920607 - time (sec): 66.20 - samples/sec: 2389.87 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 11:36:04,656 epoch 2 - iter 1440/1445 - loss 0.10848901 - time (sec): 73.87 - samples/sec: 2379.55 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-14 11:36:04,876 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:36:04,876 EPOCH 2 done: loss 0.1084 - lr: 0.000044 |
|
2023-10-14 11:36:08,445 DEV : loss 0.08696628361940384 - f1-score (micro avg) 0.8056 |
|
2023-10-14 11:36:08,470 saving best model |
|
2023-10-14 11:36:09,027 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:36:16,208 epoch 3 - iter 144/1445 - loss 0.08013974 - time (sec): 7.18 - samples/sec: 2363.46 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-14 11:36:23,636 epoch 3 - iter 288/1445 - loss 0.07495514 - time (sec): 14.61 - samples/sec: 2333.06 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-14 11:36:30,995 epoch 3 - iter 432/1445 - loss 0.07427969 - time (sec): 21.97 - samples/sec: 2329.93 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-14 11:36:38,057 epoch 3 - iter 576/1445 - loss 0.07707753 - time (sec): 29.03 - samples/sec: 2340.34 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-14 11:36:45,180 epoch 3 - iter 720/1445 - loss 0.07618608 - time (sec): 36.15 - samples/sec: 2332.09 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-14 11:36:52,728 epoch 3 - iter 864/1445 - loss 0.07322862 - time (sec): 43.70 - samples/sec: 2362.62 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-14 11:37:00,007 epoch 3 - iter 1008/1445 - loss 0.07343540 - time (sec): 50.98 - samples/sec: 2362.17 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-14 11:37:07,468 epoch 3 - iter 1152/1445 - loss 0.07535018 - time (sec): 58.44 - samples/sec: 2380.78 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-14 11:37:15,113 epoch 3 - iter 1296/1445 - loss 0.07471681 - time (sec): 66.08 - samples/sec: 2374.03 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 11:37:22,648 epoch 3 - iter 1440/1445 - loss 0.07360689 - time (sec): 73.62 - samples/sec: 2387.80 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 11:37:22,869 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:37:22,869 EPOCH 3 done: loss 0.0737 - lr: 0.000039 |
|
2023-10-14 11:37:26,430 DEV : loss 0.09837064146995544 - f1-score (micro avg) 0.7909 |
|
2023-10-14 11:37:26,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:37:35,513 epoch 4 - iter 144/1445 - loss 0.04038213 - time (sec): 9.06 - samples/sec: 1871.02 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 11:37:43,810 epoch 4 - iter 288/1445 - loss 0.04880213 - time (sec): 17.36 - samples/sec: 2065.39 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 11:37:51,934 epoch 4 - iter 432/1445 - loss 0.04623390 - time (sec): 25.48 - samples/sec: 2080.27 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 11:37:59,626 epoch 4 - iter 576/1445 - loss 0.05140673 - time (sec): 33.17 - samples/sec: 2130.40 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 11:38:06,763 epoch 4 - iter 720/1445 - loss 0.05242646 - time (sec): 40.31 - samples/sec: 2168.66 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 11:38:14,123 epoch 4 - iter 864/1445 - loss 0.05243404 - time (sec): 47.67 - samples/sec: 2203.65 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 11:38:21,439 epoch 4 - iter 1008/1445 - loss 0.05352965 - time (sec): 54.99 - samples/sec: 2238.25 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-14 11:38:28,637 epoch 4 - iter 1152/1445 - loss 0.05228346 - time (sec): 62.18 - samples/sec: 2247.00 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 11:38:35,882 epoch 4 - iter 1296/1445 - loss 0.05354625 - time (sec): 69.43 - samples/sec: 2272.85 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 11:38:43,190 epoch 4 - iter 1440/1445 - loss 0.05446189 - time (sec): 76.74 - samples/sec: 2290.88 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-14 11:38:43,412 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:38:43,413 EPOCH 4 done: loss 0.0544 - lr: 0.000033 |
|
2023-10-14 11:38:47,049 DEV : loss 0.13684163987636566 - f1-score (micro avg) 0.7735 |
|
2023-10-14 11:38:47,070 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:38:54,510 epoch 5 - iter 144/1445 - loss 0.03982192 - time (sec): 7.44 - samples/sec: 2235.21 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-14 11:39:01,852 epoch 5 - iter 288/1445 - loss 0.03969703 - time (sec): 14.78 - samples/sec: 2260.48 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 11:39:09,061 epoch 5 - iter 432/1445 - loss 0.04370314 - time (sec): 21.99 - samples/sec: 2299.81 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 11:39:16,385 epoch 5 - iter 576/1445 - loss 0.04527434 - time (sec): 29.31 - samples/sec: 2359.50 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-14 11:39:24,018 epoch 5 - iter 720/1445 - loss 0.04759992 - time (sec): 36.95 - samples/sec: 2359.56 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-14 11:39:31,564 epoch 5 - iter 864/1445 - loss 0.04728067 - time (sec): 44.49 - samples/sec: 2367.71 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 11:39:39,124 epoch 5 - iter 1008/1445 - loss 0.04654287 - time (sec): 52.05 - samples/sec: 2384.71 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 11:39:46,236 epoch 5 - iter 1152/1445 - loss 0.04540814 - time (sec): 59.16 - samples/sec: 2381.08 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 11:39:53,236 epoch 5 - iter 1296/1445 - loss 0.04372098 - time (sec): 66.16 - samples/sec: 2388.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 11:40:00,506 epoch 5 - iter 1440/1445 - loss 0.04412868 - time (sec): 73.44 - samples/sec: 2388.38 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 11:40:00,795 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:40:00,795 EPOCH 5 done: loss 0.0442 - lr: 0.000028 |
|
2023-10-14 11:40:04,785 DEV : loss 0.1434909701347351 - f1-score (micro avg) 0.8048 |
|
2023-10-14 11:40:04,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:40:12,079 epoch 6 - iter 144/1445 - loss 0.03394452 - time (sec): 7.28 - samples/sec: 2323.74 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 11:40:19,193 epoch 6 - iter 288/1445 - loss 0.03447812 - time (sec): 14.39 - samples/sec: 2379.30 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 11:40:26,689 epoch 6 - iter 432/1445 - loss 0.03348796 - time (sec): 21.89 - samples/sec: 2380.18 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 11:40:34,468 epoch 6 - iter 576/1445 - loss 0.03487847 - time (sec): 29.66 - samples/sec: 2371.17 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 11:40:42,197 epoch 6 - iter 720/1445 - loss 0.03632731 - time (sec): 37.39 - samples/sec: 2393.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 11:40:49,599 epoch 6 - iter 864/1445 - loss 0.03568819 - time (sec): 44.80 - samples/sec: 2385.20 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 11:40:56,596 epoch 6 - iter 1008/1445 - loss 0.03343040 - time (sec): 51.79 - samples/sec: 2387.09 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 11:41:03,864 epoch 6 - iter 1152/1445 - loss 0.03169355 - time (sec): 59.06 - samples/sec: 2384.17 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 11:41:11,044 epoch 6 - iter 1296/1445 - loss 0.03126589 - time (sec): 66.24 - samples/sec: 2389.17 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 11:41:18,341 epoch 6 - iter 1440/1445 - loss 0.03080380 - time (sec): 73.54 - samples/sec: 2389.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 11:41:18,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:41:18,575 EPOCH 6 done: loss 0.0307 - lr: 0.000022 |
|
2023-10-14 11:41:22,171 DEV : loss 0.17948994040489197 - f1-score (micro avg) 0.7937 |
|
2023-10-14 11:41:22,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:41:29,386 epoch 7 - iter 144/1445 - loss 0.02148999 - time (sec): 7.20 - samples/sec: 2365.12 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 11:41:36,603 epoch 7 - iter 288/1445 - loss 0.02049358 - time (sec): 14.41 - samples/sec: 2374.86 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 11:41:44,061 epoch 7 - iter 432/1445 - loss 0.02426862 - time (sec): 21.87 - samples/sec: 2406.65 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 11:41:51,364 epoch 7 - iter 576/1445 - loss 0.02353802 - time (sec): 29.18 - samples/sec: 2400.22 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 11:41:58,671 epoch 7 - iter 720/1445 - loss 0.02213679 - time (sec): 36.48 - samples/sec: 2414.68 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 11:42:06,237 epoch 7 - iter 864/1445 - loss 0.02183153 - time (sec): 44.05 - samples/sec: 2392.78 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 11:42:13,794 epoch 7 - iter 1008/1445 - loss 0.02145289 - time (sec): 51.61 - samples/sec: 2383.07 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 11:42:21,361 epoch 7 - iter 1152/1445 - loss 0.02240382 - time (sec): 59.17 - samples/sec: 2376.42 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 11:42:28,790 epoch 7 - iter 1296/1445 - loss 0.02358361 - time (sec): 66.60 - samples/sec: 2359.36 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 11:42:37,066 epoch 7 - iter 1440/1445 - loss 0.02296526 - time (sec): 74.88 - samples/sec: 2346.33 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 11:42:37,342 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:42:37,342 EPOCH 7 done: loss 0.0229 - lr: 0.000017 |
|
2023-10-14 11:42:41,079 DEV : loss 0.17617355287075043 - f1-score (micro avg) 0.807 |
|
2023-10-14 11:42:41,102 saving best model |
|
2023-10-14 11:42:41,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:42:49,496 epoch 8 - iter 144/1445 - loss 0.02117476 - time (sec): 7.83 - samples/sec: 2275.91 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 11:42:56,718 epoch 8 - iter 288/1445 - loss 0.01850677 - time (sec): 15.05 - samples/sec: 2336.14 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 11:43:04,263 epoch 8 - iter 432/1445 - loss 0.02027973 - time (sec): 22.59 - samples/sec: 2357.45 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 11:43:11,468 epoch 8 - iter 576/1445 - loss 0.01932100 - time (sec): 29.80 - samples/sec: 2369.27 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 11:43:18,867 epoch 8 - iter 720/1445 - loss 0.01835449 - time (sec): 37.20 - samples/sec: 2380.94 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 11:43:26,156 epoch 8 - iter 864/1445 - loss 0.01762142 - time (sec): 44.49 - samples/sec: 2382.22 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 11:43:33,446 epoch 8 - iter 1008/1445 - loss 0.01660096 - time (sec): 51.78 - samples/sec: 2363.32 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 11:43:40,921 epoch 8 - iter 1152/1445 - loss 0.01616639 - time (sec): 59.25 - samples/sec: 2368.87 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 11:43:48,521 epoch 8 - iter 1296/1445 - loss 0.01642043 - time (sec): 66.85 - samples/sec: 2371.35 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 11:43:55,716 epoch 8 - iter 1440/1445 - loss 0.01607383 - time (sec): 74.05 - samples/sec: 2369.57 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 11:43:56,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:43:56,008 EPOCH 8 done: loss 0.0160 - lr: 0.000011 |
|
2023-10-14 11:43:59,940 DEV : loss 0.20396527647972107 - f1-score (micro avg) 0.7911 |
|
2023-10-14 11:43:59,956 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:44:07,506 epoch 9 - iter 144/1445 - loss 0.00551300 - time (sec): 7.55 - samples/sec: 2309.36 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 11:44:14,806 epoch 9 - iter 288/1445 - loss 0.00752272 - time (sec): 14.85 - samples/sec: 2253.81 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 11:44:22,602 epoch 9 - iter 432/1445 - loss 0.01012645 - time (sec): 22.64 - samples/sec: 2368.69 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 11:44:29,669 epoch 9 - iter 576/1445 - loss 0.01015045 - time (sec): 29.71 - samples/sec: 2367.09 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 11:44:36,980 epoch 9 - iter 720/1445 - loss 0.00999283 - time (sec): 37.02 - samples/sec: 2387.63 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 11:44:44,297 epoch 9 - iter 864/1445 - loss 0.01004974 - time (sec): 44.34 - samples/sec: 2387.55 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 11:44:51,529 epoch 9 - iter 1008/1445 - loss 0.00938374 - time (sec): 51.57 - samples/sec: 2391.58 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 11:44:59,107 epoch 9 - iter 1152/1445 - loss 0.00993299 - time (sec): 59.15 - samples/sec: 2373.67 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 11:45:06,864 epoch 9 - iter 1296/1445 - loss 0.01035581 - time (sec): 66.91 - samples/sec: 2357.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 11:45:14,195 epoch 9 - iter 1440/1445 - loss 0.01097736 - time (sec): 74.24 - samples/sec: 2363.80 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 11:45:14,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:45:14,464 EPOCH 9 done: loss 0.0109 - lr: 0.000006 |
|
2023-10-14 11:45:17,986 DEV : loss 0.18760253489017487 - f1-score (micro avg) 0.8144 |
|
2023-10-14 11:45:18,004 saving best model |
|
2023-10-14 11:45:18,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:45:26,028 epoch 10 - iter 144/1445 - loss 0.00921785 - time (sec): 7.46 - samples/sec: 2427.67 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 11:45:33,349 epoch 10 - iter 288/1445 - loss 0.00998994 - time (sec): 14.78 - samples/sec: 2411.98 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 11:45:40,783 epoch 10 - iter 432/1445 - loss 0.01085145 - time (sec): 22.21 - samples/sec: 2398.82 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 11:45:48,868 epoch 10 - iter 576/1445 - loss 0.00896974 - time (sec): 30.30 - samples/sec: 2370.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 11:45:56,076 epoch 10 - iter 720/1445 - loss 0.00775903 - time (sec): 37.51 - samples/sec: 2379.75 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 11:46:03,265 epoch 10 - iter 864/1445 - loss 0.00776058 - time (sec): 44.69 - samples/sec: 2389.41 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 11:46:10,672 epoch 10 - iter 1008/1445 - loss 0.00775701 - time (sec): 52.10 - samples/sec: 2385.87 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 11:46:17,927 epoch 10 - iter 1152/1445 - loss 0.00812453 - time (sec): 59.36 - samples/sec: 2383.52 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 11:46:25,194 epoch 10 - iter 1296/1445 - loss 0.00784918 - time (sec): 66.62 - samples/sec: 2372.11 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 11:46:32,424 epoch 10 - iter 1440/1445 - loss 0.00772587 - time (sec): 73.85 - samples/sec: 2377.16 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 11:46:32,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:46:32,704 EPOCH 10 done: loss 0.0077 - lr: 0.000000 |
|
2023-10-14 11:46:36,251 DEV : loss 0.1970515102148056 - f1-score (micro avg) 0.806 |
|
2023-10-14 11:46:36,663 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 11:46:36,665 Loading model from best epoch ... |
|
2023-10-14 11:46:38,496 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-14 11:46:42,384 |
|
Results: |
|
- F-score (micro) 0.8044 |
|
- F-score (macro) 0.6972 |
|
- Accuracy 0.6823 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8552 0.7842 0.8182 482 |
|
LOC 0.8819 0.7991 0.8385 458 |
|
ORG 0.5435 0.3623 0.4348 69 |
|
|
|
micro avg 0.8516 0.7621 0.8044 1009 |
|
macro avg 0.7602 0.6486 0.6972 1009 |
|
weighted avg 0.8460 0.7621 0.8012 1009 |
|
|
|
2023-10-14 11:46:42,384 ---------------------------------------------------------------------------------------------------- |
|
|