|
2023-10-14 08:01:57,233 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,234 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 08:01:57,234 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,235 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-14 08:01:57,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,235 Train: 5777 sentences |
|
2023-10-14 08:01:57,235 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 08:01:57,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,235 Training Params: |
|
2023-10-14 08:01:57,235 - learning_rate: "3e-05" |
|
2023-10-14 08:01:57,235 - mini_batch_size: "4" |
|
2023-10-14 08:01:57,235 - max_epochs: "10" |
|
2023-10-14 08:01:57,235 - shuffle: "True" |
|
2023-10-14 08:01:57,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,235 Plugins: |
|
2023-10-14 08:01:57,235 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 08:01:57,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,235 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 08:01:57,235 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 08:01:57,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,235 Computation: |
|
2023-10-14 08:01:57,235 - compute on device: cuda:0 |
|
2023-10-14 08:01:57,235 - embedding storage: none |
|
2023-10-14 08:01:57,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,235 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-14 08:01:57,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:01:57,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:02:05,605 epoch 1 - iter 144/1445 - loss 1.98365673 - time (sec): 8.37 - samples/sec: 2024.95 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 08:02:12,644 epoch 1 - iter 288/1445 - loss 1.12353678 - time (sec): 15.41 - samples/sec: 2193.16 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 08:02:19,943 epoch 1 - iter 432/1445 - loss 0.80002002 - time (sec): 22.71 - samples/sec: 2300.53 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 08:02:27,153 epoch 1 - iter 576/1445 - loss 0.64820380 - time (sec): 29.92 - samples/sec: 2334.66 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 08:02:34,153 epoch 1 - iter 720/1445 - loss 0.55704517 - time (sec): 36.92 - samples/sec: 2349.72 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 08:02:41,087 epoch 1 - iter 864/1445 - loss 0.50191366 - time (sec): 43.85 - samples/sec: 2331.26 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 08:02:48,174 epoch 1 - iter 1008/1445 - loss 0.45051786 - time (sec): 50.94 - samples/sec: 2371.87 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 08:02:55,702 epoch 1 - iter 1152/1445 - loss 0.41254886 - time (sec): 58.47 - samples/sec: 2374.94 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 08:03:02,970 epoch 1 - iter 1296/1445 - loss 0.38359668 - time (sec): 65.73 - samples/sec: 2386.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 08:03:10,330 epoch 1 - iter 1440/1445 - loss 0.35707304 - time (sec): 73.09 - samples/sec: 2401.86 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 08:03:10,587 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:03:10,587 EPOCH 1 done: loss 0.3561 - lr: 0.000030 |
|
2023-10-14 08:03:13,488 DEV : loss 0.15405000746250153 - f1-score (micro avg) 0.5446 |
|
2023-10-14 08:03:13,502 saving best model |
|
2023-10-14 08:03:13,875 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:03:21,212 epoch 2 - iter 144/1445 - loss 0.14057591 - time (sec): 7.34 - samples/sec: 2367.01 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 08:03:28,665 epoch 2 - iter 288/1445 - loss 0.13227084 - time (sec): 14.79 - samples/sec: 2328.11 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 08:03:36,066 epoch 2 - iter 432/1445 - loss 0.13180492 - time (sec): 22.19 - samples/sec: 2347.30 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 08:03:43,233 epoch 2 - iter 576/1445 - loss 0.12555163 - time (sec): 29.36 - samples/sec: 2360.57 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 08:03:50,688 epoch 2 - iter 720/1445 - loss 0.12318631 - time (sec): 36.81 - samples/sec: 2381.45 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 08:03:57,836 epoch 2 - iter 864/1445 - loss 0.12027431 - time (sec): 43.96 - samples/sec: 2382.69 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 08:04:05,205 epoch 2 - iter 1008/1445 - loss 0.12246549 - time (sec): 51.33 - samples/sec: 2391.19 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-14 08:04:12,194 epoch 2 - iter 1152/1445 - loss 0.11929573 - time (sec): 58.32 - samples/sec: 2377.13 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 08:04:19,819 epoch 2 - iter 1296/1445 - loss 0.11665455 - time (sec): 65.94 - samples/sec: 2393.24 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 08:04:27,010 epoch 2 - iter 1440/1445 - loss 0.11480563 - time (sec): 73.13 - samples/sec: 2401.85 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 08:04:27,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:04:27,254 EPOCH 2 done: loss 0.1148 - lr: 0.000027 |
|
2023-10-14 08:04:31,083 DEV : loss 0.1282866895198822 - f1-score (micro avg) 0.7047 |
|
2023-10-14 08:04:31,097 saving best model |
|
2023-10-14 08:04:31,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:04:38,949 epoch 3 - iter 144/1445 - loss 0.08251436 - time (sec): 7.32 - samples/sec: 2420.04 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 08:04:46,376 epoch 3 - iter 288/1445 - loss 0.07784727 - time (sec): 14.75 - samples/sec: 2411.42 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 08:04:53,664 epoch 3 - iter 432/1445 - loss 0.07783536 - time (sec): 22.04 - samples/sec: 2414.24 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-14 08:05:00,965 epoch 3 - iter 576/1445 - loss 0.07514099 - time (sec): 29.34 - samples/sec: 2414.20 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 08:05:08,267 epoch 3 - iter 720/1445 - loss 0.07581881 - time (sec): 36.64 - samples/sec: 2416.46 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 08:05:15,292 epoch 3 - iter 864/1445 - loss 0.07472479 - time (sec): 43.66 - samples/sec: 2430.01 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 08:05:22,640 epoch 3 - iter 1008/1445 - loss 0.07638855 - time (sec): 51.01 - samples/sec: 2411.58 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 08:05:29,996 epoch 3 - iter 1152/1445 - loss 0.07480564 - time (sec): 58.37 - samples/sec: 2419.81 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 08:05:37,136 epoch 3 - iter 1296/1445 - loss 0.07451773 - time (sec): 65.51 - samples/sec: 2425.62 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 08:05:44,547 epoch 3 - iter 1440/1445 - loss 0.07377630 - time (sec): 72.92 - samples/sec: 2410.74 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 08:05:44,765 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:05:44,765 EPOCH 3 done: loss 0.0738 - lr: 0.000023 |
|
2023-10-14 08:05:48,189 DEV : loss 0.10148890316486359 - f1-score (micro avg) 0.79 |
|
2023-10-14 08:05:48,203 saving best model |
|
2023-10-14 08:05:48,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:05:56,573 epoch 4 - iter 144/1445 - loss 0.04469757 - time (sec): 7.85 - samples/sec: 2296.48 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 08:06:03,927 epoch 4 - iter 288/1445 - loss 0.04147054 - time (sec): 15.21 - samples/sec: 2407.13 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 08:06:11,043 epoch 4 - iter 432/1445 - loss 0.04588436 - time (sec): 22.32 - samples/sec: 2382.90 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 08:06:18,317 epoch 4 - iter 576/1445 - loss 0.04687950 - time (sec): 29.60 - samples/sec: 2408.53 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 08:06:25,425 epoch 4 - iter 720/1445 - loss 0.05047665 - time (sec): 36.71 - samples/sec: 2414.44 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 08:06:32,454 epoch 4 - iter 864/1445 - loss 0.05060324 - time (sec): 43.73 - samples/sec: 2406.39 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 08:06:39,525 epoch 4 - iter 1008/1445 - loss 0.04884015 - time (sec): 50.81 - samples/sec: 2400.36 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 08:06:46,972 epoch 4 - iter 1152/1445 - loss 0.04928212 - time (sec): 58.25 - samples/sec: 2411.64 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 08:06:54,267 epoch 4 - iter 1296/1445 - loss 0.05041401 - time (sec): 65.55 - samples/sec: 2399.42 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 08:07:01,642 epoch 4 - iter 1440/1445 - loss 0.05090107 - time (sec): 72.92 - samples/sec: 2409.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 08:07:01,879 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:07:01,879 EPOCH 4 done: loss 0.0513 - lr: 0.000020 |
|
2023-10-14 08:07:05,346 DEV : loss 0.12521180510520935 - f1-score (micro avg) 0.7713 |
|
2023-10-14 08:07:05,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:07:12,830 epoch 5 - iter 144/1445 - loss 0.03689043 - time (sec): 7.47 - samples/sec: 2376.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 08:07:20,377 epoch 5 - iter 288/1445 - loss 0.04141324 - time (sec): 15.02 - samples/sec: 2371.46 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 08:07:27,312 epoch 5 - iter 432/1445 - loss 0.03820414 - time (sec): 21.95 - samples/sec: 2390.95 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 08:07:34,526 epoch 5 - iter 576/1445 - loss 0.03805636 - time (sec): 29.16 - samples/sec: 2397.87 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 08:07:41,652 epoch 5 - iter 720/1445 - loss 0.03728520 - time (sec): 36.29 - samples/sec: 2415.63 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 08:07:49,124 epoch 5 - iter 864/1445 - loss 0.03609096 - time (sec): 43.76 - samples/sec: 2419.45 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 08:07:56,226 epoch 5 - iter 1008/1445 - loss 0.03551319 - time (sec): 50.86 - samples/sec: 2412.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 08:08:03,611 epoch 5 - iter 1152/1445 - loss 0.03524412 - time (sec): 58.25 - samples/sec: 2413.61 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 08:08:11,110 epoch 5 - iter 1296/1445 - loss 0.03695923 - time (sec): 65.75 - samples/sec: 2410.80 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 08:08:18,082 epoch 5 - iter 1440/1445 - loss 0.03728152 - time (sec): 72.72 - samples/sec: 2413.40 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 08:08:18,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:08:18,376 EPOCH 5 done: loss 0.0372 - lr: 0.000017 |
|
2023-10-14 08:08:22,161 DEV : loss 0.17038105428218842 - f1-score (micro avg) 0.7608 |
|
2023-10-14 08:08:22,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:08:29,368 epoch 6 - iter 144/1445 - loss 0.02545181 - time (sec): 7.19 - samples/sec: 2500.25 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 08:08:36,614 epoch 6 - iter 288/1445 - loss 0.02562755 - time (sec): 14.44 - samples/sec: 2504.38 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 08:08:43,778 epoch 6 - iter 432/1445 - loss 0.02526942 - time (sec): 21.60 - samples/sec: 2499.22 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 08:08:51,447 epoch 6 - iter 576/1445 - loss 0.02775801 - time (sec): 29.27 - samples/sec: 2455.98 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 08:08:58,506 epoch 6 - iter 720/1445 - loss 0.02706181 - time (sec): 36.33 - samples/sec: 2439.93 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 08:09:05,788 epoch 6 - iter 864/1445 - loss 0.02599246 - time (sec): 43.61 - samples/sec: 2421.51 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 08:09:13,141 epoch 6 - iter 1008/1445 - loss 0.02794849 - time (sec): 50.96 - samples/sec: 2420.40 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 08:09:20,518 epoch 6 - iter 1152/1445 - loss 0.02869252 - time (sec): 58.34 - samples/sec: 2428.10 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 08:09:27,666 epoch 6 - iter 1296/1445 - loss 0.02856538 - time (sec): 65.49 - samples/sec: 2417.95 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 08:09:34,870 epoch 6 - iter 1440/1445 - loss 0.02820532 - time (sec): 72.69 - samples/sec: 2416.69 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 08:09:35,113 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:09:35,113 EPOCH 6 done: loss 0.0282 - lr: 0.000013 |
|
2023-10-14 08:09:38,563 DEV : loss 0.1647178679704666 - f1-score (micro avg) 0.7804 |
|
2023-10-14 08:09:38,579 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:09:45,895 epoch 7 - iter 144/1445 - loss 0.01300211 - time (sec): 7.31 - samples/sec: 2379.65 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 08:09:53,075 epoch 7 - iter 288/1445 - loss 0.01696728 - time (sec): 14.50 - samples/sec: 2354.75 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 08:10:00,712 epoch 7 - iter 432/1445 - loss 0.01849889 - time (sec): 22.13 - samples/sec: 2370.41 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 08:10:07,982 epoch 7 - iter 576/1445 - loss 0.01797361 - time (sec): 29.40 - samples/sec: 2398.07 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 08:10:15,199 epoch 7 - iter 720/1445 - loss 0.01798847 - time (sec): 36.62 - samples/sec: 2403.56 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 08:10:22,580 epoch 7 - iter 864/1445 - loss 0.01952435 - time (sec): 44.00 - samples/sec: 2420.08 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 08:10:29,751 epoch 7 - iter 1008/1445 - loss 0.01984725 - time (sec): 51.17 - samples/sec: 2407.97 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 08:10:37,085 epoch 7 - iter 1152/1445 - loss 0.01930156 - time (sec): 58.51 - samples/sec: 2417.50 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 08:10:44,067 epoch 7 - iter 1296/1445 - loss 0.01973561 - time (sec): 65.49 - samples/sec: 2413.04 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 08:10:51,271 epoch 7 - iter 1440/1445 - loss 0.01980517 - time (sec): 72.69 - samples/sec: 2417.01 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 08:10:51,497 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:10:51,497 EPOCH 7 done: loss 0.0198 - lr: 0.000010 |
|
2023-10-14 08:10:54,970 DEV : loss 0.1839003711938858 - f1-score (micro avg) 0.7965 |
|
2023-10-14 08:10:54,986 saving best model |
|
2023-10-14 08:10:55,568 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:11:02,709 epoch 8 - iter 144/1445 - loss 0.01405551 - time (sec): 7.14 - samples/sec: 2442.76 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 08:11:10,323 epoch 8 - iter 288/1445 - loss 0.01175523 - time (sec): 14.75 - samples/sec: 2391.64 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 08:11:17,600 epoch 8 - iter 432/1445 - loss 0.01129704 - time (sec): 22.03 - samples/sec: 2402.18 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 08:11:25,174 epoch 8 - iter 576/1445 - loss 0.01292683 - time (sec): 29.60 - samples/sec: 2370.48 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 08:11:32,582 epoch 8 - iter 720/1445 - loss 0.01230325 - time (sec): 37.01 - samples/sec: 2406.86 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 08:11:39,737 epoch 8 - iter 864/1445 - loss 0.01183301 - time (sec): 44.17 - samples/sec: 2397.89 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 08:11:46,776 epoch 8 - iter 1008/1445 - loss 0.01208905 - time (sec): 51.21 - samples/sec: 2412.73 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-14 08:11:53,875 epoch 8 - iter 1152/1445 - loss 0.01318623 - time (sec): 58.31 - samples/sec: 2403.05 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 08:12:01,512 epoch 8 - iter 1296/1445 - loss 0.01358646 - time (sec): 65.94 - samples/sec: 2402.71 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 08:12:08,839 epoch 8 - iter 1440/1445 - loss 0.01334733 - time (sec): 73.27 - samples/sec: 2400.15 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 08:12:09,064 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:12:09,064 EPOCH 8 done: loss 0.0133 - lr: 0.000007 |
|
2023-10-14 08:12:12,861 DEV : loss 0.18413744866847992 - f1-score (micro avg) 0.7987 |
|
2023-10-14 08:12:12,876 saving best model |
|
2023-10-14 08:12:13,455 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:12:20,659 epoch 9 - iter 144/1445 - loss 0.01146554 - time (sec): 7.20 - samples/sec: 2530.75 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 08:12:28,224 epoch 9 - iter 288/1445 - loss 0.01300968 - time (sec): 14.77 - samples/sec: 2489.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 08:12:35,547 epoch 9 - iter 432/1445 - loss 0.01004910 - time (sec): 22.09 - samples/sec: 2531.98 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 08:12:42,577 epoch 9 - iter 576/1445 - loss 0.00980441 - time (sec): 29.12 - samples/sec: 2477.06 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 08:12:50,040 epoch 9 - iter 720/1445 - loss 0.00958906 - time (sec): 36.58 - samples/sec: 2482.19 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 08:12:56,931 epoch 9 - iter 864/1445 - loss 0.00931059 - time (sec): 43.47 - samples/sec: 2463.93 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 08:13:04,320 epoch 9 - iter 1008/1445 - loss 0.00931590 - time (sec): 50.86 - samples/sec: 2440.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 08:13:11,276 epoch 9 - iter 1152/1445 - loss 0.00917441 - time (sec): 57.82 - samples/sec: 2423.84 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 08:13:18,525 epoch 9 - iter 1296/1445 - loss 0.00925497 - time (sec): 65.07 - samples/sec: 2420.64 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 08:13:25,900 epoch 9 - iter 1440/1445 - loss 0.00952078 - time (sec): 72.44 - samples/sec: 2424.96 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 08:13:26,131 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:13:26,131 EPOCH 9 done: loss 0.0095 - lr: 0.000003 |
|
2023-10-14 08:13:29,573 DEV : loss 0.186563640832901 - f1-score (micro avg) 0.8039 |
|
2023-10-14 08:13:29,588 saving best model |
|
2023-10-14 08:13:30,149 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:13:37,534 epoch 10 - iter 144/1445 - loss 0.00663741 - time (sec): 7.38 - samples/sec: 2488.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 08:13:45,111 epoch 10 - iter 288/1445 - loss 0.00559908 - time (sec): 14.96 - samples/sec: 2318.32 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-14 08:13:52,608 epoch 10 - iter 432/1445 - loss 0.00730824 - time (sec): 22.46 - samples/sec: 2344.28 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 08:14:00,148 epoch 10 - iter 576/1445 - loss 0.00695631 - time (sec): 30.00 - samples/sec: 2363.90 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 08:14:07,251 epoch 10 - iter 720/1445 - loss 0.00719157 - time (sec): 37.10 - samples/sec: 2367.92 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 08:14:14,804 epoch 10 - iter 864/1445 - loss 0.00696365 - time (sec): 44.65 - samples/sec: 2398.27 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 08:14:21,883 epoch 10 - iter 1008/1445 - loss 0.00688362 - time (sec): 51.73 - samples/sec: 2399.15 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 08:14:29,091 epoch 10 - iter 1152/1445 - loss 0.00665468 - time (sec): 58.94 - samples/sec: 2394.67 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 08:14:36,092 epoch 10 - iter 1296/1445 - loss 0.00650502 - time (sec): 65.94 - samples/sec: 2393.15 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 08:14:43,476 epoch 10 - iter 1440/1445 - loss 0.00627816 - time (sec): 73.33 - samples/sec: 2398.42 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 08:14:43,689 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:14:43,689 EPOCH 10 done: loss 0.0063 - lr: 0.000000 |
|
2023-10-14 08:14:47,154 DEV : loss 0.20616155862808228 - f1-score (micro avg) 0.8002 |
|
2023-10-14 08:14:47,596 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:14:47,597 Loading model from best epoch ... |
|
2023-10-14 08:14:49,401 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-14 08:14:52,528 |
|
Results: |
|
- F-score (micro) 0.8004 |
|
- F-score (macro) 0.6926 |
|
- Accuracy 0.6828 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8282 0.8299 0.8290 482 |
|
LOC 0.8786 0.7904 0.8322 458 |
|
ORG 0.4000 0.4348 0.4167 69 |
|
|
|
micro avg 0.8165 0.7849 0.8004 1009 |
|
macro avg 0.7023 0.6850 0.6926 1009 |
|
weighted avg 0.8218 0.7849 0.8023 1009 |
|
|
|
2023-10-14 08:14:52,528 ---------------------------------------------------------------------------------------------------- |
|
|