|
2023-10-18 19:35:14,228 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,228 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences |
|
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 Train: 5901 sentences |
|
2023-10-18 19:35:14,229 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 Training Params: |
|
2023-10-18 19:35:14,229 - learning_rate: "5e-05" |
|
2023-10-18 19:35:14,229 - mini_batch_size: "8" |
|
2023-10-18 19:35:14,229 - max_epochs: "10" |
|
2023-10-18 19:35:14,229 - shuffle: "True" |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 Plugins: |
|
2023-10-18 19:35:14,229 - TensorboardLogger |
|
2023-10-18 19:35:14,229 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 19:35:14,229 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 Computation: |
|
2023-10-18 19:35:14,229 - compute on device: cuda:0 |
|
2023-10-18 19:35:14,229 - embedding storage: none |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:14,229 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 19:35:15,905 epoch 1 - iter 73/738 - loss 3.63053485 - time (sec): 1.68 - samples/sec: 9643.96 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 19:35:17,651 epoch 1 - iter 146/738 - loss 3.31461352 - time (sec): 3.42 - samples/sec: 9489.77 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 19:35:19,344 epoch 1 - iter 219/738 - loss 2.83008921 - time (sec): 5.11 - samples/sec: 10079.98 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 19:35:20,933 epoch 1 - iter 292/738 - loss 2.38810325 - time (sec): 6.70 - samples/sec: 10086.77 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 19:35:22,580 epoch 1 - iter 365/738 - loss 2.07283111 - time (sec): 8.35 - samples/sec: 9945.62 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 19:35:24,282 epoch 1 - iter 438/738 - loss 1.86103924 - time (sec): 10.05 - samples/sec: 9816.05 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 19:35:25,969 epoch 1 - iter 511/738 - loss 1.68520947 - time (sec): 11.74 - samples/sec: 9785.13 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 19:35:27,670 epoch 1 - iter 584/738 - loss 1.56226971 - time (sec): 13.44 - samples/sec: 9682.86 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 19:35:29,371 epoch 1 - iter 657/738 - loss 1.45396249 - time (sec): 15.14 - samples/sec: 9670.98 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 19:35:31,169 epoch 1 - iter 730/738 - loss 1.34404621 - time (sec): 16.94 - samples/sec: 9723.05 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 19:35:31,364 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:31,365 EPOCH 1 done: loss 1.3350 - lr: 0.000049 |
|
2023-10-18 19:35:34,207 DEV : loss 0.40549758076667786 - f1-score (micro avg) 0.1417 |
|
2023-10-18 19:35:34,233 saving best model |
|
2023-10-18 19:35:34,261 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:35,917 epoch 2 - iter 73/738 - loss 0.48249485 - time (sec): 1.66 - samples/sec: 9133.11 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 19:35:37,765 epoch 2 - iter 146/738 - loss 0.48196119 - time (sec): 3.50 - samples/sec: 9363.41 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 19:35:39,549 epoch 2 - iter 219/738 - loss 0.49088316 - time (sec): 5.29 - samples/sec: 9368.74 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 19:35:41,233 epoch 2 - iter 292/738 - loss 0.48042137 - time (sec): 6.97 - samples/sec: 9322.75 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 19:35:42,926 epoch 2 - iter 365/738 - loss 0.47661632 - time (sec): 8.66 - samples/sec: 9289.18 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 19:35:44,644 epoch 2 - iter 438/738 - loss 0.46395621 - time (sec): 10.38 - samples/sec: 9235.60 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 19:35:46,397 epoch 2 - iter 511/738 - loss 0.45934458 - time (sec): 12.14 - samples/sec: 9229.27 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 19:35:48,158 epoch 2 - iter 584/738 - loss 0.44792016 - time (sec): 13.90 - samples/sec: 9274.20 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 19:35:49,967 epoch 2 - iter 657/738 - loss 0.43639841 - time (sec): 15.71 - samples/sec: 9384.18 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 19:35:51,795 epoch 2 - iter 730/738 - loss 0.42423828 - time (sec): 17.53 - samples/sec: 9405.78 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 19:35:51,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:35:51,976 EPOCH 2 done: loss 0.4245 - lr: 0.000045 |
|
2023-10-18 19:35:59,168 DEV : loss 0.30590692162513733 - f1-score (micro avg) 0.3575 |
|
2023-10-18 19:35:59,194 saving best model |
|
2023-10-18 19:35:59,225 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:36:00,923 epoch 3 - iter 73/738 - loss 0.35205040 - time (sec): 1.70 - samples/sec: 9041.46 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 19:36:02,700 epoch 3 - iter 146/738 - loss 0.35036399 - time (sec): 3.47 - samples/sec: 9224.94 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 19:36:04,385 epoch 3 - iter 219/738 - loss 0.35036463 - time (sec): 5.16 - samples/sec: 9396.68 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 19:36:06,189 epoch 3 - iter 292/738 - loss 0.36990703 - time (sec): 6.96 - samples/sec: 9441.02 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 19:36:07,944 epoch 3 - iter 365/738 - loss 0.36280215 - time (sec): 8.72 - samples/sec: 9519.15 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 19:36:09,709 epoch 3 - iter 438/738 - loss 0.36446324 - time (sec): 10.48 - samples/sec: 9562.94 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 19:36:11,426 epoch 3 - iter 511/738 - loss 0.35898419 - time (sec): 12.20 - samples/sec: 9548.49 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 19:36:13,197 epoch 3 - iter 584/738 - loss 0.36163452 - time (sec): 13.97 - samples/sec: 9446.61 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 19:36:14,984 epoch 3 - iter 657/738 - loss 0.35837866 - time (sec): 15.76 - samples/sec: 9393.20 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 19:36:16,752 epoch 3 - iter 730/738 - loss 0.35598337 - time (sec): 17.53 - samples/sec: 9385.10 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 19:36:16,942 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:36:16,943 EPOCH 3 done: loss 0.3555 - lr: 0.000039 |
|
2023-10-18 19:36:24,213 DEV : loss 0.2731035351753235 - f1-score (micro avg) 0.4159 |
|
2023-10-18 19:36:24,239 saving best model |
|
2023-10-18 19:36:24,273 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:36:25,980 epoch 4 - iter 73/738 - loss 0.33208577 - time (sec): 1.71 - samples/sec: 10888.66 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 19:36:27,605 epoch 4 - iter 146/738 - loss 0.33198456 - time (sec): 3.33 - samples/sec: 10399.76 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 19:36:29,773 epoch 4 - iter 219/738 - loss 0.32821079 - time (sec): 5.50 - samples/sec: 9152.13 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 19:36:31,500 epoch 4 - iter 292/738 - loss 0.31868276 - time (sec): 7.23 - samples/sec: 9173.61 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 19:36:33,246 epoch 4 - iter 365/738 - loss 0.31775783 - time (sec): 8.97 - samples/sec: 9172.47 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 19:36:34,985 epoch 4 - iter 438/738 - loss 0.31981964 - time (sec): 10.71 - samples/sec: 9016.36 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 19:36:36,920 epoch 4 - iter 511/738 - loss 0.31761936 - time (sec): 12.65 - samples/sec: 9120.41 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 19:36:38,704 epoch 4 - iter 584/738 - loss 0.31202084 - time (sec): 14.43 - samples/sec: 9234.98 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 19:36:40,430 epoch 4 - iter 657/738 - loss 0.30850835 - time (sec): 16.16 - samples/sec: 9196.80 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 19:36:42,105 epoch 4 - iter 730/738 - loss 0.30722893 - time (sec): 17.83 - samples/sec: 9230.97 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 19:36:42,288 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:36:42,288 EPOCH 4 done: loss 0.3079 - lr: 0.000033 |
|
2023-10-18 19:36:49,585 DEV : loss 0.25223150849342346 - f1-score (micro avg) 0.4639 |
|
2023-10-18 19:36:49,611 saving best model |
|
2023-10-18 19:36:49,646 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:36:51,309 epoch 5 - iter 73/738 - loss 0.27083994 - time (sec): 1.66 - samples/sec: 9809.46 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 19:36:53,221 epoch 5 - iter 146/738 - loss 0.28575352 - time (sec): 3.57 - samples/sec: 9894.09 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 19:36:55,037 epoch 5 - iter 219/738 - loss 0.29104358 - time (sec): 5.39 - samples/sec: 9435.27 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 19:36:56,910 epoch 5 - iter 292/738 - loss 0.29447822 - time (sec): 7.26 - samples/sec: 9395.39 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 19:36:58,691 epoch 5 - iter 365/738 - loss 0.28951430 - time (sec): 9.04 - samples/sec: 9381.87 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 19:37:00,565 epoch 5 - iter 438/738 - loss 0.28515640 - time (sec): 10.92 - samples/sec: 9393.50 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 19:37:02,394 epoch 5 - iter 511/738 - loss 0.28632033 - time (sec): 12.75 - samples/sec: 9302.52 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 19:37:04,089 epoch 5 - iter 584/738 - loss 0.28383735 - time (sec): 14.44 - samples/sec: 9244.53 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 19:37:05,827 epoch 5 - iter 657/738 - loss 0.28272056 - time (sec): 16.18 - samples/sec: 9287.29 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 19:37:07,587 epoch 5 - iter 730/738 - loss 0.28285932 - time (sec): 17.94 - samples/sec: 9191.28 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 19:37:07,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:37:07,764 EPOCH 5 done: loss 0.2826 - lr: 0.000028 |
|
2023-10-18 19:37:15,047 DEV : loss 0.2452479600906372 - f1-score (micro avg) 0.4868 |
|
2023-10-18 19:37:15,074 saving best model |
|
2023-10-18 19:37:15,108 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:37:17,025 epoch 6 - iter 73/738 - loss 0.26800115 - time (sec): 1.92 - samples/sec: 7865.52 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 19:37:18,868 epoch 6 - iter 146/738 - loss 0.26661344 - time (sec): 3.76 - samples/sec: 8541.95 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 19:37:20,690 epoch 6 - iter 219/738 - loss 0.26725341 - time (sec): 5.58 - samples/sec: 8809.44 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 19:37:22,492 epoch 6 - iter 292/738 - loss 0.25507698 - time (sec): 7.38 - samples/sec: 9103.63 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 19:37:24,351 epoch 6 - iter 365/738 - loss 0.26162726 - time (sec): 9.24 - samples/sec: 9208.12 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 19:37:26,092 epoch 6 - iter 438/738 - loss 0.26717380 - time (sec): 10.98 - samples/sec: 9206.13 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 19:37:27,845 epoch 6 - iter 511/738 - loss 0.26684394 - time (sec): 12.74 - samples/sec: 9111.20 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 19:37:29,582 epoch 6 - iter 584/738 - loss 0.26557866 - time (sec): 14.47 - samples/sec: 9096.35 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 19:37:31,387 epoch 6 - iter 657/738 - loss 0.26549689 - time (sec): 16.28 - samples/sec: 9164.46 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 19:37:33,090 epoch 6 - iter 730/738 - loss 0.26221669 - time (sec): 17.98 - samples/sec: 9152.14 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 19:37:33,270 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:37:33,270 EPOCH 6 done: loss 0.2617 - lr: 0.000022 |
|
2023-10-18 19:37:40,546 DEV : loss 0.2362372875213623 - f1-score (micro avg) 0.4943 |
|
2023-10-18 19:37:40,572 saving best model |
|
2023-10-18 19:37:40,607 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:37:41,825 epoch 7 - iter 73/738 - loss 0.23175227 - time (sec): 1.22 - samples/sec: 12904.32 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 19:37:43,384 epoch 7 - iter 146/738 - loss 0.22300055 - time (sec): 2.78 - samples/sec: 11392.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 19:37:45,124 epoch 7 - iter 219/738 - loss 0.22803988 - time (sec): 4.52 - samples/sec: 10310.56 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 19:37:46,877 epoch 7 - iter 292/738 - loss 0.22994480 - time (sec): 6.27 - samples/sec: 10064.78 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 19:37:48,594 epoch 7 - iter 365/738 - loss 0.22861585 - time (sec): 7.99 - samples/sec: 9877.93 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 19:37:50,341 epoch 7 - iter 438/738 - loss 0.23000216 - time (sec): 9.73 - samples/sec: 9765.22 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 19:37:52,098 epoch 7 - iter 511/738 - loss 0.23097716 - time (sec): 11.49 - samples/sec: 9786.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 19:37:53,823 epoch 7 - iter 584/738 - loss 0.23756133 - time (sec): 13.22 - samples/sec: 9760.59 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 19:37:55,721 epoch 7 - iter 657/738 - loss 0.24091796 - time (sec): 15.11 - samples/sec: 9703.60 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 19:37:57,573 epoch 7 - iter 730/738 - loss 0.24460650 - time (sec): 16.97 - samples/sec: 9674.26 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 19:37:57,801 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:37:57,801 EPOCH 7 done: loss 0.2457 - lr: 0.000017 |
|
2023-10-18 19:38:05,080 DEV : loss 0.2305789589881897 - f1-score (micro avg) 0.5163 |
|
2023-10-18 19:38:05,106 saving best model |
|
2023-10-18 19:38:05,140 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:38:07,036 epoch 8 - iter 73/738 - loss 0.24432992 - time (sec): 1.90 - samples/sec: 10013.14 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 19:38:08,702 epoch 8 - iter 146/738 - loss 0.24646021 - time (sec): 3.56 - samples/sec: 9552.37 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 19:38:10,500 epoch 8 - iter 219/738 - loss 0.23941772 - time (sec): 5.36 - samples/sec: 9484.96 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 19:38:12,026 epoch 8 - iter 292/738 - loss 0.23803846 - time (sec): 6.89 - samples/sec: 9673.31 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 19:38:13,459 epoch 8 - iter 365/738 - loss 0.23848125 - time (sec): 8.32 - samples/sec: 9866.35 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 19:38:15,016 epoch 8 - iter 438/738 - loss 0.23643961 - time (sec): 9.88 - samples/sec: 9904.13 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 19:38:16,844 epoch 8 - iter 511/738 - loss 0.23715027 - time (sec): 11.70 - samples/sec: 9762.36 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 19:38:18,645 epoch 8 - iter 584/738 - loss 0.23740836 - time (sec): 13.50 - samples/sec: 9658.65 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 19:38:20,414 epoch 8 - iter 657/738 - loss 0.23666139 - time (sec): 15.27 - samples/sec: 9647.63 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 19:38:22,159 epoch 8 - iter 730/738 - loss 0.23604389 - time (sec): 17.02 - samples/sec: 9637.66 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 19:38:22,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:38:22,370 EPOCH 8 done: loss 0.2368 - lr: 0.000011 |
|
2023-10-18 19:38:29,599 DEV : loss 0.22756943106651306 - f1-score (micro avg) 0.5194 |
|
2023-10-18 19:38:29,628 saving best model |
|
2023-10-18 19:38:29,662 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:38:31,966 epoch 9 - iter 73/738 - loss 0.25562055 - time (sec): 2.30 - samples/sec: 7984.22 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 19:38:33,683 epoch 9 - iter 146/738 - loss 0.23148464 - time (sec): 4.02 - samples/sec: 8712.38 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 19:38:35,475 epoch 9 - iter 219/738 - loss 0.22642035 - time (sec): 5.81 - samples/sec: 9208.13 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 19:38:37,287 epoch 9 - iter 292/738 - loss 0.22311580 - time (sec): 7.62 - samples/sec: 9207.71 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 19:38:39,000 epoch 9 - iter 365/738 - loss 0.22831720 - time (sec): 9.34 - samples/sec: 9112.28 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 19:38:40,709 epoch 9 - iter 438/738 - loss 0.22852117 - time (sec): 11.05 - samples/sec: 9105.26 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 19:38:42,477 epoch 9 - iter 511/738 - loss 0.22838509 - time (sec): 12.81 - samples/sec: 9069.26 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 19:38:44,262 epoch 9 - iter 584/738 - loss 0.22970756 - time (sec): 14.60 - samples/sec: 9025.39 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 19:38:45,973 epoch 9 - iter 657/738 - loss 0.22786409 - time (sec): 16.31 - samples/sec: 9048.38 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 19:38:47,790 epoch 9 - iter 730/738 - loss 0.22652181 - time (sec): 18.13 - samples/sec: 9103.82 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 19:38:47,959 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:38:47,959 EPOCH 9 done: loss 0.2268 - lr: 0.000006 |
|
2023-10-18 19:38:55,217 DEV : loss 0.23068448901176453 - f1-score (micro avg) 0.5154 |
|
2023-10-18 19:38:55,244 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:38:56,971 epoch 10 - iter 73/738 - loss 0.20547344 - time (sec): 1.73 - samples/sec: 9235.33 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 19:38:58,664 epoch 10 - iter 146/738 - loss 0.21960310 - time (sec): 3.42 - samples/sec: 9475.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 19:39:00,598 epoch 10 - iter 219/738 - loss 0.23832641 - time (sec): 5.35 - samples/sec: 9581.14 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 19:39:02,390 epoch 10 - iter 292/738 - loss 0.23700380 - time (sec): 7.14 - samples/sec: 9527.83 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 19:39:04,112 epoch 10 - iter 365/738 - loss 0.22774723 - time (sec): 8.87 - samples/sec: 9450.00 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 19:39:05,879 epoch 10 - iter 438/738 - loss 0.22541519 - time (sec): 10.63 - samples/sec: 9287.51 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 19:39:07,708 epoch 10 - iter 511/738 - loss 0.22318112 - time (sec): 12.46 - samples/sec: 9359.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 19:39:09,458 epoch 10 - iter 584/738 - loss 0.22569354 - time (sec): 14.21 - samples/sec: 9335.60 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 19:39:11,291 epoch 10 - iter 657/738 - loss 0.22324440 - time (sec): 16.05 - samples/sec: 9361.85 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 19:39:13,017 epoch 10 - iter 730/738 - loss 0.22492781 - time (sec): 17.77 - samples/sec: 9276.55 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 19:39:13,207 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:39:13,208 EPOCH 10 done: loss 0.2250 - lr: 0.000000 |
|
2023-10-18 19:39:20,487 DEV : loss 0.22922056913375854 - f1-score (micro avg) 0.5225 |
|
2023-10-18 19:39:20,513 saving best model |
|
2023-10-18 19:39:20,574 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 19:39:20,574 Loading model from best epoch ... |
|
2023-10-18 19:39:20,651 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod |
|
2023-10-18 19:39:23,289 |
|
Results: |
|
- F-score (micro) 0.533 |
|
- F-score (macro) 0.3321 |
|
- Accuracy 0.385 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.5560 0.7517 0.6392 858 |
|
pers 0.4071 0.4898 0.4446 537 |
|
org 0.2941 0.0758 0.1205 132 |
|
time 0.4333 0.4815 0.4561 54 |
|
prod 0.0000 0.0000 0.0000 61 |
|
|
|
micro avg 0.4968 0.5749 0.5330 1642 |
|
macro avg 0.3381 0.3597 0.3321 1642 |
|
weighted avg 0.4616 0.5749 0.5041 1642 |
|
|
|
2023-10-18 19:39:23,289 ---------------------------------------------------------------------------------------------------- |
|
|