stefan-it's picture
Upload folder using huggingface_hub
8dbf01a
2023-10-17 19:54:58,087 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,087 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,088 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,088 Train: 5901 sentences
2023-10-17 19:54:58,088 (train_with_dev=False, train_with_test=False)
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,088 Training Params:
2023-10-17 19:54:58,088 - learning_rate: "5e-05"
2023-10-17 19:54:58,088 - mini_batch_size: "8"
2023-10-17 19:54:58,088 - max_epochs: "10"
2023-10-17 19:54:58,088 - shuffle: "True"
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,088 Plugins:
2023-10-17 19:54:58,088 - TensorboardLogger
2023-10-17 19:54:58,088 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,088 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 19:54:58,088 - metric: "('micro avg', 'f1-score')"
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,088 Computation:
2023-10-17 19:54:58,088 - compute on device: cuda:0
2023-10-17 19:54:58,088 - embedding storage: none
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,088 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:58,089 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 19:55:03,268 epoch 1 - iter 73/738 - loss 2.88693034 - time (sec): 5.18 - samples/sec: 3395.57 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:55:07,787 epoch 1 - iter 146/738 - loss 1.85950582 - time (sec): 9.70 - samples/sec: 3389.92 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:55:13,252 epoch 1 - iter 219/738 - loss 1.35106642 - time (sec): 15.16 - samples/sec: 3378.75 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:55:18,790 epoch 1 - iter 292/738 - loss 1.08357586 - time (sec): 20.70 - samples/sec: 3336.43 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:55:23,677 epoch 1 - iter 365/738 - loss 0.92998728 - time (sec): 25.59 - samples/sec: 3324.51 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:55:28,248 epoch 1 - iter 438/738 - loss 0.82953286 - time (sec): 30.16 - samples/sec: 3312.57 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:55:33,045 epoch 1 - iter 511/738 - loss 0.74962079 - time (sec): 34.96 - samples/sec: 3295.52 - lr: 0.000035 - momentum: 0.000000
2023-10-17 19:55:38,046 epoch 1 - iter 584/738 - loss 0.68197633 - time (sec): 39.96 - samples/sec: 3288.35 - lr: 0.000039 - momentum: 0.000000
2023-10-17 19:55:43,219 epoch 1 - iter 657/738 - loss 0.62571720 - time (sec): 45.13 - samples/sec: 3269.21 - lr: 0.000044 - momentum: 0.000000
2023-10-17 19:55:48,468 epoch 1 - iter 730/738 - loss 0.57764771 - time (sec): 50.38 - samples/sec: 3269.02 - lr: 0.000049 - momentum: 0.000000
2023-10-17 19:55:48,961 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:48,961 EPOCH 1 done: loss 0.5727 - lr: 0.000049
2023-10-17 19:55:54,850 DEV : loss 0.10554851591587067 - f1-score (micro avg) 0.7791
2023-10-17 19:55:54,883 saving best model
2023-10-17 19:55:55,250 ----------------------------------------------------------------------------------------------------
2023-10-17 19:56:00,200 epoch 2 - iter 73/738 - loss 0.14514615 - time (sec): 4.95 - samples/sec: 3366.36 - lr: 0.000049 - momentum: 0.000000
2023-10-17 19:56:05,461 epoch 2 - iter 146/738 - loss 0.14280365 - time (sec): 10.21 - samples/sec: 3405.96 - lr: 0.000049 - momentum: 0.000000
2023-10-17 19:56:11,181 epoch 2 - iter 219/738 - loss 0.13414673 - time (sec): 15.93 - samples/sec: 3260.55 - lr: 0.000048 - momentum: 0.000000
2023-10-17 19:56:16,015 epoch 2 - iter 292/738 - loss 0.12870561 - time (sec): 20.76 - samples/sec: 3242.34 - lr: 0.000048 - momentum: 0.000000
2023-10-17 19:56:20,656 epoch 2 - iter 365/738 - loss 0.12649401 - time (sec): 25.40 - samples/sec: 3224.67 - lr: 0.000047 - momentum: 0.000000
2023-10-17 19:56:25,261 epoch 2 - iter 438/738 - loss 0.12411978 - time (sec): 30.01 - samples/sec: 3236.24 - lr: 0.000047 - momentum: 0.000000
2023-10-17 19:56:30,148 epoch 2 - iter 511/738 - loss 0.11950094 - time (sec): 34.90 - samples/sec: 3251.88 - lr: 0.000046 - momentum: 0.000000
2023-10-17 19:56:35,203 epoch 2 - iter 584/738 - loss 0.11924617 - time (sec): 39.95 - samples/sec: 3240.62 - lr: 0.000046 - momentum: 0.000000
2023-10-17 19:56:40,727 epoch 2 - iter 657/738 - loss 0.11885339 - time (sec): 45.48 - samples/sec: 3244.73 - lr: 0.000045 - momentum: 0.000000
2023-10-17 19:56:46,137 epoch 2 - iter 730/738 - loss 0.11757973 - time (sec): 50.89 - samples/sec: 3234.08 - lr: 0.000045 - momentum: 0.000000
2023-10-17 19:56:46,752 ----------------------------------------------------------------------------------------------------
2023-10-17 19:56:46,752 EPOCH 2 done: loss 0.1175 - lr: 0.000045
2023-10-17 19:56:58,026 DEV : loss 0.10420098155736923 - f1-score (micro avg) 0.7988
2023-10-17 19:56:58,058 saving best model
2023-10-17 19:56:58,527 ----------------------------------------------------------------------------------------------------
2023-10-17 19:57:04,268 epoch 3 - iter 73/738 - loss 0.06478791 - time (sec): 5.74 - samples/sec: 3069.26 - lr: 0.000044 - momentum: 0.000000
2023-10-17 19:57:09,479 epoch 3 - iter 146/738 - loss 0.06871097 - time (sec): 10.95 - samples/sec: 3185.38 - lr: 0.000043 - momentum: 0.000000
2023-10-17 19:57:14,560 epoch 3 - iter 219/738 - loss 0.06720314 - time (sec): 16.03 - samples/sec: 3221.49 - lr: 0.000043 - momentum: 0.000000
2023-10-17 19:57:19,385 epoch 3 - iter 292/738 - loss 0.06920884 - time (sec): 20.85 - samples/sec: 3233.71 - lr: 0.000042 - momentum: 0.000000
2023-10-17 19:57:24,326 epoch 3 - iter 365/738 - loss 0.07062373 - time (sec): 25.80 - samples/sec: 3231.62 - lr: 0.000042 - momentum: 0.000000
2023-10-17 19:57:29,283 epoch 3 - iter 438/738 - loss 0.07250042 - time (sec): 30.75 - samples/sec: 3215.39 - lr: 0.000041 - momentum: 0.000000
2023-10-17 19:57:34,689 epoch 3 - iter 511/738 - loss 0.07252047 - time (sec): 36.16 - samples/sec: 3237.02 - lr: 0.000041 - momentum: 0.000000
2023-10-17 19:57:39,815 epoch 3 - iter 584/738 - loss 0.07374542 - time (sec): 41.28 - samples/sec: 3222.24 - lr: 0.000040 - momentum: 0.000000
2023-10-17 19:57:44,747 epoch 3 - iter 657/738 - loss 0.07287253 - time (sec): 46.22 - samples/sec: 3223.47 - lr: 0.000040 - momentum: 0.000000
2023-10-17 19:57:49,392 epoch 3 - iter 730/738 - loss 0.07335412 - time (sec): 50.86 - samples/sec: 3243.97 - lr: 0.000039 - momentum: 0.000000
2023-10-17 19:57:49,824 ----------------------------------------------------------------------------------------------------
2023-10-17 19:57:49,825 EPOCH 3 done: loss 0.0731 - lr: 0.000039
2023-10-17 19:58:01,170 DEV : loss 0.1143854483962059 - f1-score (micro avg) 0.8304
2023-10-17 19:58:01,201 saving best model
2023-10-17 19:58:01,680 ----------------------------------------------------------------------------------------------------
2023-10-17 19:58:06,869 epoch 4 - iter 73/738 - loss 0.05125944 - time (sec): 5.18 - samples/sec: 3063.50 - lr: 0.000038 - momentum: 0.000000
2023-10-17 19:58:12,085 epoch 4 - iter 146/738 - loss 0.04770594 - time (sec): 10.40 - samples/sec: 3221.63 - lr: 0.000038 - momentum: 0.000000
2023-10-17 19:58:16,715 epoch 4 - iter 219/738 - loss 0.05111511 - time (sec): 15.03 - samples/sec: 3250.15 - lr: 0.000037 - momentum: 0.000000
2023-10-17 19:58:21,755 epoch 4 - iter 292/738 - loss 0.05211159 - time (sec): 20.07 - samples/sec: 3238.43 - lr: 0.000037 - momentum: 0.000000
2023-10-17 19:58:26,379 epoch 4 - iter 365/738 - loss 0.05176227 - time (sec): 24.69 - samples/sec: 3228.83 - lr: 0.000036 - momentum: 0.000000
2023-10-17 19:58:31,229 epoch 4 - iter 438/738 - loss 0.05010900 - time (sec): 29.54 - samples/sec: 3263.30 - lr: 0.000036 - momentum: 0.000000
2023-10-17 19:58:35,878 epoch 4 - iter 511/738 - loss 0.04871523 - time (sec): 34.19 - samples/sec: 3282.05 - lr: 0.000035 - momentum: 0.000000
2023-10-17 19:58:41,325 epoch 4 - iter 584/738 - loss 0.04755472 - time (sec): 39.64 - samples/sec: 3277.83 - lr: 0.000035 - momentum: 0.000000
2023-10-17 19:58:46,444 epoch 4 - iter 657/738 - loss 0.04734226 - time (sec): 44.76 - samples/sec: 3265.92 - lr: 0.000034 - momentum: 0.000000
2023-10-17 19:58:52,119 epoch 4 - iter 730/738 - loss 0.04826096 - time (sec): 50.43 - samples/sec: 3266.32 - lr: 0.000033 - momentum: 0.000000
2023-10-17 19:58:52,586 ----------------------------------------------------------------------------------------------------
2023-10-17 19:58:52,587 EPOCH 4 done: loss 0.0482 - lr: 0.000033
2023-10-17 19:59:03,974 DEV : loss 0.14476759731769562 - f1-score (micro avg) 0.8296
2023-10-17 19:59:04,007 ----------------------------------------------------------------------------------------------------
2023-10-17 19:59:09,112 epoch 5 - iter 73/738 - loss 0.02409841 - time (sec): 5.10 - samples/sec: 3478.57 - lr: 0.000033 - momentum: 0.000000
2023-10-17 19:59:13,903 epoch 5 - iter 146/738 - loss 0.02674600 - time (sec): 9.90 - samples/sec: 3380.35 - lr: 0.000032 - momentum: 0.000000
2023-10-17 19:59:18,707 epoch 5 - iter 219/738 - loss 0.03079058 - time (sec): 14.70 - samples/sec: 3371.53 - lr: 0.000032 - momentum: 0.000000
2023-10-17 19:59:23,875 epoch 5 - iter 292/738 - loss 0.03683637 - time (sec): 19.87 - samples/sec: 3340.91 - lr: 0.000031 - momentum: 0.000000
2023-10-17 19:59:28,827 epoch 5 - iter 365/738 - loss 0.03423444 - time (sec): 24.82 - samples/sec: 3340.14 - lr: 0.000031 - momentum: 0.000000
2023-10-17 19:59:33,840 epoch 5 - iter 438/738 - loss 0.03471070 - time (sec): 29.83 - samples/sec: 3336.10 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:59:38,818 epoch 5 - iter 511/738 - loss 0.03388460 - time (sec): 34.81 - samples/sec: 3313.82 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:59:43,507 epoch 5 - iter 584/738 - loss 0.03409148 - time (sec): 39.50 - samples/sec: 3308.22 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:59:48,472 epoch 5 - iter 657/738 - loss 0.03441889 - time (sec): 44.46 - samples/sec: 3313.79 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:59:53,517 epoch 5 - iter 730/738 - loss 0.03505580 - time (sec): 49.51 - samples/sec: 3315.30 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:59:54,306 ----------------------------------------------------------------------------------------------------
2023-10-17 19:59:54,306 EPOCH 5 done: loss 0.0354 - lr: 0.000028
2023-10-17 20:00:05,845 DEV : loss 0.19043707847595215 - f1-score (micro avg) 0.8318
2023-10-17 20:00:05,880 saving best model
2023-10-17 20:00:06,366 ----------------------------------------------------------------------------------------------------
2023-10-17 20:00:11,320 epoch 6 - iter 73/738 - loss 0.03216532 - time (sec): 4.95 - samples/sec: 3176.87 - lr: 0.000027 - momentum: 0.000000
2023-10-17 20:00:16,370 epoch 6 - iter 146/738 - loss 0.02639679 - time (sec): 10.00 - samples/sec: 3286.09 - lr: 0.000027 - momentum: 0.000000
2023-10-17 20:00:21,647 epoch 6 - iter 219/738 - loss 0.02284424 - time (sec): 15.28 - samples/sec: 3237.56 - lr: 0.000026 - momentum: 0.000000
2023-10-17 20:00:27,112 epoch 6 - iter 292/738 - loss 0.02623873 - time (sec): 20.74 - samples/sec: 3148.97 - lr: 0.000026 - momentum: 0.000000
2023-10-17 20:00:32,148 epoch 6 - iter 365/738 - loss 0.02569826 - time (sec): 25.78 - samples/sec: 3159.99 - lr: 0.000025 - momentum: 0.000000
2023-10-17 20:00:36,948 epoch 6 - iter 438/738 - loss 0.02511489 - time (sec): 30.58 - samples/sec: 3174.95 - lr: 0.000025 - momentum: 0.000000
2023-10-17 20:00:42,160 epoch 6 - iter 511/738 - loss 0.02584371 - time (sec): 35.79 - samples/sec: 3177.93 - lr: 0.000024 - momentum: 0.000000
2023-10-17 20:00:47,108 epoch 6 - iter 584/738 - loss 0.02552496 - time (sec): 40.74 - samples/sec: 3212.15 - lr: 0.000023 - momentum: 0.000000
2023-10-17 20:00:51,949 epoch 6 - iter 657/738 - loss 0.02629373 - time (sec): 45.58 - samples/sec: 3222.14 - lr: 0.000023 - momentum: 0.000000
2023-10-17 20:00:57,061 epoch 6 - iter 730/738 - loss 0.02671452 - time (sec): 50.69 - samples/sec: 3245.73 - lr: 0.000022 - momentum: 0.000000
2023-10-17 20:00:57,723 ----------------------------------------------------------------------------------------------------
2023-10-17 20:00:57,723 EPOCH 6 done: loss 0.0269 - lr: 0.000022
2023-10-17 20:01:09,239 DEV : loss 0.1950322538614273 - f1-score (micro avg) 0.8249
2023-10-17 20:01:09,271 ----------------------------------------------------------------------------------------------------
2023-10-17 20:01:14,527 epoch 7 - iter 73/738 - loss 0.01386572 - time (sec): 5.25 - samples/sec: 3200.54 - lr: 0.000022 - momentum: 0.000000
2023-10-17 20:01:19,720 epoch 7 - iter 146/738 - loss 0.01634565 - time (sec): 10.45 - samples/sec: 3157.16 - lr: 0.000021 - momentum: 0.000000
2023-10-17 20:01:25,152 epoch 7 - iter 219/738 - loss 0.01790451 - time (sec): 15.88 - samples/sec: 3193.90 - lr: 0.000021 - momentum: 0.000000
2023-10-17 20:01:30,559 epoch 7 - iter 292/738 - loss 0.01736663 - time (sec): 21.29 - samples/sec: 3204.39 - lr: 0.000020 - momentum: 0.000000
2023-10-17 20:01:35,540 epoch 7 - iter 365/738 - loss 0.01978003 - time (sec): 26.27 - samples/sec: 3192.85 - lr: 0.000020 - momentum: 0.000000
2023-10-17 20:01:40,624 epoch 7 - iter 438/738 - loss 0.01962635 - time (sec): 31.35 - samples/sec: 3207.62 - lr: 0.000019 - momentum: 0.000000
2023-10-17 20:01:45,292 epoch 7 - iter 511/738 - loss 0.01942126 - time (sec): 36.02 - samples/sec: 3231.16 - lr: 0.000018 - momentum: 0.000000
2023-10-17 20:01:50,368 epoch 7 - iter 584/738 - loss 0.01907844 - time (sec): 41.09 - samples/sec: 3246.80 - lr: 0.000018 - momentum: 0.000000
2023-10-17 20:01:55,510 epoch 7 - iter 657/738 - loss 0.01851805 - time (sec): 46.24 - samples/sec: 3245.13 - lr: 0.000017 - momentum: 0.000000
2023-10-17 20:02:00,008 epoch 7 - iter 730/738 - loss 0.01755251 - time (sec): 50.74 - samples/sec: 3250.75 - lr: 0.000017 - momentum: 0.000000
2023-10-17 20:02:00,455 ----------------------------------------------------------------------------------------------------
2023-10-17 20:02:00,455 EPOCH 7 done: loss 0.0175 - lr: 0.000017
2023-10-17 20:02:11,907 DEV : loss 0.20099307596683502 - f1-score (micro avg) 0.8472
2023-10-17 20:02:11,942 saving best model
2023-10-17 20:02:12,437 ----------------------------------------------------------------------------------------------------
2023-10-17 20:02:17,409 epoch 8 - iter 73/738 - loss 0.00530868 - time (sec): 4.97 - samples/sec: 3271.17 - lr: 0.000016 - momentum: 0.000000
2023-10-17 20:02:22,979 epoch 8 - iter 146/738 - loss 0.00860378 - time (sec): 10.54 - samples/sec: 3219.29 - lr: 0.000016 - momentum: 0.000000
2023-10-17 20:02:27,693 epoch 8 - iter 219/738 - loss 0.00882523 - time (sec): 15.25 - samples/sec: 3245.99 - lr: 0.000015 - momentum: 0.000000
2023-10-17 20:02:32,332 epoch 8 - iter 292/738 - loss 0.00758182 - time (sec): 19.89 - samples/sec: 3282.80 - lr: 0.000015 - momentum: 0.000000
2023-10-17 20:02:37,477 epoch 8 - iter 365/738 - loss 0.00936325 - time (sec): 25.04 - samples/sec: 3273.17 - lr: 0.000014 - momentum: 0.000000
2023-10-17 20:02:42,143 epoch 8 - iter 438/738 - loss 0.00889819 - time (sec): 29.70 - samples/sec: 3288.61 - lr: 0.000013 - momentum: 0.000000
2023-10-17 20:02:47,889 epoch 8 - iter 511/738 - loss 0.01120590 - time (sec): 35.45 - samples/sec: 3292.36 - lr: 0.000013 - momentum: 0.000000
2023-10-17 20:02:52,781 epoch 8 - iter 584/738 - loss 0.01070108 - time (sec): 40.34 - samples/sec: 3292.28 - lr: 0.000012 - momentum: 0.000000
2023-10-17 20:02:57,905 epoch 8 - iter 657/738 - loss 0.01029857 - time (sec): 45.47 - samples/sec: 3281.21 - lr: 0.000012 - momentum: 0.000000
2023-10-17 20:03:02,819 epoch 8 - iter 730/738 - loss 0.01057085 - time (sec): 50.38 - samples/sec: 3266.74 - lr: 0.000011 - momentum: 0.000000
2023-10-17 20:03:03,437 ----------------------------------------------------------------------------------------------------
2023-10-17 20:03:03,438 EPOCH 8 done: loss 0.0106 - lr: 0.000011
2023-10-17 20:03:15,127 DEV : loss 0.2029201090335846 - f1-score (micro avg) 0.85
2023-10-17 20:03:15,159 saving best model
2023-10-17 20:03:15,636 ----------------------------------------------------------------------------------------------------
2023-10-17 20:03:21,317 epoch 9 - iter 73/738 - loss 0.00731494 - time (sec): 5.67 - samples/sec: 3159.58 - lr: 0.000011 - momentum: 0.000000
2023-10-17 20:03:26,465 epoch 9 - iter 146/738 - loss 0.00660726 - time (sec): 10.82 - samples/sec: 3343.70 - lr: 0.000010 - momentum: 0.000000
2023-10-17 20:03:31,815 epoch 9 - iter 219/738 - loss 0.00688380 - time (sec): 16.17 - samples/sec: 3369.99 - lr: 0.000010 - momentum: 0.000000
2023-10-17 20:03:36,707 epoch 9 - iter 292/738 - loss 0.00641742 - time (sec): 21.06 - samples/sec: 3302.92 - lr: 0.000009 - momentum: 0.000000
2023-10-17 20:03:41,868 epoch 9 - iter 365/738 - loss 0.00586385 - time (sec): 26.23 - samples/sec: 3257.53 - lr: 0.000008 - momentum: 0.000000
2023-10-17 20:03:46,603 epoch 9 - iter 438/738 - loss 0.00547976 - time (sec): 30.96 - samples/sec: 3284.63 - lr: 0.000008 - momentum: 0.000000
2023-10-17 20:03:51,092 epoch 9 - iter 511/738 - loss 0.00675333 - time (sec): 35.45 - samples/sec: 3291.12 - lr: 0.000007 - momentum: 0.000000
2023-10-17 20:03:55,944 epoch 9 - iter 584/738 - loss 0.00670660 - time (sec): 40.30 - samples/sec: 3279.67 - lr: 0.000007 - momentum: 0.000000
2023-10-17 20:04:01,758 epoch 9 - iter 657/738 - loss 0.00683468 - time (sec): 46.12 - samples/sec: 3255.35 - lr: 0.000006 - momentum: 0.000000
2023-10-17 20:04:06,152 epoch 9 - iter 730/738 - loss 0.00731875 - time (sec): 50.51 - samples/sec: 3256.15 - lr: 0.000006 - momentum: 0.000000
2023-10-17 20:04:06,710 ----------------------------------------------------------------------------------------------------
2023-10-17 20:04:06,711 EPOCH 9 done: loss 0.0072 - lr: 0.000006
2023-10-17 20:04:18,319 DEV : loss 0.20573198795318604 - f1-score (micro avg) 0.8458
2023-10-17 20:04:18,358 ----------------------------------------------------------------------------------------------------
2023-10-17 20:04:24,146 epoch 10 - iter 73/738 - loss 0.00431885 - time (sec): 5.79 - samples/sec: 3384.62 - lr: 0.000005 - momentum: 0.000000
2023-10-17 20:04:29,348 epoch 10 - iter 146/738 - loss 0.00511183 - time (sec): 10.99 - samples/sec: 3350.39 - lr: 0.000004 - momentum: 0.000000
2023-10-17 20:04:34,219 epoch 10 - iter 219/738 - loss 0.00524316 - time (sec): 15.86 - samples/sec: 3345.15 - lr: 0.000004 - momentum: 0.000000
2023-10-17 20:04:39,060 epoch 10 - iter 292/738 - loss 0.00450078 - time (sec): 20.70 - samples/sec: 3268.11 - lr: 0.000003 - momentum: 0.000000
2023-10-17 20:04:43,692 epoch 10 - iter 365/738 - loss 0.00430535 - time (sec): 25.33 - samples/sec: 3281.17 - lr: 0.000003 - momentum: 0.000000
2023-10-17 20:04:48,199 epoch 10 - iter 438/738 - loss 0.00395909 - time (sec): 29.84 - samples/sec: 3327.98 - lr: 0.000002 - momentum: 0.000000
2023-10-17 20:04:53,745 epoch 10 - iter 511/738 - loss 0.00416350 - time (sec): 35.39 - samples/sec: 3296.62 - lr: 0.000002 - momentum: 0.000000
2023-10-17 20:04:58,445 epoch 10 - iter 584/738 - loss 0.00467921 - time (sec): 40.09 - samples/sec: 3287.80 - lr: 0.000001 - momentum: 0.000000
2023-10-17 20:05:03,398 epoch 10 - iter 657/738 - loss 0.00470911 - time (sec): 45.04 - samples/sec: 3274.98 - lr: 0.000001 - momentum: 0.000000
2023-10-17 20:05:08,928 epoch 10 - iter 730/738 - loss 0.00478977 - time (sec): 50.57 - samples/sec: 3260.86 - lr: 0.000000 - momentum: 0.000000
2023-10-17 20:05:09,381 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:09,382 EPOCH 10 done: loss 0.0050 - lr: 0.000000
2023-10-17 20:05:21,014 DEV : loss 0.2125636637210846 - f1-score (micro avg) 0.8478
2023-10-17 20:05:21,431 ----------------------------------------------------------------------------------------------------
2023-10-17 20:05:21,432 Loading model from best epoch ...
2023-10-17 20:05:22,951 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-17 20:05:29,932
Results:
- F-score (micro) 0.8107
- F-score (macro) 0.7154
- Accuracy 0.7
By class:
precision recall f1-score support
loc 0.8549 0.8928 0.8734 858
pers 0.7792 0.8082 0.7934 537
org 0.6154 0.6061 0.6107 132
prod 0.6721 0.6721 0.6721 61
time 0.5781 0.6852 0.6271 54
micro avg 0.7951 0.8270 0.8107 1642
macro avg 0.6999 0.7329 0.7154 1642
weighted avg 0.7950 0.8270 0.8106 1642
2023-10-17 20:05:29,933 ----------------------------------------------------------------------------------------------------