2023-10-17 19:54:58,087 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,087 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,088 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,088 Train: 5901 sentences 2023-10-17 19:54:58,088 (train_with_dev=False, train_with_test=False) 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,088 Training Params: 2023-10-17 19:54:58,088 - learning_rate: "5e-05" 2023-10-17 19:54:58,088 - mini_batch_size: "8" 2023-10-17 19:54:58,088 - max_epochs: "10" 2023-10-17 19:54:58,088 - shuffle: "True" 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,088 Plugins: 2023-10-17 19:54:58,088 - TensorboardLogger 2023-10-17 19:54:58,088 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,088 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 19:54:58,088 - metric: "('micro avg', 'f1-score')" 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,088 Computation: 2023-10-17 19:54:58,088 - compute on device: cuda:0 2023-10-17 19:54:58,088 - embedding storage: none 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,088 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,088 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:54:58,089 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 19:55:03,268 epoch 1 - iter 73/738 - loss 2.88693034 - time (sec): 5.18 - samples/sec: 3395.57 - lr: 0.000005 - momentum: 0.000000 2023-10-17 19:55:07,787 epoch 1 - iter 146/738 - loss 1.85950582 - time (sec): 9.70 - samples/sec: 3389.92 - lr: 0.000010 - momentum: 0.000000 2023-10-17 19:55:13,252 epoch 1 - iter 219/738 - loss 1.35106642 - time (sec): 15.16 - samples/sec: 3378.75 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:55:18,790 epoch 1 - iter 292/738 - loss 1.08357586 - time (sec): 20.70 - samples/sec: 3336.43 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:55:23,677 epoch 1 - iter 365/738 - loss 0.92998728 - time (sec): 25.59 - samples/sec: 3324.51 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:55:28,248 epoch 1 - iter 438/738 - loss 0.82953286 - time (sec): 30.16 - samples/sec: 3312.57 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:55:33,045 epoch 1 - iter 511/738 - loss 0.74962079 - time (sec): 34.96 - samples/sec: 3295.52 - lr: 0.000035 - momentum: 0.000000 2023-10-17 19:55:38,046 epoch 1 - iter 584/738 - loss 0.68197633 - time (sec): 39.96 - samples/sec: 3288.35 - lr: 0.000039 - momentum: 0.000000 2023-10-17 19:55:43,219 epoch 1 - iter 657/738 - loss 0.62571720 - time (sec): 45.13 - samples/sec: 3269.21 - lr: 0.000044 - momentum: 0.000000 2023-10-17 19:55:48,468 epoch 1 - iter 730/738 - loss 0.57764771 - time (sec): 50.38 - samples/sec: 3269.02 - lr: 0.000049 - momentum: 0.000000 2023-10-17 19:55:48,961 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:48,961 EPOCH 1 done: loss 0.5727 - lr: 0.000049 2023-10-17 19:55:54,850 DEV : loss 0.10554851591587067 - f1-score (micro avg) 0.7791 2023-10-17 19:55:54,883 saving best model 2023-10-17 19:55:55,250 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:56:00,200 epoch 2 - iter 73/738 - loss 0.14514615 - time (sec): 4.95 - samples/sec: 3366.36 - lr: 0.000049 - momentum: 0.000000 2023-10-17 19:56:05,461 epoch 2 - iter 146/738 - loss 0.14280365 - time (sec): 10.21 - samples/sec: 3405.96 - lr: 0.000049 - momentum: 0.000000 2023-10-17 19:56:11,181 epoch 2 - iter 219/738 - loss 0.13414673 - time (sec): 15.93 - samples/sec: 3260.55 - lr: 0.000048 - momentum: 0.000000 2023-10-17 19:56:16,015 epoch 2 - iter 292/738 - loss 0.12870561 - time (sec): 20.76 - samples/sec: 3242.34 - lr: 0.000048 - momentum: 0.000000 2023-10-17 19:56:20,656 epoch 2 - iter 365/738 - loss 0.12649401 - time (sec): 25.40 - samples/sec: 3224.67 - lr: 0.000047 - momentum: 0.000000 2023-10-17 19:56:25,261 epoch 2 - iter 438/738 - loss 0.12411978 - time (sec): 30.01 - samples/sec: 3236.24 - lr: 0.000047 - momentum: 0.000000 2023-10-17 19:56:30,148 epoch 2 - iter 511/738 - loss 0.11950094 - time (sec): 34.90 - samples/sec: 3251.88 - lr: 0.000046 - momentum: 0.000000 2023-10-17 19:56:35,203 epoch 2 - iter 584/738 - loss 0.11924617 - time (sec): 39.95 - samples/sec: 3240.62 - lr: 0.000046 - momentum: 0.000000 2023-10-17 19:56:40,727 epoch 2 - iter 657/738 - loss 0.11885339 - time (sec): 45.48 - samples/sec: 3244.73 - lr: 0.000045 - momentum: 0.000000 2023-10-17 19:56:46,137 epoch 2 - iter 730/738 - loss 0.11757973 - time (sec): 50.89 - samples/sec: 3234.08 - lr: 0.000045 - momentum: 0.000000 2023-10-17 19:56:46,752 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:56:46,752 EPOCH 2 done: loss 0.1175 - lr: 0.000045 2023-10-17 19:56:58,026 DEV : loss 0.10420098155736923 - f1-score (micro avg) 0.7988 2023-10-17 19:56:58,058 saving best model 2023-10-17 19:56:58,527 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:57:04,268 epoch 3 - iter 73/738 - loss 0.06478791 - time (sec): 5.74 - samples/sec: 3069.26 - lr: 0.000044 - momentum: 0.000000 2023-10-17 19:57:09,479 epoch 3 - iter 146/738 - loss 0.06871097 - time (sec): 10.95 - samples/sec: 3185.38 - lr: 0.000043 - momentum: 0.000000 2023-10-17 19:57:14,560 epoch 3 - iter 219/738 - loss 0.06720314 - time (sec): 16.03 - samples/sec: 3221.49 - lr: 0.000043 - momentum: 0.000000 2023-10-17 19:57:19,385 epoch 3 - iter 292/738 - loss 0.06920884 - time (sec): 20.85 - samples/sec: 3233.71 - lr: 0.000042 - momentum: 0.000000 2023-10-17 19:57:24,326 epoch 3 - iter 365/738 - loss 0.07062373 - time (sec): 25.80 - samples/sec: 3231.62 - lr: 0.000042 - momentum: 0.000000 2023-10-17 19:57:29,283 epoch 3 - iter 438/738 - loss 0.07250042 - time (sec): 30.75 - samples/sec: 3215.39 - lr: 0.000041 - momentum: 0.000000 2023-10-17 19:57:34,689 epoch 3 - iter 511/738 - loss 0.07252047 - time (sec): 36.16 - samples/sec: 3237.02 - lr: 0.000041 - momentum: 0.000000 2023-10-17 19:57:39,815 epoch 3 - iter 584/738 - loss 0.07374542 - time (sec): 41.28 - samples/sec: 3222.24 - lr: 0.000040 - momentum: 0.000000 2023-10-17 19:57:44,747 epoch 3 - iter 657/738 - loss 0.07287253 - time (sec): 46.22 - samples/sec: 3223.47 - lr: 0.000040 - momentum: 0.000000 2023-10-17 19:57:49,392 epoch 3 - iter 730/738 - loss 0.07335412 - time (sec): 50.86 - samples/sec: 3243.97 - lr: 0.000039 - momentum: 0.000000 2023-10-17 19:57:49,824 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:57:49,825 EPOCH 3 done: loss 0.0731 - lr: 0.000039 2023-10-17 19:58:01,170 DEV : loss 0.1143854483962059 - f1-score (micro avg) 0.8304 2023-10-17 19:58:01,201 saving best model 2023-10-17 19:58:01,680 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:58:06,869 epoch 4 - iter 73/738 - loss 0.05125944 - time (sec): 5.18 - samples/sec: 3063.50 - lr: 0.000038 - momentum: 0.000000 2023-10-17 19:58:12,085 epoch 4 - iter 146/738 - loss 0.04770594 - time (sec): 10.40 - samples/sec: 3221.63 - lr: 0.000038 - momentum: 0.000000 2023-10-17 19:58:16,715 epoch 4 - iter 219/738 - loss 0.05111511 - time (sec): 15.03 - samples/sec: 3250.15 - lr: 0.000037 - momentum: 0.000000 2023-10-17 19:58:21,755 epoch 4 - iter 292/738 - loss 0.05211159 - time (sec): 20.07 - samples/sec: 3238.43 - lr: 0.000037 - momentum: 0.000000 2023-10-17 19:58:26,379 epoch 4 - iter 365/738 - loss 0.05176227 - time (sec): 24.69 - samples/sec: 3228.83 - lr: 0.000036 - momentum: 0.000000 2023-10-17 19:58:31,229 epoch 4 - iter 438/738 - loss 0.05010900 - time (sec): 29.54 - samples/sec: 3263.30 - lr: 0.000036 - momentum: 0.000000 2023-10-17 19:58:35,878 epoch 4 - iter 511/738 - loss 0.04871523 - time (sec): 34.19 - samples/sec: 3282.05 - lr: 0.000035 - momentum: 0.000000 2023-10-17 19:58:41,325 epoch 4 - iter 584/738 - loss 0.04755472 - time (sec): 39.64 - samples/sec: 3277.83 - lr: 0.000035 - momentum: 0.000000 2023-10-17 19:58:46,444 epoch 4 - iter 657/738 - loss 0.04734226 - time (sec): 44.76 - samples/sec: 3265.92 - lr: 0.000034 - momentum: 0.000000 2023-10-17 19:58:52,119 epoch 4 - iter 730/738 - loss 0.04826096 - time (sec): 50.43 - samples/sec: 3266.32 - lr: 0.000033 - momentum: 0.000000 2023-10-17 19:58:52,586 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:58:52,587 EPOCH 4 done: loss 0.0482 - lr: 0.000033 2023-10-17 19:59:03,974 DEV : loss 0.14476759731769562 - f1-score (micro avg) 0.8296 2023-10-17 19:59:04,007 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:59:09,112 epoch 5 - iter 73/738 - loss 0.02409841 - time (sec): 5.10 - samples/sec: 3478.57 - lr: 0.000033 - momentum: 0.000000 2023-10-17 19:59:13,903 epoch 5 - iter 146/738 - loss 0.02674600 - time (sec): 9.90 - samples/sec: 3380.35 - lr: 0.000032 - momentum: 0.000000 2023-10-17 19:59:18,707 epoch 5 - iter 219/738 - loss 0.03079058 - time (sec): 14.70 - samples/sec: 3371.53 - lr: 0.000032 - momentum: 0.000000 2023-10-17 19:59:23,875 epoch 5 - iter 292/738 - loss 0.03683637 - time (sec): 19.87 - samples/sec: 3340.91 - lr: 0.000031 - momentum: 0.000000 2023-10-17 19:59:28,827 epoch 5 - iter 365/738 - loss 0.03423444 - time (sec): 24.82 - samples/sec: 3340.14 - lr: 0.000031 - momentum: 0.000000 2023-10-17 19:59:33,840 epoch 5 - iter 438/738 - loss 0.03471070 - time (sec): 29.83 - samples/sec: 3336.10 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:59:38,818 epoch 5 - iter 511/738 - loss 0.03388460 - time (sec): 34.81 - samples/sec: 3313.82 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:59:43,507 epoch 5 - iter 584/738 - loss 0.03409148 - time (sec): 39.50 - samples/sec: 3308.22 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:59:48,472 epoch 5 - iter 657/738 - loss 0.03441889 - time (sec): 44.46 - samples/sec: 3313.79 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:59:53,517 epoch 5 - iter 730/738 - loss 0.03505580 - time (sec): 49.51 - samples/sec: 3315.30 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:59:54,306 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:59:54,306 EPOCH 5 done: loss 0.0354 - lr: 0.000028 2023-10-17 20:00:05,845 DEV : loss 0.19043707847595215 - f1-score (micro avg) 0.8318 2023-10-17 20:00:05,880 saving best model 2023-10-17 20:00:06,366 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:00:11,320 epoch 6 - iter 73/738 - loss 0.03216532 - time (sec): 4.95 - samples/sec: 3176.87 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:00:16,370 epoch 6 - iter 146/738 - loss 0.02639679 - time (sec): 10.00 - samples/sec: 3286.09 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:00:21,647 epoch 6 - iter 219/738 - loss 0.02284424 - time (sec): 15.28 - samples/sec: 3237.56 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:00:27,112 epoch 6 - iter 292/738 - loss 0.02623873 - time (sec): 20.74 - samples/sec: 3148.97 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:00:32,148 epoch 6 - iter 365/738 - loss 0.02569826 - time (sec): 25.78 - samples/sec: 3159.99 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:00:36,948 epoch 6 - iter 438/738 - loss 0.02511489 - time (sec): 30.58 - samples/sec: 3174.95 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:00:42,160 epoch 6 - iter 511/738 - loss 0.02584371 - time (sec): 35.79 - samples/sec: 3177.93 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:00:47,108 epoch 6 - iter 584/738 - loss 0.02552496 - time (sec): 40.74 - samples/sec: 3212.15 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:00:51,949 epoch 6 - iter 657/738 - loss 0.02629373 - time (sec): 45.58 - samples/sec: 3222.14 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:00:57,061 epoch 6 - iter 730/738 - loss 0.02671452 - time (sec): 50.69 - samples/sec: 3245.73 - lr: 0.000022 - momentum: 0.000000 2023-10-17 20:00:57,723 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:00:57,723 EPOCH 6 done: loss 0.0269 - lr: 0.000022 2023-10-17 20:01:09,239 DEV : loss 0.1950322538614273 - f1-score (micro avg) 0.8249 2023-10-17 20:01:09,271 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:01:14,527 epoch 7 - iter 73/738 - loss 0.01386572 - time (sec): 5.25 - samples/sec: 3200.54 - lr: 0.000022 - momentum: 0.000000 2023-10-17 20:01:19,720 epoch 7 - iter 146/738 - loss 0.01634565 - time (sec): 10.45 - samples/sec: 3157.16 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:01:25,152 epoch 7 - iter 219/738 - loss 0.01790451 - time (sec): 15.88 - samples/sec: 3193.90 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:01:30,559 epoch 7 - iter 292/738 - loss 0.01736663 - time (sec): 21.29 - samples/sec: 3204.39 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:01:35,540 epoch 7 - iter 365/738 - loss 0.01978003 - time (sec): 26.27 - samples/sec: 3192.85 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:01:40,624 epoch 7 - iter 438/738 - loss 0.01962635 - time (sec): 31.35 - samples/sec: 3207.62 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:01:45,292 epoch 7 - iter 511/738 - loss 0.01942126 - time (sec): 36.02 - samples/sec: 3231.16 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:01:50,368 epoch 7 - iter 584/738 - loss 0.01907844 - time (sec): 41.09 - samples/sec: 3246.80 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:01:55,510 epoch 7 - iter 657/738 - loss 0.01851805 - time (sec): 46.24 - samples/sec: 3245.13 - lr: 0.000017 - momentum: 0.000000 2023-10-17 20:02:00,008 epoch 7 - iter 730/738 - loss 0.01755251 - time (sec): 50.74 - samples/sec: 3250.75 - lr: 0.000017 - momentum: 0.000000 2023-10-17 20:02:00,455 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:02:00,455 EPOCH 7 done: loss 0.0175 - lr: 0.000017 2023-10-17 20:02:11,907 DEV : loss 0.20099307596683502 - f1-score (micro avg) 0.8472 2023-10-17 20:02:11,942 saving best model 2023-10-17 20:02:12,437 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:02:17,409 epoch 8 - iter 73/738 - loss 0.00530868 - time (sec): 4.97 - samples/sec: 3271.17 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:02:22,979 epoch 8 - iter 146/738 - loss 0.00860378 - time (sec): 10.54 - samples/sec: 3219.29 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:02:27,693 epoch 8 - iter 219/738 - loss 0.00882523 - time (sec): 15.25 - samples/sec: 3245.99 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:02:32,332 epoch 8 - iter 292/738 - loss 0.00758182 - time (sec): 19.89 - samples/sec: 3282.80 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:02:37,477 epoch 8 - iter 365/738 - loss 0.00936325 - time (sec): 25.04 - samples/sec: 3273.17 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:02:42,143 epoch 8 - iter 438/738 - loss 0.00889819 - time (sec): 29.70 - samples/sec: 3288.61 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:02:47,889 epoch 8 - iter 511/738 - loss 0.01120590 - time (sec): 35.45 - samples/sec: 3292.36 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:02:52,781 epoch 8 - iter 584/738 - loss 0.01070108 - time (sec): 40.34 - samples/sec: 3292.28 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:02:57,905 epoch 8 - iter 657/738 - loss 0.01029857 - time (sec): 45.47 - samples/sec: 3281.21 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:03:02,819 epoch 8 - iter 730/738 - loss 0.01057085 - time (sec): 50.38 - samples/sec: 3266.74 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:03:03,437 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:03:03,438 EPOCH 8 done: loss 0.0106 - lr: 0.000011 2023-10-17 20:03:15,127 DEV : loss 0.2029201090335846 - f1-score (micro avg) 0.85 2023-10-17 20:03:15,159 saving best model 2023-10-17 20:03:15,636 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:03:21,317 epoch 9 - iter 73/738 - loss 0.00731494 - time (sec): 5.67 - samples/sec: 3159.58 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:03:26,465 epoch 9 - iter 146/738 - loss 0.00660726 - time (sec): 10.82 - samples/sec: 3343.70 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:03:31,815 epoch 9 - iter 219/738 - loss 0.00688380 - time (sec): 16.17 - samples/sec: 3369.99 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:03:36,707 epoch 9 - iter 292/738 - loss 0.00641742 - time (sec): 21.06 - samples/sec: 3302.92 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:03:41,868 epoch 9 - iter 365/738 - loss 0.00586385 - time (sec): 26.23 - samples/sec: 3257.53 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:03:46,603 epoch 9 - iter 438/738 - loss 0.00547976 - time (sec): 30.96 - samples/sec: 3284.63 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:03:51,092 epoch 9 - iter 511/738 - loss 0.00675333 - time (sec): 35.45 - samples/sec: 3291.12 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:03:55,944 epoch 9 - iter 584/738 - loss 0.00670660 - time (sec): 40.30 - samples/sec: 3279.67 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:04:01,758 epoch 9 - iter 657/738 - loss 0.00683468 - time (sec): 46.12 - samples/sec: 3255.35 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:04:06,152 epoch 9 - iter 730/738 - loss 0.00731875 - time (sec): 50.51 - samples/sec: 3256.15 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:04:06,710 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:04:06,711 EPOCH 9 done: loss 0.0072 - lr: 0.000006 2023-10-17 20:04:18,319 DEV : loss 0.20573198795318604 - f1-score (micro avg) 0.8458 2023-10-17 20:04:18,358 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:04:24,146 epoch 10 - iter 73/738 - loss 0.00431885 - time (sec): 5.79 - samples/sec: 3384.62 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:04:29,348 epoch 10 - iter 146/738 - loss 0.00511183 - time (sec): 10.99 - samples/sec: 3350.39 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:04:34,219 epoch 10 - iter 219/738 - loss 0.00524316 - time (sec): 15.86 - samples/sec: 3345.15 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:04:39,060 epoch 10 - iter 292/738 - loss 0.00450078 - time (sec): 20.70 - samples/sec: 3268.11 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:04:43,692 epoch 10 - iter 365/738 - loss 0.00430535 - time (sec): 25.33 - samples/sec: 3281.17 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:04:48,199 epoch 10 - iter 438/738 - loss 0.00395909 - time (sec): 29.84 - samples/sec: 3327.98 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:04:53,745 epoch 10 - iter 511/738 - loss 0.00416350 - time (sec): 35.39 - samples/sec: 3296.62 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:04:58,445 epoch 10 - iter 584/738 - loss 0.00467921 - time (sec): 40.09 - samples/sec: 3287.80 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:05:03,398 epoch 10 - iter 657/738 - loss 0.00470911 - time (sec): 45.04 - samples/sec: 3274.98 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:05:08,928 epoch 10 - iter 730/738 - loss 0.00478977 - time (sec): 50.57 - samples/sec: 3260.86 - lr: 0.000000 - momentum: 0.000000 2023-10-17 20:05:09,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:05:09,382 EPOCH 10 done: loss 0.0050 - lr: 0.000000 2023-10-17 20:05:21,014 DEV : loss 0.2125636637210846 - f1-score (micro avg) 0.8478 2023-10-17 20:05:21,431 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:05:21,432 Loading model from best epoch ... 2023-10-17 20:05:22,951 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-17 20:05:29,932 Results: - F-score (micro) 0.8107 - F-score (macro) 0.7154 - Accuracy 0.7 By class: precision recall f1-score support loc 0.8549 0.8928 0.8734 858 pers 0.7792 0.8082 0.7934 537 org 0.6154 0.6061 0.6107 132 prod 0.6721 0.6721 0.6721 61 time 0.5781 0.6852 0.6271 54 micro avg 0.7951 0.8270 0.8107 1642 macro avg 0.6999 0.7329 0.7154 1642 weighted avg 0.7950 0.8270 0.8106 1642 2023-10-17 20:05:29,933 ----------------------------------------------------------------------------------------------------