|
2023-10-13 10:08:42,076 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,079 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 10:08:42,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,079 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences |
|
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator |
|
2023-10-13 10:08:42,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,079 Train: 14465 sentences |
|
2023-10-13 10:08:42,079 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 10:08:42,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,079 Training Params: |
|
2023-10-13 10:08:42,079 - learning_rate: "0.00015" |
|
2023-10-13 10:08:42,080 - mini_batch_size: "8" |
|
2023-10-13 10:08:42,080 - max_epochs: "10" |
|
2023-10-13 10:08:42,080 - shuffle: "True" |
|
2023-10-13 10:08:42,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,080 Plugins: |
|
2023-10-13 10:08:42,080 - TensorboardLogger |
|
2023-10-13 10:08:42,080 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 10:08:42,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,080 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 10:08:42,080 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 10:08:42,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,080 Computation: |
|
2023-10-13 10:08:42,080 - compute on device: cuda:0 |
|
2023-10-13 10:08:42,080 - embedding storage: none |
|
2023-10-13 10:08:42,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,081 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-13 10:08:42,081 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,081 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:42,081 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 10:10:16,670 epoch 1 - iter 180/1809 - loss 2.55193452 - time (sec): 94.59 - samples/sec: 393.52 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 10:11:50,746 epoch 1 - iter 360/1809 - loss 2.32664425 - time (sec): 188.66 - samples/sec: 395.41 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 10:13:30,449 epoch 1 - iter 540/1809 - loss 1.97458754 - time (sec): 288.37 - samples/sec: 391.31 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 10:15:09,687 epoch 1 - iter 720/1809 - loss 1.63149756 - time (sec): 387.60 - samples/sec: 388.96 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 10:16:47,097 epoch 1 - iter 900/1809 - loss 1.36469605 - time (sec): 485.01 - samples/sec: 389.16 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 10:18:23,052 epoch 1 - iter 1080/1809 - loss 1.17974807 - time (sec): 580.97 - samples/sec: 389.57 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-13 10:19:58,384 epoch 1 - iter 1260/1809 - loss 1.03904632 - time (sec): 676.30 - samples/sec: 390.55 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-13 10:21:34,085 epoch 1 - iter 1440/1809 - loss 0.92828376 - time (sec): 772.00 - samples/sec: 391.23 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-13 10:23:11,299 epoch 1 - iter 1620/1809 - loss 0.84195771 - time (sec): 869.22 - samples/sec: 391.84 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-13 10:24:46,248 epoch 1 - iter 1800/1809 - loss 0.77324166 - time (sec): 964.17 - samples/sec: 392.14 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-13 10:24:50,774 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:24:50,774 EPOCH 1 done: loss 0.7705 - lr: 0.000149 |
|
2023-10-13 10:25:30,426 DEV : loss 0.14874930679798126 - f1-score (micro avg) 0.4029 |
|
2023-10-13 10:25:30,486 saving best model |
|
2023-10-13 10:25:31,350 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:27:05,720 epoch 2 - iter 180/1809 - loss 0.13805801 - time (sec): 94.37 - samples/sec: 388.33 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 10:28:42,834 epoch 2 - iter 360/1809 - loss 0.12933060 - time (sec): 191.48 - samples/sec: 389.43 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-13 10:30:19,877 epoch 2 - iter 540/1809 - loss 0.12513132 - time (sec): 288.52 - samples/sec: 390.84 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-13 10:31:57,990 epoch 2 - iter 720/1809 - loss 0.12053367 - time (sec): 386.64 - samples/sec: 389.16 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-13 10:33:35,527 epoch 2 - iter 900/1809 - loss 0.11629545 - time (sec): 484.17 - samples/sec: 391.21 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 10:35:08,984 epoch 2 - iter 1080/1809 - loss 0.11333380 - time (sec): 577.63 - samples/sec: 391.35 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 10:36:43,056 epoch 2 - iter 1260/1809 - loss 0.10988685 - time (sec): 671.70 - samples/sec: 391.95 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-13 10:38:22,782 epoch 2 - iter 1440/1809 - loss 0.10663465 - time (sec): 771.43 - samples/sec: 392.15 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 10:40:03,308 epoch 2 - iter 1620/1809 - loss 0.10350687 - time (sec): 871.95 - samples/sec: 390.60 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 10:41:38,900 epoch 2 - iter 1800/1809 - loss 0.10224515 - time (sec): 967.55 - samples/sec: 390.97 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 10:41:43,136 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:41:43,137 EPOCH 2 done: loss 0.1022 - lr: 0.000133 |
|
2023-10-13 10:42:24,954 DEV : loss 0.09910175204277039 - f1-score (micro avg) 0.5719 |
|
2023-10-13 10:42:25,015 saving best model |
|
2023-10-13 10:42:27,591 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:44:04,602 epoch 3 - iter 180/1809 - loss 0.06155697 - time (sec): 97.01 - samples/sec: 403.70 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 10:45:38,856 epoch 3 - iter 360/1809 - loss 0.06070254 - time (sec): 191.26 - samples/sec: 395.73 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 10:47:14,475 epoch 3 - iter 540/1809 - loss 0.06120311 - time (sec): 286.88 - samples/sec: 394.26 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 10:48:53,782 epoch 3 - iter 720/1809 - loss 0.06292844 - time (sec): 386.19 - samples/sec: 390.36 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-13 10:50:32,623 epoch 3 - iter 900/1809 - loss 0.06307044 - time (sec): 485.03 - samples/sec: 389.58 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 10:52:10,549 epoch 3 - iter 1080/1809 - loss 0.06402904 - time (sec): 582.95 - samples/sec: 386.68 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 10:53:49,156 epoch 3 - iter 1260/1809 - loss 0.06377227 - time (sec): 681.56 - samples/sec: 389.15 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-13 10:55:24,344 epoch 3 - iter 1440/1809 - loss 0.06426367 - time (sec): 776.75 - samples/sec: 388.15 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-13 10:57:00,411 epoch 3 - iter 1620/1809 - loss 0.06362259 - time (sec): 872.82 - samples/sec: 389.28 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-13 10:58:38,665 epoch 3 - iter 1800/1809 - loss 0.06327889 - time (sec): 971.07 - samples/sec: 389.10 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 10:58:43,368 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:58:43,368 EPOCH 3 done: loss 0.0632 - lr: 0.000117 |
|
2023-10-13 10:59:24,367 DEV : loss 0.11729110032320023 - f1-score (micro avg) 0.6357 |
|
2023-10-13 10:59:24,427 saving best model |
|
2023-10-13 10:59:26,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:01:01,873 epoch 4 - iter 180/1809 - loss 0.03980255 - time (sec): 94.88 - samples/sec: 390.73 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-13 11:02:41,954 epoch 4 - iter 360/1809 - loss 0.04218853 - time (sec): 194.96 - samples/sec: 389.89 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-13 11:04:22,545 epoch 4 - iter 540/1809 - loss 0.04514616 - time (sec): 295.55 - samples/sec: 382.99 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 11:06:00,574 epoch 4 - iter 720/1809 - loss 0.04552944 - time (sec): 393.58 - samples/sec: 382.84 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 11:07:37,214 epoch 4 - iter 900/1809 - loss 0.04691891 - time (sec): 490.22 - samples/sec: 384.07 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-13 11:09:17,007 epoch 4 - iter 1080/1809 - loss 0.04597866 - time (sec): 590.01 - samples/sec: 382.76 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 11:10:57,169 epoch 4 - iter 1260/1809 - loss 0.04513610 - time (sec): 690.18 - samples/sec: 381.27 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 11:12:40,828 epoch 4 - iter 1440/1809 - loss 0.04438682 - time (sec): 793.83 - samples/sec: 379.78 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 11:14:18,442 epoch 4 - iter 1620/1809 - loss 0.04429229 - time (sec): 891.45 - samples/sec: 381.80 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-13 11:16:00,362 epoch 4 - iter 1800/1809 - loss 0.04572766 - time (sec): 993.37 - samples/sec: 380.72 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 11:16:04,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:16:04,772 EPOCH 4 done: loss 0.0457 - lr: 0.000100 |
|
2023-10-13 11:16:44,751 DEV : loss 0.16882555186748505 - f1-score (micro avg) 0.6361 |
|
2023-10-13 11:16:44,823 saving best model |
|
2023-10-13 11:16:47,519 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:18:28,087 epoch 5 - iter 180/1809 - loss 0.02745004 - time (sec): 100.56 - samples/sec: 383.98 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 11:20:05,038 epoch 5 - iter 360/1809 - loss 0.02948707 - time (sec): 197.51 - samples/sec: 391.81 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-13 11:21:43,338 epoch 5 - iter 540/1809 - loss 0.02991506 - time (sec): 295.81 - samples/sec: 385.31 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-13 11:23:23,915 epoch 5 - iter 720/1809 - loss 0.03207680 - time (sec): 396.39 - samples/sec: 386.48 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 11:25:03,777 epoch 5 - iter 900/1809 - loss 0.03173525 - time (sec): 496.25 - samples/sec: 386.05 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-13 11:26:41,708 epoch 5 - iter 1080/1809 - loss 0.03291552 - time (sec): 594.18 - samples/sec: 383.07 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-13 11:28:20,067 epoch 5 - iter 1260/1809 - loss 0.03293994 - time (sec): 692.54 - samples/sec: 383.92 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-13 11:29:59,571 epoch 5 - iter 1440/1809 - loss 0.03244028 - time (sec): 792.05 - samples/sec: 383.50 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 11:31:39,767 epoch 5 - iter 1620/1809 - loss 0.03323533 - time (sec): 892.24 - samples/sec: 381.40 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 11:33:19,274 epoch 5 - iter 1800/1809 - loss 0.03363279 - time (sec): 991.75 - samples/sec: 381.47 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-13 11:33:23,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:33:23,698 EPOCH 5 done: loss 0.0337 - lr: 0.000083 |
|
2023-10-13 11:34:04,732 DEV : loss 0.22161424160003662 - f1-score (micro avg) 0.6488 |
|
2023-10-13 11:34:04,800 saving best model |
|
2023-10-13 11:34:07,393 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:35:48,090 epoch 6 - iter 180/1809 - loss 0.01989408 - time (sec): 100.69 - samples/sec: 377.17 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 11:37:24,788 epoch 6 - iter 360/1809 - loss 0.02145466 - time (sec): 197.39 - samples/sec: 380.27 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 11:39:00,908 epoch 6 - iter 540/1809 - loss 0.02242469 - time (sec): 293.51 - samples/sec: 381.09 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 11:40:34,463 epoch 6 - iter 720/1809 - loss 0.02387583 - time (sec): 387.06 - samples/sec: 388.26 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 11:42:09,699 epoch 6 - iter 900/1809 - loss 0.02427821 - time (sec): 482.30 - samples/sec: 388.84 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 11:43:42,056 epoch 6 - iter 1080/1809 - loss 0.02375360 - time (sec): 574.66 - samples/sec: 391.77 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 11:45:16,686 epoch 6 - iter 1260/1809 - loss 0.02379900 - time (sec): 669.29 - samples/sec: 393.07 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-13 11:46:50,964 epoch 6 - iter 1440/1809 - loss 0.02386266 - time (sec): 763.57 - samples/sec: 395.39 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-13 11:48:23,979 epoch 6 - iter 1620/1809 - loss 0.02406347 - time (sec): 856.58 - samples/sec: 396.80 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 11:49:57,153 epoch 6 - iter 1800/1809 - loss 0.02485654 - time (sec): 949.75 - samples/sec: 398.12 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-13 11:50:01,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:50:01,456 EPOCH 6 done: loss 0.0248 - lr: 0.000067 |
|
2023-10-13 11:50:42,664 DEV : loss 0.26427823305130005 - f1-score (micro avg) 0.6499 |
|
2023-10-13 11:50:42,729 saving best model |
|
2023-10-13 11:50:45,273 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:52:24,565 epoch 7 - iter 180/1809 - loss 0.01829401 - time (sec): 99.29 - samples/sec: 388.56 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-13 11:54:00,473 epoch 7 - iter 360/1809 - loss 0.01662527 - time (sec): 195.20 - samples/sec: 390.08 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-13 11:55:35,250 epoch 7 - iter 540/1809 - loss 0.01711330 - time (sec): 289.97 - samples/sec: 396.97 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 11:57:12,171 epoch 7 - iter 720/1809 - loss 0.01763868 - time (sec): 386.89 - samples/sec: 391.83 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 11:58:52,384 epoch 7 - iter 900/1809 - loss 0.01892285 - time (sec): 487.11 - samples/sec: 388.52 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-13 12:00:33,274 epoch 7 - iter 1080/1809 - loss 0.01907594 - time (sec): 588.00 - samples/sec: 389.53 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 12:02:10,151 epoch 7 - iter 1260/1809 - loss 0.01948090 - time (sec): 684.87 - samples/sec: 389.17 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 12:03:44,517 epoch 7 - iter 1440/1809 - loss 0.01959196 - time (sec): 779.24 - samples/sec: 389.11 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-13 12:05:18,459 epoch 7 - iter 1620/1809 - loss 0.01974823 - time (sec): 873.18 - samples/sec: 389.17 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 12:06:52,523 epoch 7 - iter 1800/1809 - loss 0.01891607 - time (sec): 967.24 - samples/sec: 390.92 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 12:06:56,738 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:06:56,738 EPOCH 7 done: loss 0.0190 - lr: 0.000050 |
|
2023-10-13 12:07:37,764 DEV : loss 0.3006477653980255 - f1-score (micro avg) 0.6484 |
|
2023-10-13 12:07:37,830 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:09:14,953 epoch 8 - iter 180/1809 - loss 0.01107605 - time (sec): 97.12 - samples/sec: 390.74 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 12:10:55,384 epoch 8 - iter 360/1809 - loss 0.01371757 - time (sec): 197.55 - samples/sec: 391.33 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 12:12:34,854 epoch 8 - iter 540/1809 - loss 0.01237565 - time (sec): 297.02 - samples/sec: 389.62 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 12:14:13,295 epoch 8 - iter 720/1809 - loss 0.01229570 - time (sec): 395.46 - samples/sec: 389.84 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 12:15:49,208 epoch 8 - iter 900/1809 - loss 0.01315215 - time (sec): 491.38 - samples/sec: 387.35 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 12:17:23,939 epoch 8 - iter 1080/1809 - loss 0.01311433 - time (sec): 586.11 - samples/sec: 391.13 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 12:18:59,994 epoch 8 - iter 1260/1809 - loss 0.01292896 - time (sec): 682.16 - samples/sec: 389.79 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 12:20:36,705 epoch 8 - iter 1440/1809 - loss 0.01275007 - time (sec): 778.87 - samples/sec: 389.16 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 12:22:13,261 epoch 8 - iter 1620/1809 - loss 0.01281415 - time (sec): 875.43 - samples/sec: 389.99 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 12:23:53,434 epoch 8 - iter 1800/1809 - loss 0.01361457 - time (sec): 975.60 - samples/sec: 387.96 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 12:23:57,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:23:57,584 EPOCH 8 done: loss 0.0136 - lr: 0.000033 |
|
2023-10-13 12:24:39,214 DEV : loss 0.3133712708950043 - f1-score (micro avg) 0.6443 |
|
2023-10-13 12:24:39,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:26:14,295 epoch 9 - iter 180/1809 - loss 0.00811211 - time (sec): 95.01 - samples/sec: 381.60 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 12:27:51,642 epoch 9 - iter 360/1809 - loss 0.01014665 - time (sec): 192.36 - samples/sec: 386.96 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 12:29:28,757 epoch 9 - iter 540/1809 - loss 0.01204379 - time (sec): 289.47 - samples/sec: 389.86 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 12:31:06,089 epoch 9 - iter 720/1809 - loss 0.01210666 - time (sec): 386.81 - samples/sec: 389.63 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 12:32:48,351 epoch 9 - iter 900/1809 - loss 0.01191859 - time (sec): 489.07 - samples/sec: 387.06 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 12:34:29,374 epoch 9 - iter 1080/1809 - loss 0.01103504 - time (sec): 590.09 - samples/sec: 385.32 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 12:36:06,450 epoch 9 - iter 1260/1809 - loss 0.01185981 - time (sec): 687.17 - samples/sec: 384.86 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 12:37:44,048 epoch 9 - iter 1440/1809 - loss 0.01164861 - time (sec): 784.76 - samples/sec: 383.08 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 12:39:21,038 epoch 9 - iter 1620/1809 - loss 0.01129939 - time (sec): 881.75 - samples/sec: 384.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 12:41:04,369 epoch 9 - iter 1800/1809 - loss 0.01127319 - time (sec): 985.09 - samples/sec: 383.82 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 12:41:09,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:41:09,283 EPOCH 9 done: loss 0.0112 - lr: 0.000017 |
|
2023-10-13 12:41:52,191 DEV : loss 0.33530837297439575 - f1-score (micro avg) 0.6476 |
|
2023-10-13 12:41:52,272 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:43:32,493 epoch 10 - iter 180/1809 - loss 0.00503398 - time (sec): 100.22 - samples/sec: 380.94 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 12:45:10,018 epoch 10 - iter 360/1809 - loss 0.00514473 - time (sec): 197.74 - samples/sec: 383.57 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 12:46:46,023 epoch 10 - iter 540/1809 - loss 0.00587847 - time (sec): 293.75 - samples/sec: 385.18 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 12:48:21,614 epoch 10 - iter 720/1809 - loss 0.00651019 - time (sec): 389.34 - samples/sec: 389.69 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 12:49:57,424 epoch 10 - iter 900/1809 - loss 0.00679671 - time (sec): 485.15 - samples/sec: 389.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 12:51:33,136 epoch 10 - iter 1080/1809 - loss 0.00712549 - time (sec): 580.86 - samples/sec: 389.66 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 12:53:07,935 epoch 10 - iter 1260/1809 - loss 0.00705147 - time (sec): 675.66 - samples/sec: 391.82 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 12:54:42,621 epoch 10 - iter 1440/1809 - loss 0.00701041 - time (sec): 770.35 - samples/sec: 394.49 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 12:56:16,729 epoch 10 - iter 1620/1809 - loss 0.00703049 - time (sec): 864.45 - samples/sec: 392.77 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 12:57:53,169 epoch 10 - iter 1800/1809 - loss 0.00712609 - time (sec): 960.89 - samples/sec: 393.85 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 12:57:57,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:57:57,308 EPOCH 10 done: loss 0.0071 - lr: 0.000000 |
|
2023-10-13 12:58:38,220 DEV : loss 0.3420470952987671 - f1-score (micro avg) 0.6541 |
|
2023-10-13 12:58:38,294 saving best model |
|
2023-10-13 12:58:45,503 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:58:45,506 Loading model from best epoch ... |
|
2023-10-13 12:58:51,051 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org |
|
2023-10-13 12:59:49,682 |
|
Results: |
|
- F-score (micro) 0.6291 |
|
- F-score (macro) 0.5022 |
|
- Accuracy 0.4688 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.6234 0.7479 0.6800 591 |
|
pers 0.5742 0.6611 0.6146 357 |
|
org 0.2642 0.1772 0.2121 79 |
|
|
|
micro avg 0.5899 0.6738 0.6291 1027 |
|
macro avg 0.4873 0.5287 0.5022 1027 |
|
weighted avg 0.5787 0.6738 0.6213 1027 |
|
|
|
2023-10-13 12:59:49,682 ---------------------------------------------------------------------------------------------------- |
|
|