2022-01-16 18:38:17,520 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:38:17,523 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): RobertaModel( (embeddings): RobertaEmbeddings( (word_embeddings): Embedding(32768, 768, padding_idx=1) (position_embeddings): Embedding(514, 768, padding_idx=1) (token_type_embeddings): Embedding(1, 768) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): RobertaEncoder( (layer): ModuleList( (0): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (1): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (2): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (3): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (4): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (5): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (6): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (7): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (8): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (9): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (10): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (11): RobertaLayer( (attention): RobertaAttention( (self): RobertaSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): RobertaSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): RobertaIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): RobertaOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): RobertaPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (word_dropout): WordDropout(p=0.05) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=51, bias=True) (beta): 1.0 (weights): None (weight_tensor) None )" 2022-01-16 18:38:17,526 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:38:17,526 Corpus: "Corpus: 5642 train + 195 dev + 649 test sentences" 2022-01-16 18:38:17,526 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:38:17,527 Parameters: 2022-01-16 18:38:17,527 - learning_rate: "5e-06" 2022-01-16 18:38:17,527 - mini_batch_size: "32" 2022-01-16 18:38:17,527 - patience: "3" 2022-01-16 18:38:17,528 - anneal_factor: "0.5" 2022-01-16 18:38:17,528 - max_epochs: "10" 2022-01-16 18:38:17,528 - shuffle: "True" 2022-01-16 18:38:17,528 - train_with_dev: "False" 2022-01-16 18:38:17,529 - batch_growth_annealing: "False" 2022-01-16 18:38:17,529 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:38:17,529 Model training base path: "resources/taggers/pos-transformer" 2022-01-16 18:38:17,530 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:38:17,530 Device: cuda:0 2022-01-16 18:38:17,530 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:38:17,530 Embeddings storage mode: none 2022-01-16 18:38:17,534 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:38:34,359 epoch 1 - iter 17/177 - loss 4.21719545 - samples/sec: 32.34 - lr: 0.000000 2022-01-16 18:38:49,400 epoch 1 - iter 34/177 - loss 4.19345430 - samples/sec: 36.17 - lr: 0.000001 2022-01-16 18:39:05,256 epoch 1 - iter 51/177 - loss 4.15633603 - samples/sec: 34.31 - lr: 0.000001 2022-01-16 18:39:19,936 epoch 1 - iter 68/177 - loss 4.11811385 - samples/sec: 37.07 - lr: 0.000002 2022-01-16 18:39:35,631 epoch 1 - iter 85/177 - loss 4.06705216 - samples/sec: 34.68 - lr: 0.000002 2022-01-16 18:39:49,539 epoch 1 - iter 102/177 - loss 4.01162833 - samples/sec: 39.12 - lr: 0.000003 2022-01-16 18:40:04,517 epoch 1 - iter 119/177 - loss 3.95117440 - samples/sec: 36.33 - lr: 0.000003 2022-01-16 18:40:18,637 epoch 1 - iter 136/177 - loss 3.88391044 - samples/sec: 38.53 - lr: 0.000004 2022-01-16 18:40:34,602 epoch 1 - iter 153/177 - loss 3.78662706 - samples/sec: 34.08 - lr: 0.000004 2022-01-16 18:40:50,297 epoch 1 - iter 170/177 - loss 3.66565316 - samples/sec: 34.67 - lr: 0.000005 2022-01-16 18:40:55,405 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:40:55,406 EPOCH 1 done: loss 3.6331 - lr 0.0000050 2022-01-16 18:41:01,071 DEV : loss 2.0775277614593506 - f1-score (micro avg) 0.5698 2022-01-16 18:41:01,073 BAD EPOCHS (no improvement): 4 2022-01-16 18:41:01,075 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:41:14,873 epoch 2 - iter 17/177 - loss 2.20805337 - samples/sec: 39.44 - lr: 0.000005 2022-01-16 18:41:29,867 epoch 2 - iter 34/177 - loss 1.96658974 - samples/sec: 36.29 - lr: 0.000005 2022-01-16 18:41:45,607 epoch 2 - iter 51/177 - loss 1.75508128 - samples/sec: 34.57 - lr: 0.000005 2022-01-16 18:42:01,386 epoch 2 - iter 68/177 - loss 1.58575541 - samples/sec: 34.48 - lr: 0.000005 2022-01-16 18:42:16,804 epoch 2 - iter 85/177 - loss 1.45429547 - samples/sec: 35.29 - lr: 0.000005 2022-01-16 18:42:32,178 epoch 2 - iter 102/177 - loss 1.34526502 - samples/sec: 35.39 - lr: 0.000005 2022-01-16 18:42:48,735 epoch 2 - iter 119/177 - loss 1.23724431 - samples/sec: 32.86 - lr: 0.000005 2022-01-16 18:43:03,310 epoch 2 - iter 136/177 - loss 1.16223838 - samples/sec: 37.33 - lr: 0.000005 2022-01-16 18:43:18,304 epoch 2 - iter 153/177 - loss 1.09870495 - samples/sec: 36.29 - lr: 0.000005 2022-01-16 18:43:34,956 epoch 2 - iter 170/177 - loss 1.03855466 - samples/sec: 32.67 - lr: 0.000004 2022-01-16 18:43:40,722 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:43:40,723 EPOCH 2 done: loss 1.0198 - lr 0.0000044 2022-01-16 18:43:46,405 DEV : loss 0.23464356362819672 - f1-score (micro avg) 0.9443 2022-01-16 18:43:46,407 BAD EPOCHS (no improvement): 4 2022-01-16 18:43:46,408 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:44:01,387 epoch 3 - iter 17/177 - loss 0.46476740 - samples/sec: 36.33 - lr: 0.000004 2022-01-16 18:44:17,394 epoch 3 - iter 34/177 - loss 0.46233323 - samples/sec: 33.99 - lr: 0.000004 2022-01-16 18:44:32,304 epoch 3 - iter 51/177 - loss 0.45235428 - samples/sec: 36.49 - lr: 0.000004 2022-01-16 18:44:46,826 epoch 3 - iter 68/177 - loss 0.44547326 - samples/sec: 37.47 - lr: 0.000004 2022-01-16 18:45:03,857 epoch 3 - iter 85/177 - loss 0.43503033 - samples/sec: 31.95 - lr: 0.000004 2022-01-16 18:45:20,043 epoch 3 - iter 102/177 - loss 0.42734805 - samples/sec: 33.63 - lr: 0.000004 2022-01-16 18:45:36,060 epoch 3 - iter 119/177 - loss 0.42237100 - samples/sec: 33.97 - lr: 0.000004 2022-01-16 18:45:51,576 epoch 3 - iter 136/177 - loss 0.41700412 - samples/sec: 35.07 - lr: 0.000004 2022-01-16 18:46:07,252 epoch 3 - iter 153/177 - loss 0.41455352 - samples/sec: 34.71 - lr: 0.000004 2022-01-16 18:46:23,597 epoch 3 - iter 170/177 - loss 0.41134424 - samples/sec: 33.29 - lr: 0.000004 2022-01-16 18:46:29,222 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:46:29,223 EPOCH 3 done: loss 0.4103 - lr 0.0000039 2022-01-16 18:46:34,899 DEV : loss 0.140821173787117 - f1-score (micro avg) 0.9632 2022-01-16 18:46:34,901 BAD EPOCHS (no improvement): 4 2022-01-16 18:46:34,902 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:46:49,649 epoch 4 - iter 17/177 - loss 0.34770276 - samples/sec: 36.90 - lr: 0.000004 2022-01-16 18:47:05,137 epoch 4 - iter 34/177 - loss 0.34449519 - samples/sec: 35.13 - lr: 0.000004 2022-01-16 18:47:20,666 epoch 4 - iter 51/177 - loss 0.35038471 - samples/sec: 35.04 - lr: 0.000004 2022-01-16 18:47:35,593 epoch 4 - iter 68/177 - loss 0.34965167 - samples/sec: 36.45 - lr: 0.000004 2022-01-16 18:47:51,537 epoch 4 - iter 85/177 - loss 0.35074386 - samples/sec: 34.13 - lr: 0.000004 2022-01-16 18:48:06,575 epoch 4 - iter 102/177 - loss 0.34919573 - samples/sec: 36.18 - lr: 0.000004 2022-01-16 18:48:22,671 epoch 4 - iter 119/177 - loss 0.34906482 - samples/sec: 33.80 - lr: 0.000004 2022-01-16 18:48:38,152 epoch 4 - iter 136/177 - loss 0.34645574 - samples/sec: 35.15 - lr: 0.000003 2022-01-16 18:48:53,425 epoch 4 - iter 153/177 - loss 0.34515747 - samples/sec: 35.63 - lr: 0.000003 2022-01-16 18:49:08,614 epoch 4 - iter 170/177 - loss 0.34411478 - samples/sec: 35.82 - lr: 0.000003 2022-01-16 18:49:14,556 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:49:14,557 EPOCH 4 done: loss 0.3430 - lr 0.0000033 2022-01-16 18:49:20,294 DEV : loss 0.11640190333127975 - f1-score (micro avg) 0.9703 2022-01-16 18:49:20,297 BAD EPOCHS (no improvement): 4 2022-01-16 18:49:20,297 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:49:36,057 epoch 5 - iter 17/177 - loss 0.31027747 - samples/sec: 34.53 - lr: 0.000003 2022-01-16 18:49:51,823 epoch 5 - iter 34/177 - loss 0.31176440 - samples/sec: 34.51 - lr: 0.000003 2022-01-16 18:50:06,630 epoch 5 - iter 51/177 - loss 0.31452075 - samples/sec: 36.75 - lr: 0.000003 2022-01-16 18:50:22,294 epoch 5 - iter 68/177 - loss 0.31209996 - samples/sec: 34.73 - lr: 0.000003 2022-01-16 18:50:36,301 epoch 5 - iter 85/177 - loss 0.31357991 - samples/sec: 38.85 - lr: 0.000003 2022-01-16 18:50:52,962 epoch 5 - iter 102/177 - loss 0.31496866 - samples/sec: 32.66 - lr: 0.000003 2022-01-16 18:51:08,260 epoch 5 - iter 119/177 - loss 0.31294977 - samples/sec: 35.57 - lr: 0.000003 2022-01-16 18:51:24,158 epoch 5 - iter 136/177 - loss 0.31189665 - samples/sec: 34.22 - lr: 0.000003 2022-01-16 18:51:39,145 epoch 5 - iter 153/177 - loss 0.31138881 - samples/sec: 36.31 - lr: 0.000003 2022-01-16 18:51:54,700 epoch 5 - iter 170/177 - loss 0.30960234 - samples/sec: 34.98 - lr: 0.000003 2022-01-16 18:51:59,742 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:51:59,743 EPOCH 5 done: loss 0.3098 - lr 0.0000028 2022-01-16 18:52:05,466 DEV : loss 0.10135460644960403 - f1-score (micro avg) 0.9729 2022-01-16 18:52:05,468 BAD EPOCHS (no improvement): 4 2022-01-16 18:52:05,469 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:52:20,458 epoch 6 - iter 17/177 - loss 0.30154787 - samples/sec: 36.30 - lr: 0.000003 2022-01-16 18:52:34,917 epoch 6 - iter 34/177 - loss 0.30197436 - samples/sec: 37.63 - lr: 0.000003 2022-01-16 18:52:49,618 epoch 6 - iter 51/177 - loss 0.30167136 - samples/sec: 37.01 - lr: 0.000003 2022-01-16 18:53:04,988 epoch 6 - iter 68/177 - loss 0.30196611 - samples/sec: 35.40 - lr: 0.000003 2022-01-16 18:53:20,297 epoch 6 - iter 85/177 - loss 0.30182940 - samples/sec: 35.54 - lr: 0.000003 2022-01-16 18:53:35,734 epoch 6 - iter 102/177 - loss 0.30003109 - samples/sec: 35.25 - lr: 0.000002 2022-01-16 18:53:51,701 epoch 6 - iter 119/177 - loss 0.30091205 - samples/sec: 34.08 - lr: 0.000002 2022-01-16 18:54:06,831 epoch 6 - iter 136/177 - loss 0.30099483 - samples/sec: 35.96 - lr: 0.000002 2022-01-16 18:54:22,486 epoch 6 - iter 153/177 - loss 0.29848715 - samples/sec: 34.76 - lr: 0.000002 2022-01-16 18:54:37,203 epoch 6 - iter 170/177 - loss 0.29689481 - samples/sec: 36.97 - lr: 0.000002 2022-01-16 18:54:44,337 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:54:44,338 EPOCH 6 done: loss 0.2966 - lr 0.0000022 2022-01-16 18:54:49,620 DEV : loss 0.09480294585227966 - f1-score (micro avg) 0.974 2022-01-16 18:54:49,623 BAD EPOCHS (no improvement): 4 2022-01-16 18:54:49,623 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:55:05,515 epoch 7 - iter 17/177 - loss 0.28239213 - samples/sec: 34.24 - lr: 0.000002 2022-01-16 18:55:20,295 epoch 7 - iter 34/177 - loss 0.28557506 - samples/sec: 36.81 - lr: 0.000002 2022-01-16 18:55:35,660 epoch 7 - iter 51/177 - loss 0.28541785 - samples/sec: 35.41 - lr: 0.000002 2022-01-16 18:55:51,758 epoch 7 - iter 68/177 - loss 0.29320767 - samples/sec: 33.80 - lr: 0.000002 2022-01-16 18:56:06,783 epoch 7 - iter 85/177 - loss 0.29339894 - samples/sec: 36.21 - lr: 0.000002 2022-01-16 18:56:22,815 epoch 7 - iter 102/177 - loss 0.29253486 - samples/sec: 33.94 - lr: 0.000002 2022-01-16 18:56:39,028 epoch 7 - iter 119/177 - loss 0.29145637 - samples/sec: 33.56 - lr: 0.000002 2022-01-16 18:56:54,361 epoch 7 - iter 136/177 - loss 0.29111952 - samples/sec: 35.49 - lr: 0.000002 2022-01-16 18:57:09,548 epoch 7 - iter 153/177 - loss 0.29113036 - samples/sec: 35.83 - lr: 0.000002 2022-01-16 18:57:23,584 epoch 7 - iter 170/177 - loss 0.29066532 - samples/sec: 38.76 - lr: 0.000002 2022-01-16 18:57:29,584 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:57:29,585 EPOCH 7 done: loss 0.2896 - lr 0.0000017 2022-01-16 18:57:34,894 DEV : loss 0.09033482521772385 - f1-score (micro avg) 0.9743 2022-01-16 18:57:34,896 BAD EPOCHS (no improvement): 4 2022-01-16 18:57:34,898 ---------------------------------------------------------------------------------------------------- 2022-01-16 18:57:50,623 epoch 8 - iter 17/177 - loss 0.28329047 - samples/sec: 34.60 - lr: 0.000002 2022-01-16 18:58:06,213 epoch 8 - iter 34/177 - loss 0.28096448 - samples/sec: 34.90 - lr: 0.000002 2022-01-16 18:58:22,737 epoch 8 - iter 51/177 - loss 0.28201738 - samples/sec: 32.93 - lr: 0.000002 2022-01-16 18:58:37,507 epoch 8 - iter 68/177 - loss 0.28137267 - samples/sec: 36.84 - lr: 0.000001 2022-01-16 18:58:52,962 epoch 8 - iter 85/177 - loss 0.28405564 - samples/sec: 35.21 - lr: 0.000001 2022-01-16 18:59:08,711 epoch 8 - iter 102/177 - loss 0.28496531 - samples/sec: 34.55 - lr: 0.000001 2022-01-16 18:59:23,238 epoch 8 - iter 119/177 - loss 0.28466528 - samples/sec: 37.46 - lr: 0.000001 2022-01-16 18:59:38,520 epoch 8 - iter 136/177 - loss 0.28246598 - samples/sec: 35.60 - lr: 0.000001 2022-01-16 18:59:53,789 epoch 8 - iter 153/177 - loss 0.28078088 - samples/sec: 35.63 - lr: 0.000001 2022-01-16 19:00:09,934 epoch 8 - iter 170/177 - loss 0.28075535 - samples/sec: 33.70 - lr: 0.000001 2022-01-16 19:00:15,100 ---------------------------------------------------------------------------------------------------- 2022-01-16 19:00:15,101 EPOCH 8 done: loss 0.2814 - lr 0.0000011 2022-01-16 19:00:20,403 DEV : loss 0.08581043034791946 - f1-score (micro avg) 0.9745 2022-01-16 19:00:20,406 BAD EPOCHS (no improvement): 4 2022-01-16 19:00:20,406 ---------------------------------------------------------------------------------------------------- 2022-01-16 19:00:36,469 epoch 9 - iter 17/177 - loss 0.27366042 - samples/sec: 33.87 - lr: 0.000001 2022-01-16 19:00:51,042 epoch 9 - iter 34/177 - loss 0.27417563 - samples/sec: 37.34 - lr: 0.000001 2022-01-16 19:01:06,968 epoch 9 - iter 51/177 - loss 0.27908066 - samples/sec: 34.16 - lr: 0.000001 2022-01-16 19:01:21,551 epoch 9 - iter 68/177 - loss 0.27815091 - samples/sec: 37.31 - lr: 0.000001 2022-01-16 19:01:38,409 epoch 9 - iter 85/177 - loss 0.27855783 - samples/sec: 32.28 - lr: 0.000001 2022-01-16 19:01:53,547 epoch 9 - iter 102/177 - loss 0.28336618 - samples/sec: 35.94 - lr: 0.000001 2022-01-16 19:02:09,188 epoch 9 - iter 119/177 - loss 0.28196400 - samples/sec: 34.79 - lr: 0.000001 2022-01-16 19:02:25,112 epoch 9 - iter 136/177 - loss 0.28112997 - samples/sec: 34.17 - lr: 0.000001 2022-01-16 19:02:41,122 epoch 9 - iter 153/177 - loss 0.28271008 - samples/sec: 33.99 - lr: 0.000001 2022-01-16 19:02:57,003 epoch 9 - iter 170/177 - loss 0.28254205 - samples/sec: 34.26 - lr: 0.000001 2022-01-16 19:03:02,602 ---------------------------------------------------------------------------------------------------- 2022-01-16 19:03:02,603 EPOCH 9 done: loss 0.2826 - lr 0.0000006 2022-01-16 19:03:08,344 DEV : loss 0.08502506464719772 - f1-score (micro avg) 0.974 2022-01-16 19:03:08,347 BAD EPOCHS (no improvement): 4 2022-01-16 19:03:08,348 ---------------------------------------------------------------------------------------------------- 2022-01-16 19:03:22,683 epoch 10 - iter 17/177 - loss 0.29810598 - samples/sec: 37.96 - lr: 0.000001 2022-01-16 19:03:38,044 epoch 10 - iter 34/177 - loss 0.29633129 - samples/sec: 35.42 - lr: 0.000000 2022-01-16 19:03:54,399 epoch 10 - iter 51/177 - loss 0.28500408 - samples/sec: 33.27 - lr: 0.000000 2022-01-16 19:04:09,802 epoch 10 - iter 68/177 - loss 0.28305573 - samples/sec: 35.32 - lr: 0.000000 2022-01-16 19:04:25,641 epoch 10 - iter 85/177 - loss 0.28663575 - samples/sec: 34.35 - lr: 0.000000 2022-01-16 19:04:40,354 epoch 10 - iter 102/177 - loss 0.28653115 - samples/sec: 36.98 - lr: 0.000000 2022-01-16 19:04:56,702 epoch 10 - iter 119/177 - loss 0.28579694 - samples/sec: 33.28 - lr: 0.000000 2022-01-16 19:05:12,070 epoch 10 - iter 136/177 - loss 0.28590446 - samples/sec: 35.40 - lr: 0.000000 2022-01-16 19:05:27,377 epoch 10 - iter 153/177 - loss 0.28533742 - samples/sec: 35.55 - lr: 0.000000 2022-01-16 19:05:42,603 epoch 10 - iter 170/177 - loss 0.28333786 - samples/sec: 35.73 - lr: 0.000000 2022-01-16 19:05:48,443 ---------------------------------------------------------------------------------------------------- 2022-01-16 19:05:48,444 EPOCH 10 done: loss 0.2832 - lr 0.0000000 2022-01-16 19:05:54,211 DEV : loss 0.08448906987905502 - f1-score (micro avg) 0.974 2022-01-16 19:05:54,214 BAD EPOCHS (no improvement): 4 2022-01-16 19:05:55,439 ---------------------------------------------------------------------------------------------------- 2022-01-16 19:05:55,440 Testing using last state of model ... 2022-01-16 19:06:15,179 0.9788 0.9788 0.9788 0.9788 2022-01-16 19:06:15,180 Results: - F-score (micro) 0.9788 - F-score (macro) 0.7527 - Accuracy 0.9788 By class: precision recall f1-score support NOMcom 0.9850 0.9840 0.9845 2130 VERcjg 0.9974 0.9954 0.9964 1535 PROper 0.9912 0.9920 0.9916 1368 PONfbl 1.0000 0.9993 0.9996 1341 PRE 0.9881 0.9955 0.9918 1331 ADVgen 0.9713 0.9263 0.9483 841 PONfrt 0.9895 1.0000 0.9947 662 DETdef 0.9983 0.9983 0.9983 606 ADJqua 0.9259 0.9500 0.9378 500 VERinf 0.9920 1.0000 0.9960 497 DETpos 1.0000 0.9957 0.9979 469 CONcoo 0.9957 0.9935 0.9946 465 CONsub 0.9337 0.9409 0.9373 389 VERppe 0.9659 0.9720 0.9689 321 ADVneg 0.9476 1.0000 0.9731 271 PROrel 0.9194 0.9296 0.9245 270 NOMpro 0.9634 0.9925 0.9777 265 DETndf 0.9958 0.9715 0.9835 246 PROind 0.9526 0.9628 0.9577 188 PRE.DETdef 0.9785 0.9945 0.9864 183 DETdem 1.0000 0.9806 0.9902 155 PROdem 0.9675 1.0000 0.9835 119 PROadv 0.9083 0.9820 0.9437 111 DETind 0.9223 0.9694 0.9453 98 VERppa 0.9683 0.9104 0.9385 67 PROimp 0.8333 0.8333 0.8333 54 DETcar 0.7381 1.0000 0.8493 31 INJ 1.0000 0.8571 0.9231 35 ADJind 0.9310 0.9000 0.9153 30 PROint 0.6957 0.7273 0.7111 22 ADJcar 0.8333 0.4762 0.6061 21 PROcar 0.7333 0.6111 0.6667 18 PONpga 1.0000 1.0000 1.0000 16 PROpos 0.9231 0.8571 0.8889 14 DETrel 0.6364 0.4375 0.5185 16 DETint 0.4706 0.8000 0.5926 10 PONpdr 1.0000 1.0000 1.0000 13 ADJord 0.8889 0.5000 0.6400 16 ADVint 1.0000 0.8000 0.8889 5 PONpxx 0.0000 0.0000 0.0000 6 PRE.PROrel 0.0000 0.0000 0.0000 2 latin 0.0000 0.0000 0.0000 2 PROord 0.0000 0.0000 0.0000 1 PRE.PROdem 0.0000 0.0000 0.0000 1 PRE.NOMcom 0.0000 0.0000 0.0000 1 ETR 0.0000 0.0000 0.0000 1 ADVsub 0.0000 0.0000 0.0000 1 micro avg 0.9788 0.9788 0.9788 14744 macro avg 0.7647 0.7497 0.7527 14744 weighted avg 0.9781 0.9788 0.9782 14744 samples avg 0.9788 0.9788 0.9788 14744 2022-01-16 19:06:15,180 ----------------------------------------------------------------------------------------------------