stefan-it commited on
Commit
8dbf01a
1 Parent(s): 64bc05b

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc350c9be504f46e30766a6db37d01f3c513a613173900c8288bbfd0409004d2
3
+ size 440966725
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 19:55:54 0.0000 0.5727 0.1055 0.7697 0.7887 0.7791 0.6576
3
+ 2 19:56:58 0.0000 0.1175 0.1042 0.7749 0.8242 0.7988 0.6872
4
+ 3 19:58:01 0.0000 0.0731 0.1144 0.8237 0.8373 0.8304 0.7328
5
+ 4 19:59:04 0.0000 0.0482 0.1448 0.8270 0.8322 0.8296 0.7313
6
+ 5 20:00:05 0.0000 0.0354 0.1904 0.8241 0.8396 0.8318 0.7408
7
+ 6 20:01:09 0.0000 0.0269 0.1950 0.8280 0.8219 0.8249 0.7295
8
+ 7 20:02:11 0.0000 0.0175 0.2010 0.8450 0.8494 0.8472 0.7578
9
+ 8 20:03:15 0.0000 0.0106 0.2029 0.8466 0.8534 0.8500 0.7637
10
+ 9 20:04:18 0.0000 0.0072 0.2057 0.8344 0.8574 0.8458 0.7587
11
+ 10 20:05:21 0.0000 0.0050 0.2126 0.8439 0.8517 0.8478 0.7591
runs/events.out.tfevents.1697572498.bce904bcef33.2482.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f45eb7604ab86b2b5a17acec87883e4ab391a7184afa69b7ac55fb8ba323188
3
+ size 415388
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 19:54:58,087 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 19:54:58,087 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=21, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 19:54:58,088 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
48
+ - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
49
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 19:54:58,088 Train: 5901 sentences
51
+ 2023-10-17 19:54:58,088 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 19:54:58,088 Training Params:
54
+ 2023-10-17 19:54:58,088 - learning_rate: "5e-05"
55
+ 2023-10-17 19:54:58,088 - mini_batch_size: "8"
56
+ 2023-10-17 19:54:58,088 - max_epochs: "10"
57
+ 2023-10-17 19:54:58,088 - shuffle: "True"
58
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 19:54:58,088 Plugins:
60
+ 2023-10-17 19:54:58,088 - TensorboardLogger
61
+ 2023-10-17 19:54:58,088 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 19:54:58,088 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 19:54:58,088 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 19:54:58,088 Computation:
67
+ 2023-10-17 19:54:58,088 - compute on device: cuda:0
68
+ 2023-10-17 19:54:58,088 - embedding storage: none
69
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 19:54:58,088 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
71
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 19:54:58,089 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 19:55:03,268 epoch 1 - iter 73/738 - loss 2.88693034 - time (sec): 5.18 - samples/sec: 3395.57 - lr: 0.000005 - momentum: 0.000000
75
+ 2023-10-17 19:55:07,787 epoch 1 - iter 146/738 - loss 1.85950582 - time (sec): 9.70 - samples/sec: 3389.92 - lr: 0.000010 - momentum: 0.000000
76
+ 2023-10-17 19:55:13,252 epoch 1 - iter 219/738 - loss 1.35106642 - time (sec): 15.16 - samples/sec: 3378.75 - lr: 0.000015 - momentum: 0.000000
77
+ 2023-10-17 19:55:18,790 epoch 1 - iter 292/738 - loss 1.08357586 - time (sec): 20.70 - samples/sec: 3336.43 - lr: 0.000020 - momentum: 0.000000
78
+ 2023-10-17 19:55:23,677 epoch 1 - iter 365/738 - loss 0.92998728 - time (sec): 25.59 - samples/sec: 3324.51 - lr: 0.000025 - momentum: 0.000000
79
+ 2023-10-17 19:55:28,248 epoch 1 - iter 438/738 - loss 0.82953286 - time (sec): 30.16 - samples/sec: 3312.57 - lr: 0.000030 - momentum: 0.000000
80
+ 2023-10-17 19:55:33,045 epoch 1 - iter 511/738 - loss 0.74962079 - time (sec): 34.96 - samples/sec: 3295.52 - lr: 0.000035 - momentum: 0.000000
81
+ 2023-10-17 19:55:38,046 epoch 1 - iter 584/738 - loss 0.68197633 - time (sec): 39.96 - samples/sec: 3288.35 - lr: 0.000039 - momentum: 0.000000
82
+ 2023-10-17 19:55:43,219 epoch 1 - iter 657/738 - loss 0.62571720 - time (sec): 45.13 - samples/sec: 3269.21 - lr: 0.000044 - momentum: 0.000000
83
+ 2023-10-17 19:55:48,468 epoch 1 - iter 730/738 - loss 0.57764771 - time (sec): 50.38 - samples/sec: 3269.02 - lr: 0.000049 - momentum: 0.000000
84
+ 2023-10-17 19:55:48,961 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 19:55:48,961 EPOCH 1 done: loss 0.5727 - lr: 0.000049
86
+ 2023-10-17 19:55:54,850 DEV : loss 0.10554851591587067 - f1-score (micro avg) 0.7791
87
+ 2023-10-17 19:55:54,883 saving best model
88
+ 2023-10-17 19:55:55,250 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 19:56:00,200 epoch 2 - iter 73/738 - loss 0.14514615 - time (sec): 4.95 - samples/sec: 3366.36 - lr: 0.000049 - momentum: 0.000000
90
+ 2023-10-17 19:56:05,461 epoch 2 - iter 146/738 - loss 0.14280365 - time (sec): 10.21 - samples/sec: 3405.96 - lr: 0.000049 - momentum: 0.000000
91
+ 2023-10-17 19:56:11,181 epoch 2 - iter 219/738 - loss 0.13414673 - time (sec): 15.93 - samples/sec: 3260.55 - lr: 0.000048 - momentum: 0.000000
92
+ 2023-10-17 19:56:16,015 epoch 2 - iter 292/738 - loss 0.12870561 - time (sec): 20.76 - samples/sec: 3242.34 - lr: 0.000048 - momentum: 0.000000
93
+ 2023-10-17 19:56:20,656 epoch 2 - iter 365/738 - loss 0.12649401 - time (sec): 25.40 - samples/sec: 3224.67 - lr: 0.000047 - momentum: 0.000000
94
+ 2023-10-17 19:56:25,261 epoch 2 - iter 438/738 - loss 0.12411978 - time (sec): 30.01 - samples/sec: 3236.24 - lr: 0.000047 - momentum: 0.000000
95
+ 2023-10-17 19:56:30,148 epoch 2 - iter 511/738 - loss 0.11950094 - time (sec): 34.90 - samples/sec: 3251.88 - lr: 0.000046 - momentum: 0.000000
96
+ 2023-10-17 19:56:35,203 epoch 2 - iter 584/738 - loss 0.11924617 - time (sec): 39.95 - samples/sec: 3240.62 - lr: 0.000046 - momentum: 0.000000
97
+ 2023-10-17 19:56:40,727 epoch 2 - iter 657/738 - loss 0.11885339 - time (sec): 45.48 - samples/sec: 3244.73 - lr: 0.000045 - momentum: 0.000000
98
+ 2023-10-17 19:56:46,137 epoch 2 - iter 730/738 - loss 0.11757973 - time (sec): 50.89 - samples/sec: 3234.08 - lr: 0.000045 - momentum: 0.000000
99
+ 2023-10-17 19:56:46,752 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 19:56:46,752 EPOCH 2 done: loss 0.1175 - lr: 0.000045
101
+ 2023-10-17 19:56:58,026 DEV : loss 0.10420098155736923 - f1-score (micro avg) 0.7988
102
+ 2023-10-17 19:56:58,058 saving best model
103
+ 2023-10-17 19:56:58,527 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 19:57:04,268 epoch 3 - iter 73/738 - loss 0.06478791 - time (sec): 5.74 - samples/sec: 3069.26 - lr: 0.000044 - momentum: 0.000000
105
+ 2023-10-17 19:57:09,479 epoch 3 - iter 146/738 - loss 0.06871097 - time (sec): 10.95 - samples/sec: 3185.38 - lr: 0.000043 - momentum: 0.000000
106
+ 2023-10-17 19:57:14,560 epoch 3 - iter 219/738 - loss 0.06720314 - time (sec): 16.03 - samples/sec: 3221.49 - lr: 0.000043 - momentum: 0.000000
107
+ 2023-10-17 19:57:19,385 epoch 3 - iter 292/738 - loss 0.06920884 - time (sec): 20.85 - samples/sec: 3233.71 - lr: 0.000042 - momentum: 0.000000
108
+ 2023-10-17 19:57:24,326 epoch 3 - iter 365/738 - loss 0.07062373 - time (sec): 25.80 - samples/sec: 3231.62 - lr: 0.000042 - momentum: 0.000000
109
+ 2023-10-17 19:57:29,283 epoch 3 - iter 438/738 - loss 0.07250042 - time (sec): 30.75 - samples/sec: 3215.39 - lr: 0.000041 - momentum: 0.000000
110
+ 2023-10-17 19:57:34,689 epoch 3 - iter 511/738 - loss 0.07252047 - time (sec): 36.16 - samples/sec: 3237.02 - lr: 0.000041 - momentum: 0.000000
111
+ 2023-10-17 19:57:39,815 epoch 3 - iter 584/738 - loss 0.07374542 - time (sec): 41.28 - samples/sec: 3222.24 - lr: 0.000040 - momentum: 0.000000
112
+ 2023-10-17 19:57:44,747 epoch 3 - iter 657/738 - loss 0.07287253 - time (sec): 46.22 - samples/sec: 3223.47 - lr: 0.000040 - momentum: 0.000000
113
+ 2023-10-17 19:57:49,392 epoch 3 - iter 730/738 - loss 0.07335412 - time (sec): 50.86 - samples/sec: 3243.97 - lr: 0.000039 - momentum: 0.000000
114
+ 2023-10-17 19:57:49,824 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 19:57:49,825 EPOCH 3 done: loss 0.0731 - lr: 0.000039
116
+ 2023-10-17 19:58:01,170 DEV : loss 0.1143854483962059 - f1-score (micro avg) 0.8304
117
+ 2023-10-17 19:58:01,201 saving best model
118
+ 2023-10-17 19:58:01,680 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-17 19:58:06,869 epoch 4 - iter 73/738 - loss 0.05125944 - time (sec): 5.18 - samples/sec: 3063.50 - lr: 0.000038 - momentum: 0.000000
120
+ 2023-10-17 19:58:12,085 epoch 4 - iter 146/738 - loss 0.04770594 - time (sec): 10.40 - samples/sec: 3221.63 - lr: 0.000038 - momentum: 0.000000
121
+ 2023-10-17 19:58:16,715 epoch 4 - iter 219/738 - loss 0.05111511 - time (sec): 15.03 - samples/sec: 3250.15 - lr: 0.000037 - momentum: 0.000000
122
+ 2023-10-17 19:58:21,755 epoch 4 - iter 292/738 - loss 0.05211159 - time (sec): 20.07 - samples/sec: 3238.43 - lr: 0.000037 - momentum: 0.000000
123
+ 2023-10-17 19:58:26,379 epoch 4 - iter 365/738 - loss 0.05176227 - time (sec): 24.69 - samples/sec: 3228.83 - lr: 0.000036 - momentum: 0.000000
124
+ 2023-10-17 19:58:31,229 epoch 4 - iter 438/738 - loss 0.05010900 - time (sec): 29.54 - samples/sec: 3263.30 - lr: 0.000036 - momentum: 0.000000
125
+ 2023-10-17 19:58:35,878 epoch 4 - iter 511/738 - loss 0.04871523 - time (sec): 34.19 - samples/sec: 3282.05 - lr: 0.000035 - momentum: 0.000000
126
+ 2023-10-17 19:58:41,325 epoch 4 - iter 584/738 - loss 0.04755472 - time (sec): 39.64 - samples/sec: 3277.83 - lr: 0.000035 - momentum: 0.000000
127
+ 2023-10-17 19:58:46,444 epoch 4 - iter 657/738 - loss 0.04734226 - time (sec): 44.76 - samples/sec: 3265.92 - lr: 0.000034 - momentum: 0.000000
128
+ 2023-10-17 19:58:52,119 epoch 4 - iter 730/738 - loss 0.04826096 - time (sec): 50.43 - samples/sec: 3266.32 - lr: 0.000033 - momentum: 0.000000
129
+ 2023-10-17 19:58:52,586 ----------------------------------------------------------------------------------------------------
130
+ 2023-10-17 19:58:52,587 EPOCH 4 done: loss 0.0482 - lr: 0.000033
131
+ 2023-10-17 19:59:03,974 DEV : loss 0.14476759731769562 - f1-score (micro avg) 0.8296
132
+ 2023-10-17 19:59:04,007 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-17 19:59:09,112 epoch 5 - iter 73/738 - loss 0.02409841 - time (sec): 5.10 - samples/sec: 3478.57 - lr: 0.000033 - momentum: 0.000000
134
+ 2023-10-17 19:59:13,903 epoch 5 - iter 146/738 - loss 0.02674600 - time (sec): 9.90 - samples/sec: 3380.35 - lr: 0.000032 - momentum: 0.000000
135
+ 2023-10-17 19:59:18,707 epoch 5 - iter 219/738 - loss 0.03079058 - time (sec): 14.70 - samples/sec: 3371.53 - lr: 0.000032 - momentum: 0.000000
136
+ 2023-10-17 19:59:23,875 epoch 5 - iter 292/738 - loss 0.03683637 - time (sec): 19.87 - samples/sec: 3340.91 - lr: 0.000031 - momentum: 0.000000
137
+ 2023-10-17 19:59:28,827 epoch 5 - iter 365/738 - loss 0.03423444 - time (sec): 24.82 - samples/sec: 3340.14 - lr: 0.000031 - momentum: 0.000000
138
+ 2023-10-17 19:59:33,840 epoch 5 - iter 438/738 - loss 0.03471070 - time (sec): 29.83 - samples/sec: 3336.10 - lr: 0.000030 - momentum: 0.000000
139
+ 2023-10-17 19:59:38,818 epoch 5 - iter 511/738 - loss 0.03388460 - time (sec): 34.81 - samples/sec: 3313.82 - lr: 0.000030 - momentum: 0.000000
140
+ 2023-10-17 19:59:43,507 epoch 5 - iter 584/738 - loss 0.03409148 - time (sec): 39.50 - samples/sec: 3308.22 - lr: 0.000029 - momentum: 0.000000
141
+ 2023-10-17 19:59:48,472 epoch 5 - iter 657/738 - loss 0.03441889 - time (sec): 44.46 - samples/sec: 3313.79 - lr: 0.000028 - momentum: 0.000000
142
+ 2023-10-17 19:59:53,517 epoch 5 - iter 730/738 - loss 0.03505580 - time (sec): 49.51 - samples/sec: 3315.30 - lr: 0.000028 - momentum: 0.000000
143
+ 2023-10-17 19:59:54,306 ----------------------------------------------------------------------------------------------------
144
+ 2023-10-17 19:59:54,306 EPOCH 5 done: loss 0.0354 - lr: 0.000028
145
+ 2023-10-17 20:00:05,845 DEV : loss 0.19043707847595215 - f1-score (micro avg) 0.8318
146
+ 2023-10-17 20:00:05,880 saving best model
147
+ 2023-10-17 20:00:06,366 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-17 20:00:11,320 epoch 6 - iter 73/738 - loss 0.03216532 - time (sec): 4.95 - samples/sec: 3176.87 - lr: 0.000027 - momentum: 0.000000
149
+ 2023-10-17 20:00:16,370 epoch 6 - iter 146/738 - loss 0.02639679 - time (sec): 10.00 - samples/sec: 3286.09 - lr: 0.000027 - momentum: 0.000000
150
+ 2023-10-17 20:00:21,647 epoch 6 - iter 219/738 - loss 0.02284424 - time (sec): 15.28 - samples/sec: 3237.56 - lr: 0.000026 - momentum: 0.000000
151
+ 2023-10-17 20:00:27,112 epoch 6 - iter 292/738 - loss 0.02623873 - time (sec): 20.74 - samples/sec: 3148.97 - lr: 0.000026 - momentum: 0.000000
152
+ 2023-10-17 20:00:32,148 epoch 6 - iter 365/738 - loss 0.02569826 - time (sec): 25.78 - samples/sec: 3159.99 - lr: 0.000025 - momentum: 0.000000
153
+ 2023-10-17 20:00:36,948 epoch 6 - iter 438/738 - loss 0.02511489 - time (sec): 30.58 - samples/sec: 3174.95 - lr: 0.000025 - momentum: 0.000000
154
+ 2023-10-17 20:00:42,160 epoch 6 - iter 511/738 - loss 0.02584371 - time (sec): 35.79 - samples/sec: 3177.93 - lr: 0.000024 - momentum: 0.000000
155
+ 2023-10-17 20:00:47,108 epoch 6 - iter 584/738 - loss 0.02552496 - time (sec): 40.74 - samples/sec: 3212.15 - lr: 0.000023 - momentum: 0.000000
156
+ 2023-10-17 20:00:51,949 epoch 6 - iter 657/738 - loss 0.02629373 - time (sec): 45.58 - samples/sec: 3222.14 - lr: 0.000023 - momentum: 0.000000
157
+ 2023-10-17 20:00:57,061 epoch 6 - iter 730/738 - loss 0.02671452 - time (sec): 50.69 - samples/sec: 3245.73 - lr: 0.000022 - momentum: 0.000000
158
+ 2023-10-17 20:00:57,723 ----------------------------------------------------------------------------------------------------
159
+ 2023-10-17 20:00:57,723 EPOCH 6 done: loss 0.0269 - lr: 0.000022
160
+ 2023-10-17 20:01:09,239 DEV : loss 0.1950322538614273 - f1-score (micro avg) 0.8249
161
+ 2023-10-17 20:01:09,271 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-17 20:01:14,527 epoch 7 - iter 73/738 - loss 0.01386572 - time (sec): 5.25 - samples/sec: 3200.54 - lr: 0.000022 - momentum: 0.000000
163
+ 2023-10-17 20:01:19,720 epoch 7 - iter 146/738 - loss 0.01634565 - time (sec): 10.45 - samples/sec: 3157.16 - lr: 0.000021 - momentum: 0.000000
164
+ 2023-10-17 20:01:25,152 epoch 7 - iter 219/738 - loss 0.01790451 - time (sec): 15.88 - samples/sec: 3193.90 - lr: 0.000021 - momentum: 0.000000
165
+ 2023-10-17 20:01:30,559 epoch 7 - iter 292/738 - loss 0.01736663 - time (sec): 21.29 - samples/sec: 3204.39 - lr: 0.000020 - momentum: 0.000000
166
+ 2023-10-17 20:01:35,540 epoch 7 - iter 365/738 - loss 0.01978003 - time (sec): 26.27 - samples/sec: 3192.85 - lr: 0.000020 - momentum: 0.000000
167
+ 2023-10-17 20:01:40,624 epoch 7 - iter 438/738 - loss 0.01962635 - time (sec): 31.35 - samples/sec: 3207.62 - lr: 0.000019 - momentum: 0.000000
168
+ 2023-10-17 20:01:45,292 epoch 7 - iter 511/738 - loss 0.01942126 - time (sec): 36.02 - samples/sec: 3231.16 - lr: 0.000018 - momentum: 0.000000
169
+ 2023-10-17 20:01:50,368 epoch 7 - iter 584/738 - loss 0.01907844 - time (sec): 41.09 - samples/sec: 3246.80 - lr: 0.000018 - momentum: 0.000000
170
+ 2023-10-17 20:01:55,510 epoch 7 - iter 657/738 - loss 0.01851805 - time (sec): 46.24 - samples/sec: 3245.13 - lr: 0.000017 - momentum: 0.000000
171
+ 2023-10-17 20:02:00,008 epoch 7 - iter 730/738 - loss 0.01755251 - time (sec): 50.74 - samples/sec: 3250.75 - lr: 0.000017 - momentum: 0.000000
172
+ 2023-10-17 20:02:00,455 ----------------------------------------------------------------------------------------------------
173
+ 2023-10-17 20:02:00,455 EPOCH 7 done: loss 0.0175 - lr: 0.000017
174
+ 2023-10-17 20:02:11,907 DEV : loss 0.20099307596683502 - f1-score (micro avg) 0.8472
175
+ 2023-10-17 20:02:11,942 saving best model
176
+ 2023-10-17 20:02:12,437 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-17 20:02:17,409 epoch 8 - iter 73/738 - loss 0.00530868 - time (sec): 4.97 - samples/sec: 3271.17 - lr: 0.000016 - momentum: 0.000000
178
+ 2023-10-17 20:02:22,979 epoch 8 - iter 146/738 - loss 0.00860378 - time (sec): 10.54 - samples/sec: 3219.29 - lr: 0.000016 - momentum: 0.000000
179
+ 2023-10-17 20:02:27,693 epoch 8 - iter 219/738 - loss 0.00882523 - time (sec): 15.25 - samples/sec: 3245.99 - lr: 0.000015 - momentum: 0.000000
180
+ 2023-10-17 20:02:32,332 epoch 8 - iter 292/738 - loss 0.00758182 - time (sec): 19.89 - samples/sec: 3282.80 - lr: 0.000015 - momentum: 0.000000
181
+ 2023-10-17 20:02:37,477 epoch 8 - iter 365/738 - loss 0.00936325 - time (sec): 25.04 - samples/sec: 3273.17 - lr: 0.000014 - momentum: 0.000000
182
+ 2023-10-17 20:02:42,143 epoch 8 - iter 438/738 - loss 0.00889819 - time (sec): 29.70 - samples/sec: 3288.61 - lr: 0.000013 - momentum: 0.000000
183
+ 2023-10-17 20:02:47,889 epoch 8 - iter 511/738 - loss 0.01120590 - time (sec): 35.45 - samples/sec: 3292.36 - lr: 0.000013 - momentum: 0.000000
184
+ 2023-10-17 20:02:52,781 epoch 8 - iter 584/738 - loss 0.01070108 - time (sec): 40.34 - samples/sec: 3292.28 - lr: 0.000012 - momentum: 0.000000
185
+ 2023-10-17 20:02:57,905 epoch 8 - iter 657/738 - loss 0.01029857 - time (sec): 45.47 - samples/sec: 3281.21 - lr: 0.000012 - momentum: 0.000000
186
+ 2023-10-17 20:03:02,819 epoch 8 - iter 730/738 - loss 0.01057085 - time (sec): 50.38 - samples/sec: 3266.74 - lr: 0.000011 - momentum: 0.000000
187
+ 2023-10-17 20:03:03,437 ----------------------------------------------------------------------------------------------------
188
+ 2023-10-17 20:03:03,438 EPOCH 8 done: loss 0.0106 - lr: 0.000011
189
+ 2023-10-17 20:03:15,127 DEV : loss 0.2029201090335846 - f1-score (micro avg) 0.85
190
+ 2023-10-17 20:03:15,159 saving best model
191
+ 2023-10-17 20:03:15,636 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-17 20:03:21,317 epoch 9 - iter 73/738 - loss 0.00731494 - time (sec): 5.67 - samples/sec: 3159.58 - lr: 0.000011 - momentum: 0.000000
193
+ 2023-10-17 20:03:26,465 epoch 9 - iter 146/738 - loss 0.00660726 - time (sec): 10.82 - samples/sec: 3343.70 - lr: 0.000010 - momentum: 0.000000
194
+ 2023-10-17 20:03:31,815 epoch 9 - iter 219/738 - loss 0.00688380 - time (sec): 16.17 - samples/sec: 3369.99 - lr: 0.000010 - momentum: 0.000000
195
+ 2023-10-17 20:03:36,707 epoch 9 - iter 292/738 - loss 0.00641742 - time (sec): 21.06 - samples/sec: 3302.92 - lr: 0.000009 - momentum: 0.000000
196
+ 2023-10-17 20:03:41,868 epoch 9 - iter 365/738 - loss 0.00586385 - time (sec): 26.23 - samples/sec: 3257.53 - lr: 0.000008 - momentum: 0.000000
197
+ 2023-10-17 20:03:46,603 epoch 9 - iter 438/738 - loss 0.00547976 - time (sec): 30.96 - samples/sec: 3284.63 - lr: 0.000008 - momentum: 0.000000
198
+ 2023-10-17 20:03:51,092 epoch 9 - iter 511/738 - loss 0.00675333 - time (sec): 35.45 - samples/sec: 3291.12 - lr: 0.000007 - momentum: 0.000000
199
+ 2023-10-17 20:03:55,944 epoch 9 - iter 584/738 - loss 0.00670660 - time (sec): 40.30 - samples/sec: 3279.67 - lr: 0.000007 - momentum: 0.000000
200
+ 2023-10-17 20:04:01,758 epoch 9 - iter 657/738 - loss 0.00683468 - time (sec): 46.12 - samples/sec: 3255.35 - lr: 0.000006 - momentum: 0.000000
201
+ 2023-10-17 20:04:06,152 epoch 9 - iter 730/738 - loss 0.00731875 - time (sec): 50.51 - samples/sec: 3256.15 - lr: 0.000006 - momentum: 0.000000
202
+ 2023-10-17 20:04:06,710 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-17 20:04:06,711 EPOCH 9 done: loss 0.0072 - lr: 0.000006
204
+ 2023-10-17 20:04:18,319 DEV : loss 0.20573198795318604 - f1-score (micro avg) 0.8458
205
+ 2023-10-17 20:04:18,358 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-17 20:04:24,146 epoch 10 - iter 73/738 - loss 0.00431885 - time (sec): 5.79 - samples/sec: 3384.62 - lr: 0.000005 - momentum: 0.000000
207
+ 2023-10-17 20:04:29,348 epoch 10 - iter 146/738 - loss 0.00511183 - time (sec): 10.99 - samples/sec: 3350.39 - lr: 0.000004 - momentum: 0.000000
208
+ 2023-10-17 20:04:34,219 epoch 10 - iter 219/738 - loss 0.00524316 - time (sec): 15.86 - samples/sec: 3345.15 - lr: 0.000004 - momentum: 0.000000
209
+ 2023-10-17 20:04:39,060 epoch 10 - iter 292/738 - loss 0.00450078 - time (sec): 20.70 - samples/sec: 3268.11 - lr: 0.000003 - momentum: 0.000000
210
+ 2023-10-17 20:04:43,692 epoch 10 - iter 365/738 - loss 0.00430535 - time (sec): 25.33 - samples/sec: 3281.17 - lr: 0.000003 - momentum: 0.000000
211
+ 2023-10-17 20:04:48,199 epoch 10 - iter 438/738 - loss 0.00395909 - time (sec): 29.84 - samples/sec: 3327.98 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-17 20:04:53,745 epoch 10 - iter 511/738 - loss 0.00416350 - time (sec): 35.39 - samples/sec: 3296.62 - lr: 0.000002 - momentum: 0.000000
213
+ 2023-10-17 20:04:58,445 epoch 10 - iter 584/738 - loss 0.00467921 - time (sec): 40.09 - samples/sec: 3287.80 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-17 20:05:03,398 epoch 10 - iter 657/738 - loss 0.00470911 - time (sec): 45.04 - samples/sec: 3274.98 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-17 20:05:08,928 epoch 10 - iter 730/738 - loss 0.00478977 - time (sec): 50.57 - samples/sec: 3260.86 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-17 20:05:09,381 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-17 20:05:09,382 EPOCH 10 done: loss 0.0050 - lr: 0.000000
218
+ 2023-10-17 20:05:21,014 DEV : loss 0.2125636637210846 - f1-score (micro avg) 0.8478
219
+ 2023-10-17 20:05:21,431 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-17 20:05:21,432 Loading model from best epoch ...
221
+ 2023-10-17 20:05:22,951 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
222
+ 2023-10-17 20:05:29,932
223
+ Results:
224
+ - F-score (micro) 0.8107
225
+ - F-score (macro) 0.7154
226
+ - Accuracy 0.7
227
+
228
+ By class:
229
+ precision recall f1-score support
230
+
231
+ loc 0.8549 0.8928 0.8734 858
232
+ pers 0.7792 0.8082 0.7934 537
233
+ org 0.6154 0.6061 0.6107 132
234
+ prod 0.6721 0.6721 0.6721 61
235
+ time 0.5781 0.6852 0.6271 54
236
+
237
+ micro avg 0.7951 0.8270 0.8107 1642
238
+ macro avg 0.6999 0.7329 0.7154 1642
239
+ weighted avg 0.7950 0.8270 0.8106 1642
240
+
241
+ 2023-10-17 20:05:29,933 ----------------------------------------------------------------------------------------------------