Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697572498.bce904bcef33.2482.1 +3 -0
- test.tsv +0 -0
- training.log +241 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dc350c9be504f46e30766a6db37d01f3c513a613173900c8288bbfd0409004d2
|
3 |
+
size 440966725
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 19:55:54 0.0000 0.5727 0.1055 0.7697 0.7887 0.7791 0.6576
|
3 |
+
2 19:56:58 0.0000 0.1175 0.1042 0.7749 0.8242 0.7988 0.6872
|
4 |
+
3 19:58:01 0.0000 0.0731 0.1144 0.8237 0.8373 0.8304 0.7328
|
5 |
+
4 19:59:04 0.0000 0.0482 0.1448 0.8270 0.8322 0.8296 0.7313
|
6 |
+
5 20:00:05 0.0000 0.0354 0.1904 0.8241 0.8396 0.8318 0.7408
|
7 |
+
6 20:01:09 0.0000 0.0269 0.1950 0.8280 0.8219 0.8249 0.7295
|
8 |
+
7 20:02:11 0.0000 0.0175 0.2010 0.8450 0.8494 0.8472 0.7578
|
9 |
+
8 20:03:15 0.0000 0.0106 0.2029 0.8466 0.8534 0.8500 0.7637
|
10 |
+
9 20:04:18 0.0000 0.0072 0.2057 0.8344 0.8574 0.8458 0.7587
|
11 |
+
10 20:05:21 0.0000 0.0050 0.2126 0.8439 0.8517 0.8478 0.7591
|
runs/events.out.tfevents.1697572498.bce904bcef33.2482.1
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1f45eb7604ab86b2b5a17acec87883e4ab391a7184afa69b7ac55fb8ba323188
|
3 |
+
size 415388
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,241 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 19:54:58,087 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 19:54:58,087 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=21, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 19:54:58,088 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
|
48 |
+
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
|
49 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 19:54:58,088 Train: 5901 sentences
|
51 |
+
2023-10-17 19:54:58,088 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 19:54:58,088 Training Params:
|
54 |
+
2023-10-17 19:54:58,088 - learning_rate: "5e-05"
|
55 |
+
2023-10-17 19:54:58,088 - mini_batch_size: "8"
|
56 |
+
2023-10-17 19:54:58,088 - max_epochs: "10"
|
57 |
+
2023-10-17 19:54:58,088 - shuffle: "True"
|
58 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 19:54:58,088 Plugins:
|
60 |
+
2023-10-17 19:54:58,088 - TensorboardLogger
|
61 |
+
2023-10-17 19:54:58,088 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 19:54:58,088 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 19:54:58,088 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 19:54:58,088 Computation:
|
67 |
+
2023-10-17 19:54:58,088 - compute on device: cuda:0
|
68 |
+
2023-10-17 19:54:58,088 - embedding storage: none
|
69 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 19:54:58,088 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
|
71 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 19:54:58,088 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 19:54:58,089 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 19:55:03,268 epoch 1 - iter 73/738 - loss 2.88693034 - time (sec): 5.18 - samples/sec: 3395.57 - lr: 0.000005 - momentum: 0.000000
|
75 |
+
2023-10-17 19:55:07,787 epoch 1 - iter 146/738 - loss 1.85950582 - time (sec): 9.70 - samples/sec: 3389.92 - lr: 0.000010 - momentum: 0.000000
|
76 |
+
2023-10-17 19:55:13,252 epoch 1 - iter 219/738 - loss 1.35106642 - time (sec): 15.16 - samples/sec: 3378.75 - lr: 0.000015 - momentum: 0.000000
|
77 |
+
2023-10-17 19:55:18,790 epoch 1 - iter 292/738 - loss 1.08357586 - time (sec): 20.70 - samples/sec: 3336.43 - lr: 0.000020 - momentum: 0.000000
|
78 |
+
2023-10-17 19:55:23,677 epoch 1 - iter 365/738 - loss 0.92998728 - time (sec): 25.59 - samples/sec: 3324.51 - lr: 0.000025 - momentum: 0.000000
|
79 |
+
2023-10-17 19:55:28,248 epoch 1 - iter 438/738 - loss 0.82953286 - time (sec): 30.16 - samples/sec: 3312.57 - lr: 0.000030 - momentum: 0.000000
|
80 |
+
2023-10-17 19:55:33,045 epoch 1 - iter 511/738 - loss 0.74962079 - time (sec): 34.96 - samples/sec: 3295.52 - lr: 0.000035 - momentum: 0.000000
|
81 |
+
2023-10-17 19:55:38,046 epoch 1 - iter 584/738 - loss 0.68197633 - time (sec): 39.96 - samples/sec: 3288.35 - lr: 0.000039 - momentum: 0.000000
|
82 |
+
2023-10-17 19:55:43,219 epoch 1 - iter 657/738 - loss 0.62571720 - time (sec): 45.13 - samples/sec: 3269.21 - lr: 0.000044 - momentum: 0.000000
|
83 |
+
2023-10-17 19:55:48,468 epoch 1 - iter 730/738 - loss 0.57764771 - time (sec): 50.38 - samples/sec: 3269.02 - lr: 0.000049 - momentum: 0.000000
|
84 |
+
2023-10-17 19:55:48,961 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 19:55:48,961 EPOCH 1 done: loss 0.5727 - lr: 0.000049
|
86 |
+
2023-10-17 19:55:54,850 DEV : loss 0.10554851591587067 - f1-score (micro avg) 0.7791
|
87 |
+
2023-10-17 19:55:54,883 saving best model
|
88 |
+
2023-10-17 19:55:55,250 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 19:56:00,200 epoch 2 - iter 73/738 - loss 0.14514615 - time (sec): 4.95 - samples/sec: 3366.36 - lr: 0.000049 - momentum: 0.000000
|
90 |
+
2023-10-17 19:56:05,461 epoch 2 - iter 146/738 - loss 0.14280365 - time (sec): 10.21 - samples/sec: 3405.96 - lr: 0.000049 - momentum: 0.000000
|
91 |
+
2023-10-17 19:56:11,181 epoch 2 - iter 219/738 - loss 0.13414673 - time (sec): 15.93 - samples/sec: 3260.55 - lr: 0.000048 - momentum: 0.000000
|
92 |
+
2023-10-17 19:56:16,015 epoch 2 - iter 292/738 - loss 0.12870561 - time (sec): 20.76 - samples/sec: 3242.34 - lr: 0.000048 - momentum: 0.000000
|
93 |
+
2023-10-17 19:56:20,656 epoch 2 - iter 365/738 - loss 0.12649401 - time (sec): 25.40 - samples/sec: 3224.67 - lr: 0.000047 - momentum: 0.000000
|
94 |
+
2023-10-17 19:56:25,261 epoch 2 - iter 438/738 - loss 0.12411978 - time (sec): 30.01 - samples/sec: 3236.24 - lr: 0.000047 - momentum: 0.000000
|
95 |
+
2023-10-17 19:56:30,148 epoch 2 - iter 511/738 - loss 0.11950094 - time (sec): 34.90 - samples/sec: 3251.88 - lr: 0.000046 - momentum: 0.000000
|
96 |
+
2023-10-17 19:56:35,203 epoch 2 - iter 584/738 - loss 0.11924617 - time (sec): 39.95 - samples/sec: 3240.62 - lr: 0.000046 - momentum: 0.000000
|
97 |
+
2023-10-17 19:56:40,727 epoch 2 - iter 657/738 - loss 0.11885339 - time (sec): 45.48 - samples/sec: 3244.73 - lr: 0.000045 - momentum: 0.000000
|
98 |
+
2023-10-17 19:56:46,137 epoch 2 - iter 730/738 - loss 0.11757973 - time (sec): 50.89 - samples/sec: 3234.08 - lr: 0.000045 - momentum: 0.000000
|
99 |
+
2023-10-17 19:56:46,752 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 19:56:46,752 EPOCH 2 done: loss 0.1175 - lr: 0.000045
|
101 |
+
2023-10-17 19:56:58,026 DEV : loss 0.10420098155736923 - f1-score (micro avg) 0.7988
|
102 |
+
2023-10-17 19:56:58,058 saving best model
|
103 |
+
2023-10-17 19:56:58,527 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 19:57:04,268 epoch 3 - iter 73/738 - loss 0.06478791 - time (sec): 5.74 - samples/sec: 3069.26 - lr: 0.000044 - momentum: 0.000000
|
105 |
+
2023-10-17 19:57:09,479 epoch 3 - iter 146/738 - loss 0.06871097 - time (sec): 10.95 - samples/sec: 3185.38 - lr: 0.000043 - momentum: 0.000000
|
106 |
+
2023-10-17 19:57:14,560 epoch 3 - iter 219/738 - loss 0.06720314 - time (sec): 16.03 - samples/sec: 3221.49 - lr: 0.000043 - momentum: 0.000000
|
107 |
+
2023-10-17 19:57:19,385 epoch 3 - iter 292/738 - loss 0.06920884 - time (sec): 20.85 - samples/sec: 3233.71 - lr: 0.000042 - momentum: 0.000000
|
108 |
+
2023-10-17 19:57:24,326 epoch 3 - iter 365/738 - loss 0.07062373 - time (sec): 25.80 - samples/sec: 3231.62 - lr: 0.000042 - momentum: 0.000000
|
109 |
+
2023-10-17 19:57:29,283 epoch 3 - iter 438/738 - loss 0.07250042 - time (sec): 30.75 - samples/sec: 3215.39 - lr: 0.000041 - momentum: 0.000000
|
110 |
+
2023-10-17 19:57:34,689 epoch 3 - iter 511/738 - loss 0.07252047 - time (sec): 36.16 - samples/sec: 3237.02 - lr: 0.000041 - momentum: 0.000000
|
111 |
+
2023-10-17 19:57:39,815 epoch 3 - iter 584/738 - loss 0.07374542 - time (sec): 41.28 - samples/sec: 3222.24 - lr: 0.000040 - momentum: 0.000000
|
112 |
+
2023-10-17 19:57:44,747 epoch 3 - iter 657/738 - loss 0.07287253 - time (sec): 46.22 - samples/sec: 3223.47 - lr: 0.000040 - momentum: 0.000000
|
113 |
+
2023-10-17 19:57:49,392 epoch 3 - iter 730/738 - loss 0.07335412 - time (sec): 50.86 - samples/sec: 3243.97 - lr: 0.000039 - momentum: 0.000000
|
114 |
+
2023-10-17 19:57:49,824 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 19:57:49,825 EPOCH 3 done: loss 0.0731 - lr: 0.000039
|
116 |
+
2023-10-17 19:58:01,170 DEV : loss 0.1143854483962059 - f1-score (micro avg) 0.8304
|
117 |
+
2023-10-17 19:58:01,201 saving best model
|
118 |
+
2023-10-17 19:58:01,680 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 19:58:06,869 epoch 4 - iter 73/738 - loss 0.05125944 - time (sec): 5.18 - samples/sec: 3063.50 - lr: 0.000038 - momentum: 0.000000
|
120 |
+
2023-10-17 19:58:12,085 epoch 4 - iter 146/738 - loss 0.04770594 - time (sec): 10.40 - samples/sec: 3221.63 - lr: 0.000038 - momentum: 0.000000
|
121 |
+
2023-10-17 19:58:16,715 epoch 4 - iter 219/738 - loss 0.05111511 - time (sec): 15.03 - samples/sec: 3250.15 - lr: 0.000037 - momentum: 0.000000
|
122 |
+
2023-10-17 19:58:21,755 epoch 4 - iter 292/738 - loss 0.05211159 - time (sec): 20.07 - samples/sec: 3238.43 - lr: 0.000037 - momentum: 0.000000
|
123 |
+
2023-10-17 19:58:26,379 epoch 4 - iter 365/738 - loss 0.05176227 - time (sec): 24.69 - samples/sec: 3228.83 - lr: 0.000036 - momentum: 0.000000
|
124 |
+
2023-10-17 19:58:31,229 epoch 4 - iter 438/738 - loss 0.05010900 - time (sec): 29.54 - samples/sec: 3263.30 - lr: 0.000036 - momentum: 0.000000
|
125 |
+
2023-10-17 19:58:35,878 epoch 4 - iter 511/738 - loss 0.04871523 - time (sec): 34.19 - samples/sec: 3282.05 - lr: 0.000035 - momentum: 0.000000
|
126 |
+
2023-10-17 19:58:41,325 epoch 4 - iter 584/738 - loss 0.04755472 - time (sec): 39.64 - samples/sec: 3277.83 - lr: 0.000035 - momentum: 0.000000
|
127 |
+
2023-10-17 19:58:46,444 epoch 4 - iter 657/738 - loss 0.04734226 - time (sec): 44.76 - samples/sec: 3265.92 - lr: 0.000034 - momentum: 0.000000
|
128 |
+
2023-10-17 19:58:52,119 epoch 4 - iter 730/738 - loss 0.04826096 - time (sec): 50.43 - samples/sec: 3266.32 - lr: 0.000033 - momentum: 0.000000
|
129 |
+
2023-10-17 19:58:52,586 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 19:58:52,587 EPOCH 4 done: loss 0.0482 - lr: 0.000033
|
131 |
+
2023-10-17 19:59:03,974 DEV : loss 0.14476759731769562 - f1-score (micro avg) 0.8296
|
132 |
+
2023-10-17 19:59:04,007 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-17 19:59:09,112 epoch 5 - iter 73/738 - loss 0.02409841 - time (sec): 5.10 - samples/sec: 3478.57 - lr: 0.000033 - momentum: 0.000000
|
134 |
+
2023-10-17 19:59:13,903 epoch 5 - iter 146/738 - loss 0.02674600 - time (sec): 9.90 - samples/sec: 3380.35 - lr: 0.000032 - momentum: 0.000000
|
135 |
+
2023-10-17 19:59:18,707 epoch 5 - iter 219/738 - loss 0.03079058 - time (sec): 14.70 - samples/sec: 3371.53 - lr: 0.000032 - momentum: 0.000000
|
136 |
+
2023-10-17 19:59:23,875 epoch 5 - iter 292/738 - loss 0.03683637 - time (sec): 19.87 - samples/sec: 3340.91 - lr: 0.000031 - momentum: 0.000000
|
137 |
+
2023-10-17 19:59:28,827 epoch 5 - iter 365/738 - loss 0.03423444 - time (sec): 24.82 - samples/sec: 3340.14 - lr: 0.000031 - momentum: 0.000000
|
138 |
+
2023-10-17 19:59:33,840 epoch 5 - iter 438/738 - loss 0.03471070 - time (sec): 29.83 - samples/sec: 3336.10 - lr: 0.000030 - momentum: 0.000000
|
139 |
+
2023-10-17 19:59:38,818 epoch 5 - iter 511/738 - loss 0.03388460 - time (sec): 34.81 - samples/sec: 3313.82 - lr: 0.000030 - momentum: 0.000000
|
140 |
+
2023-10-17 19:59:43,507 epoch 5 - iter 584/738 - loss 0.03409148 - time (sec): 39.50 - samples/sec: 3308.22 - lr: 0.000029 - momentum: 0.000000
|
141 |
+
2023-10-17 19:59:48,472 epoch 5 - iter 657/738 - loss 0.03441889 - time (sec): 44.46 - samples/sec: 3313.79 - lr: 0.000028 - momentum: 0.000000
|
142 |
+
2023-10-17 19:59:53,517 epoch 5 - iter 730/738 - loss 0.03505580 - time (sec): 49.51 - samples/sec: 3315.30 - lr: 0.000028 - momentum: 0.000000
|
143 |
+
2023-10-17 19:59:54,306 ----------------------------------------------------------------------------------------------------
|
144 |
+
2023-10-17 19:59:54,306 EPOCH 5 done: loss 0.0354 - lr: 0.000028
|
145 |
+
2023-10-17 20:00:05,845 DEV : loss 0.19043707847595215 - f1-score (micro avg) 0.8318
|
146 |
+
2023-10-17 20:00:05,880 saving best model
|
147 |
+
2023-10-17 20:00:06,366 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-17 20:00:11,320 epoch 6 - iter 73/738 - loss 0.03216532 - time (sec): 4.95 - samples/sec: 3176.87 - lr: 0.000027 - momentum: 0.000000
|
149 |
+
2023-10-17 20:00:16,370 epoch 6 - iter 146/738 - loss 0.02639679 - time (sec): 10.00 - samples/sec: 3286.09 - lr: 0.000027 - momentum: 0.000000
|
150 |
+
2023-10-17 20:00:21,647 epoch 6 - iter 219/738 - loss 0.02284424 - time (sec): 15.28 - samples/sec: 3237.56 - lr: 0.000026 - momentum: 0.000000
|
151 |
+
2023-10-17 20:00:27,112 epoch 6 - iter 292/738 - loss 0.02623873 - time (sec): 20.74 - samples/sec: 3148.97 - lr: 0.000026 - momentum: 0.000000
|
152 |
+
2023-10-17 20:00:32,148 epoch 6 - iter 365/738 - loss 0.02569826 - time (sec): 25.78 - samples/sec: 3159.99 - lr: 0.000025 - momentum: 0.000000
|
153 |
+
2023-10-17 20:00:36,948 epoch 6 - iter 438/738 - loss 0.02511489 - time (sec): 30.58 - samples/sec: 3174.95 - lr: 0.000025 - momentum: 0.000000
|
154 |
+
2023-10-17 20:00:42,160 epoch 6 - iter 511/738 - loss 0.02584371 - time (sec): 35.79 - samples/sec: 3177.93 - lr: 0.000024 - momentum: 0.000000
|
155 |
+
2023-10-17 20:00:47,108 epoch 6 - iter 584/738 - loss 0.02552496 - time (sec): 40.74 - samples/sec: 3212.15 - lr: 0.000023 - momentum: 0.000000
|
156 |
+
2023-10-17 20:00:51,949 epoch 6 - iter 657/738 - loss 0.02629373 - time (sec): 45.58 - samples/sec: 3222.14 - lr: 0.000023 - momentum: 0.000000
|
157 |
+
2023-10-17 20:00:57,061 epoch 6 - iter 730/738 - loss 0.02671452 - time (sec): 50.69 - samples/sec: 3245.73 - lr: 0.000022 - momentum: 0.000000
|
158 |
+
2023-10-17 20:00:57,723 ----------------------------------------------------------------------------------------------------
|
159 |
+
2023-10-17 20:00:57,723 EPOCH 6 done: loss 0.0269 - lr: 0.000022
|
160 |
+
2023-10-17 20:01:09,239 DEV : loss 0.1950322538614273 - f1-score (micro avg) 0.8249
|
161 |
+
2023-10-17 20:01:09,271 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-17 20:01:14,527 epoch 7 - iter 73/738 - loss 0.01386572 - time (sec): 5.25 - samples/sec: 3200.54 - lr: 0.000022 - momentum: 0.000000
|
163 |
+
2023-10-17 20:01:19,720 epoch 7 - iter 146/738 - loss 0.01634565 - time (sec): 10.45 - samples/sec: 3157.16 - lr: 0.000021 - momentum: 0.000000
|
164 |
+
2023-10-17 20:01:25,152 epoch 7 - iter 219/738 - loss 0.01790451 - time (sec): 15.88 - samples/sec: 3193.90 - lr: 0.000021 - momentum: 0.000000
|
165 |
+
2023-10-17 20:01:30,559 epoch 7 - iter 292/738 - loss 0.01736663 - time (sec): 21.29 - samples/sec: 3204.39 - lr: 0.000020 - momentum: 0.000000
|
166 |
+
2023-10-17 20:01:35,540 epoch 7 - iter 365/738 - loss 0.01978003 - time (sec): 26.27 - samples/sec: 3192.85 - lr: 0.000020 - momentum: 0.000000
|
167 |
+
2023-10-17 20:01:40,624 epoch 7 - iter 438/738 - loss 0.01962635 - time (sec): 31.35 - samples/sec: 3207.62 - lr: 0.000019 - momentum: 0.000000
|
168 |
+
2023-10-17 20:01:45,292 epoch 7 - iter 511/738 - loss 0.01942126 - time (sec): 36.02 - samples/sec: 3231.16 - lr: 0.000018 - momentum: 0.000000
|
169 |
+
2023-10-17 20:01:50,368 epoch 7 - iter 584/738 - loss 0.01907844 - time (sec): 41.09 - samples/sec: 3246.80 - lr: 0.000018 - momentum: 0.000000
|
170 |
+
2023-10-17 20:01:55,510 epoch 7 - iter 657/738 - loss 0.01851805 - time (sec): 46.24 - samples/sec: 3245.13 - lr: 0.000017 - momentum: 0.000000
|
171 |
+
2023-10-17 20:02:00,008 epoch 7 - iter 730/738 - loss 0.01755251 - time (sec): 50.74 - samples/sec: 3250.75 - lr: 0.000017 - momentum: 0.000000
|
172 |
+
2023-10-17 20:02:00,455 ----------------------------------------------------------------------------------------------------
|
173 |
+
2023-10-17 20:02:00,455 EPOCH 7 done: loss 0.0175 - lr: 0.000017
|
174 |
+
2023-10-17 20:02:11,907 DEV : loss 0.20099307596683502 - f1-score (micro avg) 0.8472
|
175 |
+
2023-10-17 20:02:11,942 saving best model
|
176 |
+
2023-10-17 20:02:12,437 ----------------------------------------------------------------------------------------------------
|
177 |
+
2023-10-17 20:02:17,409 epoch 8 - iter 73/738 - loss 0.00530868 - time (sec): 4.97 - samples/sec: 3271.17 - lr: 0.000016 - momentum: 0.000000
|
178 |
+
2023-10-17 20:02:22,979 epoch 8 - iter 146/738 - loss 0.00860378 - time (sec): 10.54 - samples/sec: 3219.29 - lr: 0.000016 - momentum: 0.000000
|
179 |
+
2023-10-17 20:02:27,693 epoch 8 - iter 219/738 - loss 0.00882523 - time (sec): 15.25 - samples/sec: 3245.99 - lr: 0.000015 - momentum: 0.000000
|
180 |
+
2023-10-17 20:02:32,332 epoch 8 - iter 292/738 - loss 0.00758182 - time (sec): 19.89 - samples/sec: 3282.80 - lr: 0.000015 - momentum: 0.000000
|
181 |
+
2023-10-17 20:02:37,477 epoch 8 - iter 365/738 - loss 0.00936325 - time (sec): 25.04 - samples/sec: 3273.17 - lr: 0.000014 - momentum: 0.000000
|
182 |
+
2023-10-17 20:02:42,143 epoch 8 - iter 438/738 - loss 0.00889819 - time (sec): 29.70 - samples/sec: 3288.61 - lr: 0.000013 - momentum: 0.000000
|
183 |
+
2023-10-17 20:02:47,889 epoch 8 - iter 511/738 - loss 0.01120590 - time (sec): 35.45 - samples/sec: 3292.36 - lr: 0.000013 - momentum: 0.000000
|
184 |
+
2023-10-17 20:02:52,781 epoch 8 - iter 584/738 - loss 0.01070108 - time (sec): 40.34 - samples/sec: 3292.28 - lr: 0.000012 - momentum: 0.000000
|
185 |
+
2023-10-17 20:02:57,905 epoch 8 - iter 657/738 - loss 0.01029857 - time (sec): 45.47 - samples/sec: 3281.21 - lr: 0.000012 - momentum: 0.000000
|
186 |
+
2023-10-17 20:03:02,819 epoch 8 - iter 730/738 - loss 0.01057085 - time (sec): 50.38 - samples/sec: 3266.74 - lr: 0.000011 - momentum: 0.000000
|
187 |
+
2023-10-17 20:03:03,437 ----------------------------------------------------------------------------------------------------
|
188 |
+
2023-10-17 20:03:03,438 EPOCH 8 done: loss 0.0106 - lr: 0.000011
|
189 |
+
2023-10-17 20:03:15,127 DEV : loss 0.2029201090335846 - f1-score (micro avg) 0.85
|
190 |
+
2023-10-17 20:03:15,159 saving best model
|
191 |
+
2023-10-17 20:03:15,636 ----------------------------------------------------------------------------------------------------
|
192 |
+
2023-10-17 20:03:21,317 epoch 9 - iter 73/738 - loss 0.00731494 - time (sec): 5.67 - samples/sec: 3159.58 - lr: 0.000011 - momentum: 0.000000
|
193 |
+
2023-10-17 20:03:26,465 epoch 9 - iter 146/738 - loss 0.00660726 - time (sec): 10.82 - samples/sec: 3343.70 - lr: 0.000010 - momentum: 0.000000
|
194 |
+
2023-10-17 20:03:31,815 epoch 9 - iter 219/738 - loss 0.00688380 - time (sec): 16.17 - samples/sec: 3369.99 - lr: 0.000010 - momentum: 0.000000
|
195 |
+
2023-10-17 20:03:36,707 epoch 9 - iter 292/738 - loss 0.00641742 - time (sec): 21.06 - samples/sec: 3302.92 - lr: 0.000009 - momentum: 0.000000
|
196 |
+
2023-10-17 20:03:41,868 epoch 9 - iter 365/738 - loss 0.00586385 - time (sec): 26.23 - samples/sec: 3257.53 - lr: 0.000008 - momentum: 0.000000
|
197 |
+
2023-10-17 20:03:46,603 epoch 9 - iter 438/738 - loss 0.00547976 - time (sec): 30.96 - samples/sec: 3284.63 - lr: 0.000008 - momentum: 0.000000
|
198 |
+
2023-10-17 20:03:51,092 epoch 9 - iter 511/738 - loss 0.00675333 - time (sec): 35.45 - samples/sec: 3291.12 - lr: 0.000007 - momentum: 0.000000
|
199 |
+
2023-10-17 20:03:55,944 epoch 9 - iter 584/738 - loss 0.00670660 - time (sec): 40.30 - samples/sec: 3279.67 - lr: 0.000007 - momentum: 0.000000
|
200 |
+
2023-10-17 20:04:01,758 epoch 9 - iter 657/738 - loss 0.00683468 - time (sec): 46.12 - samples/sec: 3255.35 - lr: 0.000006 - momentum: 0.000000
|
201 |
+
2023-10-17 20:04:06,152 epoch 9 - iter 730/738 - loss 0.00731875 - time (sec): 50.51 - samples/sec: 3256.15 - lr: 0.000006 - momentum: 0.000000
|
202 |
+
2023-10-17 20:04:06,710 ----------------------------------------------------------------------------------------------------
|
203 |
+
2023-10-17 20:04:06,711 EPOCH 9 done: loss 0.0072 - lr: 0.000006
|
204 |
+
2023-10-17 20:04:18,319 DEV : loss 0.20573198795318604 - f1-score (micro avg) 0.8458
|
205 |
+
2023-10-17 20:04:18,358 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-17 20:04:24,146 epoch 10 - iter 73/738 - loss 0.00431885 - time (sec): 5.79 - samples/sec: 3384.62 - lr: 0.000005 - momentum: 0.000000
|
207 |
+
2023-10-17 20:04:29,348 epoch 10 - iter 146/738 - loss 0.00511183 - time (sec): 10.99 - samples/sec: 3350.39 - lr: 0.000004 - momentum: 0.000000
|
208 |
+
2023-10-17 20:04:34,219 epoch 10 - iter 219/738 - loss 0.00524316 - time (sec): 15.86 - samples/sec: 3345.15 - lr: 0.000004 - momentum: 0.000000
|
209 |
+
2023-10-17 20:04:39,060 epoch 10 - iter 292/738 - loss 0.00450078 - time (sec): 20.70 - samples/sec: 3268.11 - lr: 0.000003 - momentum: 0.000000
|
210 |
+
2023-10-17 20:04:43,692 epoch 10 - iter 365/738 - loss 0.00430535 - time (sec): 25.33 - samples/sec: 3281.17 - lr: 0.000003 - momentum: 0.000000
|
211 |
+
2023-10-17 20:04:48,199 epoch 10 - iter 438/738 - loss 0.00395909 - time (sec): 29.84 - samples/sec: 3327.98 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-17 20:04:53,745 epoch 10 - iter 511/738 - loss 0.00416350 - time (sec): 35.39 - samples/sec: 3296.62 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-17 20:04:58,445 epoch 10 - iter 584/738 - loss 0.00467921 - time (sec): 40.09 - samples/sec: 3287.80 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-17 20:05:03,398 epoch 10 - iter 657/738 - loss 0.00470911 - time (sec): 45.04 - samples/sec: 3274.98 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-17 20:05:08,928 epoch 10 - iter 730/738 - loss 0.00478977 - time (sec): 50.57 - samples/sec: 3260.86 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-17 20:05:09,381 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-17 20:05:09,382 EPOCH 10 done: loss 0.0050 - lr: 0.000000
|
218 |
+
2023-10-17 20:05:21,014 DEV : loss 0.2125636637210846 - f1-score (micro avg) 0.8478
|
219 |
+
2023-10-17 20:05:21,431 ----------------------------------------------------------------------------------------------------
|
220 |
+
2023-10-17 20:05:21,432 Loading model from best epoch ...
|
221 |
+
2023-10-17 20:05:22,951 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
|
222 |
+
2023-10-17 20:05:29,932
|
223 |
+
Results:
|
224 |
+
- F-score (micro) 0.8107
|
225 |
+
- F-score (macro) 0.7154
|
226 |
+
- Accuracy 0.7
|
227 |
+
|
228 |
+
By class:
|
229 |
+
precision recall f1-score support
|
230 |
+
|
231 |
+
loc 0.8549 0.8928 0.8734 858
|
232 |
+
pers 0.7792 0.8082 0.7934 537
|
233 |
+
org 0.6154 0.6061 0.6107 132
|
234 |
+
prod 0.6721 0.6721 0.6721 61
|
235 |
+
time 0.5781 0.6852 0.6271 54
|
236 |
+
|
237 |
+
micro avg 0.7951 0.8270 0.8107 1642
|
238 |
+
macro avg 0.6999 0.7329 0.7154 1642
|
239 |
+
weighted avg 0.7950 0.8270 0.8106 1642
|
240 |
+
|
241 |
+
2023-10-17 20:05:29,933 ----------------------------------------------------------------------------------------------------
|