stefan-it commited on
Commit
6e98dcd
1 Parent(s): 5f572bb

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f96bba563dd898d95c0a840aa135b0a9d850c122308f5179b7bbf7758568fd31
3
+ size 440941957
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 09:06:03 0.0000 0.2873 0.1274 0.4853 0.6991 0.5729 0.4112
3
+ 2 09:10:00 0.0000 0.0977 0.1344 0.5390 0.8387 0.6562 0.4963
4
+ 3 09:13:59 0.0000 0.0762 0.1836 0.5495 0.7368 0.6295 0.4657
5
+ 4 09:17:48 0.0000 0.0548 0.2305 0.5647 0.7243 0.6346 0.4692
6
+ 5 09:21:52 0.0000 0.0386 0.3018 0.5591 0.7906 0.6550 0.4950
7
+ 6 09:25:39 0.0000 0.0277 0.3515 0.5356 0.7998 0.6416 0.4794
8
+ 7 09:29:41 0.0000 0.0194 0.3488 0.5560 0.7780 0.6485 0.4868
9
+ 8 09:33:31 0.0000 0.0129 0.3892 0.5512 0.8066 0.6549 0.4958
10
+ 9 09:37:34 0.0000 0.0082 0.3819 0.5724 0.7780 0.6596 0.4985
11
+ 10 09:41:33 0.0000 0.0047 0.4088 0.5653 0.7929 0.6600 0.5000
runs/events.out.tfevents.1697533330.4aef72135bc5.1113.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1ec5cb44ed8f7f0f9708ed2e17e800da044bd5ac530f4041cb73a58206733d79
3
+ size 2030580
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,237 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 09:02:10,577 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 09:02:10,579 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=13, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 09:02:10,579 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 09:02:10,579 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
48
+ - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
49
+ 2023-10-17 09:02:10,579 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 09:02:10,579 Train: 14465 sentences
51
+ 2023-10-17 09:02:10,579 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 09:02:10,579 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 09:02:10,579 Training Params:
54
+ 2023-10-17 09:02:10,579 - learning_rate: "3e-05"
55
+ 2023-10-17 09:02:10,579 - mini_batch_size: "4"
56
+ 2023-10-17 09:02:10,579 - max_epochs: "10"
57
+ 2023-10-17 09:02:10,579 - shuffle: "True"
58
+ 2023-10-17 09:02:10,580 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 09:02:10,580 Plugins:
60
+ 2023-10-17 09:02:10,580 - TensorboardLogger
61
+ 2023-10-17 09:02:10,580 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 09:02:10,580 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 09:02:10,580 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 09:02:10,580 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 09:02:10,580 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 09:02:10,580 Computation:
67
+ 2023-10-17 09:02:10,580 - compute on device: cuda:0
68
+ 2023-10-17 09:02:10,580 - embedding storage: none
69
+ 2023-10-17 09:02:10,580 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 09:02:10,580 Model training base path: "hmbench-letemps/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
71
+ 2023-10-17 09:02:10,580 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 09:02:10,580 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 09:02:10,580 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 09:02:34,601 epoch 1 - iter 361/3617 - loss 1.57318286 - time (sec): 24.02 - samples/sec: 1618.04 - lr: 0.000003 - momentum: 0.000000
75
+ 2023-10-17 09:02:57,319 epoch 1 - iter 722/3617 - loss 0.91675384 - time (sec): 46.74 - samples/sec: 1634.35 - lr: 0.000006 - momentum: 0.000000
76
+ 2023-10-17 09:03:20,631 epoch 1 - iter 1083/3617 - loss 0.66327805 - time (sec): 70.05 - samples/sec: 1635.60 - lr: 0.000009 - momentum: 0.000000
77
+ 2023-10-17 09:03:43,792 epoch 1 - iter 1444/3617 - loss 0.53397507 - time (sec): 93.21 - samples/sec: 1641.10 - lr: 0.000012 - momentum: 0.000000
78
+ 2023-10-17 09:04:07,330 epoch 1 - iter 1805/3617 - loss 0.45416962 - time (sec): 116.75 - samples/sec: 1629.84 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-17 09:04:28,950 epoch 1 - iter 2166/3617 - loss 0.40035588 - time (sec): 138.37 - samples/sec: 1642.28 - lr: 0.000018 - momentum: 0.000000
80
+ 2023-10-17 09:04:50,503 epoch 1 - iter 2527/3617 - loss 0.36212873 - time (sec): 159.92 - samples/sec: 1653.79 - lr: 0.000021 - momentum: 0.000000
81
+ 2023-10-17 09:05:13,205 epoch 1 - iter 2888/3617 - loss 0.33159814 - time (sec): 182.62 - samples/sec: 1663.85 - lr: 0.000024 - momentum: 0.000000
82
+ 2023-10-17 09:05:35,839 epoch 1 - iter 3249/3617 - loss 0.30611244 - time (sec): 205.26 - samples/sec: 1667.31 - lr: 0.000027 - momentum: 0.000000
83
+ 2023-10-17 09:05:56,932 epoch 1 - iter 3610/3617 - loss 0.28747179 - time (sec): 226.35 - samples/sec: 1674.81 - lr: 0.000030 - momentum: 0.000000
84
+ 2023-10-17 09:05:57,348 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 09:05:57,349 EPOCH 1 done: loss 0.2873 - lr: 0.000030
86
+ 2023-10-17 09:06:03,627 DEV : loss 0.12736186385154724 - f1-score (micro avg) 0.5729
87
+ 2023-10-17 09:06:03,669 saving best model
88
+ 2023-10-17 09:06:04,170 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 09:06:27,927 epoch 2 - iter 361/3617 - loss 0.10706765 - time (sec): 23.76 - samples/sec: 1541.66 - lr: 0.000030 - momentum: 0.000000
90
+ 2023-10-17 09:06:52,938 epoch 2 - iter 722/3617 - loss 0.10073542 - time (sec): 48.77 - samples/sec: 1557.77 - lr: 0.000029 - momentum: 0.000000
91
+ 2023-10-17 09:07:16,610 epoch 2 - iter 1083/3617 - loss 0.10019335 - time (sec): 72.44 - samples/sec: 1589.80 - lr: 0.000029 - momentum: 0.000000
92
+ 2023-10-17 09:07:39,159 epoch 2 - iter 1444/3617 - loss 0.10373908 - time (sec): 94.99 - samples/sec: 1596.28 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-17 09:08:02,807 epoch 2 - iter 1805/3617 - loss 0.10168351 - time (sec): 118.64 - samples/sec: 1585.32 - lr: 0.000028 - momentum: 0.000000
94
+ 2023-10-17 09:08:26,939 epoch 2 - iter 2166/3617 - loss 0.10048563 - time (sec): 142.77 - samples/sec: 1582.24 - lr: 0.000028 - momentum: 0.000000
95
+ 2023-10-17 09:08:48,749 epoch 2 - iter 2527/3617 - loss 0.09886610 - time (sec): 164.58 - samples/sec: 1601.65 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-17 09:09:10,359 epoch 2 - iter 2888/3617 - loss 0.10012403 - time (sec): 186.19 - samples/sec: 1621.82 - lr: 0.000027 - momentum: 0.000000
97
+ 2023-10-17 09:09:31,958 epoch 2 - iter 3249/3617 - loss 0.09783095 - time (sec): 207.79 - samples/sec: 1650.26 - lr: 0.000027 - momentum: 0.000000
98
+ 2023-10-17 09:09:53,444 epoch 2 - iter 3610/3617 - loss 0.09771129 - time (sec): 229.27 - samples/sec: 1654.59 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-17 09:09:53,849 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 09:09:53,850 EPOCH 2 done: loss 0.0977 - lr: 0.000027
101
+ 2023-10-17 09:10:00,951 DEV : loss 0.1343551129102707 - f1-score (micro avg) 0.6562
102
+ 2023-10-17 09:10:00,995 saving best model
103
+ 2023-10-17 09:10:01,589 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 09:10:23,891 epoch 3 - iter 361/3617 - loss 0.07791442 - time (sec): 22.30 - samples/sec: 1689.74 - lr: 0.000026 - momentum: 0.000000
105
+ 2023-10-17 09:10:48,004 epoch 3 - iter 722/3617 - loss 0.07169122 - time (sec): 46.41 - samples/sec: 1664.02 - lr: 0.000026 - momentum: 0.000000
106
+ 2023-10-17 09:11:12,224 epoch 3 - iter 1083/3617 - loss 0.07458371 - time (sec): 70.63 - samples/sec: 1628.39 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-17 09:11:34,685 epoch 3 - iter 1444/3617 - loss 0.07550290 - time (sec): 93.09 - samples/sec: 1658.33 - lr: 0.000025 - momentum: 0.000000
108
+ 2023-10-17 09:11:57,759 epoch 3 - iter 1805/3617 - loss 0.07532046 - time (sec): 116.17 - samples/sec: 1655.01 - lr: 0.000025 - momentum: 0.000000
109
+ 2023-10-17 09:12:20,983 epoch 3 - iter 2166/3617 - loss 0.07566162 - time (sec): 139.39 - samples/sec: 1655.63 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-17 09:12:44,462 epoch 3 - iter 2527/3617 - loss 0.07561865 - time (sec): 162.87 - samples/sec: 1649.55 - lr: 0.000024 - momentum: 0.000000
111
+ 2023-10-17 09:13:08,282 epoch 3 - iter 2888/3617 - loss 0.07691066 - time (sec): 186.69 - samples/sec: 1634.52 - lr: 0.000024 - momentum: 0.000000
112
+ 2023-10-17 09:13:30,472 epoch 3 - iter 3249/3617 - loss 0.07626978 - time (sec): 208.88 - samples/sec: 1640.52 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-17 09:13:52,448 epoch 3 - iter 3610/3617 - loss 0.07614469 - time (sec): 230.86 - samples/sec: 1642.45 - lr: 0.000023 - momentum: 0.000000
114
+ 2023-10-17 09:13:52,876 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 09:13:52,876 EPOCH 3 done: loss 0.0762 - lr: 0.000023
116
+ 2023-10-17 09:13:59,262 DEV : loss 0.183589369058609 - f1-score (micro avg) 0.6295
117
+ 2023-10-17 09:13:59,306 ----------------------------------------------------------------------------------------------------
118
+ 2023-10-17 09:14:22,434 epoch 4 - iter 361/3617 - loss 0.05314260 - time (sec): 23.13 - samples/sec: 1683.44 - lr: 0.000023 - momentum: 0.000000
119
+ 2023-10-17 09:14:44,723 epoch 4 - iter 722/3617 - loss 0.04744612 - time (sec): 45.42 - samples/sec: 1691.58 - lr: 0.000023 - momentum: 0.000000
120
+ 2023-10-17 09:15:07,522 epoch 4 - iter 1083/3617 - loss 0.05024746 - time (sec): 68.21 - samples/sec: 1705.54 - lr: 0.000022 - momentum: 0.000000
121
+ 2023-10-17 09:15:30,144 epoch 4 - iter 1444/3617 - loss 0.05060285 - time (sec): 90.84 - samples/sec: 1687.44 - lr: 0.000022 - momentum: 0.000000
122
+ 2023-10-17 09:15:51,988 epoch 4 - iter 1805/3617 - loss 0.05209485 - time (sec): 112.68 - samples/sec: 1703.02 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-17 09:16:13,057 epoch 4 - iter 2166/3617 - loss 0.05270590 - time (sec): 133.75 - samples/sec: 1703.34 - lr: 0.000021 - momentum: 0.000000
124
+ 2023-10-17 09:16:35,238 epoch 4 - iter 2527/3617 - loss 0.05313400 - time (sec): 155.93 - samples/sec: 1704.57 - lr: 0.000021 - momentum: 0.000000
125
+ 2023-10-17 09:16:57,109 epoch 4 - iter 2888/3617 - loss 0.05411486 - time (sec): 177.80 - samples/sec: 1712.35 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-17 09:17:19,686 epoch 4 - iter 3249/3617 - loss 0.05477012 - time (sec): 200.38 - samples/sec: 1703.16 - lr: 0.000020 - momentum: 0.000000
127
+ 2023-10-17 09:17:41,730 epoch 4 - iter 3610/3617 - loss 0.05488383 - time (sec): 222.42 - samples/sec: 1704.66 - lr: 0.000020 - momentum: 0.000000
128
+ 2023-10-17 09:17:42,130 ----------------------------------------------------------------------------------------------------
129
+ 2023-10-17 09:17:42,130 EPOCH 4 done: loss 0.0548 - lr: 0.000020
130
+ 2023-10-17 09:17:48,498 DEV : loss 0.23047274351119995 - f1-score (micro avg) 0.6346
131
+ 2023-10-17 09:17:48,542 ----------------------------------------------------------------------------------------------------
132
+ 2023-10-17 09:18:11,645 epoch 5 - iter 361/3617 - loss 0.03877276 - time (sec): 23.10 - samples/sec: 1648.99 - lr: 0.000020 - momentum: 0.000000
133
+ 2023-10-17 09:18:35,842 epoch 5 - iter 722/3617 - loss 0.03893797 - time (sec): 47.30 - samples/sec: 1573.53 - lr: 0.000019 - momentum: 0.000000
134
+ 2023-10-17 09:19:00,608 epoch 5 - iter 1083/3617 - loss 0.03738715 - time (sec): 72.06 - samples/sec: 1566.11 - lr: 0.000019 - momentum: 0.000000
135
+ 2023-10-17 09:19:24,105 epoch 5 - iter 1444/3617 - loss 0.03521727 - time (sec): 95.56 - samples/sec: 1570.21 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-17 09:19:49,600 epoch 5 - iter 1805/3617 - loss 0.03658251 - time (sec): 121.06 - samples/sec: 1563.80 - lr: 0.000018 - momentum: 0.000000
137
+ 2023-10-17 09:20:12,773 epoch 5 - iter 2166/3617 - loss 0.03666109 - time (sec): 144.23 - samples/sec: 1576.19 - lr: 0.000018 - momentum: 0.000000
138
+ 2023-10-17 09:20:35,723 epoch 5 - iter 2527/3617 - loss 0.03560606 - time (sec): 167.18 - samples/sec: 1592.90 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-17 09:20:59,453 epoch 5 - iter 2888/3617 - loss 0.03567818 - time (sec): 190.91 - samples/sec: 1591.22 - lr: 0.000017 - momentum: 0.000000
140
+ 2023-10-17 09:21:21,835 epoch 5 - iter 3249/3617 - loss 0.03843292 - time (sec): 213.29 - samples/sec: 1598.18 - lr: 0.000017 - momentum: 0.000000
141
+ 2023-10-17 09:21:45,332 epoch 5 - iter 3610/3617 - loss 0.03827749 - time (sec): 236.79 - samples/sec: 1601.48 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-17 09:21:45,766 ----------------------------------------------------------------------------------------------------
143
+ 2023-10-17 09:21:45,766 EPOCH 5 done: loss 0.0386 - lr: 0.000017
144
+ 2023-10-17 09:21:52,102 DEV : loss 0.30182531476020813 - f1-score (micro avg) 0.655
145
+ 2023-10-17 09:21:52,151 ----------------------------------------------------------------------------------------------------
146
+ 2023-10-17 09:22:13,654 epoch 6 - iter 361/3617 - loss 0.02971622 - time (sec): 21.50 - samples/sec: 1768.48 - lr: 0.000016 - momentum: 0.000000
147
+ 2023-10-17 09:22:35,210 epoch 6 - iter 722/3617 - loss 0.02596294 - time (sec): 43.06 - samples/sec: 1771.12 - lr: 0.000016 - momentum: 0.000000
148
+ 2023-10-17 09:22:56,691 epoch 6 - iter 1083/3617 - loss 0.02699636 - time (sec): 64.54 - samples/sec: 1753.12 - lr: 0.000016 - momentum: 0.000000
149
+ 2023-10-17 09:23:18,245 epoch 6 - iter 1444/3617 - loss 0.02615175 - time (sec): 86.09 - samples/sec: 1759.27 - lr: 0.000015 - momentum: 0.000000
150
+ 2023-10-17 09:23:39,965 epoch 6 - iter 1805/3617 - loss 0.02525762 - time (sec): 107.81 - samples/sec: 1757.56 - lr: 0.000015 - momentum: 0.000000
151
+ 2023-10-17 09:24:03,115 epoch 6 - iter 2166/3617 - loss 0.02578467 - time (sec): 130.96 - samples/sec: 1734.42 - lr: 0.000015 - momentum: 0.000000
152
+ 2023-10-17 09:24:25,467 epoch 6 - iter 2527/3617 - loss 0.02698358 - time (sec): 153.31 - samples/sec: 1728.18 - lr: 0.000014 - momentum: 0.000000
153
+ 2023-10-17 09:24:47,331 epoch 6 - iter 2888/3617 - loss 0.02700357 - time (sec): 175.18 - samples/sec: 1728.11 - lr: 0.000014 - momentum: 0.000000
154
+ 2023-10-17 09:25:09,745 epoch 6 - iter 3249/3617 - loss 0.02824365 - time (sec): 197.59 - samples/sec: 1727.15 - lr: 0.000014 - momentum: 0.000000
155
+ 2023-10-17 09:25:31,716 epoch 6 - iter 3610/3617 - loss 0.02780229 - time (sec): 219.56 - samples/sec: 1726.51 - lr: 0.000013 - momentum: 0.000000
156
+ 2023-10-17 09:25:32,140 ----------------------------------------------------------------------------------------------------
157
+ 2023-10-17 09:25:32,140 EPOCH 6 done: loss 0.0277 - lr: 0.000013
158
+ 2023-10-17 09:25:39,197 DEV : loss 0.35149648785591125 - f1-score (micro avg) 0.6416
159
+ 2023-10-17 09:25:39,238 ----------------------------------------------------------------------------------------------------
160
+ 2023-10-17 09:26:03,014 epoch 7 - iter 361/3617 - loss 0.01810299 - time (sec): 23.77 - samples/sec: 1687.02 - lr: 0.000013 - momentum: 0.000000
161
+ 2023-10-17 09:26:26,361 epoch 7 - iter 722/3617 - loss 0.01752033 - time (sec): 47.12 - samples/sec: 1674.89 - lr: 0.000013 - momentum: 0.000000
162
+ 2023-10-17 09:26:49,545 epoch 7 - iter 1083/3617 - loss 0.01868235 - time (sec): 70.31 - samples/sec: 1650.25 - lr: 0.000012 - momentum: 0.000000
163
+ 2023-10-17 09:27:14,249 epoch 7 - iter 1444/3617 - loss 0.01923882 - time (sec): 95.01 - samples/sec: 1609.58 - lr: 0.000012 - momentum: 0.000000
164
+ 2023-10-17 09:27:37,499 epoch 7 - iter 1805/3617 - loss 0.01853291 - time (sec): 118.26 - samples/sec: 1604.87 - lr: 0.000012 - momentum: 0.000000
165
+ 2023-10-17 09:28:00,694 epoch 7 - iter 2166/3617 - loss 0.01937851 - time (sec): 141.45 - samples/sec: 1599.75 - lr: 0.000011 - momentum: 0.000000
166
+ 2023-10-17 09:28:23,889 epoch 7 - iter 2527/3617 - loss 0.01913223 - time (sec): 164.65 - samples/sec: 1608.56 - lr: 0.000011 - momentum: 0.000000
167
+ 2023-10-17 09:28:48,438 epoch 7 - iter 2888/3617 - loss 0.01935787 - time (sec): 189.20 - samples/sec: 1603.18 - lr: 0.000011 - momentum: 0.000000
168
+ 2023-10-17 09:29:12,391 epoch 7 - iter 3249/3617 - loss 0.01962716 - time (sec): 213.15 - samples/sec: 1600.38 - lr: 0.000010 - momentum: 0.000000
169
+ 2023-10-17 09:29:34,243 epoch 7 - iter 3610/3617 - loss 0.01941165 - time (sec): 235.00 - samples/sec: 1614.35 - lr: 0.000010 - momentum: 0.000000
170
+ 2023-10-17 09:29:34,669 ----------------------------------------------------------------------------------------------------
171
+ 2023-10-17 09:29:34,670 EPOCH 7 done: loss 0.0194 - lr: 0.000010
172
+ 2023-10-17 09:29:41,014 DEV : loss 0.34877660870552063 - f1-score (micro avg) 0.6485
173
+ 2023-10-17 09:29:41,056 ----------------------------------------------------------------------------------------------------
174
+ 2023-10-17 09:30:02,968 epoch 8 - iter 361/3617 - loss 0.01268390 - time (sec): 21.91 - samples/sec: 1703.57 - lr: 0.000010 - momentum: 0.000000
175
+ 2023-10-17 09:30:25,299 epoch 8 - iter 722/3617 - loss 0.01099667 - time (sec): 44.24 - samples/sec: 1711.21 - lr: 0.000009 - momentum: 0.000000
176
+ 2023-10-17 09:30:48,107 epoch 8 - iter 1083/3617 - loss 0.01320618 - time (sec): 67.05 - samples/sec: 1694.10 - lr: 0.000009 - momentum: 0.000000
177
+ 2023-10-17 09:31:11,789 epoch 8 - iter 1444/3617 - loss 0.01438812 - time (sec): 90.73 - samples/sec: 1666.77 - lr: 0.000009 - momentum: 0.000000
178
+ 2023-10-17 09:31:33,254 epoch 8 - iter 1805/3617 - loss 0.01526598 - time (sec): 112.20 - samples/sec: 1689.19 - lr: 0.000008 - momentum: 0.000000
179
+ 2023-10-17 09:31:54,672 epoch 8 - iter 2166/3617 - loss 0.01461418 - time (sec): 133.61 - samples/sec: 1697.90 - lr: 0.000008 - momentum: 0.000000
180
+ 2023-10-17 09:32:16,127 epoch 8 - iter 2527/3617 - loss 0.01418461 - time (sec): 155.07 - samples/sec: 1709.65 - lr: 0.000008 - momentum: 0.000000
181
+ 2023-10-17 09:32:37,708 epoch 8 - iter 2888/3617 - loss 0.01390302 - time (sec): 176.65 - samples/sec: 1717.72 - lr: 0.000007 - momentum: 0.000000
182
+ 2023-10-17 09:33:01,690 epoch 8 - iter 3249/3617 - loss 0.01320267 - time (sec): 200.63 - samples/sec: 1699.52 - lr: 0.000007 - momentum: 0.000000
183
+ 2023-10-17 09:33:24,874 epoch 8 - iter 3610/3617 - loss 0.01285080 - time (sec): 223.82 - samples/sec: 1693.91 - lr: 0.000007 - momentum: 0.000000
184
+ 2023-10-17 09:33:25,290 ----------------------------------------------------------------------------------------------------
185
+ 2023-10-17 09:33:25,290 EPOCH 8 done: loss 0.0129 - lr: 0.000007
186
+ 2023-10-17 09:33:31,720 DEV : loss 0.38916581869125366 - f1-score (micro avg) 0.6549
187
+ 2023-10-17 09:33:31,761 ----------------------------------------------------------------------------------------------------
188
+ 2023-10-17 09:33:56,673 epoch 9 - iter 361/3617 - loss 0.01001985 - time (sec): 24.91 - samples/sec: 1557.19 - lr: 0.000006 - momentum: 0.000000
189
+ 2023-10-17 09:34:20,402 epoch 9 - iter 722/3617 - loss 0.00865293 - time (sec): 48.64 - samples/sec: 1545.49 - lr: 0.000006 - momentum: 0.000000
190
+ 2023-10-17 09:34:44,683 epoch 9 - iter 1083/3617 - loss 0.00726475 - time (sec): 72.92 - samples/sec: 1550.75 - lr: 0.000006 - momentum: 0.000000
191
+ 2023-10-17 09:35:07,954 epoch 9 - iter 1444/3617 - loss 0.00785104 - time (sec): 96.19 - samples/sec: 1579.21 - lr: 0.000005 - momentum: 0.000000
192
+ 2023-10-17 09:35:32,051 epoch 9 - iter 1805/3617 - loss 0.00836511 - time (sec): 120.29 - samples/sec: 1579.08 - lr: 0.000005 - momentum: 0.000000
193
+ 2023-10-17 09:35:54,176 epoch 9 - iter 2166/3617 - loss 0.00795807 - time (sec): 142.41 - samples/sec: 1601.95 - lr: 0.000005 - momentum: 0.000000
194
+ 2023-10-17 09:36:16,583 epoch 9 - iter 2527/3617 - loss 0.00807517 - time (sec): 164.82 - samples/sec: 1611.25 - lr: 0.000004 - momentum: 0.000000
195
+ 2023-10-17 09:36:39,198 epoch 9 - iter 2888/3617 - loss 0.00797982 - time (sec): 187.44 - samples/sec: 1620.56 - lr: 0.000004 - momentum: 0.000000
196
+ 2023-10-17 09:37:02,573 epoch 9 - iter 3249/3617 - loss 0.00787163 - time (sec): 210.81 - samples/sec: 1624.12 - lr: 0.000004 - momentum: 0.000000
197
+ 2023-10-17 09:37:26,353 epoch 9 - iter 3610/3617 - loss 0.00821338 - time (sec): 234.59 - samples/sec: 1617.36 - lr: 0.000003 - momentum: 0.000000
198
+ 2023-10-17 09:37:26,818 ----------------------------------------------------------------------------------------------------
199
+ 2023-10-17 09:37:26,818 EPOCH 9 done: loss 0.0082 - lr: 0.000003
200
+ 2023-10-17 09:37:34,515 DEV : loss 0.3819396495819092 - f1-score (micro avg) 0.6596
201
+ 2023-10-17 09:37:34,561 saving best model
202
+ 2023-10-17 09:37:35,193 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-17 09:37:59,413 epoch 10 - iter 361/3617 - loss 0.00781618 - time (sec): 24.22 - samples/sec: 1593.10 - lr: 0.000003 - momentum: 0.000000
204
+ 2023-10-17 09:38:22,354 epoch 10 - iter 722/3617 - loss 0.00571085 - time (sec): 47.16 - samples/sec: 1608.77 - lr: 0.000003 - momentum: 0.000000
205
+ 2023-10-17 09:38:45,290 epoch 10 - iter 1083/3617 - loss 0.00491489 - time (sec): 70.10 - samples/sec: 1617.50 - lr: 0.000002 - momentum: 0.000000
206
+ 2023-10-17 09:39:08,269 epoch 10 - iter 1444/3617 - loss 0.00460967 - time (sec): 93.07 - samples/sec: 1618.29 - lr: 0.000002 - momentum: 0.000000
207
+ 2023-10-17 09:39:30,478 epoch 10 - iter 1805/3617 - loss 0.00449936 - time (sec): 115.28 - samples/sec: 1622.89 - lr: 0.000002 - momentum: 0.000000
208
+ 2023-10-17 09:39:54,643 epoch 10 - iter 2166/3617 - loss 0.00457182 - time (sec): 139.45 - samples/sec: 1617.06 - lr: 0.000001 - momentum: 0.000000
209
+ 2023-10-17 09:40:17,739 epoch 10 - iter 2527/3617 - loss 0.00425884 - time (sec): 162.54 - samples/sec: 1632.20 - lr: 0.000001 - momentum: 0.000000
210
+ 2023-10-17 09:40:40,411 epoch 10 - iter 2888/3617 - loss 0.00465737 - time (sec): 185.22 - samples/sec: 1632.18 - lr: 0.000001 - momentum: 0.000000
211
+ 2023-10-17 09:41:03,801 epoch 10 - iter 3249/3617 - loss 0.00458577 - time (sec): 208.61 - samples/sec: 1633.83 - lr: 0.000000 - momentum: 0.000000
212
+ 2023-10-17 09:41:26,934 epoch 10 - iter 3610/3617 - loss 0.00474640 - time (sec): 231.74 - samples/sec: 1636.82 - lr: 0.000000 - momentum: 0.000000
213
+ 2023-10-17 09:41:27,381 ----------------------------------------------------------------------------------------------------
214
+ 2023-10-17 09:41:27,381 EPOCH 10 done: loss 0.0047 - lr: 0.000000
215
+ 2023-10-17 09:41:33,760 DEV : loss 0.40882349014282227 - f1-score (micro avg) 0.66
216
+ 2023-10-17 09:41:33,801 saving best model
217
+ 2023-10-17 09:41:34,823 ----------------------------------------------------------------------------------------------------
218
+ 2023-10-17 09:41:34,824 Loading model from best epoch ...
219
+ 2023-10-17 09:41:36,920 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
220
+ 2023-10-17 09:41:46,180
221
+ Results:
222
+ - F-score (micro) 0.6469
223
+ - F-score (macro) 0.5083
224
+ - Accuracy 0.4891
225
+
226
+ By class:
227
+ precision recall f1-score support
228
+
229
+ loc 0.6288 0.7766 0.6949 591
230
+ pers 0.5766 0.7591 0.6554 357
231
+ org 0.1857 0.1646 0.1745 79
232
+
233
+ micro avg 0.5850 0.7235 0.6469 1027
234
+ macro avg 0.4637 0.5668 0.5083 1027
235
+ weighted avg 0.5766 0.7235 0.6411 1027
236
+
237
+ 2023-10-17 09:41:46,180 ----------------------------------------------------------------------------------------------------