stefan-it commited on
Commit
d308d3a
1 Parent(s): 77c0c92

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b9555b8c0cf24519cb6d747623fd3e0973bf9380a62f59caeb86cc5060c83f5
3
+ size 440941957
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 18:51:55 0.0000 0.4518 0.0857 0.7548 0.7696 0.7621 0.6224
3
+ 2 18:52:52 0.0000 0.0871 0.0563 0.8789 0.8543 0.8664 0.7729
4
+ 3 18:53:49 0.0000 0.0606 0.0610 0.9110 0.8564 0.8829 0.7994
5
+ 4 18:54:45 0.0000 0.0432 0.0706 0.8862 0.8450 0.8652 0.7695
6
+ 5 18:55:41 0.0000 0.0324 0.0791 0.9126 0.8306 0.8697 0.7746
7
+ 6 18:56:37 0.0000 0.0263 0.0874 0.8989 0.8636 0.8809 0.7977
8
+ 7 18:57:33 0.0000 0.0184 0.1058 0.9079 0.8554 0.8809 0.7954
9
+ 8 18:58:29 0.0000 0.0139 0.1137 0.9060 0.8564 0.8805 0.7956
10
+ 9 18:59:24 0.0000 0.0117 0.1161 0.9008 0.8626 0.8813 0.7960
11
+ 10 19:00:21 0.0000 0.0079 0.1215 0.8986 0.8605 0.8792 0.7933
runs/events.out.tfevents.1697568660.bce904bcef33.2251.18 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb3bda6f0b3131c93d3901dcfdcd5a52291996e2ce499cf6f02c67719fcd8d50
3
+ size 407048
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,236 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 18:51:00,787 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 18:51:00,788 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=13, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 18:51:00,788 MultiCorpus: 5777 train + 722 dev + 723 test sentences
48
+ - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
49
+ 2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 18:51:00,788 Train: 5777 sentences
51
+ 2023-10-17 18:51:00,788 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 18:51:00,788 Training Params:
54
+ 2023-10-17 18:51:00,788 - learning_rate: "3e-05"
55
+ 2023-10-17 18:51:00,788 - mini_batch_size: "8"
56
+ 2023-10-17 18:51:00,788 - max_epochs: "10"
57
+ 2023-10-17 18:51:00,788 - shuffle: "True"
58
+ 2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 18:51:00,788 Plugins:
60
+ 2023-10-17 18:51:00,788 - TensorboardLogger
61
+ 2023-10-17 18:51:00,788 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 18:51:00,788 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 18:51:00,788 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 18:51:00,788 Computation:
67
+ 2023-10-17 18:51:00,788 - compute on device: cuda:0
68
+ 2023-10-17 18:51:00,788 - embedding storage: none
69
+ 2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 18:51:00,788 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
71
+ 2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 18:51:00,789 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 18:51:00,789 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 18:51:05,999 epoch 1 - iter 72/723 - loss 2.73572129 - time (sec): 5.21 - samples/sec: 3228.40 - lr: 0.000003 - momentum: 0.000000
75
+ 2023-10-17 18:51:11,090 epoch 1 - iter 144/723 - loss 1.71636677 - time (sec): 10.30 - samples/sec: 3297.56 - lr: 0.000006 - momentum: 0.000000
76
+ 2023-10-17 18:51:16,224 epoch 1 - iter 216/723 - loss 1.21015689 - time (sec): 15.43 - samples/sec: 3319.86 - lr: 0.000009 - momentum: 0.000000
77
+ 2023-10-17 18:51:21,390 epoch 1 - iter 288/723 - loss 0.94420268 - time (sec): 20.60 - samples/sec: 3347.59 - lr: 0.000012 - momentum: 0.000000
78
+ 2023-10-17 18:51:26,193 epoch 1 - iter 360/723 - loss 0.78590918 - time (sec): 25.40 - samples/sec: 3395.27 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-17 18:51:31,477 epoch 1 - iter 432/723 - loss 0.67292077 - time (sec): 30.69 - samples/sec: 3400.44 - lr: 0.000018 - momentum: 0.000000
80
+ 2023-10-17 18:51:36,674 epoch 1 - iter 504/723 - loss 0.59756926 - time (sec): 35.88 - samples/sec: 3400.07 - lr: 0.000021 - momentum: 0.000000
81
+ 2023-10-17 18:51:42,043 epoch 1 - iter 576/723 - loss 0.53717802 - time (sec): 41.25 - samples/sec: 3387.98 - lr: 0.000024 - momentum: 0.000000
82
+ 2023-10-17 18:51:47,518 epoch 1 - iter 648/723 - loss 0.48924915 - time (sec): 46.73 - samples/sec: 3371.37 - lr: 0.000027 - momentum: 0.000000
83
+ 2023-10-17 18:51:52,896 epoch 1 - iter 720/723 - loss 0.45323298 - time (sec): 52.11 - samples/sec: 3367.96 - lr: 0.000030 - momentum: 0.000000
84
+ 2023-10-17 18:51:53,104 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 18:51:53,105 EPOCH 1 done: loss 0.4518 - lr: 0.000030
86
+ 2023-10-17 18:51:55,813 DEV : loss 0.08571955561637878 - f1-score (micro avg) 0.7621
87
+ 2023-10-17 18:51:55,830 saving best model
88
+ 2023-10-17 18:51:56,351 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 18:52:01,211 epoch 2 - iter 72/723 - loss 0.09693799 - time (sec): 4.86 - samples/sec: 3414.32 - lr: 0.000030 - momentum: 0.000000
90
+ 2023-10-17 18:52:06,705 epoch 2 - iter 144/723 - loss 0.09826077 - time (sec): 10.35 - samples/sec: 3307.89 - lr: 0.000029 - momentum: 0.000000
91
+ 2023-10-17 18:52:11,708 epoch 2 - iter 216/723 - loss 0.09503249 - time (sec): 15.36 - samples/sec: 3367.88 - lr: 0.000029 - momentum: 0.000000
92
+ 2023-10-17 18:52:16,853 epoch 2 - iter 288/723 - loss 0.09891978 - time (sec): 20.50 - samples/sec: 3362.88 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-17 18:52:22,304 epoch 2 - iter 360/723 - loss 0.09274178 - time (sec): 25.95 - samples/sec: 3380.81 - lr: 0.000028 - momentum: 0.000000
94
+ 2023-10-17 18:52:27,873 epoch 2 - iter 432/723 - loss 0.08927963 - time (sec): 31.52 - samples/sec: 3401.91 - lr: 0.000028 - momentum: 0.000000
95
+ 2023-10-17 18:52:32,724 epoch 2 - iter 504/723 - loss 0.09084334 - time (sec): 36.37 - samples/sec: 3381.60 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-17 18:52:38,353 epoch 2 - iter 576/723 - loss 0.09010738 - time (sec): 42.00 - samples/sec: 3375.35 - lr: 0.000027 - momentum: 0.000000
97
+ 2023-10-17 18:52:43,617 epoch 2 - iter 648/723 - loss 0.08974176 - time (sec): 47.26 - samples/sec: 3348.81 - lr: 0.000027 - momentum: 0.000000
98
+ 2023-10-17 18:52:48,857 epoch 2 - iter 720/723 - loss 0.08723098 - time (sec): 52.50 - samples/sec: 3343.67 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-17 18:52:49,018 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 18:52:49,019 EPOCH 2 done: loss 0.0871 - lr: 0.000027
101
+ 2023-10-17 18:52:52,238 DEV : loss 0.05628642439842224 - f1-score (micro avg) 0.8664
102
+ 2023-10-17 18:52:52,255 saving best model
103
+ 2023-10-17 18:52:52,649 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 18:52:58,275 epoch 3 - iter 72/723 - loss 0.06108376 - time (sec): 5.62 - samples/sec: 3077.41 - lr: 0.000026 - momentum: 0.000000
105
+ 2023-10-17 18:53:03,076 epoch 3 - iter 144/723 - loss 0.06229454 - time (sec): 10.43 - samples/sec: 3255.95 - lr: 0.000026 - momentum: 0.000000
106
+ 2023-10-17 18:53:08,621 epoch 3 - iter 216/723 - loss 0.06269040 - time (sec): 15.97 - samples/sec: 3261.71 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-17 18:53:14,059 epoch 3 - iter 288/723 - loss 0.05782452 - time (sec): 21.41 - samples/sec: 3273.63 - lr: 0.000025 - momentum: 0.000000
108
+ 2023-10-17 18:53:19,376 epoch 3 - iter 360/723 - loss 0.05776341 - time (sec): 26.73 - samples/sec: 3305.07 - lr: 0.000025 - momentum: 0.000000
109
+ 2023-10-17 18:53:24,787 epoch 3 - iter 432/723 - loss 0.06112370 - time (sec): 32.14 - samples/sec: 3288.95 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-17 18:53:29,871 epoch 3 - iter 504/723 - loss 0.06276585 - time (sec): 37.22 - samples/sec: 3302.02 - lr: 0.000024 - momentum: 0.000000
111
+ 2023-10-17 18:53:35,301 epoch 3 - iter 576/723 - loss 0.06100180 - time (sec): 42.65 - samples/sec: 3318.70 - lr: 0.000024 - momentum: 0.000000
112
+ 2023-10-17 18:53:40,357 epoch 3 - iter 648/723 - loss 0.06087244 - time (sec): 47.71 - samples/sec: 3323.88 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-17 18:53:45,595 epoch 3 - iter 720/723 - loss 0.06067803 - time (sec): 52.94 - samples/sec: 3316.47 - lr: 0.000023 - momentum: 0.000000
114
+ 2023-10-17 18:53:45,776 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 18:53:45,777 EPOCH 3 done: loss 0.0606 - lr: 0.000023
116
+ 2023-10-17 18:53:48,994 DEV : loss 0.061015695333480835 - f1-score (micro avg) 0.8829
117
+ 2023-10-17 18:53:49,010 saving best model
118
+ 2023-10-17 18:53:49,436 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-17 18:53:54,717 epoch 4 - iter 72/723 - loss 0.04387038 - time (sec): 5.28 - samples/sec: 3489.64 - lr: 0.000023 - momentum: 0.000000
120
+ 2023-10-17 18:54:00,075 epoch 4 - iter 144/723 - loss 0.03977768 - time (sec): 10.64 - samples/sec: 3434.68 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-17 18:54:05,046 epoch 4 - iter 216/723 - loss 0.04174055 - time (sec): 15.61 - samples/sec: 3394.76 - lr: 0.000022 - momentum: 0.000000
122
+ 2023-10-17 18:54:10,454 epoch 4 - iter 288/723 - loss 0.04295155 - time (sec): 21.02 - samples/sec: 3371.28 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-17 18:54:15,352 epoch 4 - iter 360/723 - loss 0.04184529 - time (sec): 25.92 - samples/sec: 3370.25 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-17 18:54:20,627 epoch 4 - iter 432/723 - loss 0.04216528 - time (sec): 31.19 - samples/sec: 3358.32 - lr: 0.000021 - momentum: 0.000000
125
+ 2023-10-17 18:54:25,677 epoch 4 - iter 504/723 - loss 0.04174425 - time (sec): 36.24 - samples/sec: 3385.08 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-17 18:54:31,093 epoch 4 - iter 576/723 - loss 0.04256262 - time (sec): 41.66 - samples/sec: 3366.49 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-17 18:54:36,301 epoch 4 - iter 648/723 - loss 0.04255295 - time (sec): 46.86 - samples/sec: 3360.15 - lr: 0.000020 - momentum: 0.000000
128
+ 2023-10-17 18:54:41,551 epoch 4 - iter 720/723 - loss 0.04326055 - time (sec): 52.11 - samples/sec: 3372.14 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-17 18:54:41,713 ----------------------------------------------------------------------------------------------------
130
+ 2023-10-17 18:54:41,713 EPOCH 4 done: loss 0.0432 - lr: 0.000020
131
+ 2023-10-17 18:54:45,293 DEV : loss 0.07059507817029953 - f1-score (micro avg) 0.8652
132
+ 2023-10-17 18:54:45,309 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-17 18:54:50,582 epoch 5 - iter 72/723 - loss 0.03765946 - time (sec): 5.27 - samples/sec: 3200.52 - lr: 0.000020 - momentum: 0.000000
134
+ 2023-10-17 18:54:55,435 epoch 5 - iter 144/723 - loss 0.03376893 - time (sec): 10.12 - samples/sec: 3269.32 - lr: 0.000019 - momentum: 0.000000
135
+ 2023-10-17 18:55:01,381 epoch 5 - iter 216/723 - loss 0.03444213 - time (sec): 16.07 - samples/sec: 3259.98 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-17 18:55:06,439 epoch 5 - iter 288/723 - loss 0.03201450 - time (sec): 21.13 - samples/sec: 3278.94 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-17 18:55:11,837 epoch 5 - iter 360/723 - loss 0.03021248 - time (sec): 26.53 - samples/sec: 3268.82 - lr: 0.000018 - momentum: 0.000000
138
+ 2023-10-17 18:55:17,061 epoch 5 - iter 432/723 - loss 0.03095066 - time (sec): 31.75 - samples/sec: 3295.73 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-17 18:55:22,311 epoch 5 - iter 504/723 - loss 0.03205508 - time (sec): 37.00 - samples/sec: 3318.65 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-17 18:55:27,501 epoch 5 - iter 576/723 - loss 0.03251336 - time (sec): 42.19 - samples/sec: 3325.42 - lr: 0.000017 - momentum: 0.000000
141
+ 2023-10-17 18:55:32,561 epoch 5 - iter 648/723 - loss 0.03276588 - time (sec): 47.25 - samples/sec: 3329.33 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-17 18:55:37,985 epoch 5 - iter 720/723 - loss 0.03238438 - time (sec): 52.67 - samples/sec: 3338.39 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-17 18:55:38,138 ----------------------------------------------------------------------------------------------------
144
+ 2023-10-17 18:55:38,139 EPOCH 5 done: loss 0.0324 - lr: 0.000017
145
+ 2023-10-17 18:55:41,451 DEV : loss 0.07911184430122375 - f1-score (micro avg) 0.8697
146
+ 2023-10-17 18:55:41,469 ----------------------------------------------------------------------------------------------------
147
+ 2023-10-17 18:55:46,855 epoch 6 - iter 72/723 - loss 0.01939646 - time (sec): 5.38 - samples/sec: 3381.89 - lr: 0.000016 - momentum: 0.000000
148
+ 2023-10-17 18:55:52,097 epoch 6 - iter 144/723 - loss 0.02202295 - time (sec): 10.63 - samples/sec: 3376.79 - lr: 0.000016 - momentum: 0.000000
149
+ 2023-10-17 18:55:57,312 epoch 6 - iter 216/723 - loss 0.02329081 - time (sec): 15.84 - samples/sec: 3385.10 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-17 18:56:03,153 epoch 6 - iter 288/723 - loss 0.02709437 - time (sec): 21.68 - samples/sec: 3284.61 - lr: 0.000015 - momentum: 0.000000
151
+ 2023-10-17 18:56:08,569 epoch 6 - iter 360/723 - loss 0.02852147 - time (sec): 27.10 - samples/sec: 3309.49 - lr: 0.000015 - momentum: 0.000000
152
+ 2023-10-17 18:56:13,824 epoch 6 - iter 432/723 - loss 0.02704669 - time (sec): 32.35 - samples/sec: 3322.19 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-17 18:56:18,846 epoch 6 - iter 504/723 - loss 0.02698341 - time (sec): 37.38 - samples/sec: 3337.29 - lr: 0.000014 - momentum: 0.000000
154
+ 2023-10-17 18:56:23,707 epoch 6 - iter 576/723 - loss 0.02713054 - time (sec): 42.24 - samples/sec: 3344.91 - lr: 0.000014 - momentum: 0.000000
155
+ 2023-10-17 18:56:28,856 epoch 6 - iter 648/723 - loss 0.02638017 - time (sec): 47.38 - samples/sec: 3348.51 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-17 18:56:33,896 epoch 6 - iter 720/723 - loss 0.02633353 - time (sec): 52.43 - samples/sec: 3352.29 - lr: 0.000013 - momentum: 0.000000
157
+ 2023-10-17 18:56:34,069 ----------------------------------------------------------------------------------------------------
158
+ 2023-10-17 18:56:34,069 EPOCH 6 done: loss 0.0263 - lr: 0.000013
159
+ 2023-10-17 18:56:37,242 DEV : loss 0.08742444217205048 - f1-score (micro avg) 0.8809
160
+ 2023-10-17 18:56:37,259 ----------------------------------------------------------------------------------------------------
161
+ 2023-10-17 18:56:42,537 epoch 7 - iter 72/723 - loss 0.01010414 - time (sec): 5.28 - samples/sec: 3350.03 - lr: 0.000013 - momentum: 0.000000
162
+ 2023-10-17 18:56:47,589 epoch 7 - iter 144/723 - loss 0.01933338 - time (sec): 10.33 - samples/sec: 3321.61 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-17 18:56:53,263 epoch 7 - iter 216/723 - loss 0.01860946 - time (sec): 16.00 - samples/sec: 3316.41 - lr: 0.000012 - momentum: 0.000000
164
+ 2023-10-17 18:56:58,767 epoch 7 - iter 288/723 - loss 0.01987477 - time (sec): 21.51 - samples/sec: 3329.15 - lr: 0.000012 - momentum: 0.000000
165
+ 2023-10-17 18:57:04,192 epoch 7 - iter 360/723 - loss 0.01978143 - time (sec): 26.93 - samples/sec: 3325.96 - lr: 0.000012 - momentum: 0.000000
166
+ 2023-10-17 18:57:09,642 epoch 7 - iter 432/723 - loss 0.02023814 - time (sec): 32.38 - samples/sec: 3310.10 - lr: 0.000011 - momentum: 0.000000
167
+ 2023-10-17 18:57:14,799 epoch 7 - iter 504/723 - loss 0.01949429 - time (sec): 37.54 - samples/sec: 3315.96 - lr: 0.000011 - momentum: 0.000000
168
+ 2023-10-17 18:57:19,826 epoch 7 - iter 576/723 - loss 0.01865648 - time (sec): 42.57 - samples/sec: 3327.50 - lr: 0.000011 - momentum: 0.000000
169
+ 2023-10-17 18:57:24,812 epoch 7 - iter 648/723 - loss 0.01853140 - time (sec): 47.55 - samples/sec: 3333.24 - lr: 0.000010 - momentum: 0.000000
170
+ 2023-10-17 18:57:30,046 epoch 7 - iter 720/723 - loss 0.01839669 - time (sec): 52.79 - samples/sec: 3328.86 - lr: 0.000010 - momentum: 0.000000
171
+ 2023-10-17 18:57:30,201 ----------------------------------------------------------------------------------------------------
172
+ 2023-10-17 18:57:30,201 EPOCH 7 done: loss 0.0184 - lr: 0.000010
173
+ 2023-10-17 18:57:33,757 DEV : loss 0.10578546673059464 - f1-score (micro avg) 0.8809
174
+ 2023-10-17 18:57:33,774 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-17 18:57:38,932 epoch 8 - iter 72/723 - loss 0.00841995 - time (sec): 5.16 - samples/sec: 3443.43 - lr: 0.000010 - momentum: 0.000000
176
+ 2023-10-17 18:57:44,074 epoch 8 - iter 144/723 - loss 0.01229420 - time (sec): 10.30 - samples/sec: 3426.89 - lr: 0.000009 - momentum: 0.000000
177
+ 2023-10-17 18:57:49,096 epoch 8 - iter 216/723 - loss 0.01355410 - time (sec): 15.32 - samples/sec: 3395.51 - lr: 0.000009 - momentum: 0.000000
178
+ 2023-10-17 18:57:54,305 epoch 8 - iter 288/723 - loss 0.01394882 - time (sec): 20.53 - samples/sec: 3385.90 - lr: 0.000009 - momentum: 0.000000
179
+ 2023-10-17 18:57:59,323 epoch 8 - iter 360/723 - loss 0.01323447 - time (sec): 25.55 - samples/sec: 3374.12 - lr: 0.000008 - momentum: 0.000000
180
+ 2023-10-17 18:58:04,498 epoch 8 - iter 432/723 - loss 0.01270974 - time (sec): 30.72 - samples/sec: 3377.16 - lr: 0.000008 - momentum: 0.000000
181
+ 2023-10-17 18:58:09,812 epoch 8 - iter 504/723 - loss 0.01255571 - time (sec): 36.04 - samples/sec: 3355.95 - lr: 0.000008 - momentum: 0.000000
182
+ 2023-10-17 18:58:15,497 epoch 8 - iter 576/723 - loss 0.01341256 - time (sec): 41.72 - samples/sec: 3360.99 - lr: 0.000007 - momentum: 0.000000
183
+ 2023-10-17 18:58:20,693 epoch 8 - iter 648/723 - loss 0.01371160 - time (sec): 46.92 - samples/sec: 3357.01 - lr: 0.000007 - momentum: 0.000000
184
+ 2023-10-17 18:58:26,278 epoch 8 - iter 720/723 - loss 0.01394702 - time (sec): 52.50 - samples/sec: 3344.06 - lr: 0.000007 - momentum: 0.000000
185
+ 2023-10-17 18:58:26,474 ----------------------------------------------------------------------------------------------------
186
+ 2023-10-17 18:58:26,474 EPOCH 8 done: loss 0.0139 - lr: 0.000007
187
+ 2023-10-17 18:58:29,679 DEV : loss 0.11371435225009918 - f1-score (micro avg) 0.8805
188
+ 2023-10-17 18:58:29,695 ----------------------------------------------------------------------------------------------------
189
+ 2023-10-17 18:58:35,055 epoch 9 - iter 72/723 - loss 0.01003235 - time (sec): 5.36 - samples/sec: 3286.54 - lr: 0.000006 - momentum: 0.000000
190
+ 2023-10-17 18:58:40,336 epoch 9 - iter 144/723 - loss 0.00981857 - time (sec): 10.64 - samples/sec: 3403.11 - lr: 0.000006 - momentum: 0.000000
191
+ 2023-10-17 18:58:45,118 epoch 9 - iter 216/723 - loss 0.01076096 - time (sec): 15.42 - samples/sec: 3432.99 - lr: 0.000006 - momentum: 0.000000
192
+ 2023-10-17 18:58:50,044 epoch 9 - iter 288/723 - loss 0.01027158 - time (sec): 20.35 - samples/sec: 3464.18 - lr: 0.000005 - momentum: 0.000000
193
+ 2023-10-17 18:58:55,521 epoch 9 - iter 360/723 - loss 0.00989332 - time (sec): 25.82 - samples/sec: 3429.30 - lr: 0.000005 - momentum: 0.000000
194
+ 2023-10-17 18:59:00,544 epoch 9 - iter 432/723 - loss 0.00992623 - time (sec): 30.85 - samples/sec: 3432.33 - lr: 0.000005 - momentum: 0.000000
195
+ 2023-10-17 18:59:06,392 epoch 9 - iter 504/723 - loss 0.01080183 - time (sec): 36.70 - samples/sec: 3398.02 - lr: 0.000004 - momentum: 0.000000
196
+ 2023-10-17 18:59:11,482 epoch 9 - iter 576/723 - loss 0.01069467 - time (sec): 41.79 - samples/sec: 3391.20 - lr: 0.000004 - momentum: 0.000000
197
+ 2023-10-17 18:59:16,665 epoch 9 - iter 648/723 - loss 0.01087289 - time (sec): 46.97 - samples/sec: 3403.56 - lr: 0.000004 - momentum: 0.000000
198
+ 2023-10-17 18:59:21,382 epoch 9 - iter 720/723 - loss 0.01170845 - time (sec): 51.69 - samples/sec: 3401.34 - lr: 0.000003 - momentum: 0.000000
199
+ 2023-10-17 18:59:21,537 ----------------------------------------------------------------------------------------------------
200
+ 2023-10-17 18:59:21,537 EPOCH 9 done: loss 0.0117 - lr: 0.000003
201
+ 2023-10-17 18:59:24,756 DEV : loss 0.11608566343784332 - f1-score (micro avg) 0.8813
202
+ 2023-10-17 18:59:24,773 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-17 18:59:30,213 epoch 10 - iter 72/723 - loss 0.01498393 - time (sec): 5.44 - samples/sec: 3305.97 - lr: 0.000003 - momentum: 0.000000
204
+ 2023-10-17 18:59:35,092 epoch 10 - iter 144/723 - loss 0.00950962 - time (sec): 10.32 - samples/sec: 3395.52 - lr: 0.000003 - momentum: 0.000000
205
+ 2023-10-17 18:59:40,454 epoch 10 - iter 216/723 - loss 0.00882272 - time (sec): 15.68 - samples/sec: 3373.59 - lr: 0.000002 - momentum: 0.000000
206
+ 2023-10-17 18:59:45,853 epoch 10 - iter 288/723 - loss 0.00879424 - time (sec): 21.08 - samples/sec: 3350.52 - lr: 0.000002 - momentum: 0.000000
207
+ 2023-10-17 18:59:50,927 epoch 10 - iter 360/723 - loss 0.00903110 - time (sec): 26.15 - samples/sec: 3359.20 - lr: 0.000002 - momentum: 0.000000
208
+ 2023-10-17 18:59:56,386 epoch 10 - iter 432/723 - loss 0.00841109 - time (sec): 31.61 - samples/sec: 3355.32 - lr: 0.000001 - momentum: 0.000000
209
+ 2023-10-17 19:00:01,775 epoch 10 - iter 504/723 - loss 0.00806376 - time (sec): 37.00 - samples/sec: 3325.43 - lr: 0.000001 - momentum: 0.000000
210
+ 2023-10-17 19:00:06,842 epoch 10 - iter 576/723 - loss 0.00790310 - time (sec): 42.07 - samples/sec: 3324.90 - lr: 0.000001 - momentum: 0.000000
211
+ 2023-10-17 19:00:12,025 epoch 10 - iter 648/723 - loss 0.00804525 - time (sec): 47.25 - samples/sec: 3339.23 - lr: 0.000000 - momentum: 0.000000
212
+ 2023-10-17 19:00:17,383 epoch 10 - iter 720/723 - loss 0.00796014 - time (sec): 52.61 - samples/sec: 3342.17 - lr: 0.000000 - momentum: 0.000000
213
+ 2023-10-17 19:00:17,532 ----------------------------------------------------------------------------------------------------
214
+ 2023-10-17 19:00:17,533 EPOCH 10 done: loss 0.0079 - lr: 0.000000
215
+ 2023-10-17 19:00:21,687 DEV : loss 0.12145841866731644 - f1-score (micro avg) 0.8792
216
+ 2023-10-17 19:00:22,119 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-17 19:00:22,121 Loading model from best epoch ...
218
+ 2023-10-17 19:00:23,863 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
219
+ 2023-10-17 19:00:27,596
220
+ Results:
221
+ - F-score (micro) 0.8643
222
+ - F-score (macro) 0.7231
223
+ - Accuracy 0.7673
224
+
225
+ By class:
226
+ precision recall f1-score support
227
+
228
+ PER 0.8669 0.8651 0.8660 482
229
+ LOC 0.9509 0.8886 0.9187 458
230
+ ORG 0.5714 0.2899 0.3846 69
231
+
232
+ micro avg 0.8941 0.8365 0.8643 1009
233
+ macro avg 0.7964 0.6812 0.7231 1009
234
+ weighted avg 0.8849 0.8365 0.8570 1009
235
+
236
+ 2023-10-17 19:00:27,596 ----------------------------------------------------------------------------------------------------