supreethrao commited on
Commit
3224233
1 Parent(s): bf77c15

Model save

Browse files
README.md ADDED
@@ -0,0 +1,333 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: span-marker
3
+ tags:
4
+ - span-marker
5
+ - token-classification
6
+ - ner
7
+ - named-entity-recognition
8
+ - generated_from_span_marker_trainer
9
+ datasets:
10
+ - DFKI-SLT/few-nerd
11
+ metrics:
12
+ - precision
13
+ - recall
14
+ - f1
15
+ widget:
16
+ - text: In response, in May or June 1125, a 3,000-strong Crusader coalition commanded
17
+ by King Baldwin II of Jerusalem confronted and defeated the 15,000-strong Muslim
18
+ coalition at the Battle of Azaz, raising the siege of the town.
19
+ - text: Cardenal made several visits to Jesuit universities in the United States,
20
+ including the University of Detroit Mercy in 2013, and the John Carroll University
21
+ in 2014.
22
+ - text: Other super-spreaders, defined as those that transmit SARS to at least eight
23
+ other people, included the incidents at the Hotel Metropole in Hong Kong, the
24
+ Amoy Gardens apartment complex in Hong Kong and one in an acute care hospital
25
+ in Toronto, Ontario, Canada.
26
+ - text: The District Court for the Northern District of California rejected 321 Studios'
27
+ claims for declaratory relief, holding that both DVD Copy Plus and DVD-X Copy
28
+ violated the DMCA and that the DMCA was not unconstitutional.
29
+ - text: The Sunday Edition is a television programme broadcast on the ITV Network
30
+ in the United Kingdom focusing on political interview and discussion, produced
31
+ by ITV Productions.
32
+ pipeline_tag: token-classification
33
+ model-index:
34
+ - name: SpanMarker
35
+ results:
36
+ - task:
37
+ type: token-classification
38
+ name: Named Entity Recognition
39
+ dataset:
40
+ name: Unknown
41
+ type: DFKI-SLT/few-nerd
42
+ split: test
43
+ metrics:
44
+ - type: f1
45
+ value: 0.703084859534267
46
+ name: F1
47
+ - type: precision
48
+ value: 0.7034273336857051
49
+ name: Precision
50
+ - type: recall
51
+ value: 0.7027427186979075
52
+ name: Recall
53
+ ---
54
+
55
+ # SpanMarker
56
+
57
+ This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model trained on the [DFKI-SLT/few-nerd](https://huggingface.co/datasets/DFKI-SLT/few-nerd) dataset that can be used for Named Entity Recognition.
58
+
59
+ ## Model Details
60
+
61
+ ### Model Description
62
+ - **Model Type:** SpanMarker
63
+ <!-- - **Encoder:** [Unknown](https://huggingface.co/unknown) -->
64
+ - **Maximum Sequence Length:** 256 tokens
65
+ - **Maximum Entity Length:** 8 words
66
+ - **Training Dataset:** [DFKI-SLT/few-nerd](https://huggingface.co/datasets/DFKI-SLT/few-nerd)
67
+ <!-- - **Language:** Unknown -->
68
+ <!-- - **License:** Unknown -->
69
+
70
+ ### Model Sources
71
+
72
+ - **Repository:** [SpanMarker on GitHub](https://github.com/tomaarsen/SpanMarkerNER)
73
+ - **Thesis:** [SpanMarker For Named Entity Recognition](https://raw.githubusercontent.com/tomaarsen/SpanMarkerNER/main/thesis.pdf)
74
+
75
+ ### Model Labels
76
+ | Label | Examples |
77
+ |:-----------------------------------------|:---------------------------------------------------------------------------------------------------------|
78
+ | art-broadcastprogram | "Street Cents", "Corazones", "The Gale Storm Show : Oh , Susanna" |
79
+ | art-film | "L'Atlantide", "Shawshank Redemption", "Bosch" |
80
+ | art-music | "Champion Lover", "Atkinson , Danko and Ford ( with Brockie and Hilton )", "Hollywood Studio Symphony" |
81
+ | art-other | "Aphrodite of Milos", "The Today Show", "Venus de Milo" |
82
+ | art-painting | "Production/Reproduction", "Cofiwch Dryweryn", "Touit" |
83
+ | art-writtenart | "Time", "Imelda de ' Lambertazzi", "The Seven Year Itch" |
84
+ | building-airport | "Sheremetyevo International Airport", "Luton Airport", "Newark Liberty International Airport" |
85
+ | building-hospital | "Yeungnam University Hospital", "Memorial Sloan-Kettering Cancer Center", "Hokkaido University Hospital" |
86
+ | building-hotel | "Radisson Blu Sea Plaza Hotel", "Flamingo Hotel", "The Standard Hotel" |
87
+ | building-library | "British Library", "Berlin State Library", "Bayerische Staatsbibliothek" |
88
+ | building-other | "Communiplex", "Henry Ford Museum", "Alpha Recording Studios" |
89
+ | building-restaurant | "Carnegie Deli", "Trumbull", "Fatburger" |
90
+ | building-sportsfacility | "Sports Center", "Boston Garden", "Glenn Warner Soccer Facility" |
91
+ | building-theater | "Sanders Theatre", "Pittsburgh Civic Light Opera", "National Paris Opera" |
92
+ | event-attack/battle/war/militaryconflict | "Vietnam War", "Jurist", "Easter Offensive" |
93
+ | event-disaster | "1990s North Korean famine", "the 1912 North Mount Lyell Disaster", "1693 Sicily earthquake" |
94
+ | event-election | "1982 Mitcham and Morden by-election", "Elections to the European Parliament", "March 1898 elections" |
95
+ | event-other | "Eastwood Scoring Stage", "Union for a Popular Movement", "Masaryk Democratic Movement" |
96
+ | event-protest | "French Revolution", "Iranian Constitutional Revolution", "Russian Revolution" |
97
+ | event-sportsevent | "World Cup", "National Champions", "Stanley Cup" |
98
+ | location-GPE | "Mediterranean Basin", "the Republic of Croatia", "Croatian" |
99
+ | location-bodiesofwater | "Arthur Kill", "Atatürk Dam Lake", "Norfolk coast" |
100
+ | location-island | "Staten Island", "new Samsat district", "Laccadives" |
101
+ | location-mountain | "Miteirya Ridge", "Ruweisat Ridge", "Salamander Glacier" |
102
+ | location-other | "Northern City Line", "Victoria line", "Cartuther" |
103
+ | location-park | "Painted Desert Community Complex Historic District", "Gramercy Park", "Shenandoah National Park" |
104
+ | location-road/railway/highway/transit | "NJT", "Newark-Elizabeth Rail Link", "Friern Barnet Road" |
105
+ | organization-company | "Church 's Chicken", "Texas Chicken", "Dixy Chicken" |
106
+ | organization-education | "Barnard College", "MIT", "Belfast Royal Academy and the Ulster College of Physical Education" |
107
+ | organization-government/governmentagency | "Diet", "Supreme Court", "Congregazione dei Nobili" |
108
+ | organization-media/newspaper | "Al Jazeera", "Clash", "TimeOut Melbourne" |
109
+ | organization-other | "Defence Sector C", "4th Army", "IAEA" |
110
+ | organization-politicalparty | "Al Wafa ' Islamic", "Shimpotō", "Kenseitō" |
111
+ | organization-religion | "Jewish", "UPCUSA", "Christian" |
112
+ | organization-showorganization | "Mr. Mister", "Lizzy", "Bochumer Symphoniker" |
113
+ | organization-sportsleague | "NHL", "First Division", "China League One" |
114
+ | organization-sportsteam | "Arsenal", "Luc Alphand Aventures", "Tottenham" |
115
+ | other-astronomything | "Algol", "Zodiac", "`` Caput Larvae ''" |
116
+ | other-award | "Order of the Republic of Guinea and Nigeria", "GCON", "Grand Commander of the Order of the Niger" |
117
+ | other-biologything | "Amphiphysin", "BAR", "N-terminal lipid" |
118
+ | other-chemicalthing | "sulfur", "uranium", "carbon dioxide" |
119
+ | other-currency | "$", "Travancore Rupee", "lac crore" |
120
+ | other-disease | "hypothyroidism", "bladder cancer", "French Dysentery Epidemic of 1779" |
121
+ | other-educationaldegree | "BSc ( Hons ) in physics", "Master", "Bachelor" |
122
+ | other-god | "El", "Raijin", "Fujin" |
123
+ | other-language | "Latin", "English", "Breton-speaking" |
124
+ | other-law | "United States Freedom Support Act", "Thirty Years ' Peace", "Leahy–Smith America Invents Act ( AIA" |
125
+ | other-livingthing | "insects", "monkeys", "patchouli" |
126
+ | other-medical | "pediatrician", "Pediatrics", "amitriptyline" |
127
+ | person-actor | "Edmund Payne", "Tchéky Karyo", "Ellaline Terriss" |
128
+ | person-artist/author | "Gaetano Donizett", "George Axelrod", "Hicks" |
129
+ | person-athlete | "Tozawa", "Jaguar", "Neville" |
130
+ | person-director | "Bob Swaim", "Frank Darabont", "Richard Quine" |
131
+ | person-other | "Holden", "Richard Benson", "Campbell" |
132
+ | person-politician | "Rivière", "Emeric", "William" |
133
+ | person-scholar | "Stalmine", "Wurdack", "Stedman" |
134
+ | person-soldier | "Krukenberg", "Joachim Ziegler", "Helmuth Weidling" |
135
+ | product-airplane | "EC135T2 CPDS", "Spey-equipped FGR.2s", "Luton" |
136
+ | product-car | "100EX", "Corvettes - GT1 C6R", "Phantom" |
137
+ | product-food | "yakiniku", "V. labrusca", "red grape" |
138
+ | product-game | "Airforce Delta", "Splinter Cell", "Hardcore RPG" |
139
+ | product-other | "X11", "Fairbottom Bobs", "PDP-1" |
140
+ | product-ship | "Essex", "HMS `` Chinkara ''", "Congress" |
141
+ | product-software | "Wikipedia", "Apdf", "AmiPDF" |
142
+ | product-train | "High Speed Trains", "Royal Scots Grey", "55022" |
143
+ | product-weapon | "ZU-23-2M Wróbel", "AR-15 's", "ZU-23-2MR Wróbel II" |
144
+
145
+ ## Evaluation
146
+
147
+ ### Metrics
148
+ | Label | Precision | Recall | F1 |
149
+ |:-----------------------------------------|:----------|:-------|:-------|
150
+ | **all** | 0.7034 | 0.7027 | 0.7031 |
151
+ | art-broadcastprogram | 0.6024 | 0.5904 | 0.5963 |
152
+ | art-film | 0.7761 | 0.7533 | 0.7645 |
153
+ | art-music | 0.7825 | 0.7551 | 0.7685 |
154
+ | art-other | 0.4193 | 0.3327 | 0.3710 |
155
+ | art-painting | 0.5882 | 0.5263 | 0.5556 |
156
+ | art-writtenart | 0.6819 | 0.6488 | 0.6649 |
157
+ | building-airport | 0.8064 | 0.8352 | 0.8205 |
158
+ | building-hospital | 0.7282 | 0.8022 | 0.7634 |
159
+ | building-hotel | 0.7033 | 0.7245 | 0.7138 |
160
+ | building-library | 0.7550 | 0.7380 | 0.7464 |
161
+ | building-other | 0.5867 | 0.5840 | 0.5853 |
162
+ | building-restaurant | 0.6205 | 0.5216 | 0.5667 |
163
+ | building-sportsfacility | 0.6113 | 0.7976 | 0.6921 |
164
+ | building-theater | 0.7060 | 0.7495 | 0.7271 |
165
+ | event-attack/battle/war/militaryconflict | 0.7945 | 0.7395 | 0.7660 |
166
+ | event-disaster | 0.5604 | 0.5604 | 0.5604 |
167
+ | event-election | 0.4286 | 0.1484 | 0.2204 |
168
+ | event-other | 0.4885 | 0.4400 | 0.4629 |
169
+ | event-protest | 0.3798 | 0.4759 | 0.4225 |
170
+ | event-sportsevent | 0.6198 | 0.6162 | 0.6180 |
171
+ | location-GPE | 0.8157 | 0.8552 | 0.8350 |
172
+ | location-bodiesofwater | 0.7268 | 0.7690 | 0.7473 |
173
+ | location-island | 0.7504 | 0.6842 | 0.7158 |
174
+ | location-mountain | 0.7352 | 0.7298 | 0.7325 |
175
+ | location-other | 0.4427 | 0.3104 | 0.3649 |
176
+ | location-park | 0.7153 | 0.6856 | 0.7001 |
177
+ | location-road/railway/highway/transit | 0.7090 | 0.7324 | 0.7205 |
178
+ | organization-company | 0.6963 | 0.7061 | 0.7012 |
179
+ | organization-education | 0.7994 | 0.7986 | 0.7990 |
180
+ | organization-government/governmentagency | 0.5524 | 0.4533 | 0.4980 |
181
+ | organization-media/newspaper | 0.6513 | 0.6656 | 0.6584 |
182
+ | organization-other | 0.5978 | 0.5375 | 0.5661 |
183
+ | organization-politicalparty | 0.6793 | 0.7315 | 0.7044 |
184
+ | organization-religion | 0.5575 | 0.6131 | 0.5840 |
185
+ | organization-showorganization | 0.6035 | 0.5839 | 0.5935 |
186
+ | organization-sportsleague | 0.6393 | 0.6610 | 0.6499 |
187
+ | organization-sportsteam | 0.7259 | 0.7796 | 0.7518 |
188
+ | other-astronomything | 0.7794 | 0.8024 | 0.7907 |
189
+ | other-award | 0.7180 | 0.6649 | 0.6904 |
190
+ | other-biologything | 0.6864 | 0.6238 | 0.6536 |
191
+ | other-chemicalthing | 0.5688 | 0.6036 | 0.5856 |
192
+ | other-currency | 0.6996 | 0.8423 | 0.7643 |
193
+ | other-disease | 0.6591 | 0.7410 | 0.6977 |
194
+ | other-educationaldegree | 0.6114 | 0.6198 | 0.6156 |
195
+ | other-god | 0.6486 | 0.7181 | 0.6816 |
196
+ | other-language | 0.6507 | 0.8313 | 0.7300 |
197
+ | other-law | 0.6934 | 0.7331 | 0.7127 |
198
+ | other-livingthing | 0.6019 | 0.6605 | 0.6298 |
199
+ | other-medical | 0.5124 | 0.5214 | 0.5169 |
200
+ | person-actor | 0.8384 | 0.8051 | 0.8214 |
201
+ | person-artist/author | 0.7122 | 0.7531 | 0.7321 |
202
+ | person-athlete | 0.8318 | 0.8422 | 0.8370 |
203
+ | person-director | 0.7083 | 0.7365 | 0.7221 |
204
+ | person-other | 0.6833 | 0.6737 | 0.6785 |
205
+ | person-politician | 0.6807 | 0.6836 | 0.6822 |
206
+ | person-scholar | 0.5397 | 0.5209 | 0.5301 |
207
+ | person-soldier | 0.5053 | 0.5920 | 0.5452 |
208
+ | product-airplane | 0.6617 | 0.6692 | 0.6654 |
209
+ | product-car | 0.7313 | 0.7132 | 0.7222 |
210
+ | product-food | 0.5787 | 0.5787 | 0.5787 |
211
+ | product-game | 0.7364 | 0.7140 | 0.7250 |
212
+ | product-other | 0.5567 | 0.4210 | 0.4795 |
213
+ | product-ship | 0.6842 | 0.6842 | 0.6842 |
214
+ | product-software | 0.6495 | 0.6648 | 0.6570 |
215
+ | product-train | 0.5942 | 0.5924 | 0.5933 |
216
+ | product-weapon | 0.6435 | 0.5353 | 0.5844 |
217
+
218
+ ## Uses
219
+
220
+ ### Direct Use for Inference
221
+
222
+ ```python
223
+ from span_marker import SpanMarkerModel
224
+
225
+ # Download from the 🤗 Hub
226
+ model = SpanMarkerModel.from_pretrained("supreethrao/instructNER_fewnerd_xl")
227
+ # Run inference
228
+ entities = model.predict("The Sunday Edition is a television programme broadcast on the ITV Network in the United Kingdom focusing on political interview and discussion, produced by ITV Productions.")
229
+ ```
230
+
231
+ ### Downstream Use
232
+ You can finetune this model on your own dataset.
233
+
234
+ <details><summary>Click to expand</summary>
235
+
236
+ ```python
237
+ from span_marker import SpanMarkerModel, Trainer
238
+
239
+ # Download from the 🤗 Hub
240
+ model = SpanMarkerModel.from_pretrained("supreethrao/instructNER_fewnerd_xl")
241
+
242
+ # Specify a Dataset with "tokens" and "ner_tag" columns
243
+ dataset = load_dataset("conll2003") # For example CoNLL2003
244
+
245
+ # Initialize a Trainer using the pretrained model & dataset
246
+ trainer = Trainer(
247
+ model=model,
248
+ train_dataset=dataset["train"],
249
+ eval_dataset=dataset["validation"],
250
+ )
251
+ trainer.train()
252
+ trainer.save_model("supreethrao/instructNER_fewnerd_xl-finetuned")
253
+ ```
254
+ </details>
255
+
256
+ <!--
257
+ ### Out-of-Scope Use
258
+
259
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
260
+ -->
261
+
262
+ <!--
263
+ ## Bias, Risks and Limitations
264
+
265
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
266
+ -->
267
+
268
+ <!--
269
+ ### Recommendations
270
+
271
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
272
+ -->
273
+
274
+ ## Training Details
275
+
276
+ ### Training Set Metrics
277
+ | Training set | Min | Median | Max |
278
+ |:----------------------|:----|:--------|:----|
279
+ | Sentence length | 1 | 24.4945 | 267 |
280
+ | Entities per sentence | 0 | 2.5832 | 88 |
281
+
282
+ ### Training Hyperparameters
283
+ - learning_rate: 5e-05
284
+ - train_batch_size: 16
285
+ - eval_batch_size: 16
286
+ - seed: 42
287
+ - distributed_type: multi-GPU
288
+ - num_devices: 2
289
+ - total_train_batch_size: 32
290
+ - total_eval_batch_size: 32
291
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
292
+ - lr_scheduler_type: linear
293
+ - lr_scheduler_warmup_ratio: 0.1
294
+ - num_epochs: 3
295
+ - mixed_precision_training: Native AMP
296
+
297
+ ### Framework Versions
298
+ - Python: 3.10.13
299
+ - SpanMarker: 1.5.0
300
+ - Transformers: 4.35.2
301
+ - PyTorch: 2.1.1
302
+ - Datasets: 2.15.0
303
+ - Tokenizers: 0.15.0
304
+
305
+ ## Citation
306
+
307
+ ### BibTeX
308
+ ```
309
+ @software{Aarsen_SpanMarker,
310
+ author = {Aarsen, Tom},
311
+ license = {Apache-2.0},
312
+ title = {{SpanMarker for Named Entity Recognition}},
313
+ url = {https://github.com/tomaarsen/SpanMarkerNER}
314
+ }
315
+ ```
316
+
317
+ <!--
318
+ ## Glossary
319
+
320
+ *Clearly define terms in order to be accessible across audiences.*
321
+ -->
322
+
323
+ <!--
324
+ ## Model Card Authors
325
+
326
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
327
+ -->
328
+
329
+ <!--
330
+ ## Model Card Contact
331
+
332
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
333
+ -->
all_results.json ADDED
@@ -0,0 +1,407 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 3.0,
3
+ "test_art-broadcastprogram": {
4
+ "f1": 0.5963149078726968,
5
+ "number": 603,
6
+ "precision": 0.6023688663282571,
7
+ "recall": 0.5903814262023217
8
+ },
9
+ "test_art-film": {
10
+ "f1": 0.7645466847090664,
11
+ "number": 750,
12
+ "precision": 0.7760989010989011,
13
+ "recall": 0.7533333333333333
14
+ },
15
+ "test_art-music": {
16
+ "f1": 0.7685459940652819,
17
+ "number": 1029,
18
+ "precision": 0.7824773413897281,
19
+ "recall": 0.7551020408163265
20
+ },
21
+ "test_art-other": {
22
+ "f1": 0.37103174603174605,
23
+ "number": 562,
24
+ "precision": 0.4192825112107623,
25
+ "recall": 0.33274021352313166
26
+ },
27
+ "test_art-painting": {
28
+ "f1": 0.5555555555555555,
29
+ "number": 57,
30
+ "precision": 0.5882352941176471,
31
+ "recall": 0.5263157894736842
32
+ },
33
+ "test_art-writtenart": {
34
+ "f1": 0.6649020645844362,
35
+ "number": 968,
36
+ "precision": 0.6818675352877307,
37
+ "recall": 0.6487603305785123
38
+ },
39
+ "test_building-airport": {
40
+ "f1": 0.8205128205128206,
41
+ "number": 364,
42
+ "precision": 0.8063660477453581,
43
+ "recall": 0.8351648351648352
44
+ },
45
+ "test_building-hospital": {
46
+ "f1": 0.7633986928104576,
47
+ "number": 364,
48
+ "precision": 0.7281795511221946,
49
+ "recall": 0.8021978021978022
50
+ },
51
+ "test_building-hotel": {
52
+ "f1": 0.7137546468401488,
53
+ "number": 265,
54
+ "precision": 0.7032967032967034,
55
+ "recall": 0.7245283018867924
56
+ },
57
+ "test_building-library": {
58
+ "f1": 0.7464387464387464,
59
+ "number": 355,
60
+ "precision": 0.7550432276657061,
61
+ "recall": 0.7380281690140845
62
+ },
63
+ "test_building-other": {
64
+ "f1": 0.5853370122191565,
65
+ "number": 2543,
66
+ "precision": 0.5867246147767681,
67
+ "recall": 0.5839559575304758
68
+ },
69
+ "test_building-restaurant": {
70
+ "f1": 0.5667447306791569,
71
+ "number": 232,
72
+ "precision": 0.6205128205128205,
73
+ "recall": 0.521551724137931
74
+ },
75
+ "test_building-sportsfacility": {
76
+ "f1": 0.6921487603305786,
77
+ "number": 420,
78
+ "precision": 0.6113138686131386,
79
+ "recall": 0.7976190476190477
80
+ },
81
+ "test_building-theater": {
82
+ "f1": 0.7270788912579957,
83
+ "number": 455,
84
+ "precision": 0.7060041407867494,
85
+ "recall": 0.7494505494505495
86
+ },
87
+ "test_event-attack/battle/war/militaryconflict": {
88
+ "f1": 0.7660377358490565,
89
+ "number": 1098,
90
+ "precision": 0.7945205479452054,
91
+ "recall": 0.7395264116575592
92
+ },
93
+ "test_event-disaster": {
94
+ "f1": 0.5603864734299517,
95
+ "number": 207,
96
+ "precision": 0.5603864734299517,
97
+ "recall": 0.5603864734299517
98
+ },
99
+ "test_event-election": {
100
+ "f1": 0.22040816326530616,
101
+ "number": 182,
102
+ "precision": 0.42857142857142855,
103
+ "recall": 0.14835164835164835
104
+ },
105
+ "test_event-other": {
106
+ "f1": 0.4629404617253949,
107
+ "number": 866,
108
+ "precision": 0.48846153846153845,
109
+ "recall": 0.4399538106235566
110
+ },
111
+ "test_event-protest": {
112
+ "f1": 0.42245989304812837,
113
+ "number": 166,
114
+ "precision": 0.3798076923076923,
115
+ "recall": 0.4759036144578313
116
+ },
117
+ "test_event-sportsevent": {
118
+ "f1": 0.6179955171309639,
119
+ "number": 1566,
120
+ "precision": 0.619781631342325,
121
+ "recall": 0.6162196679438059
122
+ },
123
+ "test_location-GPE": {
124
+ "f1": 0.8349881570447639,
125
+ "number": 20405,
126
+ "precision": 0.8157255048616305,
127
+ "recall": 0.8551825532957609
128
+ },
129
+ "test_location-bodiesofwater": {
130
+ "f1": 0.7472984206151289,
131
+ "number": 1169,
132
+ "precision": 0.7267582861762328,
133
+ "recall": 0.7690333618477331
134
+ },
135
+ "test_location-island": {
136
+ "f1": 0.7157894736842105,
137
+ "number": 646,
138
+ "precision": 0.7504244482173175,
139
+ "recall": 0.6842105263157895
140
+ },
141
+ "test_location-mountain": {
142
+ "f1": 0.7324981577008107,
143
+ "number": 681,
144
+ "precision": 0.735207100591716,
145
+ "recall": 0.7298091042584435
146
+ },
147
+ "test_location-other": {
148
+ "f1": 0.36490474912798493,
149
+ "number": 2191,
150
+ "precision": 0.4427083333333333,
151
+ "recall": 0.31036056595162026
152
+ },
153
+ "test_location-park": {
154
+ "f1": 0.7001114827201783,
155
+ "number": 458,
156
+ "precision": 0.715261958997722,
157
+ "recall": 0.6855895196506551
158
+ },
159
+ "test_location-road/railway/highway/transit": {
160
+ "f1": 0.7204861111111112,
161
+ "number": 1700,
162
+ "precision": 0.708997722095672,
163
+ "recall": 0.7323529411764705
164
+ },
165
+ "test_loss": 0.022335968911647797,
166
+ "test_organization-company": {
167
+ "f1": 0.7011596788581624,
168
+ "number": 3896,
169
+ "precision": 0.6962794229309036,
170
+ "recall": 0.7061088295687885
171
+ },
172
+ "test_organization-education": {
173
+ "f1": 0.7990314769975787,
174
+ "number": 2066,
175
+ "precision": 0.7994186046511628,
176
+ "recall": 0.7986447241045499
177
+ },
178
+ "test_organization-government/governmentagency": {
179
+ "f1": 0.49800072700836057,
180
+ "number": 1511,
181
+ "precision": 0.5524193548387096,
182
+ "recall": 0.45334215751158174
183
+ },
184
+ "test_organization-media/newspaper": {
185
+ "f1": 0.6583701324769169,
186
+ "number": 1232,
187
+ "precision": 0.6513105639396346,
188
+ "recall": 0.6655844155844156
189
+ },
190
+ "test_organization-other": {
191
+ "f1": 0.5660735468564649,
192
+ "number": 4439,
193
+ "precision": 0.59784515159108,
194
+ "recall": 0.5375084478486145
195
+ },
196
+ "test_organization-politicalparty": {
197
+ "f1": 0.704431247144815,
198
+ "number": 1054,
199
+ "precision": 0.6792951541850221,
200
+ "recall": 0.7314990512333965
201
+ },
202
+ "test_organization-religion": {
203
+ "f1": 0.583982990786676,
204
+ "number": 672,
205
+ "precision": 0.557510148849797,
206
+ "recall": 0.6130952380952381
207
+ },
208
+ "test_organization-showorganization": {
209
+ "f1": 0.5935228023793787,
210
+ "number": 769,
211
+ "precision": 0.603494623655914,
212
+ "recall": 0.5838751625487646
213
+ },
214
+ "test_organization-sportsleague": {
215
+ "f1": 0.6499442586399109,
216
+ "number": 882,
217
+ "precision": 0.6392543859649122,
218
+ "recall": 0.6609977324263039
219
+ },
220
+ "test_organization-sportsteam": {
221
+ "f1": 0.7518034704620783,
222
+ "number": 2473,
223
+ "precision": 0.7259036144578314,
224
+ "recall": 0.779619894864537
225
+ },
226
+ "test_other-astronomything": {
227
+ "f1": 0.7906976744186047,
228
+ "number": 678,
229
+ "precision": 0.7793696275071633,
230
+ "recall": 0.8023598820058997
231
+ },
232
+ "test_other-award": {
233
+ "f1": 0.6903954802259886,
234
+ "number": 919,
235
+ "precision": 0.717978848413631,
236
+ "recall": 0.6648531011969532
237
+ },
238
+ "test_other-biologything": {
239
+ "f1": 0.6536203522504893,
240
+ "number": 1874,
241
+ "precision": 0.6864357017028773,
242
+ "recall": 0.6237993596584845
243
+ },
244
+ "test_other-chemicalthing": {
245
+ "f1": 0.5856459330143541,
246
+ "number": 1014,
247
+ "precision": 0.5687732342007435,
248
+ "recall": 0.6035502958579881
249
+ },
250
+ "test_other-currency": {
251
+ "f1": 0.7643384440658716,
252
+ "number": 799,
253
+ "precision": 0.6995841995841996,
254
+ "recall": 0.8423028785982478
255
+ },
256
+ "test_other-disease": {
257
+ "f1": 0.6976744186046512,
258
+ "number": 749,
259
+ "precision": 0.6591448931116389,
260
+ "recall": 0.7409879839786382
261
+ },
262
+ "test_other-educationaldegree": {
263
+ "f1": 0.615595075239398,
264
+ "number": 363,
265
+ "precision": 0.6114130434782609,
266
+ "recall": 0.6198347107438017
267
+ },
268
+ "test_other-god": {
269
+ "f1": 0.6816143497757848,
270
+ "number": 635,
271
+ "precision": 0.6486486486486487,
272
+ "recall": 0.7181102362204724
273
+ },
274
+ "test_other-language": {
275
+ "f1": 0.7300291545189505,
276
+ "number": 753,
277
+ "precision": 0.6507276507276507,
278
+ "recall": 0.8313413014608234
279
+ },
280
+ "test_other-law": {
281
+ "f1": 0.7126673532440783,
282
+ "number": 472,
283
+ "precision": 0.6933867735470942,
284
+ "recall": 0.7330508474576272
285
+ },
286
+ "test_other-livingthing": {
287
+ "f1": 0.6298342541436465,
288
+ "number": 863,
289
+ "precision": 0.6019007391763463,
290
+ "recall": 0.660486674391657
291
+ },
292
+ "test_other-medical": {
293
+ "f1": 0.5168539325842697,
294
+ "number": 397,
295
+ "precision": 0.5123762376237624,
296
+ "recall": 0.5214105793450882
297
+ },
298
+ "test_overall_accuracy": 0.9256893595441806,
299
+ "test_overall_f1": 0.703084859534267,
300
+ "test_overall_precision": 0.7034273336857051,
301
+ "test_overall_recall": 0.7027427186979075,
302
+ "test_person-actor": {
303
+ "f1": 0.8214397008413836,
304
+ "number": 1637,
305
+ "precision": 0.8384223918575063,
306
+ "recall": 0.8051313378130727
307
+ },
308
+ "test_person-artist/author": {
309
+ "f1": 0.7320701754385964,
310
+ "number": 3463,
311
+ "precision": 0.7121791370835608,
312
+ "recall": 0.7531042448743863
313
+ },
314
+ "test_person-athlete": {
315
+ "f1": 0.8370089593383873,
316
+ "number": 2884,
317
+ "precision": 0.8318493150684931,
318
+ "recall": 0.8422330097087378
319
+ },
320
+ "test_person-director": {
321
+ "f1": 0.7221238938053098,
322
+ "number": 554,
323
+ "precision": 0.7083333333333334,
324
+ "recall": 0.7364620938628159
325
+ },
326
+ "test_person-other": {
327
+ "f1": 0.6784606547960942,
328
+ "number": 8767,
329
+ "precision": 0.6833275483049867,
330
+ "recall": 0.6736625983802897
331
+ },
332
+ "test_person-politician": {
333
+ "f1": 0.6821515892420537,
334
+ "number": 2857,
335
+ "precision": 0.6807249912861624,
336
+ "recall": 0.6835841792089604
337
+ },
338
+ "test_person-scholar": {
339
+ "f1": 0.5301369863013699,
340
+ "number": 743,
341
+ "precision": 0.5397489539748954,
342
+ "recall": 0.5208613728129206
343
+ },
344
+ "test_person-soldier": {
345
+ "f1": 0.5451957295373664,
346
+ "number": 647,
347
+ "precision": 0.5052770448548812,
348
+ "recall": 0.5919629057187017
349
+ },
350
+ "test_product-airplane": {
351
+ "f1": 0.6654111738857501,
352
+ "number": 792,
353
+ "precision": 0.66167290886392,
354
+ "recall": 0.6691919191919192
355
+ },
356
+ "test_product-car": {
357
+ "f1": 0.7221812822402359,
358
+ "number": 687,
359
+ "precision": 0.7313432835820896,
360
+ "recall": 0.7132459970887919
361
+ },
362
+ "test_product-food": {
363
+ "f1": 0.5787037037037037,
364
+ "number": 432,
365
+ "precision": 0.5787037037037037,
366
+ "recall": 0.5787037037037037
367
+ },
368
+ "test_product-game": {
369
+ "f1": 0.7250257466529352,
370
+ "number": 493,
371
+ "precision": 0.7364016736401674,
372
+ "recall": 0.7139959432048681
373
+ },
374
+ "test_product-other": {
375
+ "f1": 0.4794617563739376,
376
+ "number": 1608,
377
+ "precision": 0.5567434210526315,
378
+ "recall": 0.4210199004975124
379
+ },
380
+ "test_product-ship": {
381
+ "f1": 0.6842105263157895,
382
+ "number": 380,
383
+ "precision": 0.6842105263157895,
384
+ "recall": 0.6842105263157895
385
+ },
386
+ "test_product-software": {
387
+ "f1": 0.6570316842690384,
388
+ "number": 889,
389
+ "precision": 0.6494505494505495,
390
+ "recall": 0.6647919010123734
391
+ },
392
+ "test_product-train": {
393
+ "f1": 0.5933014354066984,
394
+ "number": 314,
395
+ "precision": 0.5942492012779552,
396
+ "recall": 0.5923566878980892
397
+ },
398
+ "test_product-weapon": {
399
+ "f1": 0.584426946631671,
400
+ "number": 624,
401
+ "precision": 0.6435452793834296,
402
+ "recall": 0.5352564102564102
403
+ },
404
+ "test_runtime": 242.3784,
405
+ "test_samples_per_second": 189.732,
406
+ "test_steps_per_second": 5.933
407
+ }
final_checkpoint/README.md ADDED
@@ -0,0 +1,333 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: span-marker
3
+ tags:
4
+ - span-marker
5
+ - token-classification
6
+ - ner
7
+ - named-entity-recognition
8
+ - generated_from_span_marker_trainer
9
+ datasets:
10
+ - DFKI-SLT/few-nerd
11
+ metrics:
12
+ - precision
13
+ - recall
14
+ - f1
15
+ widget:
16
+ - text: In response, in May or June 1125, a 3,000-strong Crusader coalition commanded
17
+ by King Baldwin II of Jerusalem confronted and defeated the 15,000-strong Muslim
18
+ coalition at the Battle of Azaz, raising the siege of the town.
19
+ - text: Cardenal made several visits to Jesuit universities in the United States,
20
+ including the University of Detroit Mercy in 2013, and the John Carroll University
21
+ in 2014.
22
+ - text: Other super-spreaders, defined as those that transmit SARS to at least eight
23
+ other people, included the incidents at the Hotel Metropole in Hong Kong, the
24
+ Amoy Gardens apartment complex in Hong Kong and one in an acute care hospital
25
+ in Toronto, Ontario, Canada.
26
+ - text: The District Court for the Northern District of California rejected 321 Studios'
27
+ claims for declaratory relief, holding that both DVD Copy Plus and DVD-X Copy
28
+ violated the DMCA and that the DMCA was not unconstitutional.
29
+ - text: The Sunday Edition is a television programme broadcast on the ITV Network
30
+ in the United Kingdom focusing on political interview and discussion, produced
31
+ by ITV Productions.
32
+ pipeline_tag: token-classification
33
+ model-index:
34
+ - name: SpanMarker
35
+ results:
36
+ - task:
37
+ type: token-classification
38
+ name: Named Entity Recognition
39
+ dataset:
40
+ name: Unknown
41
+ type: DFKI-SLT/few-nerd
42
+ split: test
43
+ metrics:
44
+ - type: f1
45
+ value: 0.703084859534267
46
+ name: F1
47
+ - type: precision
48
+ value: 0.7034273336857051
49
+ name: Precision
50
+ - type: recall
51
+ value: 0.7027427186979075
52
+ name: Recall
53
+ ---
54
+
55
+ # SpanMarker
56
+
57
+ This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model trained on the [DFKI-SLT/few-nerd](https://huggingface.co/datasets/DFKI-SLT/few-nerd) dataset that can be used for Named Entity Recognition.
58
+
59
+ ## Model Details
60
+
61
+ ### Model Description
62
+ - **Model Type:** SpanMarker
63
+ <!-- - **Encoder:** [Unknown](https://huggingface.co/unknown) -->
64
+ - **Maximum Sequence Length:** 256 tokens
65
+ - **Maximum Entity Length:** 8 words
66
+ - **Training Dataset:** [DFKI-SLT/few-nerd](https://huggingface.co/datasets/DFKI-SLT/few-nerd)
67
+ <!-- - **Language:** Unknown -->
68
+ <!-- - **License:** Unknown -->
69
+
70
+ ### Model Sources
71
+
72
+ - **Repository:** [SpanMarker on GitHub](https://github.com/tomaarsen/SpanMarkerNER)
73
+ - **Thesis:** [SpanMarker For Named Entity Recognition](https://raw.githubusercontent.com/tomaarsen/SpanMarkerNER/main/thesis.pdf)
74
+
75
+ ### Model Labels
76
+ | Label | Examples |
77
+ |:-----------------------------------------|:---------------------------------------------------------------------------------------------------------|
78
+ | art-broadcastprogram | "Street Cents", "Corazones", "The Gale Storm Show : Oh , Susanna" |
79
+ | art-film | "L'Atlantide", "Shawshank Redemption", "Bosch" |
80
+ | art-music | "Champion Lover", "Atkinson , Danko and Ford ( with Brockie and Hilton )", "Hollywood Studio Symphony" |
81
+ | art-other | "Aphrodite of Milos", "The Today Show", "Venus de Milo" |
82
+ | art-painting | "Production/Reproduction", "Cofiwch Dryweryn", "Touit" |
83
+ | art-writtenart | "Time", "Imelda de ' Lambertazzi", "The Seven Year Itch" |
84
+ | building-airport | "Sheremetyevo International Airport", "Luton Airport", "Newark Liberty International Airport" |
85
+ | building-hospital | "Yeungnam University Hospital", "Memorial Sloan-Kettering Cancer Center", "Hokkaido University Hospital" |
86
+ | building-hotel | "Radisson Blu Sea Plaza Hotel", "Flamingo Hotel", "The Standard Hotel" |
87
+ | building-library | "British Library", "Berlin State Library", "Bayerische Staatsbibliothek" |
88
+ | building-other | "Communiplex", "Henry Ford Museum", "Alpha Recording Studios" |
89
+ | building-restaurant | "Carnegie Deli", "Trumbull", "Fatburger" |
90
+ | building-sportsfacility | "Sports Center", "Boston Garden", "Glenn Warner Soccer Facility" |
91
+ | building-theater | "Sanders Theatre", "Pittsburgh Civic Light Opera", "National Paris Opera" |
92
+ | event-attack/battle/war/militaryconflict | "Vietnam War", "Jurist", "Easter Offensive" |
93
+ | event-disaster | "1990s North Korean famine", "the 1912 North Mount Lyell Disaster", "1693 Sicily earthquake" |
94
+ | event-election | "1982 Mitcham and Morden by-election", "Elections to the European Parliament", "March 1898 elections" |
95
+ | event-other | "Eastwood Scoring Stage", "Union for a Popular Movement", "Masaryk Democratic Movement" |
96
+ | event-protest | "French Revolution", "Iranian Constitutional Revolution", "Russian Revolution" |
97
+ | event-sportsevent | "World Cup", "National Champions", "Stanley Cup" |
98
+ | location-GPE | "Mediterranean Basin", "the Republic of Croatia", "Croatian" |
99
+ | location-bodiesofwater | "Arthur Kill", "Atatürk Dam Lake", "Norfolk coast" |
100
+ | location-island | "Staten Island", "new Samsat district", "Laccadives" |
101
+ | location-mountain | "Miteirya Ridge", "Ruweisat Ridge", "Salamander Glacier" |
102
+ | location-other | "Northern City Line", "Victoria line", "Cartuther" |
103
+ | location-park | "Painted Desert Community Complex Historic District", "Gramercy Park", "Shenandoah National Park" |
104
+ | location-road/railway/highway/transit | "NJT", "Newark-Elizabeth Rail Link", "Friern Barnet Road" |
105
+ | organization-company | "Church 's Chicken", "Texas Chicken", "Dixy Chicken" |
106
+ | organization-education | "Barnard College", "MIT", "Belfast Royal Academy and the Ulster College of Physical Education" |
107
+ | organization-government/governmentagency | "Diet", "Supreme Court", "Congregazione dei Nobili" |
108
+ | organization-media/newspaper | "Al Jazeera", "Clash", "TimeOut Melbourne" |
109
+ | organization-other | "Defence Sector C", "4th Army", "IAEA" |
110
+ | organization-politicalparty | "Al Wafa ' Islamic", "Shimpotō", "Kenseitō" |
111
+ | organization-religion | "Jewish", "UPCUSA", "Christian" |
112
+ | organization-showorganization | "Mr. Mister", "Lizzy", "Bochumer Symphoniker" |
113
+ | organization-sportsleague | "NHL", "First Division", "China League One" |
114
+ | organization-sportsteam | "Arsenal", "Luc Alphand Aventures", "Tottenham" |
115
+ | other-astronomything | "Algol", "Zodiac", "`` Caput Larvae ''" |
116
+ | other-award | "Order of the Republic of Guinea and Nigeria", "GCON", "Grand Commander of the Order of the Niger" |
117
+ | other-biologything | "Amphiphysin", "BAR", "N-terminal lipid" |
118
+ | other-chemicalthing | "sulfur", "uranium", "carbon dioxide" |
119
+ | other-currency | "$", "Travancore Rupee", "lac crore" |
120
+ | other-disease | "hypothyroidism", "bladder cancer", "French Dysentery Epidemic of 1779" |
121
+ | other-educationaldegree | "BSc ( Hons ) in physics", "Master", "Bachelor" |
122
+ | other-god | "El", "Raijin", "Fujin" |
123
+ | other-language | "Latin", "English", "Breton-speaking" |
124
+ | other-law | "United States Freedom Support Act", "Thirty Years ' Peace", "Leahy–Smith America Invents Act ( AIA" |
125
+ | other-livingthing | "insects", "monkeys", "patchouli" |
126
+ | other-medical | "pediatrician", "Pediatrics", "amitriptyline" |
127
+ | person-actor | "Edmund Payne", "Tchéky Karyo", "Ellaline Terriss" |
128
+ | person-artist/author | "Gaetano Donizett", "George Axelrod", "Hicks" |
129
+ | person-athlete | "Tozawa", "Jaguar", "Neville" |
130
+ | person-director | "Bob Swaim", "Frank Darabont", "Richard Quine" |
131
+ | person-other | "Holden", "Richard Benson", "Campbell" |
132
+ | person-politician | "Rivière", "Emeric", "William" |
133
+ | person-scholar | "Stalmine", "Wurdack", "Stedman" |
134
+ | person-soldier | "Krukenberg", "Joachim Ziegler", "Helmuth Weidling" |
135
+ | product-airplane | "EC135T2 CPDS", "Spey-equipped FGR.2s", "Luton" |
136
+ | product-car | "100EX", "Corvettes - GT1 C6R", "Phantom" |
137
+ | product-food | "yakiniku", "V. labrusca", "red grape" |
138
+ | product-game | "Airforce Delta", "Splinter Cell", "Hardcore RPG" |
139
+ | product-other | "X11", "Fairbottom Bobs", "PDP-1" |
140
+ | product-ship | "Essex", "HMS `` Chinkara ''", "Congress" |
141
+ | product-software | "Wikipedia", "Apdf", "AmiPDF" |
142
+ | product-train | "High Speed Trains", "Royal Scots Grey", "55022" |
143
+ | product-weapon | "ZU-23-2M Wróbel", "AR-15 's", "ZU-23-2MR Wróbel II" |
144
+
145
+ ## Evaluation
146
+
147
+ ### Metrics
148
+ | Label | Precision | Recall | F1 |
149
+ |:-----------------------------------------|:----------|:-------|:-------|
150
+ | **all** | 0.7034 | 0.7027 | 0.7031 |
151
+ | art-broadcastprogram | 0.6024 | 0.5904 | 0.5963 |
152
+ | art-film | 0.7761 | 0.7533 | 0.7645 |
153
+ | art-music | 0.7825 | 0.7551 | 0.7685 |
154
+ | art-other | 0.4193 | 0.3327 | 0.3710 |
155
+ | art-painting | 0.5882 | 0.5263 | 0.5556 |
156
+ | art-writtenart | 0.6819 | 0.6488 | 0.6649 |
157
+ | building-airport | 0.8064 | 0.8352 | 0.8205 |
158
+ | building-hospital | 0.7282 | 0.8022 | 0.7634 |
159
+ | building-hotel | 0.7033 | 0.7245 | 0.7138 |
160
+ | building-library | 0.7550 | 0.7380 | 0.7464 |
161
+ | building-other | 0.5867 | 0.5840 | 0.5853 |
162
+ | building-restaurant | 0.6205 | 0.5216 | 0.5667 |
163
+ | building-sportsfacility | 0.6113 | 0.7976 | 0.6921 |
164
+ | building-theater | 0.7060 | 0.7495 | 0.7271 |
165
+ | event-attack/battle/war/militaryconflict | 0.7945 | 0.7395 | 0.7660 |
166
+ | event-disaster | 0.5604 | 0.5604 | 0.5604 |
167
+ | event-election | 0.4286 | 0.1484 | 0.2204 |
168
+ | event-other | 0.4885 | 0.4400 | 0.4629 |
169
+ | event-protest | 0.3798 | 0.4759 | 0.4225 |
170
+ | event-sportsevent | 0.6198 | 0.6162 | 0.6180 |
171
+ | location-GPE | 0.8157 | 0.8552 | 0.8350 |
172
+ | location-bodiesofwater | 0.7268 | 0.7690 | 0.7473 |
173
+ | location-island | 0.7504 | 0.6842 | 0.7158 |
174
+ | location-mountain | 0.7352 | 0.7298 | 0.7325 |
175
+ | location-other | 0.4427 | 0.3104 | 0.3649 |
176
+ | location-park | 0.7153 | 0.6856 | 0.7001 |
177
+ | location-road/railway/highway/transit | 0.7090 | 0.7324 | 0.7205 |
178
+ | organization-company | 0.6963 | 0.7061 | 0.7012 |
179
+ | organization-education | 0.7994 | 0.7986 | 0.7990 |
180
+ | organization-government/governmentagency | 0.5524 | 0.4533 | 0.4980 |
181
+ | organization-media/newspaper | 0.6513 | 0.6656 | 0.6584 |
182
+ | organization-other | 0.5978 | 0.5375 | 0.5661 |
183
+ | organization-politicalparty | 0.6793 | 0.7315 | 0.7044 |
184
+ | organization-religion | 0.5575 | 0.6131 | 0.5840 |
185
+ | organization-showorganization | 0.6035 | 0.5839 | 0.5935 |
186
+ | organization-sportsleague | 0.6393 | 0.6610 | 0.6499 |
187
+ | organization-sportsteam | 0.7259 | 0.7796 | 0.7518 |
188
+ | other-astronomything | 0.7794 | 0.8024 | 0.7907 |
189
+ | other-award | 0.7180 | 0.6649 | 0.6904 |
190
+ | other-biologything | 0.6864 | 0.6238 | 0.6536 |
191
+ | other-chemicalthing | 0.5688 | 0.6036 | 0.5856 |
192
+ | other-currency | 0.6996 | 0.8423 | 0.7643 |
193
+ | other-disease | 0.6591 | 0.7410 | 0.6977 |
194
+ | other-educationaldegree | 0.6114 | 0.6198 | 0.6156 |
195
+ | other-god | 0.6486 | 0.7181 | 0.6816 |
196
+ | other-language | 0.6507 | 0.8313 | 0.7300 |
197
+ | other-law | 0.6934 | 0.7331 | 0.7127 |
198
+ | other-livingthing | 0.6019 | 0.6605 | 0.6298 |
199
+ | other-medical | 0.5124 | 0.5214 | 0.5169 |
200
+ | person-actor | 0.8384 | 0.8051 | 0.8214 |
201
+ | person-artist/author | 0.7122 | 0.7531 | 0.7321 |
202
+ | person-athlete | 0.8318 | 0.8422 | 0.8370 |
203
+ | person-director | 0.7083 | 0.7365 | 0.7221 |
204
+ | person-other | 0.6833 | 0.6737 | 0.6785 |
205
+ | person-politician | 0.6807 | 0.6836 | 0.6822 |
206
+ | person-scholar | 0.5397 | 0.5209 | 0.5301 |
207
+ | person-soldier | 0.5053 | 0.5920 | 0.5452 |
208
+ | product-airplane | 0.6617 | 0.6692 | 0.6654 |
209
+ | product-car | 0.7313 | 0.7132 | 0.7222 |
210
+ | product-food | 0.5787 | 0.5787 | 0.5787 |
211
+ | product-game | 0.7364 | 0.7140 | 0.7250 |
212
+ | product-other | 0.5567 | 0.4210 | 0.4795 |
213
+ | product-ship | 0.6842 | 0.6842 | 0.6842 |
214
+ | product-software | 0.6495 | 0.6648 | 0.6570 |
215
+ | product-train | 0.5942 | 0.5924 | 0.5933 |
216
+ | product-weapon | 0.6435 | 0.5353 | 0.5844 |
217
+
218
+ ## Uses
219
+
220
+ ### Direct Use for Inference
221
+
222
+ ```python
223
+ from span_marker import SpanMarkerModel
224
+
225
+ # Download from the 🤗 Hub
226
+ model = SpanMarkerModel.from_pretrained("supreethrao/instructNER_fewnerd_xl")
227
+ # Run inference
228
+ entities = model.predict("The Sunday Edition is a television programme broadcast on the ITV Network in the United Kingdom focusing on political interview and discussion, produced by ITV Productions.")
229
+ ```
230
+
231
+ ### Downstream Use
232
+ You can finetune this model on your own dataset.
233
+
234
+ <details><summary>Click to expand</summary>
235
+
236
+ ```python
237
+ from span_marker import SpanMarkerModel, Trainer
238
+
239
+ # Download from the 🤗 Hub
240
+ model = SpanMarkerModel.from_pretrained("supreethrao/instructNER_fewnerd_xl")
241
+
242
+ # Specify a Dataset with "tokens" and "ner_tag" columns
243
+ dataset = load_dataset("conll2003") # For example CoNLL2003
244
+
245
+ # Initialize a Trainer using the pretrained model & dataset
246
+ trainer = Trainer(
247
+ model=model,
248
+ train_dataset=dataset["train"],
249
+ eval_dataset=dataset["validation"],
250
+ )
251
+ trainer.train()
252
+ trainer.save_model("supreethrao/instructNER_fewnerd_xl-finetuned")
253
+ ```
254
+ </details>
255
+
256
+ <!--
257
+ ### Out-of-Scope Use
258
+
259
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
260
+ -->
261
+
262
+ <!--
263
+ ## Bias, Risks and Limitations
264
+
265
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
266
+ -->
267
+
268
+ <!--
269
+ ### Recommendations
270
+
271
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
272
+ -->
273
+
274
+ ## Training Details
275
+
276
+ ### Training Set Metrics
277
+ | Training set | Min | Median | Max |
278
+ |:----------------------|:----|:--------|:----|
279
+ | Sentence length | 1 | 24.4945 | 267 |
280
+ | Entities per sentence | 0 | 2.5832 | 88 |
281
+
282
+ ### Training Hyperparameters
283
+ - learning_rate: 5e-05
284
+ - train_batch_size: 16
285
+ - eval_batch_size: 16
286
+ - seed: 42
287
+ - distributed_type: multi-GPU
288
+ - num_devices: 2
289
+ - total_train_batch_size: 32
290
+ - total_eval_batch_size: 32
291
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
292
+ - lr_scheduler_type: linear
293
+ - lr_scheduler_warmup_ratio: 0.1
294
+ - num_epochs: 3
295
+ - mixed_precision_training: Native AMP
296
+
297
+ ### Framework Versions
298
+ - Python: 3.10.13
299
+ - SpanMarker: 1.5.0
300
+ - Transformers: 4.35.2
301
+ - PyTorch: 2.1.1
302
+ - Datasets: 2.15.0
303
+ - Tokenizers: 0.15.0
304
+
305
+ ## Citation
306
+
307
+ ### BibTeX
308
+ ```
309
+ @software{Aarsen_SpanMarker,
310
+ author = {Aarsen, Tom},
311
+ license = {Apache-2.0},
312
+ title = {{SpanMarker for Named Entity Recognition}},
313
+ url = {https://github.com/tomaarsen/SpanMarkerNER}
314
+ }
315
+ ```
316
+
317
+ <!--
318
+ ## Glossary
319
+
320
+ *Clearly define terms in order to be accessible across audiences.*
321
+ -->
322
+
323
+ <!--
324
+ ## Model Card Authors
325
+
326
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
327
+ -->
328
+
329
+ <!--
330
+ ## Model Card Contact
331
+
332
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
333
+ -->
final_checkpoint/added_tokens.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "<end>": 50266,
3
+ "<start>": 50265
4
+ }
final_checkpoint/config.json ADDED
@@ -0,0 +1,228 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "SpanMarkerModel"
4
+ ],
5
+ "encoder": {
6
+ "_name_or_path": "numind/generic-entity_recognition_NER-v1",
7
+ "add_cross_attention": false,
8
+ "architectures": [
9
+ "RobertaModel"
10
+ ],
11
+ "attention_probs_dropout_prob": 0.1,
12
+ "bad_words_ids": null,
13
+ "begin_suppress_tokens": null,
14
+ "bos_token_id": 0,
15
+ "chunk_size_feed_forward": 0,
16
+ "classifier_dropout": null,
17
+ "cross_attention_hidden_size": null,
18
+ "decoder_start_token_id": null,
19
+ "diversity_penalty": 0.0,
20
+ "do_sample": false,
21
+ "early_stopping": false,
22
+ "encoder_no_repeat_ngram_size": 0,
23
+ "eos_token_id": 2,
24
+ "exponential_decay_length_penalty": null,
25
+ "finetuning_task": null,
26
+ "forced_bos_token_id": null,
27
+ "forced_eos_token_id": null,
28
+ "hidden_act": "gelu",
29
+ "hidden_dropout_prob": 0.1,
30
+ "hidden_size": 768,
31
+ "id2label": {
32
+ "0": "O",
33
+ "1": "art-broadcastprogram",
34
+ "2": "art-film",
35
+ "3": "art-music",
36
+ "4": "art-other",
37
+ "5": "art-painting",
38
+ "6": "art-writtenart",
39
+ "7": "building-airport",
40
+ "8": "building-hospital",
41
+ "9": "building-hotel",
42
+ "10": "building-library",
43
+ "11": "building-other",
44
+ "12": "building-restaurant",
45
+ "13": "building-sportsfacility",
46
+ "14": "building-theater",
47
+ "15": "event-attack/battle/war/militaryconflict",
48
+ "16": "event-disaster",
49
+ "17": "event-election",
50
+ "18": "event-other",
51
+ "19": "event-protest",
52
+ "20": "event-sportsevent",
53
+ "21": "location-GPE",
54
+ "22": "location-bodiesofwater",
55
+ "23": "location-island",
56
+ "24": "location-mountain",
57
+ "25": "location-other",
58
+ "26": "location-park",
59
+ "27": "location-road/railway/highway/transit",
60
+ "28": "organization-company",
61
+ "29": "organization-education",
62
+ "30": "organization-government/governmentagency",
63
+ "31": "organization-media/newspaper",
64
+ "32": "organization-other",
65
+ "33": "organization-politicalparty",
66
+ "34": "organization-religion",
67
+ "35": "organization-showorganization",
68
+ "36": "organization-sportsleague",
69
+ "37": "organization-sportsteam",
70
+ "38": "other-astronomything",
71
+ "39": "other-award",
72
+ "40": "other-biologything",
73
+ "41": "other-chemicalthing",
74
+ "42": "other-currency",
75
+ "43": "other-disease",
76
+ "44": "other-educationaldegree",
77
+ "45": "other-god",
78
+ "46": "other-language",
79
+ "47": "other-law",
80
+ "48": "other-livingthing",
81
+ "49": "other-medical",
82
+ "50": "person-actor",
83
+ "51": "person-artist/author",
84
+ "52": "person-athlete",
85
+ "53": "person-director",
86
+ "54": "person-other",
87
+ "55": "person-politician",
88
+ "56": "person-scholar",
89
+ "57": "person-soldier",
90
+ "58": "product-airplane",
91
+ "59": "product-car",
92
+ "60": "product-food",
93
+ "61": "product-game",
94
+ "62": "product-other",
95
+ "63": "product-ship",
96
+ "64": "product-software",
97
+ "65": "product-train",
98
+ "66": "product-weapon"
99
+ },
100
+ "initializer_range": 0.02,
101
+ "intermediate_size": 3072,
102
+ "is_decoder": false,
103
+ "is_encoder_decoder": false,
104
+ "label2id": {
105
+ "O": 0,
106
+ "art-broadcastprogram": 1,
107
+ "art-film": 2,
108
+ "art-music": 3,
109
+ "art-other": 4,
110
+ "art-painting": 5,
111
+ "art-writtenart": 6,
112
+ "building-airport": 7,
113
+ "building-hospital": 8,
114
+ "building-hotel": 9,
115
+ "building-library": 10,
116
+ "building-other": 11,
117
+ "building-restaurant": 12,
118
+ "building-sportsfacility": 13,
119
+ "building-theater": 14,
120
+ "event-attack/battle/war/militaryconflict": 15,
121
+ "event-disaster": 16,
122
+ "event-election": 17,
123
+ "event-other": 18,
124
+ "event-protest": 19,
125
+ "event-sportsevent": 20,
126
+ "location-GPE": 21,
127
+ "location-bodiesofwater": 22,
128
+ "location-island": 23,
129
+ "location-mountain": 24,
130
+ "location-other": 25,
131
+ "location-park": 26,
132
+ "location-road/railway/highway/transit": 27,
133
+ "organization-company": 28,
134
+ "organization-education": 29,
135
+ "organization-government/governmentagency": 30,
136
+ "organization-media/newspaper": 31,
137
+ "organization-other": 32,
138
+ "organization-politicalparty": 33,
139
+ "organization-religion": 34,
140
+ "organization-showorganization": 35,
141
+ "organization-sportsleague": 36,
142
+ "organization-sportsteam": 37,
143
+ "other-astronomything": 38,
144
+ "other-award": 39,
145
+ "other-biologything": 40,
146
+ "other-chemicalthing": 41,
147
+ "other-currency": 42,
148
+ "other-disease": 43,
149
+ "other-educationaldegree": 44,
150
+ "other-god": 45,
151
+ "other-language": 46,
152
+ "other-law": 47,
153
+ "other-livingthing": 48,
154
+ "other-medical": 49,
155
+ "person-actor": 50,
156
+ "person-artist/author": 51,
157
+ "person-athlete": 52,
158
+ "person-director": 53,
159
+ "person-other": 54,
160
+ "person-politician": 55,
161
+ "person-scholar": 56,
162
+ "person-soldier": 57,
163
+ "product-airplane": 58,
164
+ "product-car": 59,
165
+ "product-food": 60,
166
+ "product-game": 61,
167
+ "product-other": 62,
168
+ "product-ship": 63,
169
+ "product-software": 64,
170
+ "product-train": 65,
171
+ "product-weapon": 66
172
+ },
173
+ "layer_norm_eps": 1e-05,
174
+ "length_penalty": 1.0,
175
+ "max_length": 20,
176
+ "max_position_embeddings": 514,
177
+ "min_length": 0,
178
+ "model_type": "roberta",
179
+ "no_repeat_ngram_size": 0,
180
+ "num_attention_heads": 12,
181
+ "num_beam_groups": 1,
182
+ "num_beams": 1,
183
+ "num_hidden_layers": 12,
184
+ "num_return_sequences": 1,
185
+ "output_attentions": false,
186
+ "output_hidden_states": false,
187
+ "output_scores": false,
188
+ "pad_token_id": 1,
189
+ "position_embedding_type": "absolute",
190
+ "prefix": null,
191
+ "problem_type": null,
192
+ "pruned_heads": {},
193
+ "remove_invalid_values": false,
194
+ "repetition_penalty": 1.0,
195
+ "return_dict": true,
196
+ "return_dict_in_generate": false,
197
+ "sep_token_id": null,
198
+ "suppress_tokens": null,
199
+ "task_specific_params": null,
200
+ "temperature": 1.0,
201
+ "tf_legacy_loss": false,
202
+ "tie_encoder_decoder": false,
203
+ "tie_word_embeddings": true,
204
+ "tokenizer_class": null,
205
+ "top_k": 50,
206
+ "top_p": 1.0,
207
+ "torch_dtype": "float32",
208
+ "torchscript": false,
209
+ "transformers_version": "4.35.2",
210
+ "type_vocab_size": 1,
211
+ "typical_p": 1.0,
212
+ "use_bfloat16": false,
213
+ "use_cache": true,
214
+ "vocab_size": 50272
215
+ },
216
+ "entity_max_length": 8,
217
+ "marker_max_length": 128,
218
+ "max_next_context": null,
219
+ "max_prev_context": null,
220
+ "model_max_length": 256,
221
+ "model_max_length_default": 512,
222
+ "model_type": "span-marker",
223
+ "span_marker_version": "1.5.0",
224
+ "torch_dtype": "float32",
225
+ "trained_with_document_context": false,
226
+ "transformers_version": "4.35.2",
227
+ "vocab_size": 50272
228
+ }
final_checkpoint/merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
final_checkpoint/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1f2a83a5a877bf7c79b2dd581c8ee4ded91ae0fa501cadf1d3f12deb1be8fde
3
+ size 499040084
final_checkpoint/special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": true,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": true,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": true,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": true,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": true,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
final_checkpoint/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
final_checkpoint/tokenizer_config.json ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": true,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "50264": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": true,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ },
44
+ "50265": {
45
+ "content": "<start>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false,
50
+ "special": true
51
+ },
52
+ "50266": {
53
+ "content": "<end>",
54
+ "lstrip": false,
55
+ "normalized": false,
56
+ "rstrip": false,
57
+ "single_word": false,
58
+ "special": true
59
+ }
60
+ },
61
+ "bos_token": "<s>",
62
+ "clean_up_tokenization_spaces": true,
63
+ "cls_token": "<s>",
64
+ "entity_max_length": 8,
65
+ "eos_token": "</s>",
66
+ "errors": "replace",
67
+ "marker_max_length": 128,
68
+ "mask_token": "<mask>",
69
+ "model_max_length": 256,
70
+ "pad_token": "<pad>",
71
+ "sep_token": "</s>",
72
+ "tokenizer_class": "RobertaTokenizer",
73
+ "trim_offsets": true,
74
+ "unk_token": "<unk>"
75
+ }
final_checkpoint/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdc21b4c9d1f74ac0be95d21943b06d50dbdfc012b15ba465d456f1e90484c21
3
+ size 4600
final_checkpoint/vocab.json ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3e3e3c83e4420ebd986fe000725351e0ac21193c9814e34278d52241f56a0d04
3
  size 499040084
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1f2a83a5a877bf7c79b2dd581c8ee4ded91ae0fa501cadf1d3f12deb1be8fde
3
  size 499040084
runs/Nov27_07-50-08_trinity/events.out.tfevents.1701071419.trinity.224163.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b28df34a29ca4d98aeb2a077d6de9d0168090471d9b9de504b235b78cce552e6
3
- size 54439
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dcdd56364d742c98d9ed208bcfd60f99eb58c37a29078af4258221972021cc34
3
+ size 58090
runs/Nov27_07-50-08_trinity/events.out.tfevents.1701076911.trinity.224163.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7aa30977e15166f604b04b4ef242ee6471c33e917341e69b9ce509dd8db09e3f
3
+ size 592
test_results.json ADDED
@@ -0,0 +1,407 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 3.0,
3
+ "test_art-broadcastprogram": {
4
+ "f1": 0.5963149078726968,
5
+ "number": 603,
6
+ "precision": 0.6023688663282571,
7
+ "recall": 0.5903814262023217
8
+ },
9
+ "test_art-film": {
10
+ "f1": 0.7645466847090664,
11
+ "number": 750,
12
+ "precision": 0.7760989010989011,
13
+ "recall": 0.7533333333333333
14
+ },
15
+ "test_art-music": {
16
+ "f1": 0.7685459940652819,
17
+ "number": 1029,
18
+ "precision": 0.7824773413897281,
19
+ "recall": 0.7551020408163265
20
+ },
21
+ "test_art-other": {
22
+ "f1": 0.37103174603174605,
23
+ "number": 562,
24
+ "precision": 0.4192825112107623,
25
+ "recall": 0.33274021352313166
26
+ },
27
+ "test_art-painting": {
28
+ "f1": 0.5555555555555555,
29
+ "number": 57,
30
+ "precision": 0.5882352941176471,
31
+ "recall": 0.5263157894736842
32
+ },
33
+ "test_art-writtenart": {
34
+ "f1": 0.6649020645844362,
35
+ "number": 968,
36
+ "precision": 0.6818675352877307,
37
+ "recall": 0.6487603305785123
38
+ },
39
+ "test_building-airport": {
40
+ "f1": 0.8205128205128206,
41
+ "number": 364,
42
+ "precision": 0.8063660477453581,
43
+ "recall": 0.8351648351648352
44
+ },
45
+ "test_building-hospital": {
46
+ "f1": 0.7633986928104576,
47
+ "number": 364,
48
+ "precision": 0.7281795511221946,
49
+ "recall": 0.8021978021978022
50
+ },
51
+ "test_building-hotel": {
52
+ "f1": 0.7137546468401488,
53
+ "number": 265,
54
+ "precision": 0.7032967032967034,
55
+ "recall": 0.7245283018867924
56
+ },
57
+ "test_building-library": {
58
+ "f1": 0.7464387464387464,
59
+ "number": 355,
60
+ "precision": 0.7550432276657061,
61
+ "recall": 0.7380281690140845
62
+ },
63
+ "test_building-other": {
64
+ "f1": 0.5853370122191565,
65
+ "number": 2543,
66
+ "precision": 0.5867246147767681,
67
+ "recall": 0.5839559575304758
68
+ },
69
+ "test_building-restaurant": {
70
+ "f1": 0.5667447306791569,
71
+ "number": 232,
72
+ "precision": 0.6205128205128205,
73
+ "recall": 0.521551724137931
74
+ },
75
+ "test_building-sportsfacility": {
76
+ "f1": 0.6921487603305786,
77
+ "number": 420,
78
+ "precision": 0.6113138686131386,
79
+ "recall": 0.7976190476190477
80
+ },
81
+ "test_building-theater": {
82
+ "f1": 0.7270788912579957,
83
+ "number": 455,
84
+ "precision": 0.7060041407867494,
85
+ "recall": 0.7494505494505495
86
+ },
87
+ "test_event-attack/battle/war/militaryconflict": {
88
+ "f1": 0.7660377358490565,
89
+ "number": 1098,
90
+ "precision": 0.7945205479452054,
91
+ "recall": 0.7395264116575592
92
+ },
93
+ "test_event-disaster": {
94
+ "f1": 0.5603864734299517,
95
+ "number": 207,
96
+ "precision": 0.5603864734299517,
97
+ "recall": 0.5603864734299517
98
+ },
99
+ "test_event-election": {
100
+ "f1": 0.22040816326530616,
101
+ "number": 182,
102
+ "precision": 0.42857142857142855,
103
+ "recall": 0.14835164835164835
104
+ },
105
+ "test_event-other": {
106
+ "f1": 0.4629404617253949,
107
+ "number": 866,
108
+ "precision": 0.48846153846153845,
109
+ "recall": 0.4399538106235566
110
+ },
111
+ "test_event-protest": {
112
+ "f1": 0.42245989304812837,
113
+ "number": 166,
114
+ "precision": 0.3798076923076923,
115
+ "recall": 0.4759036144578313
116
+ },
117
+ "test_event-sportsevent": {
118
+ "f1": 0.6179955171309639,
119
+ "number": 1566,
120
+ "precision": 0.619781631342325,
121
+ "recall": 0.6162196679438059
122
+ },
123
+ "test_location-GPE": {
124
+ "f1": 0.8349881570447639,
125
+ "number": 20405,
126
+ "precision": 0.8157255048616305,
127
+ "recall": 0.8551825532957609
128
+ },
129
+ "test_location-bodiesofwater": {
130
+ "f1": 0.7472984206151289,
131
+ "number": 1169,
132
+ "precision": 0.7267582861762328,
133
+ "recall": 0.7690333618477331
134
+ },
135
+ "test_location-island": {
136
+ "f1": 0.7157894736842105,
137
+ "number": 646,
138
+ "precision": 0.7504244482173175,
139
+ "recall": 0.6842105263157895
140
+ },
141
+ "test_location-mountain": {
142
+ "f1": 0.7324981577008107,
143
+ "number": 681,
144
+ "precision": 0.735207100591716,
145
+ "recall": 0.7298091042584435
146
+ },
147
+ "test_location-other": {
148
+ "f1": 0.36490474912798493,
149
+ "number": 2191,
150
+ "precision": 0.4427083333333333,
151
+ "recall": 0.31036056595162026
152
+ },
153
+ "test_location-park": {
154
+ "f1": 0.7001114827201783,
155
+ "number": 458,
156
+ "precision": 0.715261958997722,
157
+ "recall": 0.6855895196506551
158
+ },
159
+ "test_location-road/railway/highway/transit": {
160
+ "f1": 0.7204861111111112,
161
+ "number": 1700,
162
+ "precision": 0.708997722095672,
163
+ "recall": 0.7323529411764705
164
+ },
165
+ "test_loss": 0.022335968911647797,
166
+ "test_organization-company": {
167
+ "f1": 0.7011596788581624,
168
+ "number": 3896,
169
+ "precision": 0.6962794229309036,
170
+ "recall": 0.7061088295687885
171
+ },
172
+ "test_organization-education": {
173
+ "f1": 0.7990314769975787,
174
+ "number": 2066,
175
+ "precision": 0.7994186046511628,
176
+ "recall": 0.7986447241045499
177
+ },
178
+ "test_organization-government/governmentagency": {
179
+ "f1": 0.49800072700836057,
180
+ "number": 1511,
181
+ "precision": 0.5524193548387096,
182
+ "recall": 0.45334215751158174
183
+ },
184
+ "test_organization-media/newspaper": {
185
+ "f1": 0.6583701324769169,
186
+ "number": 1232,
187
+ "precision": 0.6513105639396346,
188
+ "recall": 0.6655844155844156
189
+ },
190
+ "test_organization-other": {
191
+ "f1": 0.5660735468564649,
192
+ "number": 4439,
193
+ "precision": 0.59784515159108,
194
+ "recall": 0.5375084478486145
195
+ },
196
+ "test_organization-politicalparty": {
197
+ "f1": 0.704431247144815,
198
+ "number": 1054,
199
+ "precision": 0.6792951541850221,
200
+ "recall": 0.7314990512333965
201
+ },
202
+ "test_organization-religion": {
203
+ "f1": 0.583982990786676,
204
+ "number": 672,
205
+ "precision": 0.557510148849797,
206
+ "recall": 0.6130952380952381
207
+ },
208
+ "test_organization-showorganization": {
209
+ "f1": 0.5935228023793787,
210
+ "number": 769,
211
+ "precision": 0.603494623655914,
212
+ "recall": 0.5838751625487646
213
+ },
214
+ "test_organization-sportsleague": {
215
+ "f1": 0.6499442586399109,
216
+ "number": 882,
217
+ "precision": 0.6392543859649122,
218
+ "recall": 0.6609977324263039
219
+ },
220
+ "test_organization-sportsteam": {
221
+ "f1": 0.7518034704620783,
222
+ "number": 2473,
223
+ "precision": 0.7259036144578314,
224
+ "recall": 0.779619894864537
225
+ },
226
+ "test_other-astronomything": {
227
+ "f1": 0.7906976744186047,
228
+ "number": 678,
229
+ "precision": 0.7793696275071633,
230
+ "recall": 0.8023598820058997
231
+ },
232
+ "test_other-award": {
233
+ "f1": 0.6903954802259886,
234
+ "number": 919,
235
+ "precision": 0.717978848413631,
236
+ "recall": 0.6648531011969532
237
+ },
238
+ "test_other-biologything": {
239
+ "f1": 0.6536203522504893,
240
+ "number": 1874,
241
+ "precision": 0.6864357017028773,
242
+ "recall": 0.6237993596584845
243
+ },
244
+ "test_other-chemicalthing": {
245
+ "f1": 0.5856459330143541,
246
+ "number": 1014,
247
+ "precision": 0.5687732342007435,
248
+ "recall": 0.6035502958579881
249
+ },
250
+ "test_other-currency": {
251
+ "f1": 0.7643384440658716,
252
+ "number": 799,
253
+ "precision": 0.6995841995841996,
254
+ "recall": 0.8423028785982478
255
+ },
256
+ "test_other-disease": {
257
+ "f1": 0.6976744186046512,
258
+ "number": 749,
259
+ "precision": 0.6591448931116389,
260
+ "recall": 0.7409879839786382
261
+ },
262
+ "test_other-educationaldegree": {
263
+ "f1": 0.615595075239398,
264
+ "number": 363,
265
+ "precision": 0.6114130434782609,
266
+ "recall": 0.6198347107438017
267
+ },
268
+ "test_other-god": {
269
+ "f1": 0.6816143497757848,
270
+ "number": 635,
271
+ "precision": 0.6486486486486487,
272
+ "recall": 0.7181102362204724
273
+ },
274
+ "test_other-language": {
275
+ "f1": 0.7300291545189505,
276
+ "number": 753,
277
+ "precision": 0.6507276507276507,
278
+ "recall": 0.8313413014608234
279
+ },
280
+ "test_other-law": {
281
+ "f1": 0.7126673532440783,
282
+ "number": 472,
283
+ "precision": 0.6933867735470942,
284
+ "recall": 0.7330508474576272
285
+ },
286
+ "test_other-livingthing": {
287
+ "f1": 0.6298342541436465,
288
+ "number": 863,
289
+ "precision": 0.6019007391763463,
290
+ "recall": 0.660486674391657
291
+ },
292
+ "test_other-medical": {
293
+ "f1": 0.5168539325842697,
294
+ "number": 397,
295
+ "precision": 0.5123762376237624,
296
+ "recall": 0.5214105793450882
297
+ },
298
+ "test_overall_accuracy": 0.9256893595441806,
299
+ "test_overall_f1": 0.703084859534267,
300
+ "test_overall_precision": 0.7034273336857051,
301
+ "test_overall_recall": 0.7027427186979075,
302
+ "test_person-actor": {
303
+ "f1": 0.8214397008413836,
304
+ "number": 1637,
305
+ "precision": 0.8384223918575063,
306
+ "recall": 0.8051313378130727
307
+ },
308
+ "test_person-artist/author": {
309
+ "f1": 0.7320701754385964,
310
+ "number": 3463,
311
+ "precision": 0.7121791370835608,
312
+ "recall": 0.7531042448743863
313
+ },
314
+ "test_person-athlete": {
315
+ "f1": 0.8370089593383873,
316
+ "number": 2884,
317
+ "precision": 0.8318493150684931,
318
+ "recall": 0.8422330097087378
319
+ },
320
+ "test_person-director": {
321
+ "f1": 0.7221238938053098,
322
+ "number": 554,
323
+ "precision": 0.7083333333333334,
324
+ "recall": 0.7364620938628159
325
+ },
326
+ "test_person-other": {
327
+ "f1": 0.6784606547960942,
328
+ "number": 8767,
329
+ "precision": 0.6833275483049867,
330
+ "recall": 0.6736625983802897
331
+ },
332
+ "test_person-politician": {
333
+ "f1": 0.6821515892420537,
334
+ "number": 2857,
335
+ "precision": 0.6807249912861624,
336
+ "recall": 0.6835841792089604
337
+ },
338
+ "test_person-scholar": {
339
+ "f1": 0.5301369863013699,
340
+ "number": 743,
341
+ "precision": 0.5397489539748954,
342
+ "recall": 0.5208613728129206
343
+ },
344
+ "test_person-soldier": {
345
+ "f1": 0.5451957295373664,
346
+ "number": 647,
347
+ "precision": 0.5052770448548812,
348
+ "recall": 0.5919629057187017
349
+ },
350
+ "test_product-airplane": {
351
+ "f1": 0.6654111738857501,
352
+ "number": 792,
353
+ "precision": 0.66167290886392,
354
+ "recall": 0.6691919191919192
355
+ },
356
+ "test_product-car": {
357
+ "f1": 0.7221812822402359,
358
+ "number": 687,
359
+ "precision": 0.7313432835820896,
360
+ "recall": 0.7132459970887919
361
+ },
362
+ "test_product-food": {
363
+ "f1": 0.5787037037037037,
364
+ "number": 432,
365
+ "precision": 0.5787037037037037,
366
+ "recall": 0.5787037037037037
367
+ },
368
+ "test_product-game": {
369
+ "f1": 0.7250257466529352,
370
+ "number": 493,
371
+ "precision": 0.7364016736401674,
372
+ "recall": 0.7139959432048681
373
+ },
374
+ "test_product-other": {
375
+ "f1": 0.4794617563739376,
376
+ "number": 1608,
377
+ "precision": 0.5567434210526315,
378
+ "recall": 0.4210199004975124
379
+ },
380
+ "test_product-ship": {
381
+ "f1": 0.6842105263157895,
382
+ "number": 380,
383
+ "precision": 0.6842105263157895,
384
+ "recall": 0.6842105263157895
385
+ },
386
+ "test_product-software": {
387
+ "f1": 0.6570316842690384,
388
+ "number": 889,
389
+ "precision": 0.6494505494505495,
390
+ "recall": 0.6647919010123734
391
+ },
392
+ "test_product-train": {
393
+ "f1": 0.5933014354066984,
394
+ "number": 314,
395
+ "precision": 0.5942492012779552,
396
+ "recall": 0.5923566878980892
397
+ },
398
+ "test_product-weapon": {
399
+ "f1": 0.584426946631671,
400
+ "number": 624,
401
+ "precision": 0.6435452793834296,
402
+ "recall": 0.5352564102564102
403
+ },
404
+ "test_runtime": 242.3784,
405
+ "test_samples_per_second": 189.732,
406
+ "test_steps_per_second": 5.933
407
+ }