andrewzamai commited on
Commit
12a6fed
1 Parent(s): d3736f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +213 -0
README.md CHANGED
@@ -79,6 +79,219 @@ SLIMER performs comparably to these state-of-the-art models on OOD input domains
79
  We extend the standard zero-shot evaluations (CrossNER and MIT) with BUSTER, which is characterized by financial entities that are rather far from the more traditional tags observed by all models during training.
80
  An inverse trend can be observed, with SLIMER emerging as the most effective in dealing with these unseen labels, thanks to its lighter instruction tuning methodology and the use of definition and guidelines.
81
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
  <table>
83
  <thead>
84
  <tr>
 
79
  We extend the standard zero-shot evaluations (CrossNER and MIT) with BUSTER, which is characterized by financial entities that are rather far from the more traditional tags observed by all models during training.
80
  An inverse trend can be observed, with SLIMER emerging as the most effective in dealing with these unseen labels, thanks to its lighter instruction tuning methodology and the use of definition and guidelines.
81
 
82
+ <!DOCTYPE html>
83
+ <html>
84
+ <head>
85
+ <style>
86
+ table {
87
+ width: 100%;
88
+ border-collapse: collapse;
89
+ font-size: 12px;
90
+ }
91
+ th, td {
92
+ border: 1px solid black;
93
+ padding: 4px;
94
+ text-align: center;
95
+ }
96
+ th {
97
+ background-color: #f2f2f2;
98
+ }
99
+ .col-model { width: 10%; }
100
+ .col-backbone { width: 15%; }
101
+ .col-params { width: 10%; }
102
+ .col-mit, .col-crossner, .col-buster, .col-avg { width: 7%; }
103
+ </style>
104
+ </head>
105
+ <body>
106
+
107
+ <table>
108
+ <thead>
109
+ <tr>
110
+ <th class="col-model">Model</th>
111
+ <th class="col-backbone">Backbone</th>
112
+ <th class="col-params">#Params</th>
113
+ <th class="col-mit" colspan="2">MIT</th>
114
+ <th class="col-crossner" colspan="5">CrossNER</th>
115
+ <th class="col-buster">BUSTER</th>
116
+ <th class="col-avg">AVG</th>
117
+ </tr>
118
+ <tr>
119
+ <th></th>
120
+ <th></th>
121
+ <th></th>
122
+ <th class="col-mit">Movie</th>
123
+ <th class="col-mit">Restaurant</th>
124
+ <th class="col-crossner">AI</th>
125
+ <th class="col-crossner">Literature</th>
126
+ <th class="col-crossner">Music</th>
127
+ <th class="col-crossner">Politics</th>
128
+ <th class="col-crossner">Science</th>
129
+ <th class="col-buster"></th>
130
+ <th class="col-avg"></th>
131
+ </tr>
132
+ </thead>
133
+ <tbody>
134
+ <tr>
135
+ <td class="col-model">ChatGPT</td>
136
+ <td class="col-backbone">gpt-3.5-turbo</td>
137
+ <td class="col-params">-</td>
138
+ <td class="col-mit">5.3</td>
139
+ <td class="col-mit">32.8</td>
140
+ <td class="col-crossner">52.4</td>
141
+ <td class="col-crossner">39.8</td>
142
+ <td class="col-crossner">66.6</td>
143
+ <td class="col-crossner">68.5</td>
144
+ <td class="col-crossner">67.0</td>
145
+ <td class="col-buster">-</td>
146
+ <td class="col-avg">-</td>
147
+ </tr>
148
+ <tr>
149
+ <td class="col-model">InstructUIE</td>
150
+ <td class="col-backbone">Flan-T5-xxl</td>
151
+ <td class="col-params">11B</td>
152
+ <td class="col-mit">63.0</td>
153
+ <td class="col-mit">21.0</td>
154
+ <td class="col-crossner">49.0</td>
155
+ <td class="col-crossner">47.2</td>
156
+ <td class="col-crossner">53.2</td>
157
+ <td class="col-crossner">48.2</td>
158
+ <td class="col-crossner">49.3</td>
159
+ <td class="col-buster">-</td>
160
+ <td class="col-avg">-</td>
161
+ </tr>
162
+ <tr>
163
+ <td class="col-model">UniNER-type</td>
164
+ <td class="col-backbone">LLaMA-1</td>
165
+ <td class="col-params">7B</td>
166
+ <td class="col-mit">42.4</td>
167
+ <td class="col-mit">31.7</td>
168
+ <td class="col-crossner">53.5</td>
169
+ <td class="col-crossner">59.4</td>
170
+ <td class="col-crossner">65.0</td>
171
+ <td class="col-crossner">60.8</td>
172
+ <td class="col-crossner">61.1</td>
173
+ <td class="col-buster">34.8</td>
174
+ <td class="col-avg">51.1</td>
175
+ </tr>
176
+ <tr>
177
+ <td class="col-model">UniNER-def</td>
178
+ <td class="col-backbone">LLaMA-1</td>
179
+ <td class="col-params">7B</td>
180
+ <td class="col-mit">27.1</td>
181
+ <td class="col-mit">27.9</td>
182
+ <td class="col-crossner">44.5</td>
183
+ <td class="col-crossner">49.2</td>
184
+ <td class="col-crossner">55.8</td>
185
+ <td class="col-crossner">57.5</td>
186
+ <td class="col-crossner">52.9</td>
187
+ <td class="col-buster">33.6</td>
188
+ <td class="col-avg">43.6</td>
189
+ </tr>
190
+ <tr>
191
+ <td class="col-model">UniNER-type+sup.</td>
192
+ <td class="col-backbone">LLaMA-1</td>
193
+ <td class="col-params">7B</td>
194
+ <td class="col-mit">61.2</td>
195
+ <td class="col-mit">35.2</td>
196
+ <td class="col-crossner">62.9</td>
197
+ <td class="col-crossner">64.9</td>
198
+ <td class="col-crossner">70.6</td>
199
+ <td class="col-crossner">66.9</td>
200
+ <td class="col-crossner">70.8</td>
201
+ <td class="col-buster">37.8</td>
202
+ <td class="col-avg">58.8</td>
203
+ </tr>
204
+ <tr>
205
+ <td class="col-model">GoLLIE</td>
206
+ <td class="col-backbone">Code-LLaMA</td>
207
+ <td class="col-params">7B</td>
208
+ <td class="col-mit">63.0</td>
209
+ <td class="col-mit">43.4</td>
210
+ <td class="col-crossner">59.1</td>
211
+ <td class="col-crossner">62.7</td>
212
+ <td class="col-crossner">67.8</td>
213
+ <td class="col-crossner">57.2</td>
214
+ <td class="col-crossner">55.5</td>
215
+ <td class="col-buster">27.7</td>
216
+ <td class="col-avg">54.6</td>
217
+ </tr>
218
+ <tr>
219
+ <td class="col-model">GLiNER-L</td>
220
+ <td class="col-backbone">DeBERTa-v3</td>
221
+ <td class="col-params">0.3B</td>
222
+ <td class="col-mit">57.2</td>
223
+ <td class="col-mit">42.9</td>
224
+ <td class="col-crossner">57.2</td>
225
+ <td class="col-crossner">64.4</td>
226
+ <td class="col-crossner">69.6</td>
227
+ <td class="col-crossner">72.6</td>
228
+ <td class="col-crossner">62.6</td>
229
+ <td class="col-buster">26.6</td>
230
+ <td class="col-avg">56.6</td>
231
+ </tr>
232
+ <tr>
233
+ <td class="col-model">GNER-T5</td>
234
+ <td class="col-backbone">Flan-T5-xxl</td>
235
+ <td class="col-params">11B</td>
236
+ <td class="col-mit">62.5</td>
237
+ <td class="col-mit">51.0</td>
238
+ <td class="col-crossner">68.2</td>
239
+ <td class="col-crossner">68.7</td>
240
+ <td class="col-crossner">81.2</td>
241
+ <td class="col-crossner">75.1</td>
242
+ <td class="col-crossner">76.7</td>
243
+ <td class="col-buster" style="color: red;">27.9</td>
244
+ <td class="col-avg">63.9</td>
245
+ </tr>
246
+ <tr>
247
+ <td class="col-model">GNER-LLaMA</td>
248
+ <td class="col-backbone">LLaMA-1</td>
249
+ <td class="col-params">7B</td>
250
+ <td class="col-mit">68.6</td>
251
+ <td class="col-mit">47.5</td>
252
+ <td class="col-crossner">63.1</td>
253
+ <td class="col-crossner">68.2</td>
254
+ <td class="col-crossner">75.7</td>
255
+ <td class="col-crossner">69.4</td>
256
+ <td class="col-crossner">69.9</td>
257
+ <td class="col-buster" style="color: red;">23.6</td>
258
+ <td class="col-avg">60.8</td>
259
+ </tr>
260
+ <tr>
261
+ <td class="col-model">SLIMER w/o D&amp;G</td>
262
+ <td class="col-backbone">LLaMA-2-chat</td>
263
+ <td class="col-params">7B</td>
264
+ <td class="col-mit">46.4</td>
265
+ <td class="col-mit">36.3</td>
266
+ <td class="col-crossner">49.6</td>
267
+ <td class="col-crossner">58.4</td>
268
+ <td class="col-crossner">56.8</td>
269
+ <td class="col-crossner">57.9</td>
270
+ <td class="col-crossner">53.8</td>
271
+ <td class="col-buster">40.4</td>
272
+ <td class="col-avg">49.9</td>
273
+ </tr>
274
+ <tr>
275
+ <td class="col-model"><b>SLIMER</b></td>
276
+ <td class="col-backbone"><b>LLaMA-2-chat</b></td>
277
+ <td class="col-params"><b>7B</b></td>
278
+ <td class="col-mit"><b>50.9</b></td>
279
+ <td class="col-mit"><b>38.2</b></td>
280
+ <td class="col-crossner"><b>50.1</b></td>
281
+ <td class="col-crossner"><b>58.7</b></td>
282
+ <td class="col-crossner"><b>60.0</b></td>
283
+ <td class="col-crossner"><b>63.9</b></td>
284
+ <td class="col-crossner"><b>56.3</b></td>
285
+ <td class="col-buster"><b>45.3</b></td>
286
+ <td class="col-avg"><b>52.9</b></td>
287
+ </tr>
288
+ </tbody>
289
+ </table>
290
+
291
+ </body>
292
+ </html>
293
+
294
+
295
  <table>
296
  <thead>
297
  <tr>