Ihor commited on
Commit
5fd1a07
1 Parent(s): 60bc0c0

Add becnhmarking results

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -186,6 +186,14 @@ results = process(text, prompt)
186
 
187
  print(results)
188
  ```
 
 
 
 
 
 
 
 
189
 
190
  ### Future reading
191
  Check our blogpost - ["As GPT4 but for token classification"](https://medium.com/p/9b5a081fbf27), where we highlighted possible use-cases of the model and why next-token prediction is not the only way to achive amazing zero-shot capabilites.
 
186
 
187
  print(results)
188
  ```
189
+ ### Benchmarking
190
+ Below is a table that highlights the performance of UTC models on the [CrossNER](https://huggingface.co/datasets/DFKI-SLT/cross_ner) dataset. The values represent the Micro F1 scores, with the estimation done at the word level.
191
+
192
+ | Model | AI | Literature | Music | Politics | Science |
193
+ |----------------------|--------|------------|--------|----------|---------|
194
+ | UTC-DeBERTa-small | 0.8492 | 0.8792 | 0.864 | 0.9008 | 0.85 |
195
+ | UTC-DeBERTa-base | 0.8452 | 0.8587 | 0.8711 | 0.9147 | 0.8631 |
196
+ | UTC-DeBERTa-large | 0.8971 | 0.8978 | 0.9204 | 0.9247 | 0.8779 |
197
 
198
  ### Future reading
199
  Check our blogpost - ["As GPT4 but for token classification"](https://medium.com/p/9b5a081fbf27), where we highlighted possible use-cases of the model and why next-token prediction is not the only way to achive amazing zero-shot capabilites.