File size: 1,566 Bytes
36151b0 d7f80ce d176b74 d7f80ce 36151b0 aefef28 15e0b77 d7f80ce aefef28 2369971 d176b74 f877f67 d176b74 f877f67 d176b74 a939888 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
---
license: mit
datasets:
- numind/NuNER
library_name: gliner
language:
- en
pipeline_tag: token-classification
tags:
- entity recognition
- NER
- named entity recognition
- zero shot
- zero-shot
---
NuZero - is the family of Zero-Shot Entity Recognition models inspired by [GLiNER](https://huggingface.co/papers/2311.08526) and built with insights we gathered throughout our work on [NuNER](https://huggingface.co/collections/numind/nuner-token-classification-and-ner-backbones-65e1f6e14639e2a465af823b).
NuZero span is a more powerful version of GLiNER-large-v2.1, surpassing it by 4% on average, and is trained on the diverse internal dataset tailored for real-life use cases.
<p align="center">
<img src="zero_shot_performance_span.png">
</p>
## Installation & Usage
```
!pip install gliner
```
**NuZero requires labels to be lower-cased**
```python
from gliner import GLiNER
model = GLiNER.from_pretrained("numind/NuZero_span")
# NuZero requires labels to be lower-cased!
labels = ["person", "award", "date", "competitions", "teams"]
text = """
"""
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])
```
## Fine-tuning
## Citation
```
@misc{bogdanov2024nuner,
title={NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data},
author={Sergei Bogdanov and Alexandre Constantin and Timothée Bernard and Benoit Crabbé and Etienne Bernard},
year={2024},
eprint={2402.15343},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
|