File size: 1,531 Bytes
c9c5123
ecfb6f0
 
 
 
 
 
 
 
d91d721
 
c9c5123
ecfb6f0
 
 
 
 
 
c9c5123
ecfb6f0
 
 
 
 
 
 
4b8cc82
 
 
 
 
 
 
10e7146
4b8cc82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ecfb6f0
 
 
 
 
 
 
4b8cc82
ecfb6f0
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
language: 
  - en

tags:
- pytorch
- ner
- qa

inference: false

license: mit

datasets:
- conll2003

metrics:
- f1
---
# t5-base-qa-ner-conll

Unofficial implementation of [InstructionNER](https://arxiv.org/pdf/2203.03903v1.pdf).
t5-base model tuned on conll2003 dataset.

https://github.com/ovbystrova/InstructionNER 

## Inference 
```shell
git clone https://github.com/ovbystrova/InstructionNER 
cd InstructionNER
```

```python
from instruction_ner.model import Model

model = Model(
    model_path_or_name="olgaduchovny/t5-base-qa-ner-conll",
    tokenizer_path_or_name="olgaduchovny/t5-base-qa-ner-conll"
)

options = ["LOC", "PER", "ORG", "MISC"]

instruction = "please extract entities and their types from the input sentence, " \
              "all entity types are in options"

text = "The protest , which attracted several thousand supporters , coincided with the 18th anniversary of Spain 's constitution ."

generation_kwargs = {
    "num_beams": 2,
    "max_length": 128
}

pred_spans = model.predict(
    text=text,
    generation_kwargs=generation_kwargs,
    instruction=instruction,
    options=options
)

>>> [(99, 104, 'LOC')]
```

## Prediction Sample
```
Sentence: The protest , which attracted several thousand supporters , coincided with the 18th anniversary of Spain 's constitution .
Instruction: please extract entities and their types from the input sentence, all entity types are in options
Options: ORG, PER, LOC

Prediction (raw text): Spain is a LOC.
Prediction (span): [(99, 104, 'LOC')]
```