File size: 1,345 Bytes
cde2403 c686af1 498f30f cde2403 fd0ddab 7406ff5 3847186 8b7011b 3847186 8b7011b 3847186 7406ff5 3847186 7406ff5 3847186 cde2403 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
---
language: "en"
tags:
- dstc9
widget:
- text: "i want to book the hilton hotel near china town."
- text: "can you reserve A & B restaurant for me?"
---
Only restaurant, hotel, and attraction names are tagged based on the following data and knowledge base.
Data link: https://github.com/alexa/alexa-with-dstc9-track1-dataset
Label map:
"O": 0
"B-hotel": 1
"I-hotel": 2
"B-restaurant": 3
"I-restaurant": 4
"B-attraction": 5
"I-attraction": 6
```python
from transformers import AutoConfig, AutoModelForTokenClassification, BertTokenizer
from transformers import TokenClassificationPipeline
import json
model_path = "wilsontam/dstc9_ner"
label_map = {
"LABEL_0": "O",
"LABEL_1": "B-hotel",
"LABEL_2": "I-hotel",
"LABEL_3": "B-restaurant",
"LABEL_4": "I-restaurant",
"LABEL_5": "B-attraction",
"LABEL_6": "I-attraction",
}
config = AutoConfig.from_pretrained(
model_path,
num_labels=len(label_map),
)
model = AutoModelForTokenClassification.from_pretrained(
model_path,
from_tf=False,
config=config,
)
tokenizer = BertTokenizer.from_pretrained(
model_path,
)
# device=-1: cpu, device=0: gpu
pipeline = TokenClassificationPipeline(model, tokenizer, device=-1)
tokens = pipeline(["i want to book the hilton hotel near china town.", "can you reserve A & B restaurant for me?"])
```
Credit: Jia-Chen Jason Gu, Wilson Tam
|