|
--- |
|
language: en |
|
datasets: |
|
- wnut_17 |
|
license: mit |
|
metrics: |
|
- f1 |
|
widget: |
|
- text: "My name is Sylvain and I live in Paris" |
|
example_title: "Parisian" |
|
- text: "My name is Sarah and I live in London" |
|
example_title: "Londoner" |
|
--- |
|
|
|
# Reddit NER for place names |
|
|
|
Fine-tuned `bert-base-uncased` for named entity recognition, trained using `wnut_17` with 498 additional comments from Reddit. This model is intended solely for place name extraction from social media text, other entities have therefore been removed. |
|
|
|
This model was created with two key goals: |
|
|
|
1. Improved NER results on social media |
|
2. Target only place names |
|
|
|
## Use in `transformers` |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
generator = pipeline( |
|
task="ner", |
|
model="cjber/reddit-ner-place_names", |
|
tokenizer="cjber/reddit-ner-place_names", |
|
aggregation_strategy="first", |
|
) |
|
|
|
out = generator("I live north of liverpool in Waterloo") |
|
``` |
|
|
|
Out gives: |
|
|
|
```python |
|
[{'entity_group': 'location', |
|
'score': 0.94054973, |
|
'word': 'liverpool', |
|
'start': 16, |
|
'end': 25}, |
|
{'entity_group': 'location', |
|
'score': 0.99520856, |
|
'word': 'waterloo', |
|
'start': 29, |
|
'end': 37}] |
|
``` |