thomasdehaene's picture
Update README.md
08cb518
metadata
language:
  - en
tags:
  - token-classification
  - address-NER
  - NER
  - bert-base-uncased
datasets:
  - Ultra Fine Entity Typing
metrics:
  - Precision
  - Recall
  - F1 Score
widget:
  - text: Hi, I am Kermit and I live in Berlin
  - text: It is very difficult to find a house in Berlin, Germany.
  - text: ML6 is a very cool company from Belgium
  - text: Samuel ppops in a happy plce called Berlin which happens to be Kazakhstan
  - text: >-
      My family and I visited Montreal, Canada last week and the flight from
      Amsterdam took 9 hours

City-Country-NER

A bert-base-uncased model finetuned on a custom dataset to detect Country and City names from a given sentence.

Custom Dataset

We weakly supervised the Ultra-Fine Entity Typing dataset to include the City and Country information. We also did some extra preprocessing to remove false labels.

The model predicts 3 different tags: OTHER, CITY and COUNTRY

How to use the finetuned model?

from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("ml6team/bert-base-uncased-city-country-ner")

model = AutoModelForTokenClassification.from_pretrained("ml6team/bert-base-uncased-city-country-ner")

from transformers import pipeline

nlp = pipeline('ner', model=model, tokenizer=tokenizer, aggregation_strategy="simple")
nlp("My name is Kermit and I live in London.")