shubhamkrishna commited on
Commit
f5afbf0
1 Parent(s): dc64808

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## City-Country-NER
2
+
3
+ A `bert-base-uncased` model finetuned on a custom dataset to detect `Country` and `City` names from a given sentence.
4
+
5
+ ### Custom Dataset
6
+ We weakly supervised the `Ultra-Fine Entity Typing[https://www.cs.utexas.edu/~eunsol/html_pages/open_entity.html]` dataset to include the `City` and `Country` information. We also did some extra preprocessing to remove false labels.
7
+
8
+ The model predicts 3 different tags:
9
+
10
+ | **Predicted Tag**| **Meaning** |
11
+ |------------------|-------------|
12
+ | LABEL_0 | Others |
13
+ | LABEL_2 | Country |
14
+ | LABEL_3 | City |
15
+
16
+
17
+
18
+ ### How to use the finetuned model?
19
+
20
+ ```
21
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
22
+
23
+ tokenizer = AutoTokenizer.from_pretrained("ml6team/bert-base-uncased-city-country-ner", use_auth_token=True)
24
+
25
+ model = AutoModelForTokenClassification.from_pretrained("ml6team/bert-base-uncased-city-country-ner", use_auth_token=True)
26
+
27
+ from transformers import pipeline
28
+
29
+ nlp = pipeline('ner', model=model, tokenizer=tokenizer, aggregation_strategy="simple")
30
+ nlp("My name is Kermit and I live in London.")
31
+ ```