lxyuan
/

span-marker-bert-base-multilingual-cased-multinerd

Token Classification

Generated from Trainer

named-entity-recognition

Model card Files Files and versions Community

lxyuan commited on Aug 11, 2023

Commit

fc40971

·

1 Parent(s): 5e21c98

Add chinese inference example

Files changed (1) hide show

README.md +27 -1

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ model-index:
       name: Precision
     - type: recall
       value: 0.9281
-      name: Recal
 license: apache-2.0
 datasets:
 - Babelscape/multinerd
@@ -117,6 +117,32 @@ entities
 # :(
 ```
 ## Training procedure
 One can reproduce the result running this [script](https://huggingface.co/tomaarsen/span-marker-mbert-base-multinerd/blob/main/train.py)

       name: Precision
     - type: recall
       value: 0.9281
+      name: Recall
 license: apache-2.0
 datasets:
 - Babelscape/multinerd
 # :(
 ```
+#### Quick test on Chinese
+```python
+from span_marker import SpanMarkerModel
+model = SpanMarkerModel.from_pretrained("lxyuan/span-marker-bert-base-multilingual-cased-multinerd")
+# translate to chinese
+description = "Singapore is renowned for its hawker centers offering dishes \
+like Hainanese chicken rice and laksa, while Malaysia boasts dishes such as \
+nasi lemak and rendang, reflecting its rich culinary heritage."
+zh_description = "新加坡因其小贩中心提供海南鸡饭和叻沙等菜肴而闻名, 而马来西亚则拥有椰浆饭和仁当等菜肴，反映了其丰富的烹饪传统."
+entities = model.predict(zh_description)
+entities
+>>>
+[
+  {'span': '新加坡', 'label': 'LOC', 'score': 0.9282007813453674, 'char_start_index': 0, 'char_end_index': 3},
+  {'span': '马来西亚', 'label': 'LOC', 'score': 0.7439665794372559, 'char_start_index': 27, 'char_end_index': 31}]
+# It only managed to capture two countries: Singapore and Malaysia.
+# All other entities were missed out.
+```
 ## Training procedure
 One can reproduce the result running this [script](https://huggingface.co/tomaarsen/span-marker-mbert-base-multinerd/blob/main/train.py)