tomaarsen HF staff commited on
Commit
1d7cba5
1 Parent(s): 95f254e

Add limitation due to RoBERTa

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -82,4 +82,18 @@ model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-xlm-roberta-base-
82
  entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris .")
83
  ```
84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
  See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.
 
82
  entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris .")
83
  ```
84
 
85
+ ### Limitations
86
+
87
+ **Warning**: This model works best when punctuation is separated from the prior words, so
88
+ ```python
89
+ # ✅
90
+ model.predict("He plays J. Robert Oppenheimer , an American theoretical physicist .")
91
+ # ❌
92
+ model.predict("He plays J. Robert Oppenheimer, an American theoretical physicist.")
93
+
94
+ # You can also supply a list of words directly: ✅
95
+ model.predict(["He", "plays", "J.", "Robert", "Oppenheimer", ",", "an", "American", "theoretical", "physicist", "."])
96
+ ```
97
+ The same may be beneficial for some languages, such as splitting `"l'ocean Atlantique"` into `"l' ocean Atlantique"`.
98
+
99
  See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.