asahi417 commited on
Commit
5bf4a5c
1 Parent(s): 8f8e522

model update

Browse files
Files changed (1) hide show
  1. README.md +24 -4
README.md CHANGED
@@ -73,7 +73,7 @@ model-index:
73
 
74
  pipeline_tag: token-classification
75
  widget:
76
- - text: "Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from {{@Herbie Hancock@}} via {{USERNAME}} link below: {{URL}}"
77
  example_title: "NER Example 1"
78
  ---
79
  # tner/bert-base-tweetner7-2021
@@ -112,15 +112,34 @@ Full evaluation can be found at [metric file of NER](https://huggingface.co/tner
112
  and [metric file of entity span](https://huggingface.co/tner/bert-base-tweetner7-2021/raw/main/eval/metric_span.json).
113
 
114
  ### Usage
115
- This model can be used through the [tner library](https://github.com/asahi417/tner). Install the library via pip
116
  ```shell
117
  pip install tner
118
  ```
119
- and activate model as below.
 
 
120
  ```python
 
 
121
  from tner import TransformersNER
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
  model = TransformersNER("tner/bert-base-tweetner7-2021")
123
- model.predict(["Jacob Collier is a Grammy awarded English artist from London"])
124
  ```
125
  It can be used via transformers library but it is not recommended as CRF layer is not supported at the moment.
126
 
@@ -166,3 +185,4 @@ If you use any resource from T-NER, please consider to cite our [paper](https://
166
  }
167
 
168
  ```
 
 
73
 
74
  pipeline_tag: token-classification
75
  widget:
76
+ - text: "Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from {@Herbie Hancock@} via {@bluenoterecords@} link below: {{URL}}"
77
  example_title: "NER Example 1"
78
  ---
79
  # tner/bert-base-tweetner7-2021
 
112
  and [metric file of entity span](https://huggingface.co/tner/bert-base-tweetner7-2021/raw/main/eval/metric_span.json).
113
 
114
  ### Usage
115
+ This model can be used through the [tner library](https://github.com/asahi417/tner). Install the library via pip.
116
  ```shell
117
  pip install tner
118
  ```
119
+ [TweetNER7](https://huggingface.co/datasets/tner/tweetner7) pre-processed tweets where the account name and URLs are
120
+ converted into special formats (see the dataset page for more detail), so we process tweets accordingly and then run the model prediction as below.
121
+
122
  ```python
123
+ import re
124
+ from urlextract import URLExtract
125
  from tner import TransformersNER
126
+
127
+ extractor = URLExtract()
128
+
129
+ def format_tweet(tweet):
130
+ # mask web urls
131
+ urls = extractor.find_urls(tweet)
132
+ for url in urls:
133
+ tweet = tweet.replace(url, "{{URL}}")
134
+ # format twitter account
135
+ tweet = re.sub(r"\b(\s*)(@[\S]+)\b", r'\1{\2@}', tweet)
136
+ return tweet
137
+
138
+
139
+ text = Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from @herbiehancock via @bluenoterecords link below: http://bluenote.lnk.to/AlbumOfTheWeek
140
+ text_format = format_tweet(text)
141
  model = TransformersNER("tner/bert-base-tweetner7-2021")
142
+ model.predict([text_format])
143
  ```
144
  It can be used via transformers library but it is not recommended as CRF layer is not supported at the moment.
145
 
 
185
  }
186
 
187
  ```
188
+