cardiffnlp
/

twitter-roberta-base

Inference Endpoints

Model card Files Files and versions Community

luisespinosa commited on Nov 13, 2020

Commit

7456b1b

•

1 Parent(s): bdba8b7

Update README.md

Files changed (1) hide show

README.md +48 -0

README.md CHANGED Viewed

@@ -60,6 +60,54 @@ I am so <mask> 😢
 5)  hungry 0.0232
 ```
 ## Example Feature Extraction
 ```python

 5)  hungry 0.0232
 ```
+## Example Tweet Embeddings
+```python
+from transformers import AutoTokenizer, AutoModel, TFAutoModel
+import numpy as np
+from scipy.spatial.distance import cosine
+from collections import defaultdict
+tokenizer = AutoTokenizer.from_pretrained(MODEL)
+model = AutoModel.from_pretrained(MODEL)
+def get_embedding(text):
+  text = preprocess(text)
+  encoded_input = tokenizer(text, return_tensors='pt')
+  features = model(**encoded_input)
+  features = features[0].detach().cpu().numpy()
+  features_mean = np.mean(features[0], axis=0)
+  return features_mean
+MODEL = "cardiffnlp/twitter-roberta-base"
+query = "The book was awesome"
+tweets = ["I just ordered fried chicken 🐣",
+          "The movie was great",
+          "What time is the next game?",
+          "Just finished reading 'Embeddings in NLP'"]
+d = defaultdict(int)
+for tweet in tweets:
+  sim = 1-cosine(get_embedding(query),get_embedding(tweet))
+  d[tweet] = sim
+print('Most similar to: ',query)
+print('----------------------------------------')
+for idx,x in enumerate(sorted(d.items(), key=lambda x:x[1], reverse=True)):
+  print(idx+1,x[0])
+```
+Output:
+```
+Most similar to:  The book was awesome
+----------------------------------------
+1 The movie was great
+2 Just finished reading 'Embeddings in NLP'
+3 I just ordered fried chicken 🐣
+4 What time is the next game?
+```
 ## Example Feature Extraction
 ```python