danlou commited on
Commit
fed478f
β€’
1 Parent(s): 05a58c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -22
README.md CHANGED
@@ -38,7 +38,7 @@ def preprocess(text):
38
  ```python
39
  from transformers import pipeline, AutoTokenizer
40
 
41
- MODEL = "cardiffnlp/twitter-roberta-base-2022-154m"
42
  fill_mask = pipeline("fill-mask", model=MODEL, tokenizer=MODEL)
43
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
44
 
@@ -65,25 +65,25 @@ Output:
65
  ```
66
  ------------------------------
67
  So glad I'm <mask> vaccinated.
68
- 1) 0.26251 not
69
- 2) 0.25460 a
70
- 3) 0.12611 in
71
- 4) 0.11036 the
72
- 5) 0.04210 getting
73
  ------------------------------
74
  I keep forgetting to bring a <mask>.
75
- 1) 0.09274 charger
76
- 2) 0.04727 lighter
77
- 3) 0.04469 mask
78
- 4) 0.04395 drink
79
- 5) 0.03644 camera
80
  ------------------------------
81
  Looking forward to watching <mask> Game tonight!
82
- 1) 0.57683 Squid
83
- 2) 0.17419 The
84
- 3) 0.04198 the
85
- 4) 0.00970 Spring
86
- 5) 0.00921 Big
87
  ```
88
 
89
  ## Example Tweet Embeddings
@@ -101,7 +101,7 @@ def get_embedding(text): # naive approach for demonstration
101
  return np.mean(features[0], axis=0)
102
 
103
 
104
- MODEL = "cardiffnlp/twitter-roberta-base-2022-154m"
105
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
106
  model = AutoModel.from_pretrained(MODEL)
107
 
@@ -126,10 +126,10 @@ Output:
126
  ```
127
  Most similar to: The book was awesome
128
  ------------------------------
129
- 1) 0.99403 The movie was great
130
- 2) 0.98006 Just finished reading 'Embeddings in NLP'
131
- 3) 0.97314 What time is the next game?
132
- 4) 0.92448 I just ordered fried chicken 🐣
133
  ```
134
 
135
  ## Example Feature Extraction
@@ -138,7 +138,7 @@ Most similar to: The book was awesome
138
  from transformers import AutoTokenizer, AutoModel, TFAutoModel
139
  import numpy as np
140
 
141
- MODEL = "cardiffnlp/twitter-roberta-base-2022-154m"
142
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
143
 
144
  text = "Good night 😊"
 
38
  ```python
39
  from transformers import pipeline, AutoTokenizer
40
 
41
+ MODEL = "cardiffnlp/twitter-roberta-large-2022-154m"
42
  fill_mask = pipeline("fill-mask", model=MODEL, tokenizer=MODEL)
43
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
44
 
 
65
  ```
66
  ------------------------------
67
  So glad I'm <mask> vaccinated.
68
+ 1) 0.37136 fully
69
+ 2) 0.20631 a
70
+ 3) 0.09422 the
71
+ 4) 0.07649 not
72
+ 5) 0.04505 already
73
  ------------------------------
74
  I keep forgetting to bring a <mask>.
75
+ 1) 0.10507 mask
76
+ 2) 0.05810 pen
77
+ 3) 0.05142 charger
78
+ 4) 0.04082 tissue
79
+ 5) 0.03955 lighter
80
  ------------------------------
81
  Looking forward to watching <mask> Game tonight!
82
+ 1) 0.45783 The
83
+ 2) 0.32842 the
84
+ 3) 0.02705 Squid
85
+ 4) 0.01157 Big
86
+ 5) 0.00538 Match
87
  ```
88
 
89
  ## Example Tweet Embeddings
 
101
  return np.mean(features[0], axis=0)
102
 
103
 
104
+ MODEL = "cardiffnlp/twitter-roberta-large-2022-154m"
105
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
106
  model = AutoModel.from_pretrained(MODEL)
107
 
 
126
  ```
127
  Most similar to: The book was awesome
128
  ------------------------------
129
+ 1) 0.99820 The movie was great
130
+ 2) 0.99306 Just finished reading 'Embeddings in NLP'
131
+ 3) 0.99257 What time is the next game?
132
+ 4) 0.98561 I just ordered fried chicken 🐣
133
  ```
134
 
135
  ## Example Feature Extraction
 
138
  from transformers import AutoTokenizer, AutoModel, TFAutoModel
139
  import numpy as np
140
 
141
+ MODEL = "cardiffnlp/twitter-roberta-large-2022-154m"
142
  tokenizer = AutoTokenizer.from_pretrained(MODEL)
143
 
144
  text = "Good night 😊"