lysandre HF staff commited on
Commit
ce6086f
1 Parent(s): 9c2b7df

Update examples

Browse files
Files changed (1) hide show
  1. README.md +96 -66
README.md CHANGED
@@ -58,34 +58,42 @@ You can use this model directly with a pipeline for masked language modeling:
58
  >>> from transformers import pipeline
59
  >>> unmasker = pipeline('fill-mask', model='bert-large-uncased')
60
  >>> unmasker("Hello I'm a [MASK] model.")
61
- [{'sequence': "[CLS] hello i'm a fashion model. [SEP]",
62
- 'score': 0.1886913776397705,
63
- 'token': 4827,
64
- 'token_str': 'fashion'},
65
- {'sequence': "[CLS] hello i'm a professional model. [SEP]",
66
- 'score': 0.07157472521066666,
67
- 'token': 2658,
68
- 'token_str': 'professional'},
69
- {'sequence': "[CLS] hello i'm a male model. [SEP]",
70
- 'score': 0.04053466394543648,
71
- 'token': 3287,
72
- 'token_str': 'male'},
73
- {'sequence': "[CLS] hello i'm a role model. [SEP]",
74
- 'score': 0.03891477733850479,
75
- 'token': 2535,
76
- 'token_str': 'role'},
77
- {'sequence': "[CLS] hello i'm a fitness model. [SEP]",
78
- 'score': 0.03038121573626995,
79
- 'token': 10516,
80
- 'token_str': 'fitness'}]
 
 
 
 
 
 
 
 
81
  ```
82
 
83
  Here is how to use this model to get the features of a given text in PyTorch:
84
 
85
  ```python
86
  from transformers import BertTokenizer, BertModel
87
- tokenizer = BertTokenizer.from_pretrained('bert-large-uncased')
88
- model = BertModel.from_pretrained("bert-large-uncased")
89
  text = "Replace me by any text you'd like."
90
  encoded_input = tokenizer(text, return_tensors='pt')
91
  output = model(**encoded_input)
@@ -95,8 +103,8 @@ and in TensorFlow:
95
 
96
  ```python
97
  from transformers import BertTokenizer, TFBertModel
98
- tokenizer = BertTokenizer.from_pretrained('bert-large-uncased')
99
- model = TFBertModel.from_pretrained("bert-large-uncased")
100
  text = "Replace me by any text you'd like."
101
  encoded_input = tokenizer(text, return_tensors='tf')
102
  output = model(encoded_input)
@@ -111,50 +119,72 @@ predictions:
111
  >>> from transformers import pipeline
112
  >>> unmasker = pipeline('fill-mask', model='bert-large-uncased')
113
  >>> unmasker("The man worked as a [MASK].")
114
-
115
- [{'sequence': '[CLS] the man worked as a bartender. [SEP]',
116
- 'score': 0.10426565259695053,
117
- 'token': 15812,
118
- 'token_str': 'bartender'},
119
- {'sequence': '[CLS] the man worked as a waiter. [SEP]',
120
- 'score': 0.10232779383659363,
121
- 'token': 15610,
122
- 'token_str': 'waiter'},
123
- {'sequence': '[CLS] the man worked as a mechanic. [SEP]',
124
- 'score': 0.06281787157058716,
125
- 'token': 15893,
126
- 'token_str': 'mechanic'},
127
- {'sequence': '[CLS] the man worked as a lawyer. [SEP]',
128
- 'score': 0.050936125218868256,
129
- 'token': 5160,
130
- 'token_str': 'lawyer'},
131
- {'sequence': '[CLS] the man worked as a carpenter. [SEP]',
132
- 'score': 0.041034240275621414,
133
- 'token': 10533,
134
- 'token_str': 'carpenter'}]
 
 
 
 
 
 
 
 
 
 
 
135
 
136
  >>> unmasker("The woman worked as a [MASK].")
137
-
138
- [{'sequence': '[CLS] the woman worked as a waitress. [SEP]',
139
- 'score': 0.28473711013793945,
140
- 'token': 13877,
141
- 'token_str': 'waitress'},
142
- {'sequence': '[CLS] the woman worked as a nurse. [SEP]',
143
- 'score': 0.11336520314216614,
144
- 'token': 6821,
145
- 'token_str': 'nurse'},
146
- {'sequence': '[CLS] the woman worked as a bartender. [SEP]',
147
- 'score': 0.09574324637651443,
148
- 'token': 15812,
149
- 'token_str': 'bartender'},
150
- {'sequence': '[CLS] the woman worked as a maid. [SEP]',
151
- 'score': 0.06351090222597122,
152
- 'token': 10850,
153
- 'token_str': 'maid'},
154
- {'sequence': '[CLS] the woman worked as a secretary. [SEP]',
155
- 'score': 0.048970773816108704,
156
- 'token': 3187,
157
- 'token_str': 'secretary'}]
 
 
 
 
 
 
 
 
 
 
 
158
  ```
159
 
160
  This bias will also affect all fine-tuned versions of this model.
58
  >>> from transformers import pipeline
59
  >>> unmasker = pipeline('fill-mask', model='bert-large-uncased')
60
  >>> unmasker("Hello I'm a [MASK] model.")
61
+ [
62
+ {
63
+ 'sequence': "[CLS] hello i'm a fashion model. [SEP]",
64
+ 'score': 0.15813860297203064,
65
+ 'token': 4827,
66
+ 'token_str': 'fashion'
67
+ }, {
68
+ 'sequence': "[CLS] hello i'm a cover model. [SEP]",
69
+ 'score': 0.10551052540540695,
70
+ 'token': 3104,
71
+ 'token_str': 'cover'
72
+ }, {
73
+ 'sequence': "[CLS] hello i'm a male model. [SEP]",
74
+ 'score': 0.08340442180633545,
75
+ 'token': 3287,
76
+ 'token_str': 'male'
77
+ }, {
78
+ 'sequence': "[CLS] hello i'm a super model. [SEP]",
79
+ 'score': 0.036381796002388,
80
+ 'token': 3565,
81
+ 'token_str': 'super'
82
+ }, {
83
+ 'sequence': "[CLS] hello i'm a top model. [SEP]",
84
+ 'score': 0.03609578311443329,
85
+ 'token': 2327,
86
+ 'token_str': 'top'
87
+ }
88
+ ]
89
  ```
90
 
91
  Here is how to use this model to get the features of a given text in PyTorch:
92
 
93
  ```python
94
  from transformers import BertTokenizer, BertModel
95
+ tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking')
96
+ model = BertModel.from_pretrained("bert-large-uncased-whole-word-masking")
97
  text = "Replace me by any text you'd like."
98
  encoded_input = tokenizer(text, return_tensors='pt')
99
  output = model(**encoded_input)
103
 
104
  ```python
105
  from transformers import BertTokenizer, TFBertModel
106
+ tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking')
107
+ model = TFBertModel.from_pretrained("bert-large-uncased-whole-word-masking")
108
  text = "Replace me by any text you'd like."
109
  encoded_input = tokenizer(text, return_tensors='tf')
110
  output = model(encoded_input)
119
  >>> from transformers import pipeline
120
  >>> unmasker = pipeline('fill-mask', model='bert-large-uncased')
121
  >>> unmasker("The man worked as a [MASK].")
122
+ [
123
+ {
124
+ "sequence":"[CLS] the man worked as a waiter. [SEP]",
125
+ "score":0.09823174774646759,
126
+ "token":15610,
127
+ "token_str":"waiter"
128
+ },
129
+ {
130
+ "sequence":"[CLS] the man worked as a carpenter. [SEP]",
131
+ "score":0.08976428955793381,
132
+ "token":10533,
133
+ "token_str":"carpenter"
134
+ },
135
+ {
136
+ "sequence":"[CLS] the man worked as a mechanic. [SEP]",
137
+ "score":0.06550426036119461,
138
+ "token":15893,
139
+ "token_str":"mechanic"
140
+ },
141
+ {
142
+ "sequence":"[CLS] the man worked as a butcher. [SEP]",
143
+ "score":0.04142395779490471,
144
+ "token":14998,
145
+ "token_str":"butcher"
146
+ },
147
+ {
148
+ "sequence":"[CLS] the man worked as a barber. [SEP]",
149
+ "score":0.03680137172341347,
150
+ "token":13362,
151
+ "token_str":"barber"
152
+ }
153
+ ]
154
 
155
  >>> unmasker("The woman worked as a [MASK].")
156
+ [
157
+ {
158
+ "sequence":"[CLS] the woman worked as a waitress. [SEP]",
159
+ "score":0.2669651508331299,
160
+ "token":13877,
161
+ "token_str":"waitress"
162
+ },
163
+ {
164
+ "sequence":"[CLS] the woman worked as a maid. [SEP]",
165
+ "score":0.13054853677749634,
166
+ "token":10850,
167
+ "token_str":"maid"
168
+ },
169
+ {
170
+ "sequence":"[CLS] the woman worked as a nurse. [SEP]",
171
+ "score":0.07987703382968903,
172
+ "token":6821,
173
+ "token_str":"nurse"
174
+ },
175
+ {
176
+ "sequence":"[CLS] the woman worked as a prostitute. [SEP]",
177
+ "score":0.058545831590890884,
178
+ "token":19215,
179
+ "token_str":"prostitute"
180
+ },
181
+ {
182
+ "sequence":"[CLS] the woman worked as a cleaner. [SEP]",
183
+ "score":0.03834161534905434,
184
+ "token":20133,
185
+ "token_str":"cleaner"
186
+ }
187
+ ]
188
  ```
189
 
190
  This bias will also affect all fine-tuned versions of this model.