JonasGeiping
commited on
Commit
•
440a8a9
1
Parent(s):
53976ce
Update README.md
Browse files
README.md
CHANGED
@@ -42,8 +42,16 @@ model = AutoModelForMaskedLM.from_pretrained("JonasGeiping/test-crammedBERT-c5"
|
|
42 |
|
43 |
text = "Replace me by any text you'd like."
|
44 |
encoded_input = tokenizer(text, return_tensors='pt')
|
45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
```
|
|
|
47 |
|
48 |
|
49 |
### Limitations and bias
|
|
|
42 |
|
43 |
text = "Replace me by any text you'd like."
|
44 |
encoded_input = tokenizer(text, return_tensors='pt')
|
45 |
+
|
46 |
+
# The c5 variant can only run on CUDA with AMP autocasting.
|
47 |
+
model.cuda()
|
48 |
+
cuda_input = {k:i.cuda() for k,i in encoded_input.items()}
|
49 |
+
|
50 |
+
with torch.autocast("cuda"):
|
51 |
+
output = model(**cuda_input)
|
52 |
+
|
53 |
```
|
54 |
+
If you want to use the `c5` model (which include flash-attention) on `cpu`, load the config with `config.arch["attention"]["type"] = "pytorch"` instead and convert all missing weights.
|
55 |
|
56 |
|
57 |
### Limitations and bias
|