replit
/

replit-code-v1-3b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

madhavatreplit commited on May 28, 2023

Commit

1e1a20a

•

1 Parent(s): b6d9ff2

Update README.md for flash attn

Files changed (1) hide show

README.md +8 -2

README.md CHANGED Viewed

@@ -105,10 +105,16 @@ triton==2.0.0.dev20221202
 Then, move the model to `bfloat16` and use it as follows:
 ```python
-from transformers import AutoModelForCausalLM
 # load model
-model = AutoModelForCausalLM.from_pretrained('replit/replit-code-v1-3b', trust_remote_code=True, attn_impl='triton')
 model.to(device='cuda:0', dtype=torch.bfloat16)
 # forward pass

 Then, move the model to `bfloat16` and use it as follows:
 ```python
+from transformers import AutoModelForCausalLM, AutoConfig
+config = AutoConfig.from_pretrained(
+    "replit/replit-code-v1-3b",
+    trust_remote_code=True
+)
+config.attn_config['attn_impl'] = 'triton'
 # load model
+model = AutoModelForCausalLM.from_pretrained('replit/replit-code-v1-3b', config=config, trust_remote_code=True)
 model.to(device='cuda:0', dtype=torch.bfloat16)
 # forward pass