Text Generation
Transformers
PyTorch
Safetensors
English
gpt_neox
causal-lm
Inference Endpoints
text-generation-inference
8-bit precision
rockerBOO commited on
Commit
61e0623
1 Parent(s): 56bbd74

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -16,7 +16,7 @@ datasets:
16
 
17
  # StableLM-Tuned-Alpha 3B 8Bit
18
 
19
- 3B model converted to int8 by rockerBOO. May require `bitsandbytes` dependency and using `load_in_8bit=True`.
20
 
21
  ## Model Description
22
 
@@ -29,9 +29,9 @@ Get started chatting with `StableLM-Tuned-Alpha` by using the following code sni
29
  ```python
30
  from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList
31
 
32
- tokenizer = AutoTokenizer.from_pretrained("StabilityAI/stablelm-tuned-alpha-7b")
33
- model = AutoModelForCausalLM.from_pretrained("StabilityAI/stablelm-tuned-alpha-7b")
34
- model.half().cuda()
35
 
36
  class StopOnTokens(StoppingCriteria):
37
  def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
 
16
 
17
  # StableLM-Tuned-Alpha 3B 8Bit
18
 
19
+ 3B model converted to int8 by rockerBOO. May require `bitsandbytes` dependency. Tested on a 2080 8GB.
20
 
21
  ## Model Description
22
 
 
29
  ```python
30
  from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList
31
 
32
+ tokenizer = AutoTokenizer.from_pretrained("StabilityAI/stablelm-tuned-alpha-3b")
33
+ model = AutoModelForCausalLM.from_pretrained("rockerBOO/stablelm-tuned-alpha-3b-8bit")
34
+ model.cuda()
35
 
36
  class StopOnTokens(StoppingCriteria):
37
  def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool: