bavest
/

fin-llama-33b-merged

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

bavest commited on Jun 5, 2023

Commit

ece4efe

•

1 Parent(s): e93c4ab

Update usage

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -60,12 +60,14 @@ Quantization parameters are controlled from the `BitsandbytesConfig`
   quantization datatypes `fp4` (four bit float) and `nf4` (normal four bit float). The latter is theoretically optimal
   for normally distributed weights and we recommend using `nf4`.
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
 model = AutoModelForCausalLM.from_pretrained(
-    pretrained_model_name_or_path='bavest/fin-llama',
     load_in_4bit=True,
     device_map='auto',
     torch_dtype=torch.bfloat16,
@@ -77,7 +79,7 @@ model = AutoModelForCausalLM.from_pretrained(
     ),
 )
-tokenizer = AutoTokenizer.from_pretrained(model_path)
 question = "What is the market cap of apple?"
 input = "" # context if needed
@@ -95,6 +97,7 @@ with torch.no_grad():
         do_sample=True,
         top_p=0.9,
         temperature=0.8,
     )
 generated_text = tokenizer.decode(
@@ -102,6 +105,7 @@ generated_text = tokenizer.decode(
 )
 ```
 ## Dataset for FIN-LLAMA
 The dataset is released under bigscience-openrail-m.

   quantization datatypes `fp4` (four bit float) and `nf4` (normal four bit float). The latter is theoretically optimal
   for normally distributed weights and we recommend using `nf4`.
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+pretrained_model_name_or_path = "bavest/fin-llama-33b-merge"
 model = AutoModelForCausalLM.from_pretrained(
+    pretrained_model_name_or_path=pretrained_model_name_or_path,
     load_in_4bit=True,
     device_map='auto',
     torch_dtype=torch.bfloat16,
     ),
 )
+tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
 question = "What is the market cap of apple?"
 input = "" # context if needed
         do_sample=True,
         top_p=0.9,
         temperature=0.8,
+        max_length=128
     )
 generated_text = tokenizer.decode(
 )
 ```
 ## Dataset for FIN-LLAMA
 The dataset is released under bigscience-openrail-m.