akdeniz27 commited on
Commit
fda753b
1 Parent(s): d01fc17

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -2
README.md CHANGED
@@ -8,7 +8,6 @@ pipeline_tag: text-generation
8
  ---
9
  ## Training procedure
10
 
11
-
12
  The following `bitsandbytes` quantization config was used during training:
13
  - load_in_8bit: False
14
  - load_in_4bit: True
@@ -21,5 +20,34 @@ The following `bitsandbytes` quantization config was used during training:
21
  - bnb_4bit_compute_dtype: bfloat16
22
  ### Framework versions
23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
- - PEFT 0.4.0
 
8
  ---
9
  ## Training procedure
10
 
 
11
  The following `bitsandbytes` quantization config was used during training:
12
  - load_in_8bit: False
13
  - load_in_4bit: True
 
20
  - bnb_4bit_compute_dtype: bfloat16
21
  ### Framework versions
22
 
23
+ - PEFT 0.4.0
24
+
25
+ # How to use:
26
+ ```
27
+ !pip install transformers peft accelerate bitsandbytes trl safetensors
28
+
29
+ from huggingface_hub import notebook_login
30
+ notebook_login()
31
+
32
+ import torch
33
+ from peft import AutoPeftModelForCausalLM, get_peft_config, PeftModel, PeftConfig, get_peft_model, LoraConfig, TaskType
34
+ from transformers import AutoTokenizer
35
+
36
+ peft_model_id = "akdeniz27/llama-2-7b-hf-qlora-dolly15k-turkish"
37
+ config = PeftConfig.from_pretrained(peft_model_id)
38
+ # load base LLM model and tokenizer
39
+ model = AutoPeftModelForCausalLM.from_pretrained(
40
+ peft_model_id,
41
+ low_cpu_mem_usage=True,
42
+ torch_dtype=torch.float16,
43
+ load_in_4bit=True,
44
+ )
45
+ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
46
+
47
+ prompt = "..."
48
+
49
+ input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()
50
+
51
+ outputs = model.generate(input_ids=input_ids, max_new_tokens=100, do_sample=True, top_p=0.9,temperature=0.9)
52
 
53
+ ```