francislabounty commited on
Commit
b33e2c4
·
1 Parent(s): 51704fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -1,3 +1,57 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ LoRA weights only and trained for research - nothing from the foundation model.
5
+ Trained using Anthropics HH dataset which can be found here https://huggingface.co/datasets/Anthropic/hh-rlhf
6
+
7
+ Sample usage
8
+ ```
9
+ import torch
10
+ import os
11
+ import transformers
12
+ from peft import PeftModel
13
+ from transformers import LlamaTokenizer, LlamaForCausalLM
14
+
15
+ model_path = "decapoda-research/llama-7b-hf"
16
+ peft_path = 'serpdotai/llama-hh-lora-7B'
17
+ tokenizer_path = 'decapoda-research/llama-7b-hf'
18
+
19
+ model = LlamaForCausalLM.from_pretrained(model_path, load_in_8bit=True, device_map="auto") # or something like {"": 0}
20
+ model = PeftModel.from_pretrained(model, peft_path, torch_dtype=torch.float16, device_map="auto") # or something like {"": 0}
21
+ tokenizer = LlamaTokenizer.from_pretrained(tokenizer_path)
22
+
23
+ batch = tokenizer("\n\nUser: Are you sentient?\n\nAssistant:", return_tensors="pt")
24
+
25
+ with torch.no_grad():
26
+ out = model.generate(
27
+ input_ids=batch["input_ids"].cuda(),
28
+ attention_mask=batch["attention_mask"].cuda(),
29
+ max_length=100,
30
+ do_sample=True,
31
+ top_k=50,
32
+ top_p=1.0,
33
+ temperature=1.0,
34
+ use_cache=False
35
+ )
36
+ print(tokenizer.decode(out[0]))
37
+ ```
38
+
39
+ The model will continue the conversation between the user and itself. If you want to use as a chatbot you can alter the generate method to include stop sequences for 'User:' and 'Assistant:' or strip off anything past the assistant's original response before returning.
40
+
41
+
42
+ Trained for 2 epochs with a sequence length of 1024, mini-batch size of 3, gradient accumulation of 5, on 8 A6000s for an effective batch size of 120.
43
+
44
+ Training settings:
45
+ - lr: 2.0e-04
46
+ - lr_scheduler_type: linear
47
+ - warmup_ratio: 0.06
48
+ - weight_decay: 0.1
49
+ - optimizer: adamw_torch_fused
50
+
51
+ LoRA config:
52
+ - target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']
53
+ - r: 64
54
+ - lora_alpha: 32
55
+ - lora_dropout: 0.05
56
+ - bias: "none"
57
+ - task_type: "CAUSAL_LM"