francislabounty commited on
Commit
dedb681
1 Parent(s): 1b78cfe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md CHANGED
@@ -1,3 +1,49 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ LoRA weights and for research only - nothing from the foundation model.
5
+ Trained use Anthropics HH dataset which can be found here https://huggingface.co/datasets/Anthropic/hh-rlhf
6
+
7
+ Sample usage
8
+ ```
9
+ import torch
10
+ import os
11
+ import transformers
12
+ from peft import PeftModel
13
+ from transformers import LlamaTokenizer, LlamaForCausalLM
14
+
15
+ model_path = "decapoda-research/llama-13b-hf"
16
+ peft_path = 'serpdotai/llama-hh-lora-13B'
17
+ tokenizer_path = 'decapoda-research/llama-13b-hf'
18
+
19
+ model = LlamaForCausalLM.from_pretrained(model_path, load_in_8bit=True, device_map="auto") # or something like {"": 0}
20
+ model = PeftModel.from_pretrained(model, peft_path, torch_dtype=torch.float16, device_map="auto") # or something like {"": 0}
21
+ tokenizer = LlamaTokenizer.from_pretrained(tokenizer_path)
22
+
23
+ batch = tokenizer("\n\nUser: Are you sentient?\n\nAssistant:", return_tensors="pt")
24
+
25
+ with torch.no_grad():
26
+ out = model.generate(
27
+ input_ids=batch["input_ids"].cuda(),
28
+ attention_mask=batch["attention_mask"].cuda(),
29
+ max_length=100,
30
+ do_sample=True,
31
+ top_k=50,
32
+ top_p=1.0,
33
+ temperature=1.0,
34
+ use_cache=False
35
+ )
36
+ print(tokenizer.decode(out[0]))
37
+ ```
38
+
39
+ The model will continue the conversation between the user and itself. If you want to use as a chatbot you can alter the generate method to include stop sequences for 'User:' and 'Assistant:' or strip off anything past the assistant's original response before returning.
40
+
41
+
42
+ Trained for 2 epochs with a sequence length of 640, mini-batch size of 3, gradient accumulation of 5, on 8 A6000s for an effective batch size of 120.
43
+
44
+ Training settings:
45
+ - lr: 2.0e-04
46
+ - lr_scheduler_type: linear
47
+ - warmup_ratio: 0.06
48
+ - weight_decay: 0.1
49
+ - optimizer: adamw_torch_fused