Text Generation
English
sft
jordiclive commited on
Commit
3ae2899
1 Parent(s): eadbb15

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +131 -1
README.md CHANGED
@@ -4,6 +4,15 @@ datasets:
4
  - Nebulous/gpt4all_pruned
5
  - sahil2801/CodeAlpaca-20k
6
  - yahma/alpaca-cleaned
 
 
 
 
 
 
 
 
 
7
  ---
8
 
9
  This repo contains a low-rank adapter for **LLaMA-7b** fit on
@@ -23,4 +32,125 @@ This version of the weights was trained with the following hyperparameters:
23
  - Lora Alpha: 32
24
  - Lora target modules: q_proj, k_proj, v_proj, o_proj
25
 
26
- The model was trained with flash attention and gradient checkpointing.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - Nebulous/gpt4all_pruned
5
  - sahil2801/CodeAlpaca-20k
6
  - yahma/alpaca-cleaned
7
+ language:
8
+ - en
9
+ tags:
10
+ - sft
11
+ pipeline_tag: text-generation
12
+ widget:
13
+ - text: <|prompter|>What is a meme, and what's the history behind this word?</s><|assistant|>
14
+ - text: <|prompter|>What's the Earth total population</s><|assistant|>
15
+ - text: <|prompter|>Write a story about future of AI development</s><|assistant|>
16
  ---
17
 
18
  This repo contains a low-rank adapter for **LLaMA-7b** fit on
 
32
  - Lora Alpha: 32
33
  - Lora target modules: q_proj, k_proj, v_proj, o_proj
34
 
35
+ The model was trained with flash attention and gradient checkpointing.
36
+
37
+
38
+ ---
39
+ license: apache-2.0
40
+
41
+
42
+ # Open-Assistant SFT-1 12B Model
43
+
44
+
45
+ This is the first iteration English supervised-fine-tuning (SFT) model of
46
+ the [Open-Assistant](https://github.com/LAION-AI/Open-Assistant) project.
47
+ It is based on a Pythia 12B that was fine-tuned on ~22k human demonstrations
48
+ of assistant conversations collected through the
49
+ [https://open-assistant.io/](https://open-assistant.io/) human feedback web
50
+ app before March 7, 2023.
51
+
52
+ ## Model Details
53
+
54
+ - **Developed** as part of the OpenAssistant Project
55
+ - **Model type:** Transformer-based Language Model
56
+ - **Language:** English
57
+
58
+ ## Prompting
59
+
60
+ Two special tokens are used to mark the beginning of user and assistant turns:
61
+ `<|prompter|>` and `<|assistant|>`. Each turn ends with a `<|endoftext|>` token.
62
+
63
+ Input prompt example:
64
+ ```
65
+ <|prompter|>What is a meme, and what's the history behind this word?</s><|assistant|>
66
+ ```
67
+ The input ends with the `<|assistant|>` token to signal that the model should
68
+ start generating the assistant reply.
69
+
70
+
71
+ **Example Code** (Note several embeddings need to be loaded along with the LoRA weights):
72
+
73
+ ```
74
+ from typing import List, NamedTuple
75
+
76
+ import torch
77
+ import transformers
78
+ from huggingface_hub import hf_hub_download
79
+ from peft import PeftModel
80
+ from transformers import GenerationConfig
81
+
82
+ device = "cuda" if torch.cuda.is_available() else "cpu"
83
+ tokenizer = transformers.AutoTokenizer.from_pretrained("jordiclive/gpt4all-alpaca-oa-codealpaca-lora-7b")
84
+
85
+
86
+ model = transformers.AutoModelForCausalLM.from_pretrained(
87
+ "decapoda-research/llama-7b-hf", torch_dtype=torch.float16
88
+ ) # Load Base Model
89
+ model.resize_token_embeddings(
90
+ 32016
91
+ ) # This model repo also contains several embeddings for special tokens that need to be loaded.
92
+
93
+ model.config.eos_token_id = tokenizer.eos_token_id
94
+ model.config.bos_token_id = tokenizer.bos_token_id
95
+ model.config.pad_token_id = tokenizer.pad_token_id
96
+
97
+ lora_weights = "jordiclive/gpt4all-alpaca-oa-codealpaca-lora-7b"
98
+ model = PeftModel.from_pretrained(
99
+ model,
100
+ lora_weights,
101
+ torch_dtype=torch.float16,
102
+ ) # Load Lora model
103
+
104
+ model.eos_token_id = tokenizer.eos_token_id
105
+ filename = hf_hub_download("jordiclive/gpt4all-alpaca-oa-codealpaca-lora-7b", "extra_embeddings.pt")
106
+ embed_weights = torch.load(
107
+ filename, map_location=torch.device("cuda" if torch.cuda.is_available() else "cpu")
108
+ ) # Load embeddings for special tokens
109
+ model.base_model.model.model.embed_tokens.weight[32000:, :] = embed_weights.to(
110
+ model.base_model.model.model.embed_tokens.weight.dtype
111
+ ).to(
112
+ device
113
+ ) # Add special token embeddings
114
+
115
+
116
+ model = model.half().to(device)
117
+ generation_config = GenerationConfig(
118
+ temperature=0.1,
119
+ top_p=0.75,
120
+ top_k=40,
121
+ num_beams=4,
122
+ )
123
+
124
+
125
+ def format_system_prompt(prompt, eos_token="</s>"):
126
+ return "{}{}{}".format(
127
+ "<|prompter|>",
128
+ prompt,
129
+ eos_token,
130
+ )
131
+
132
+
133
+ def generate(prompt, generation_config=generation_config, max_new_tokens=2048, device=device):
134
+ prompt = format_system_prompt(prompt) # OpenAssistant Prompt Format expected
135
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
136
+ with torch.no_grad():
137
+ generation_output = model.generate(
138
+ input_ids=input_ids,
139
+ generation_config=generation_config,
140
+ return_dict_in_generate=True,
141
+ output_scores=True,
142
+ max_new_tokens=max_new_tokens,
143
+ eos_token_id=2,
144
+ )
145
+ s = generation_output.sequences[0]
146
+ output = tokenizer.decode(s)
147
+ print("Text generated:")
148
+ print(output)
149
+ return output
150
+
151
+
152
+ generate("What is a meme, and what's the history behind this word?")
153
+ generate("What's the Earth total population")
154
+ generate("Write a story about future of AI development")
155
+ ```
156
+