Text Generation
PEFT
Safetensors
dfurman commited on
Commit
5c6c6e1
1 Parent(s): 895f1dd

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +172 -0
README.md ADDED
@@ -0,0 +1,172 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - OpenAssistant/oasst1
4
+ pipeline_tag: text-generation
5
+ ---
6
+
7
+ # Falcon-7b-chat-oasst1
8
+
9
+ Falcon-7b-chat-oasst1 is a chatbot-like model for dialogue generation. It was built by fine-tuning [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) on the [OpenAssistant/oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) dataset.
10
+ This model was fine-tuned in 8-bit using 🤗 [peft](https://github.com/huggingface/peft) adapters, [transformers](https://github.com/huggingface/transformers), and [bitsandbytes](https://github.com/TimDettmers/bitsandbytes).
11
+ - The training relied on a recent method called "Low Rank Adapters" ([LoRA](https://arxiv.org/pdf/2106.09685.pdf)), instead of fine-tuning the entire model you just have to fine-tune adapters and load them properly inside the model.
12
+ - Training took approximately 6 hours and was executed on a workstation with a single NVIDIA A100-SXM 40GB GPU (via Google Colab).
13
+ - See attached [Notebook](https://huggingface.co/intellio-NLP/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model.
14
+
15
+ ## Model Summary
16
+
17
+ - **Model Type:** Causal decoder-only
18
+ - **Language(s) (NLP):** English (primarily)
19
+ - **Base Model:** [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) (License: [TII Falcon LLM License](https://huggingface.co/tiiuae/falcon-7b#license), commercial use ok-ed)
20
+ - **Dataset:** [OpenAssistant/oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) (License: [Apache 2.0](https://huggingface.co/datasets/OpenAssistant/oasst1/blob/main/LICENSE), commercial use ok-ed)
21
+
22
+ ### Model Date
23
+
24
+ May 30, 2023
25
+
26
+ ## Quick Start
27
+
28
+ To prompt the chat model, use the following format:
29
+
30
+ ```
31
+ <human>: [Instruction]
32
+ <bot>:
33
+ ```
34
+
35
+ ### Example Dialogue
36
+
37
+ **Prompter**:
38
+
39
+ ```
40
+ """<human>: My name is Daniel. Write a short email to my closest friends inviting them to come to my home on Friday for a dinner party, I will make the food but tell them to BYOB.
41
+ <bot>:"""
42
+ ```
43
+
44
+ **Falcon-7b-chat-oasst1**:
45
+
46
+ ```
47
+ Dear friends,
48
+
49
+ I am so excited to host a dinner party at my home this Friday! I will be making a delicious meal, but I would love for you to bring your favorite bottle of wine to share with everyone.
50
+
51
+ Please let me know if you can make it and if you have any dietary restrictions I should be aware of. I look forward to seeing you soon!
52
+
53
+ Best,
54
+ Daniel
55
+ ```
56
+
57
+ **Prompter**:
58
+ ```
59
+ <human>: Create a list of things to do in San Francisco.\n
60
+ <bot>:
61
+ ```
62
+
63
+ **Falcon-7b-chat-oasst1**:
64
+ >Coming
65
+
66
+ ### Direct Use
67
+
68
+ This model has been finetuned on conversation trees from [OpenAssistant/oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) and should only be used on data of a similar nature.
69
+
70
+ ### Out-of-Scope Use
71
+
72
+ Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.
73
+
74
+ ## Bias, Risks, and Limitations
75
+
76
+ This model is mostly trained on English data, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online.
77
+
78
+ ### Recommendations
79
+
80
+ We recommend users of this model to develop guardrails and to take appropriate precautions for any production use.
81
+
82
+ ## How to Get Started with the Model
83
+
84
+ ### Setup
85
+ ```python
86
+ # Install and import packages
87
+ !pip install -q -U bitsandbytes loralib einops
88
+ !pip install -q -U git+https://github.com/huggingface/transformers.git
89
+ !pip install -q -U git+https://github.com/huggingface/peft.git
90
+ !pip install -q -U git+https://github.com/huggingface/accelerate.git
91
+
92
+ import torch
93
+ from peft import PeftModel, PeftConfig
94
+ from transformers import AutoModelForCausalLM, AutoTokenizer
95
+
96
+ # Login to HF
97
+ from huggingface_hub import notebook_login
98
+
99
+ notebook_login() # use personal HF token for access to intellio-nlp
100
+ ```
101
+
102
+ ### GPU Inference in 8-bit
103
+
104
+ This requires a GPU with at least 12GB memory.
105
+
106
+ ```python
107
+ # load the model
108
+ peft_model_id = "intellio-NLP/falcon-7b-chat-oasst1"
109
+ config = PeftConfig.from_pretrained(peft_model_id)
110
+
111
+ model = AutoModelForCausalLM.from_pretrained(
112
+ config.base_model_name_or_path,
113
+ return_dict=True,
114
+ load_in_8bit=True,
115
+ device_map="auto",
116
+ use_auth_token=True,
117
+ trust_remote_code=True,
118
+ )
119
+
120
+ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
121
+ tokenizer.pad_token = tokenizer.eos_token
122
+
123
+ model = PeftModel.from_pretrained(model, peft_model_id)
124
+ ```
125
+
126
+ ```python
127
+ # run the model
128
+ prompt = """<human>: My name is Daniel. Write a long email to my closest friends inviting them to come to my home on Friday for a dinner party, I will make the food but tell them to BYOB.
129
+ <bot>:"""
130
+
131
+ batch = tokenizer(
132
+ prompt,
133
+ padding=True,
134
+ truncation=True,
135
+ return_tensors='pt'
136
+ )
137
+ batch = batch.to('cuda:0')
138
+
139
+ with torch.cuda.amp.autocast():
140
+ output_tokens = model.generate(
141
+ input_ids = batch.input_ids,
142
+ max_new_tokens=200,
143
+ temperature=0.7,
144
+ top_p=0.7,
145
+ num_return_sequences=1,
146
+ pad_token_id=tokenizer.eos_token_id,
147
+ eos_token_id=tokenizer.eos_token_id,
148
+ )
149
+
150
+ # Inspect outputs
151
+ print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
152
+ ```
153
+
154
+ ## Reproducibility
155
+
156
+ - See attached [Notebook](https://huggingface.co/intellio-NLP/falcon-40b-chat-oasst1/blob/main/finetune_falcon40b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model.
157
+
158
+ ### CUDA Info
159
+
160
+ - CUDA Version: 12.0
161
+ - GPU Name: NVIDIA A100-SXM
162
+ - Max Memory: {0: "37GB"}
163
+ - Device Map: {"": 0}
164
+
165
+ ### Package Versions Employed
166
+
167
+ - `torch`==2.0.1+cu118
168
+ - `transformers`==4.30.0.dev0
169
+ - `peft`==0.4.0.dev0
170
+ - `accelerate`==0.19.0
171
+ - `bitsandbytes`==0.39.0
172
+ - `einops`==0.6.1