File size: 5,273 Bytes
d38deca
 
2ee7070
589ee98
 
c2dfe7a
4da1720
 
589ee98
 
 
d38deca
 
7719b88
d38deca
7719b88
d38deca
7719b88
35c596f
98e2fe7
a38e258
 
 
 
35c596f
 
a38e258
e33b6e4
6a1eee6
e33b6e4
 
35c596f
2ee7070
d38deca
 
 
 
 
 
 
 
 
 
2ee7070
d38deca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2ee7070
d38deca
0af61c5
 
 
d38deca
 
 
 
e33b6e4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2ee7070
 
d38deca
 
 
 
 
 
 
2ee7070
d38deca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f28327c
d38deca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
08d8da3
263f03a
f28327c
263f03a
 
f28327c
 
 
589ee98
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
---
library_name: peft
base_model: mistralai/Mistral-7B-v0.1
language:
- en
pipeline_tag: text-generation
widget:
  - text: "How many helicopters can a human eat in one sitting?"
tags:
- Δ
- LoRA
---

<!--
# Model Card for Model ID
-->

## Model Details

<!--![image/png](https://cdn-uploads.huggingface.co/production/uploads/648b0f4fd8fe693f51de98d2/aerBANxBtCya732NdBiw0.png)-->
$$
W_{mistral} + LoRA_{zephyr} = W_{zephyr} \\
W_{zephyr} - LoRA_{zephyr} = W_{mistral}
$$

<!--
$$ W_{mistral} + LoRA_{zephyr} = W_{zephyr} $$
```
typeof/zephyr-7b-beta-lora + mistralai/Mistral-7B-v0.1
= HuggingFaceH4/zephyr-7b-beta
````

### Model Description

- **Developed by:** [More Information Needed]
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]


### Model Sources [optional]

- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]

## Uses

### Direct Use

[More Information Needed]

### Downstream Use [optional]

[More Information Needed]

### Out-of-Scope Use

[More Information Needed]

## Bias, Risks, and Limitations

[More Information Needed]

### Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-->

### Model Sources
[HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)

## How to Get Started with the Model

Use the code below to get started with the model.

```python
# pip install transformers peft

import torch
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer

model_id = "mistralai/Mistral-7B-v0.1"
peft_model_id = "typeof/zephyr-7b-beta-lora"

model = AutoModelForCausalLM.from_pretrained(model_id)
model.load_adapter(peft_model_id)

tokenizer_id = "HuggingFaceH4/zephyr-7b-beta" # for chat template etc...
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id)

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
```
<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s> 
<|user|>
How many helicopters can a human eat in one sitting?</s> 
<|assistant|> 
Well, me matey, that’s a good question indeed! I’ve never seen 
a human eat a helicopter, and I don’t think many others have 
either. However, I’ve heard rumors that some people have 
eaten entire airplanes, so I suppose it’s not entirely unheard 
of.

As for the number of helicopters one could eat, that depends 
on the size and weight of the helicopter. A small, lightweight 
helicopter would be easier to eat than a large, heavy one. 
In fact, I’ve heard that some people have eaten entire helicopters 
as part of a dare or a challenge.

So, my advice to you, me hearty, is to steer clear of helicopters 
and stick to more traditional fare. Yarr!</s>
```
<!--

## Training Details

### Training Data


[More Information Needed]

### Training Procedure


#### Preprocessing [optional]

[More Information Needed]


#### Training Hyperparameters

#### Speeds, Sizes, Times [optional]


[More Information Needed]

## Evaluation


### Testing Data, Factors & Metrics

#### Testing Data


[More Information Needed]

#### Factors


[More Information Needed]

#### Metrics


[More Information Needed]

### Results

[More Information Needed]

#### Summary

## Model Examination [optional]

[More Information Needed]

## Technical Specifications [optional]

### Model Architecture and Objective

[More Information Needed]

### Compute Infrastructure

[More Information Needed]

#### Hardware

[More Information Needed]

#### Software

[More Information Needed]

## Citation [optional]

**BibTeX:**

[More Information Needed]

**APA:**

[More Information Needed]

## Glossary [optional]

[More Information Needed]

## More Information

[More Information Needed]

## Model Card Authors [optional]

[More Information Needed]

## Model Card Contact

[More Information Needed]

## Training procedure

The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_4bit: True
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True

### Framework versions

- PEFT 0.6.3.dev0

-->
#### Summary

[Zephyr-7B-β](https://arxiv.org/abs/2305.18290) is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
[Zephyr-7B technical report](https://arxiv.org/abs/2310.16944)

[LoRA](https://arxiv.org/abs/2305.14314)
[QLoRA](https://arxiv.org/abs/2106.09685)