File size: 5,273 Bytes
67b2db6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 |
---
library_name: peft
base_model: mistralai/Mistral-7B-v0.1
language:
- en
pipeline_tag: text-generation
widget:
- text: "How many helicopters can a human eat in one sitting?"
tags:
- Δ
- LoRA
---
<!--
# Model Card for Model ID
-->
## Model Details
<!--![image/png](https://cdn-uploads.huggingface.co/production/uploads/648b0f4fd8fe693f51de98d2/aerBANxBtCya732NdBiw0.png)-->
$$
W_{mistral} + LoRA_{zephyr} = W_{zephyr} \\
W_{zephyr} - LoRA_{zephyr} = W_{mistral}
$$
<!--
$$ W_{mistral} + LoRA_{zephyr} = W_{zephyr} $$
```
typeof/zephyr-7b-beta-lora + mistralai/Mistral-7B-v0.1
= HuggingFaceH4/zephyr-7b-beta
````
### Model Description
- **Developed by:** [More Information Needed]
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]
### Model Sources [optional]
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
### Direct Use
[More Information Needed]
### Downstream Use [optional]
[More Information Needed]
### Out-of-Scope Use
[More Information Needed]
## Bias, Risks, and Limitations
[More Information Needed]
### Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-->
### Model Sources
[HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
## How to Get Started with the Model
Use the code below to get started with the model.
```python
# pip install transformers peft
import torch
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
model_id = "mistralai/Mistral-7B-v0.1"
peft_model_id = "typeof/zephyr-7b-beta-lora"
model = AutoModelForCausalLM.from_pretrained(model_id)
model.load_adapter(peft_model_id)
tokenizer_id = "HuggingFaceH4/zephyr-7b-beta" # for chat template etc...
tokenizer = AutoTokenizer.from_pretrained(tokenizer_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
```
<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
Well, me matey, that’s a good question indeed! I’ve never seen
a human eat a helicopter, and I don’t think many others have
either. However, I’ve heard rumors that some people have
eaten entire airplanes, so I suppose it’s not entirely unheard
of.
As for the number of helicopters one could eat, that depends
on the size and weight of the helicopter. A small, lightweight
helicopter would be easier to eat than a large, heavy one.
In fact, I’ve heard that some people have eaten entire helicopters
as part of a dare or a challenge.
So, my advice to you, me hearty, is to steer clear of helicopters
and stick to more traditional fare. Yarr!</s>
```
<!--
## Training Details
### Training Data
[More Information Needed]
### Training Procedure
#### Preprocessing [optional]
[More Information Needed]
#### Training Hyperparameters
#### Speeds, Sizes, Times [optional]
[More Information Needed]
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
[More Information Needed]
#### Factors
[More Information Needed]
#### Metrics
[More Information Needed]
### Results
[More Information Needed]
#### Summary
## Model Examination [optional]
[More Information Needed]
## Technical Specifications [optional]
### Model Architecture and Objective
[More Information Needed]
### Compute Infrastructure
[More Information Needed]
#### Hardware
[More Information Needed]
#### Software
[More Information Needed]
## Citation [optional]
**BibTeX:**
[More Information Needed]
**APA:**
[More Information Needed]
## Glossary [optional]
[More Information Needed]
## More Information
[More Information Needed]
## Model Card Authors [optional]
[More Information Needed]
## Model Card Contact
[More Information Needed]
## Training procedure
The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_4bit: True
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
### Framework versions
- PEFT 0.6.3.dev0
-->
#### Summary
[Zephyr-7B-β](https://arxiv.org/abs/2305.18290) is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
[Zephyr-7B technical report](https://arxiv.org/abs/2310.16944)
[LoRA](https://arxiv.org/abs/2305.14314)
[QLoRA](https://arxiv.org/abs/2106.09685) |