File size: 31,820 Bytes
9b36202 2f66a10 9b36202 2f66a10 9b36202 2f66a10 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
---
base_model: mistralai/Pixtral-12B-Base-2409
language:
- en
library_name: transformers
pipeline_tag: image-text-to-text
license: apache-2.0
tags:
- multimodal
- mistral
- pixtral
- unsloth
---
# Finetune Llama 3.2, Qwen 2.5, Gemma 2, Mistral 2-5x faster with 70% less memory via Unsloth!
We have a free Google Colab Tesla T4 notebook for Pixtral (12B) 2409 here: https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing
And a free notebook for [Llama 3.2 Vision (11B) here](https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing)
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/Discord%20button.png" width="200"/>](https://discord.gg/unsloth)
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
# unsloth/Pixtral-12B-Base-2409-bnb-4bit
For more details on the model, please go to Mistral's original [model card](https://huggingface.co/mistralai/Pixtral-12B-Base-2409)
## β¨ Finetune for Free
All notebooks are **beginner friendly**! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.
| Unsloth supports | Free Notebooks | Performance | Memory use |
|-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------|
| **Llama-3.2 (3B)** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing) | 2.4x faster | 58% less |
| **Llama-3.2 (11B vision)** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing) | 2x faster | 40% less |
| **Qwen2 VL (7B)** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing) | 1.8x faster | 40% less |
| **Qwen2.5 (7B)** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/1Kose-ucXO1IBaZq5BvbwWieuubP7hxvQ?usp=sharing) | 2x faster | 60% less |
| **Llama-3.1 (8B)** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing) | 2.4x faster | 58% less |
| **Phi-3.5 (mini)** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/1lN6hPQveB_mHSnTOYifygFcrO8C1bxq4?usp=sharing) | 2x faster | 50% less |
| **Gemma 2 (9B)** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/1vIrqH5uYDQwsJ4-OO3DErvuv4pBgVwk4?usp=sharing) | 2.4x faster | 58% less |
| **Mistral (7B)** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/1Dyauq4kTZoLewQ1cApceUQVNcnnNTzg_?usp=sharing) | 2.2x faster | 62% less |
| **DPO - Zephyr** | [βΆοΈ Start on Colab](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) | 1.9x faster | 19% less |
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/refs/heads/main/images/documentation%20green%20button.png" width="200"/>](https://docs.unsloth.ai)
- This [conversational notebook](https://colab.research.google.com/drive/1Aau3lgPzeZKQ-98h69CCu1UJcvIBLmy2?usp=sharing) is useful for ShareGPT ChatML / Vicuna templates.
- This [text completion notebook](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing) is for raw text. This [DPO notebook](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) replicates Zephyr.
- \* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster.
## Special Thanks
A huge thank you to the Mistral team for creating and releasing these models.
### Pixtral 2409 Details
Mistral common has image support! You can now pass images and URLs alongside text into the user message.
```
pip install --upgrade mistral_common
```
To use the model checkpoint:
```
# pip install huggingface-hub
from huggingface_hub import snapshot_download
snapshot_download(repo_id="mistral-community/pixtral-12b-240910", local_dir="...")
```
βββββ
ββββββββββββββββββ
βββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββ
ββββββββββββββ ββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ βββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ βββ
ββββββββββββββββββββββββββββββββ ββββββββββββββββββ
ββββββββββββββββββββββββββββ
βββββββββββββββββ
βββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PIXTRAL - 12B - v0.1 10/09/24 β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Β·Β· md5sum Β·Β· β
β β
β b8e9126ef0c15a1130c14b15e8432a67 consolidated.safetensors β
β 68b39355a7b14a7d653292dab340a0be params.json β
β 10229adc84036ff8fe44a2a8e2ad9ba9 tekken.json β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Β·Β· Released by the Mistral AI team Β·Β· β
β β
β - Use GELU for the vision adapter β
β - Use 2D ROPE for the vision encoder β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
## Images
You can encode images as follows
```python
from mistral_common.protocol.instruct.messages import (
UserMessage,
TextChunk,
ImageURLChunk,
ImageChunk,
)
from PIL import Image
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
tokenizer = MistralTokenizer.from_model("pixtral")
image = Image.new('RGB', (64, 64))
# tokenize images and text
tokenized = tokenizer.encode_chat_completion(
ChatCompletionRequest(
messages=[
UserMessage(
content=[
TextChunk(text="Describe this image"),
ImageChunk(image=image),
]
)
],
model="pixtral",
)
)
tokens, text, images = tokenized.tokens, tokenized.text, tokenized.images
# Count the number of tokens
print("# tokens", len(tokens))
print("# images", len(images))
```
## Image URLs
You can pass image url which will be automatically downloaded
```python
url_dog = "https://picsum.photos/id/237/200/300"
url_mountain = "https://picsum.photos/seed/picsum/200/300"
# tokenize image urls and text
tokenized = tokenizer.encode_chat_completion(
ChatCompletionRequest(
messages=[
UserMessage(
content=[
TextChunk(text="Can this animal"),
ImageURLChunk(image_url=url_dog),
TextChunk(text="live here?"),
ImageURLChunk(image_url=url_mountain),
]
)
],
model="pixtral",
)
)
tokens, text, images = tokenized.tokens, tokenized.text, tokenized.images
# Count the number of tokens
print("# tokens", len(tokens))
print("# images", len(images))
```
# ImageData
You can also pass image encoded as base64
```python
tokenized = tokenizer.encode_chat_completion(
ChatCompletionRequest(
messages=[
UserMessage(
content=[
TextChunk(text="What is this?"),
ImageURLChunk(image_url=""),
]
)
],
model="pixtral",
)
)
tokens, text, images = tokenized.tokens, tokenized.text, tokenized.images
# Count the number of tokens
print("# tokens", len(tokens))
print("# images", len(images))
```
|