Pixtral-12B-2409 / README.md

Update README.md

8b1d65f verified 3 months ago

3.99 kB

	---
	license: apache-2.0
	library_name: vllm
	---

	# Pixtral-12B-0910

	> [!WARNING]
	> We still need to validate official evaluations with the below usage example.


	...TODO

	## Usage (VLLM)

	We recommend using Pixtral with the [vLLM library](https://github.com/vllm-project/vllm).

	### Installation

	Important: Make sure you install `vLLM >= v0.6.1.post1`:

	```
	pip install --upgrade vllm
	```

	Also make sure you have `mistral_common >= 1.4.0` installed:

	```
	pip install --upgrade mistral_common
	```

	You can also make use of a ready-to-go [docker image](https://hub.docker.com/layers/vllm/vllm-openai/latest/images/sha256-de9032a92ffea7b5c007dad80b38fd44aac11eddc31c435f8e52f3b7404bbf39?context=explore).

	_Simple Example_

	```py
	from vllm import LLM
	from vllm.sampling_params import SamplingParams

	model_name = "mistralai/Pixtral-12B-2409"

	sampling_params = SamplingParams(max_tokens=8192)

	llm = LLM(model=model_name, tokenizer_mode="mistral")

	prompt = "Describe this image in one sentence."
	image_url = "https://picsum.photos/id/237/200/300"

	messages = [
	{
	"role": "user",
	"content": [{"type": "text", "text": prompt}, {"type": "image_url", "image_url": {"url": image_url}}]
	},
	]

	outputs = llm.chat(messages, sampling_params=sampling_params)

	print(outputs[0].outputs[0].text)
	```

	_Advanced Example_

	You can also pass multiple images per message and/or pass multi-turn conversations

	```py
	from vllm import LLM
	from vllm.sampling_params import SamplingParams

	model_name = "mistralai/Pixtral-12B-2409"
	max_img_per_msg = 5

	sampling_params = SamplingParams(max_tokens=8192, temperature=0.7)

	# Lower max_num_seqs or max_model_len on low-VRAM GPUs.
	llm = LLM(model=model_name, tokenizer_mode="mistral", limit_mm_per_prompt={"image": max_img_per_msg}, max_model_len=32768)

	prompt = "Describe the following image."

	url_1 = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
	url_2 = "https://picsum.photos/seed/picsum/200/300"
	url_3 = "https://picsum.photos/id/32/512/512"

	messages = [
	{
	"role": "user",
	"content": [{"type": "text", "text": prompt}, {"type": "image_url", "image_url": {"url": url_1}}, {"type": "image_url", "image_url": {"url": url_2}}],
	},
	{
	"role": "assistant",
	"content": "The images shows nature.",
	},
	{
	"role": "user",
	"content": "More details please and answer only in French!."
	},
	{
	"role": "user",
	"content": [{"type": "image_url", "image_url": {"url": url_3}}],
	}
	]

	outputs = llm.chat(messages=messages, sampling_params=sampling_params)
	print(outputs[0].outputs[0].text)
	```

	You can find more examples and tests directly in vLLM.
	- [Examples](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_pixtral.py)
	- [Tests](https://github.com/vllm-project/vllm/blob/main/tests/models/test_pixtral.py)

	_Server_

	You can also use pixtral in a server/client setting.

	1. Spin up a server:

	```
	vllm serve mistralai/Pixtral-12B-2409 --tokenizer_mode mistral --limit_mm_per_prompt 'image=4'
	```

	2. And ping the client:

	```
	curl --location 'http://<your-node-url>:8000/v1/chat/completions' \
	--header 'Content-Type: application/json' \
	--header 'Authorization: Bearer token' \
	--data '{
	"model": "mistralai/Pixtral-12B-2409",
	"messages": [
	{
	"role": "user",
	"content": [
	{"type" : "text", "text": "Describe this image in detail please."},
	{"type": "image_url", "image_url": {"url": "https://s3.amazonaws.com/cms.ipressroom.com/338/files/201808/5b894ee1a138352221103195_A680%7Ejogging-edit/A680%7Ejogging-edit_hero.jpg"}},
	{"type" : "text", "text": "and this one as well. Answer in French."},
	{"type": "image_url", "image_url": {"url": "https://www.wolframcloud.com/obj/resourcesystem/images/a0e/a0ee3983-46c6-4c92-b85d-059044639928/6af8cfb971db031b.png"}}
	]
	}
	]
	}'
	```