Instructions to use lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("/home/luwa/Documents/models/Qwen3-VL-8B-Instruct")
model = PeftModel.from_pretrained(base_model, "lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA")

Transformers

How to use lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA

SGLang

How to use lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA with Docker Model Runner:
```
docker model run hf.co/lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Qwen3-VL-8B-Instruct-Vision-R1-LoRA

该模型是基于 Qwen/Qwen3-VL-8B-Instruct 使用 LLaMA Factory 进行 QLoRA 微调得到的多模态视觉语言模型。

模型描述

基础模型: Qwen3-VL-8B-Instruct
微调方法: QLoRA (4-bit 量化 + LoRA)
训练数据: vision_r1_mulberry_sft_full
LoRA 秩: 8
LoRA 目标模块: all
参数量: ~8.7B

训练设置

参数	值
学习率	1.0e-4
Batch Size	1 (梯度累积: 8)
优化器	AdamW
学习率调度	Cosine
训练轮数	3 epochs
训练步数	375 steps
最终损失	0.638
训练时间	~45 分钟 (单卡 24GB 3090)

使用方法

使用 LLaMA Factory

CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat \
    --model_name_or_path Qwen/Qwen3-VL-8B-Instruct \
    --adapter_name_or_path lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA \
    --template qwen3_vl_nothink \
    --finetuning_type lora

使用 Transformers

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-VL-8B-Instruct",
    device_map="auto"
)

model = PeftModel.from_pretrained(
    model,
    "lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA"
)

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-VL-8B-Instruct")

框架版本

PEFT 0.18.1
Transformers 5.2.0
Pytorch 2.6.0+cu124
LLaMA Factory 0.13.0

Downloads last month: -

Model tree for lumos2548/Qwen3-VL-8B-Instruct-Vision-R1-LoRA

Base model

Qwen/Qwen3-VL-8B-Instruct

Adapter

(125)

this model