Qwen3-VL-8B-Thinking-Mobile-SFT

This is the Mobile SFT model from the UI-MOPD project — a supervised fine-tuned 8B model trained on mobile GUI interaction data, serving as the initialization for the student policy before on-policy distillation.

Model Description

Qwen3-VL-8B-Thinking-Mobile-SFT is fine-tuned from Qwen3-VL-8B-Thinking on mobile interaction trajectories from the Uni-GUI dataset. It provides a warm-start checkpoint for the 8B student model in the UI-MOPD multi-teacher on-policy distillation framework.

Key Highlights

  • Base Model: Qwen3-VL-8B-Thinking
  • Training Data: Mobile subset of Uni-GUI (~160K interaction steps across ~11.5K trajectories)
  • Training: Supervised fine-tuning (SFT) on mobile GUI interaction trajectories
  • Role: Warm-start initialization for the student policy in Stage 2 (RL distillation)

Training Details

This model is obtained in Stage 1 of the UI-MOPD training pipeline:

  1. Stage 1 (This Model): Supervised fine-tuning of Qwen3-VL-8B-Thinking on mobile GUI interaction trajectories from Uni-GUI.
  2. Stage 2: The SFT checkpoint is further trained via DAPO (reinforcement learning) with multi-teacher on-policy distillation using platform-conditioned routing from 32B teacher models.

Intended Use

This model is designed to:

  • Serve as a warm-start checkpoint for the student policy in the UI-MOPD distillation framework
  • Be used as a standalone mobile GUI agent for executing mobile tasks (e.g., app navigation, settings control, messaging)
  • Provide a baseline for comparing SFT-only vs. distillation-enhanced performance

How to Use

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor

model = Qwen3VLForConditionalGeneration.from_pretrained(
    "UI-MOPD/Qwen3-VL-8B-Thinking-Mobile-SFT",
    torch_dtype="auto",
    device_map="auto",
)
processor = AutoProcessor.from_pretrained("UI-MOPD/Qwen3-VL-8B-Thinking-Mobile-SFT")

Citation

@article{lian2025uimopd,
  title={UI-MOPD: Multi-platform On-Policy Distillation for Continual GUI Agent Learning},
  author={Lian, Niu and Chen, Alan and Yu, Zhehao and Duan, Chengzhen and Liu, Fazhan and Liu, Hui and Fu, Pei and Luan, Jian and Wang, Yaowei and Xia, Shu-Tao and Wang, Jinpeng},
  year={2025}
}

Related Resources

Downloads last month
-
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for UI-MOPD/Qwen3-VL-8B-Thinking-Mobile-SFT

Finetuned
(69)
this model