UI-MOPD

community

UI-MOPD

EliSpctre updated a dataset 37 minutes ago

EliSpctre published a dataset 37 minutes ago

EliSpctre updated a dataset about 1 hour ago

Organization Card

UI-MOPD: Multi-platform On-Policy Distillation for Continual GUI Agent Learning

We build cross-platform GUI agents that can operate both desktop and mobile interfaces through a unified training framework.

UI-MOPD introduces a two-stage training pipeline:

Stage 1: Supervised Fine-Tuning (SFT) on platform-specific teacher models
Stage 2: Reinforcement Learning distillation (DAPO) with multi-teacher on-policy guidance

Our student model (8B) learns from multiple 32B teacher models to achieve strong cross-platform GUI interaction capabilities.

Model	Size	Description
Qwen3-VL-32B-Thinking-Desktop-Teacher	33B	Desktop platform teacher
Qwen3-VL-32B-Thinking-Mobile-Teacher	33B	Mobile platform teacher
Qwen3-VL-8B-Thinking-Desktop-SFT	9B	Desktop SFT checkpoint
Qwen3-VL-8B-Thinking-Mobile-SFT	9B	Mobile SFT checkpoint
Qwen3-VL-8B-Thinking-UI-MOPD-Student	9B	Final cross-platform student

Dataset	Description
Uni-GUI-OpenCUA	Post-processed desktop trajectories from OpenCUA (~832 episodes, ~14K steps)
Uni-GUI-Desktop-1	Large-scale desktop GUI trajectories (~2.7K episodes, ~36K steps)