gui360-cooperative-lora-svd-sft

Cooperative LoRA (r=128, T=2 comm rounds) with SVD initialization from full SFT. TSR=18.6%, StepSR=65.3%. ~67M params. Starting checkpoint for RL.

Base Model

  • Qwen2.5-VL-7B-Instruct

Training Data

  • GUI-360 balanced 2K episodes (17,264 steps)
  • Action types: click, type, swipe (balanced sampling)

Evaluation (GUI-360 test 1K balanced)

Metric Value
TSR (Task Success Rate) 18.6%
StepSR (Step Success Rate) 65.3%
Progress 31.0%

Full Ranking

# Method TSR Params
1 Full-param SFT step-250 22.2% 7.6B
2 V15 Cooperative RL step-25 20.8% ~142M
3 PEFT Cooperative r=128 (SVD) 18.6% ~67M
4 PEFT Standard r=128 (SVD) 18.1% ~67M
5 Base model (zero-shot) 2.4% —

Citation

Part of the Cooperative LoRA research for GUI agents.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Stevenshuqing/gui360-cooperative-lora-svd-sft

Finetuned
(1082)
this model

Dataset used to train Stevenshuqing/gui360-cooperative-lora-svd-sft