Qwen3.5-9B AgentNet Ubuntu (OpenCUA L2 CoT, ckpt-100)

Full fine-tuning of Qwen/Qwen3.5-9B on the AgentNet Ubuntu split using the OpenCUA L2 chain-of-thought template (Thought + Action + Code with ## markdown headers).

Partial training: checkpoint at step 100/300 (~33% of one epoch). Run was preempted by AWS Capacity Block expiration before reaching epoch end.

Training format (OpenCUA L2)

## Thought:
<reasoning>

## Action:
<one-sentence>

## Code:
pyautogui.click(x=0.5, y=0.5)

Coordinates normalized to [0, 1]. The ## markdown headers help the base model emit the schema reliably (vs. the legacy bare Thought: form). See insagur/qwen3.5-9b-agentnet-ubuntu-1epoch for the legacy-format variant.

Training config

  • Hardware: 1 × 8 A100 80GB SXM4
  • Distributed: DeepSpeed ZeRO-2 + bf16
  • Optimizer: AdamW, LR 1e-5 cosine, warmup 200 steps
  • Batch: per_device_bs=1 × grad_accum=16 × 8 GPU = global batch 128
  • Steps: 100 (preempted; 1 epoch = 300 steps)
  • EMA teacher: target=block, decay=0.9995, α=0.5
  • Sequence length: 3072
  • Image tokens: 2048 (≈1.6M pixel cap)
  • Save frequency: every 50 steps

Metrics @ step 100

Metric Value
Train loss 0.4601
Train token_acc 0.8416
Eval loss 0.4718
Eval token_acc 0.8387

Already approaches the fully-trained legacy-format model's eval loss (0.4622) at only 33% of training, suggesting the ## format converges faster.

Data

scripts/convert_agentnet_cot.py --cot_level l2 produces this format from AgentNet 5K trajectories with the same quality filter as the legacy converter (alignment≥7, efficiency≥5).

Split Samples
Train 38,317
Val 1,866

Inference

from transformers import AutoModelForImageTextToText, AutoProcessor

model = AutoModelForImageTextToText.from_pretrained(
    "insagur/qwen3.5-9b-agentnet-cot-l2-step100",
    torch_dtype="bfloat16",
).to("cuda")
processor = AutoProcessor.from_pretrained("insagur/qwen3.5-9b-agentnet-cot-l2-step100")

system = (
    "You are a computer-use agent operating a Linux desktop. "
    "Respond using the OpenCUA L2 format:\n"
    "## Thought:\n<reasoning>\n\n## Action:\n<one-sentence>\n\n## Code:\n<pyautogui code with normalized [0,1] coords>"
)
# ... see scripts/eval.py in the training repo for full inference loop ...

Recipe

Training code: https://github.com/2bhapby/gui_internal_worldmodel

python scripts/convert_agentnet_cot.py --src ... --images_dir ... --out_dir ./agentnet_l2 --cot_level l2

CONFIG=configs/qwen35_9b_agentnet.yaml RUN_NAME=a100-9b-1ep-cot-l2 \
  sbatch --gpus=8 scripts/slurm_train_qwen.sbatch \
    data.train_jsonl=./agentnet_l2/train.jsonl \
    data.val_jsonl=./agentnet_l2/val.jsonl
Downloads last month
20
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for insagur/qwen3.5-9b-agentnet-cot-l2-step100

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(373)
this model

Dataset used to train insagur/qwen3.5-9b-agentnet-cot-l2-step100