Instructions to use insagur/qwen3.5-9b-agentnet-cot-l2-step100 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use insagur/qwen3.5-9b-agentnet-cot-l2-step100 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="insagur/qwen3.5-9b-agentnet-cot-l2-step100")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("insagur/qwen3.5-9b-agentnet-cot-l2-step100")
model = AutoModelForMultimodalLM.from_pretrained("insagur/qwen3.5-9b-agentnet-cot-l2-step100")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use insagur/qwen3.5-9b-agentnet-cot-l2-step100 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "insagur/qwen3.5-9b-agentnet-cot-l2-step100"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "insagur/qwen3.5-9b-agentnet-cot-l2-step100",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/insagur/qwen3.5-9b-agentnet-cot-l2-step100

SGLang

How to use insagur/qwen3.5-9b-agentnet-cot-l2-step100 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "insagur/qwen3.5-9b-agentnet-cot-l2-step100" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "insagur/qwen3.5-9b-agentnet-cot-l2-step100",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "insagur/qwen3.5-9b-agentnet-cot-l2-step100" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "insagur/qwen3.5-9b-agentnet-cot-l2-step100",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use insagur/qwen3.5-9b-agentnet-cot-l2-step100 with Docker Model Runner:
```
docker model run hf.co/insagur/qwen3.5-9b-agentnet-cot-l2-step100
```

Qwen3.5-9B AgentNet Ubuntu (OpenCUA L2 CoT, ckpt-100)

Full fine-tuning of Qwen/Qwen3.5-9B on the AgentNet Ubuntu split using the OpenCUA L2 chain-of-thought template (Thought + Action + Code with ## markdown headers).

Partial training: checkpoint at step 100/300 (~33% of one epoch). Run was preempted by AWS Capacity Block expiration before reaching epoch end.

Training format (OpenCUA L2)

## Thought:
<reasoning>

## Action:
<one-sentence>

## Code:
pyautogui.click(x=0.5, y=0.5)

Coordinates normalized to [0, 1]. The ## markdown headers help the base model emit the schema reliably (vs. the legacy bare Thought: form). See insagur/qwen3.5-9b-agentnet-ubuntu-1epoch for the legacy-format variant.

Training config

Hardware: 1 × 8 A100 80GB SXM4
Distributed: DeepSpeed ZeRO-2 + bf16
Optimizer: AdamW, LR 1e-5 cosine, warmup 200 steps
Batch: per_device_bs=1 × grad_accum=16 × 8 GPU = global batch 128
Steps: 100 (preempted; 1 epoch = 300 steps)
EMA teacher: target=block, decay=0.9995, α=0.5
Sequence length: 3072
Image tokens: 2048 (≈1.6M pixel cap)
Save frequency: every 50 steps

Metrics @ step 100

Metric	Value
Train loss	0.4601
Train token_acc	0.8416
Eval loss	0.4718
Eval token_acc	0.8387

Already approaches the fully-trained legacy-format model's eval loss (0.4622) at only 33% of training, suggesting the ## format converges faster.

Data

scripts/convert_agentnet_cot.py --cot_level l2 produces this format from AgentNet 5K trajectories with the same quality filter as the legacy converter (alignment≥7, efficiency≥5).

Split	Samples
Train	38,317
Val	1,866

Inference

from transformers import AutoModelForImageTextToText, AutoProcessor

model = AutoModelForImageTextToText.from_pretrained(
    "insagur/qwen3.5-9b-agentnet-cot-l2-step100",
    torch_dtype="bfloat16",
).to("cuda")
processor = AutoProcessor.from_pretrained("insagur/qwen3.5-9b-agentnet-cot-l2-step100")

system = (
    "You are a computer-use agent operating a Linux desktop. "
    "Respond using the OpenCUA L2 format:\n"
    "## Thought:\n<reasoning>\n\n## Action:\n<one-sentence>\n\n## Code:\n<pyautogui code with normalized [0,1] coords>"
)
# ... see scripts/eval.py in the training repo for full inference loop ...

Recipe

Training code: https://github.com/2bhapby/gui_internal_worldmodel

python scripts/convert_agentnet_cot.py --src ... --images_dir ... --out_dir ./agentnet_l2 --cot_level l2

CONFIG=configs/qwen35_9b_agentnet.yaml RUN_NAME=a100-9b-1ep-cot-l2 \
  sbatch --gpus=8 scripts/slurm_train_qwen.sbatch \
    data.train_jsonl=./agentnet_l2/train.jsonl \
    data.val_jsonl=./agentnet_l2/val.jsonl

Downloads last month: 20

Safetensors

Model size

9B params

Tensor type

BF16

Model tree for insagur/qwen3.5-9b-agentnet-cot-l2-step100

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Finetuned

(373)

this model

insagur
/

qwen3.5-9b-agentnet-cot-l2-step100