Instructions to use AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B")
model = PeftModel.from_pretrained(base_model, "AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora")

Transformers

How to use AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora")
model = AutoModelForMultimodalLM.from_pretrained("AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora

SGLang

How to use AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora with Docker Model Runner:
```
docker model run hf.co/AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora
```

See axolotl config

axolotl version: 0.13.2

adapter: lora
base_model: Qwen/Qwen3-8B
bf16: true
bnb_4bit_compute_dtype: bfloat16
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: true
dataset_prepared_path: out/prepared_dataset_stateless
  message_field_content: content
  message_field_role: role
  path: /e/project1/reformo/salgarkar1/agents_learn/pythonformer-workshop/paired/train/out/paired_data/stateless/rule_diagnosis/traces.jsonl
  roles_to_train:
  - assistant
  type: chat_template
eval_steps: 5
flash_attention: true
gradient_accumulation_steps: 16
gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false
learning_rate: 0.0001
load_in_4bit: true
load_in_8bit: false
logging_steps: 1
lora_alpha: 128
lora_dropout: 0.05
lora_r: 64
lora_target_linear: false
lora_target_modules:
- q_proj
- k_proj
- v_proj
- o_proj
- gate_proj
- up_proj
- down_proj
lr_scheduler: cosine
micro_batch_size: 1
model_type: AutoModelForCausalLM
num_epochs: 3.0
optimizer: adamw_torch
output_dir: out/qwen3-8b-stateless-rule_diagnosis-20260525_123626
pad_to_sequence_len: true
sample_packing: false
save_strategy: epoch
save_total_limit: 3
seed: 3407
sequence_len: 16384
strict: false
tf32: true
tokenizer_type: AutoTokenizer
trust_remote_code: true
val_set_size: 0.04
wandb_log_model: null
wandb_project: pythonformer
wandb_watch: null
warmup_ratio: 0.03
weight_decay: 0.01

out/qwen3-8b-stateless-rule_diagnosis-20260525_123626

This model is a fine-tuned version of Qwen/Qwen3-8B on the /e/project1/reformo/salgarkar1/agents_learn/pythonformer-workshop/paired/train/out/paired_data/stateless/rule_diagnosis/traces.jsonl dataset. It achieves the following results on the evaluation set:

Loss: 0.1784
Ppl: 1.1952
Memory/max Active (gib): 54.54
Memory/max Allocated (gib): 54.54
Memory/device Reserved (gib): 66.97

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 1
eval_batch_size: 1
seed: 3407
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 16
total_train_batch_size: 64
total_eval_batch_size: 4
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 2
training_steps: 45

Training results

Training Loss	Epoch	Step	Validation Loss	Ppl	Active (gib)	Allocated (gib)	Reserved (gib)
No log	0	0	0.4882	1.6293	53.19	53.19	56.52
0.3402	0.3333	5	0.3146	1.3697	54.54	54.54	66.97
0.2775	0.6667	10	0.2673	1.3064	54.54	54.54	66.97
0.2385	1.0	15	0.2318	1.2609	54.54	54.54	66.97
0.2271	1.3333	20	0.2099	1.2335	54.54	54.54	66.97
0.2024	1.6667	25	0.1946	1.2148	54.54	54.54	66.97
0.1813	2.0	30	0.1855	1.2039	54.54	54.54	66.97
0.1706	2.3333	35	0.1808	1.1981	54.54	54.54	66.97
0.185	2.6667	40	0.1787	1.1957	54.54	54.54	66.97
0.1725	3.0	45	0.1784	1.1952	54.54	54.54	66.97

Framework versions

PEFT 0.18.1
Transformers 4.57.6
Pytorch 2.10.0+cu128
Datasets 4.5.0
Tokenizers 0.22.2

Downloads last month: 30

Model tree for AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Adapter

(1448)

this model

Collection including AutomatedScientist/qwen3-8b-stateless-rule_diagnosis-lora

Agents Learn Their Runtime

Collection

Datasets for "Agents Learn Their Runtime: Interpreter Persistence as Training-Time Semantics" (arXiv:2603.01209). • 9 items • Updated 24 days ago • 2