machine5_continue_stable_expand13000_20260520_lr2e-7_e1

This is a full-parameter fine-tune of Qwen3-VL-8B-Thinking for visual symbolic regression: given a rendered function plot, immediately call submit_expression with a compact executable NumPy expression.

Training Recipe

  • Parent checkpoint: machine5_rendered_stable_medium6000_20260520_lr5e-7_e1
  • Training data: 13000 train-only rendered image/function pairs
  • Source mix: official 7700, poly 3900, closure 1400
  • Difficulty mix: easy 3073, medium 8129, hard 1198, expert 300, extreme 300
  • Teacher trace: none
  • Target format: compact submit_expression tool call with true expression
  • Prompt/reasoning: image-only direct tool-call prompt, no reasoning
  • Key protocol fix: stripped empty Qwen3 <think> template during SFT
  • LR: 2e-7
  • Global batch: 8
  • Steps: 1625 optimizer steps, 1 epoch

No dev/test answer trajectories were used as training data.

Evaluation

Evaluated with 8 vLLM services, 60 workers, max_tokens=16000, thinking disabled, and direct tool-call extraction.

Balanced60:

  • acc@0.99=0.35
  • acc@0.95=0.3833
  • acc@0.9=0.4
  • acc@0.8=0.45
  • null=0/60
  • finish_reason={"stop": 60}
  • mean latency: 6.276s

Official dev 300:

  • acc@0.99=0.28
  • acc@0.95=0.2967
  • acc@0.9=0.31
  • acc@0.8=0.34
  • null=0/300
  • finish_reason={"stop": 300}
  • mean latency: 4.411s

Intended Use

This checkpoint is an experimental model for visual symbolic regression research. It is tuned for concise tool-call outputs and may be over-specialized to rendered single-variable function plots.

Downloads last month
6
Safetensors
Model size
9B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HuayuSha/machine5_continue_stable_expand13000_20260520_lr2e-7_e1

Finetuned
(65)
this model