olmo3-7b_solver_iter1

SCOPE solver checkpoint trained with DR-Zero long-form GRPO.

  • Model alias: olmo3-7b
  • Base model: OLMo-3-7B-Instruct
  • Version: v1.9.21
  • Solver iteration: 1
  • Global step: 20
  • Source experiment: long_solver_v1.9.21_iter1_grpo_group8_olmo-3-7b-instruct
  • Source local HF checkpoint: checkpoints/dr-zero/long_solver_v1.9.21_iter1_grpo_group8_olmo-3-7b-instruct/merged_hf_actor_gs20
  • Collection: https://huggingface.co/collections/wckwan/scope

This repository contains the Hugging Face-format checkpoint files for loading with Transformers.

Downloads last month
26
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including wckwan/olmo3-7b_solver_iter1