qwen3-8b spear-cdeact iter 0000375
PyTorch Distributed Checkpoint (DCP) snapshot from an internal RL training run.
Source
- Original path:
terminal-rl_qwen3-8b_8gpu_mixed_dapo_harness-camel-agent_explore_spear_lite_int_cosine120_a57_life0.005_postnorm_cdeact0.02_2026-06-05_155052_0609copy/iter_0000375 - Run started: 2026-06-05 15:50:52
- Iteration: 375
Key training config (parsed from run name)
| field | value |
|---|---|
| base model | qwen3-8b |
| world size | 8 GPU |
| RL algo | mixed-DAPO |
| harness | camel-agent |
| exploration | spear_lite_int, cosine120, a57, life=0.005 |
| extras | post-norm, cdeact=0.02 |
Files
8 sharded .distcp files (~13.35 GiB each) + .metadata + common.pt + metadata.json.
Backend: torch_dist (sharded), torch (common). Total ~107 GiB.
__{rank}_{shard}.distcp for rank in 0..3, shard in 0..1
.metadata
common.pt
metadata.json
Loading
import torch.distributed.checkpoint as dcp
state_dict = {...} # template with empty tensors of expected shapes
dcp.load(state_dict, checkpoint_id="path/to/iter_0000375")
Or convert to a single torch.save file:
python -m torch.distributed.checkpoint.format_utils \
dcp_to_torch_save path/to/iter_0000375 consolidated.pt
Companion run: HansBug/qwen3-8b-mt10-i343.