Cube-Double FQL Offline Checkpoint

Offline-trained Flow Q-Learning (FQL) agent for the OGBench cube-double-play-singletask-v0 environment.

Files

  • params_1000000.pkl — 1M offline-step ckpt. This is the standard starting point for online fine-tuning experiments (success rate ~30-40%).
  • params_2000000.pkl — 2M offline-step ckpt (fully converged offline policy).
  • flags.json — full training config (alpha, hidden dims, batch size, ...).
  • train.csv, eval.csv — full training metrics from the offline run.

Loading

import pickle, flax
with open("params_1000000.pkl", "rb") as f:
    load_dict = pickle.load(f)
agent = flax.serialization.from_state_dict(agent, load_dict["agent"])

In the Robo_Continual_Learning codebase you can also pass --restore_path=<dir> --restore_epoch=1000000.

Training config (excerpt)

  • Agent: FQL (alpha=300, flow_steps=10, hidden_dims=512x4)
  • Env: cube-double-play-singletask-v0 (obs_dim=37, action_dim=5)
  • Offline steps: 2M, seed=0
  • Offline dataset: OGBench cube-double-play-singletask-v0
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading