Geometry-Preserving Orthonormal Initialization RLVR Artifacts

This repository hosts LoRA adapter checkpoints for the DAPO 1.5B cosine learning-rate runs.

Structure

  • checkpoints/LoRA/seed{41,42,43}/global_step_{50..500}
  • checkpoints/RLMO/seed{41,42,43}/global_step_{50..500}
  • checkpoints/RLPO/seed{41,42,43}/global_step_{50..500}

Each checkpoint directory contains PEFT LoRA adapter files:

  • adapter_config.json
  • adapter_model.safetensors

Base model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. Code and reproduction scripts: https://github.com/Richard-ZZZ/geometry-preserving-orthonormal-init-rlvr

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support