Geometry-Preserving Orthonormal Initialization RLVR Artifacts
This repository hosts LoRA adapter checkpoints for the DAPO 1.5B cosine learning-rate runs.
Structure
checkpoints/LoRA/seed{41,42,43}/global_step_{50..500}checkpoints/RLMO/seed{41,42,43}/global_step_{50..500}checkpoints/RLPO/seed{41,42,43}/global_step_{50..500}
Each checkpoint directory contains PEFT LoRA adapter files:
adapter_config.jsonadapter_model.safetensors
Base model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.
Code and reproduction scripts: https://github.com/Richard-ZZZ/geometry-preserving-orthonormal-init-rlvr
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support