Model Card for LinalgZero-GSPO

Information and code used to train this model is available on Github.

This model is a fine-tuned version of atomwalk12/LinalgZero-SFT on the atomwalk12/linalgzero-grpo dataset using the GSPO algorithm. It has been trained using ART.

Downloads last month
55
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with rfvasile/LinalgZero-GRPO-merged.

Model tree for rfvasile/LinalgZero-GRPO-merged

Adapter
(2)
this model
Adapters
1 model