Uploaded model

  • Developed by: nomadicsynth
  • License: apache-2.0
  • Finetuned from model: unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit
  • Training Notebook: Qwen2.5_(3B)-GRPO.ipynb

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
31
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for nomadicsynth/Qwen2.5-3B-Instruct-Reasoning-gsm8k-lora-v1

Base model

Qwen/Qwen2.5-3B
Adapter
(2)
this model

Dataset used to train nomadicsynth/Qwen2.5-3B-Instruct-Reasoning-gsm8k-lora-v1