My Reasoning Model

This is my first reasoning model. It is fairly small, and yes, it still gets the answer wrong to how many r's are in the word "strawberry."

You are welcome to use the model as you wish.

System Prompt Format

Respond in the following format:

<reasoning>
...
</reasoning>
<answer>
...
</answer>

I fine-tuned the model using openai/gsm8k, and to ensure costs do not go insane, I used a single A100.


Enjoy, but please note that this model is experimental and I used it to define my pipeline.

I will be testing fine tuning larger more capable models.  I suspect they would add more value in the short term.


---
base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- gguf
license: apache-2.0
language:
- en
---

# Uploaded  model

- **Developed by:** dbands
- **License:** apache-2.0
- **Finetuned from model :** unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit

This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
Downloads last month
74
GGUF
Model size
3.09B params
Architecture
qwen2

4-bit

5-bit

8-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Dataset used to train dbands/Qwen2.5-3B-Instruct-reason-gguf