SmolLM2-135M-Instruct-Reasoning

A reasoning-focused fine-tune of SmolLM2-135M-Instruct trained using Unsloth on the combined-reasoning dataset.

Model Overview

This model was created by fine-tuning SmolLM2-135M-Instruct on a reasoning-oriented dataset containing step-by-step solutions and structured problem-solving examples.

The model is intended to:

  • Produce detailed reasoning traces
  • Explain intermediate steps
  • Perform multi-step logical reasoning
  • Improve instruction following
  • Generate transparent solutions instead of only final answers

Base Model

HuggingFaceTB/SmolLM2-135M-Instruct

Dataset

Avtrkrb/combined-reasoning

Training Method

  • Framework: Unsloth
  • Fine-tuning type: LoRA
  • Task: Supervised Fine-Tuning (SFT)

Prompt Format

This model uses the standard SmolLM2 chat template.

<|im_start|>system You are a helpful AI assistant named SmolLM, trained by Hugging Face <|im_end|>

<|im_start|>user What is 17 × 23? <|im_end|>

<|im_start|>assistant

Example Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Avtrkrb/SmolLM2-135M-Instruct-bnb-4bit-reasoning"

tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id)

Intended Use

  • Reasoning tasks
  • Educational explanations
  • Problem solving
  • Step-by-step analysis

Limitations

  • Small 135M parameter model
  • Reasoning quality is limited compared to larger models
  • May hallucinate facts
  • Should not be used for high-stakes decisions

License

Apache 2.0 (inherits base model licensing)

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Avtrkrb/SmolLM2-135M-Instruct-bnb-4bit-reasoning

Adapter
(46)
this model
Quantizations
1 model

Dataset used to train Avtrkrb/SmolLM2-135M-Instruct-bnb-4bit-reasoning