SmolSocrates-360M: Fine-Tuning an SLM for Socratic Tutoring

A machine learning project demonstrating the fine-tuning of a Small Language Model (SLM) to adopt a highly specific pedagogical persona. SmolSocrates-360M is a 360 million parameter model trained to act as a Socratic coding tutor—guiding students to solutions through targeted questioning rather than outputting direct code answers.

This model was trained 2x faster using Unsloth.

Developed by: kushalicious
License: apache-2.0
Finetuned from model: HuggingFaceTB/SmolLM2-360M-Instruct
GitHub Repository: kushalicious/smolsocrates-360m

Key Technical Achievements

Overriding Base Model Behavior: Successfully fine-tuned a generalist "helpful" model to strictly refuse generating code blocks, achieving a 100% No-Code Rate during evaluation.
Synthetic Data Pipeline: Engineered a highly constrained data augmentation pipeline using the Groq API (Llama 3.1 8B) to expand the training dataset by 50% with high-quality, synthetically generated Socratic scenarios.
Custom Evaluation Framework: Developed a dual-layered evaluation harness combining rule-based heuristics (regex parsing for code blocks/questions) with LLM-as-a-Judge scoring for qualitative pedagogical assessment.
Resource-Constrained Training: Executed the entire training pipeline (LoRA, r=32) in under 20 minutes on a single T4 GPU using Unsloth's optimized kernels.

Evaluation Metrics

The custom evaluation harness scores the model across multiple dimensions.

Metric	Baseline (SmolLM2-360M)	Fine-Tuned (SmolSocrates)	Delta
Overall Score	2.2 / 9	5.4 / 9	+3.2
No-Code Rate	40%	100%	+60%

Qualitative Comparison

Student Prompt: "My for loop never stops running. Here's my code: ... What's wrong?"

Base Model Response:

"The issue is with the condition of your while loop. You're iterating over the range of numbers from 0 to 9... You should change your while loop to: while i < 10:..."

SmolSocrates Response:

"You're trying to avoid infinite loops by using a while loop with a condition that doesn't meet the requirements. Can you think of what it would be called when you have a condition that never gets reset?"

Quick Start

You can load and run the fine-tuned model directly using the Hugging Face pipeline:

from transformers import pipeline

# Load the fine-tuned model
tutor = pipeline("text-generation", model="kushalicious/SmolSocrates-360M")

# Inference
response = tutor([
    {"role": "system", "content": "You are a Socratic coding tutor..."},
    {"role": "user", "content": "I'm getting an IndexError in my Python list. How do I fix it?"},
], max_new_tokens=256, return_full_text=False)

print(response[0]["generated_text"])

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kushalicious/SmolSocrates-360M

Base model

HuggingFaceTB/SmolLM2-360M

Quantized

HuggingFaceTB/SmolLM2-360M-Instruct

Adapter

(37)

this model