Instructions to use Bialy17/Phi3.5_socratic_dpo_unsloth with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Bialy17/Phi3.5_socratic_dpo_unsloth with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Bialy17/Phi3.5_socratic_dpo_unsloth", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use Bialy17/Phi3.5_socratic_dpo_unsloth with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Bialy17/Phi3.5_socratic_dpo_unsloth to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Bialy17/Phi3.5_socratic_dpo_unsloth to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Bialy17/Phi3.5_socratic_dpo_unsloth to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Bialy17/Phi3.5_socratic_dpo_unsloth", max_seq_length=2048, )
Phi-3.5-mini Socratic Tutor
A fine-tuned version of Phi-3.5-mini-instruct trained with DPO (Direct Preference Optimization) to teach using the Socratic method — guiding students to answers through questions rather than giving direct answers.
Training Details
| Base Model | unsloth/Phi-3.5-mini-instruct (3.8B) |
| Dataset | mudit23/class7-socratic-dpo |
| Method | DPO via Unsloth + TRL |
| LoRA rank | 16 |
| Beta | 0.5 |
| Learning rate | 1e-6 |
Usage
1. Install dependencies
!pip install --no-deps "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps trl peft accelerate bitsandbytes
!pip install unsloth_zoo
!pip install mergekit
!pip install --upgrade trl
from unsloth import FastLanguageModel
import torch
from trl import SFTTrainer
from transformers import TrainingArguments
from datasets import load_dataset
2. Load the model & CLI Chat
from unsloth import FastLanguageModel
model1, tokenizer1 = FastLanguageModel.from_pretrained(
model_name = "Bialy17/Phi3.5_socratic_dpo_unsloth",
max_seq_length = 2048,
dtype = None, # auto-detect (float16 on Colab T4)
load_in_4bit = True, # saves VRAM — works fine on free T4
)
FastLanguageModel.for_inference(model1)
import torch
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer1, skip_prompt=True)
chat_history = [
{"role": "system", "content": "You are a helpful Tutor that uses the Socratic teaching method."}
]
print("--- Phi-3.5-mini Socratic Tutor Initialized ---")
print("(Type 'quit' to exit, 'clear' to reset history)\n")
while True:
user_input = input("Student: ").strip()
if not user_input:
continue
if user_input.lower() in ("exit", "quit", "q"):
print("Goodbye!")
break
if user_input.lower() == "clear":
chat_history = [chat_history[0]] # keep system message
print("\n--- History Cleared ---\n")
continue
chat_history.append({"role": "user", "content": user_input})
inputs = tokenizer1.apply_chat_template(
chat_history,
tokenize = True,
add_generation_prompt = True,
return_tensors = "pt",
).to("cuda")
print("\nTutor: ", end="", flush=True)
# Get the correct stop token for Phi-3.5
terminators = tokenizer1.convert_tokens_to_ids(["<|end|>", "<|endoftext|>"])
outputs = model1.generate(
input_ids = inputs,
streamer = text_streamer,
max_new_tokens = 1024,
temperature = 0.2,
do_sample = True,
eos_token_id = terminators, # ← stops at <|end|>
pad_token_id = tokenizer1.eos_token_id,
)
raw_output = tokenizer1.decode(
outputs[0][len(inputs[0]):],
skip_special_tokens=True
)
chat_history.append({"role": "assistant", "content": raw_output})
print("\n")
Example
Student: What is gravity?
Tutor: Gravity is described by the general theory of relativity, which is a theory of gravitation that was first proposed by Albert Einstein in 1915. According to Einstein's theory, massive objects cause a distortion in the fabric of spacetime, which we perceive as gravity.
The most familiar example of gravity is the force that keeps us on the ground, and it's the force that keeps planets in orbit around stars.
Gravity is also responsible for the formation of stars, galaxies, and the large-scale structure of the universe.
In everyday life, gravity is the force that keeps us on the ground, and it's the force that keeps planets in orbit around stars.
To summarize, gravity is a fundamental force that attracts two bodies with mass towards each other, and it's one of the four fundamental forces in the universe.
Intended Use
- Tutoring assistants that guide rather than just answer
- Exploring Socratic dialogue in AI systems (must add socratic guide in system prompt)
Limitations
- Not fully tested but a good model
Model tree for Bialy17/Phi3.5_socratic_dpo_unsloth
Base model
microsoft/Phi-3.5-mini-instruct