Instructions to use Harsh-k-007/fitcoach-3b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Harsh-k-007/fitcoach-3b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Harsh-k-007/fitcoach-3b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("Harsh-k-007/fitcoach-3b") model = AutoModelForMultimodalLM.from_pretrained("Harsh-k-007/fitcoach-3b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Harsh-k-007/fitcoach-3b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Harsh-k-007/fitcoach-3b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harsh-k-007/fitcoach-3b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Harsh-k-007/fitcoach-3b
- SGLang
How to use Harsh-k-007/fitcoach-3b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Harsh-k-007/fitcoach-3b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harsh-k-007/fitcoach-3b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Harsh-k-007/fitcoach-3b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harsh-k-007/fitcoach-3b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use Harsh-k-007/fitcoach-3b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Harsh-k-007/fitcoach-3b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Harsh-k-007/fitcoach-3b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Harsh-k-007/fitcoach-3b to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Harsh-k-007/fitcoach-3b", max_seq_length=2048, ) - Docker Model Runner
How to use Harsh-k-007/fitcoach-3b with Docker Model Runner:
docker model run hf.co/Harsh-k-007/fitcoach-3b
FitCoach 3B (Merged, fp16)
A fully merged, full-precision (fp16) fine-tune of Llama 3.2 3B Instruct, acting as FitCoach — a conversational fitness and nutrition intake coach. This is the lightweight option in the FitCoach model family, alongside the 8B LoRA adapter.
Try it live: FitCoach Space
Model Details
- Base model:
unsloth/Llama-3.2-3B-Instruct-bnb-4bit(loaded in 4-bit for training via Unsloth, then merged to 16-bit for deployment) - Format: merged weights, fp16 — no adapter required, load directly
- Adapter (during training): LoRA, rank 16, alpha 16, dropout 0, targeting all
attention and MLP projection layers (
q/k/v/o_proj,gate/up/down_proj) - Training framework: Unsloth
FastLanguageModel+ TRLSFTTrainer - Training data:
Harsh-k-007/fitcoach-conversations— 1,407 synthetic coaching conversations (95/5 train/eval split for this run) - Sequence length: 2048 tokens, with sequence packing (
bfdstrategy) - Precision: bf16 training on a single T4 GPU (Google Colab free tier)
- Epochs: 2, effective batch size 8 (2 × 4 grad accumulation), cosine LR schedule, peak LR 2e-4
Intended Use
FitCoach is a conversational intake coach for fitness and nutrition. Given a user's goal, it asks one question at a time to gather the relevant context, then generates a structured plan.
Scope is intentionally narrow:
- Meal plans (~60% of training data): collects goal, age/height/weight, dietary restrictions, activity level
- Workout plans (~40% of training data): collects goal, experience level, days per week, equipment access
The 3B model is intended as a lighter, faster alternative to the 8B adapter — useful where latency or memory matters more than maximum response quality.
Out of scope
- Injuries, medical conditions, or any medical advice
- Macro/calorie arithmetic — the model can describe macro targets conceptually but is not reliable at computing them; treat any numeric macro breakdown as approximate, not verified
- Unprompted macro generation — the model does not currently generate macros unless explicitly asked (known dataset gap, planned for v2)
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Harsh-k-007/fitcoach-3b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = "<|finetune_right_pad_id|>"
model = AutoModelForCausalLM.from_pretrained(
model_id,
dtype=torch.float16,
device_map="auto",
)
model.eval()
messages = [
{"role": "system", "content": "You are FitCoach, a friendly fitness and nutrition coach."},
{"role": "user", "content": "Create a simple fat-loss meal plan with Indian food options."},
]
encoded = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
input_ids = encoded["input_ids"] if hasattr(encoded, "keys") else encoded
input_ids = input_ids.to(model.device)
output = model.generate(
input_ids,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9,
pad_token_id=128004,
)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
Training Procedure
- Method: Supervised fine-tuning (SFT) with TRL
SFTTrainer, using Unsloth'sFastLanguageModelfor memory-efficient LoRA training (gradient checkpointing viause_gradient_checkpointing="unsloth"), then merged to full precision viasave_pretrained_merged(..., save_method="merged_16bit") - Loss: full-conversation loss (
train_on_responses_only/ assistant-only masking was not applied in this run — a documented future optimization once reliably supported for the Llama 3 chat template) - Chat template: Llama 3.2 (
unsloth.chat_templates.get_chat_template) - Optimizer:
adamw_8bit, weight decay 0.01, cosine schedule, 17 warmup steps - Hardware: Google Colab T4 (free tier), with Drive checkpointing for resumability across the 90-minute idle / 12-hour session limits
Known Limitations
- Macro arithmetic is hallucinated. The model isn't reliable at computing calorie/macro numbers. A v2 release plans to add a calculator/tool layer for this.
- Macros aren't generated unprompted. The dataset under-represents this, so the model needs to be asked explicitly. Planned fix for v2 via dataset augmentation.
- No assistant-only loss in this training run (see above).
- As the smaller model in the family, expect slightly less consistent intake behavior and plan structure compared to the 8B adapter.
Citation
If you use this model, please link back to this repo and the training dataset.
- Downloads last month
- 96
Model tree for Harsh-k-007/fitcoach-3b
Base model
meta-llama/Llama-3.2-3B-Instruct