Instructions to use anisiraj/tinytune-smollm2-xlam with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use anisiraj/tinytune-smollm2-xlam with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="anisiraj/tinytune-smollm2-xlam") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("anisiraj/tinytune-smollm2-xlam") model = AutoModelForCausalLM.from_pretrained("anisiraj/tinytune-smollm2-xlam") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - PEFT
How to use anisiraj/tinytune-smollm2-xlam with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use anisiraj/tinytune-smollm2-xlam with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "anisiraj/tinytune-smollm2-xlam" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anisiraj/tinytune-smollm2-xlam", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/anisiraj/tinytune-smollm2-xlam
- SGLang
How to use anisiraj/tinytune-smollm2-xlam with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "anisiraj/tinytune-smollm2-xlam" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anisiraj/tinytune-smollm2-xlam", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "anisiraj/tinytune-smollm2-xlam" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anisiraj/tinytune-smollm2-xlam", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use anisiraj/tinytune-smollm2-xlam with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for anisiraj/tinytune-smollm2-xlam to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for anisiraj/tinytune-smollm2-xlam to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for anisiraj/tinytune-smollm2-xlam to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="anisiraj/tinytune-smollm2-xlam", max_seq_length=2048, ) - Docker Model Runner
How to use anisiraj/tinytune-smollm2-xlam with Docker Model Runner:
docker model run hf.co/anisiraj/tinytune-smollm2-xlam
tinytune-smollm2-xlam
A LoRA fine-tune of HuggingFaceTB/SmolLM2-135M-Instruct that emits structured tool / function calls, trained on Salesforce/xlam-function-calling-60k.
Ships two artifacts: a standalone merged model at the repo root (plain transformers,
no PEFT) and the raw LoRA adapter under adapter/. Merge produced with
Unsloth.
⚠️ Checkpoint snapshot (~1.67 epochs, step 5000/9000 (56%)); may not be a fully-trained final run.
Results
Greedy generation over 200 held-out test examples:
| Metric | Score |
|---|---|
| Tool-name match — function name(s) match gold | 96.0% |
| Exact match — name and all arguments match | 72.5% |
exact_match is the metric that matters: one wrong argument fails it while barely denting
token accuracy, so a low SFT loss can still mean imperfect calls.
Output format
The assistant replies with one or more <tool_call>{"name": ..., "arguments": {...}}</tool_call> blocks.
Usage — merged model
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("anisiraj/tinytune-smollm2-xlam")
model = AutoModelForCausalLM.from_pretrained("anisiraj/tinytune-smollm2-xlam").eval()
messages = [
{"role": "system", "content": "You are a function-calling assistant. Available tools:\n"
'{"name": "get_weather", "arguments": {"city": "string"}}'},
{"role": "user", "content": "What's the weather in Paris?"},
]
inp = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
out = model.generate(inp, max_new_tokens=128, do_sample=False)
print(tok.decode(out[0][inp.shape[1]:], skip_special_tokens=True))
Usage — LoRA adapter
from transformers import AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")
model = PeftModel.from_pretrained(base, "anisiraj/tinytune-smollm2-xlam", subfolder="adapter")
Training
LoRA (full-precision) via TRL SFTTrainer: r=16, alpha=32, dropout=0.05, targets =
attention + MLP; completion-only (generation) loss; lr 2e-4 cosine, warmup 0.03,
batch 4 × grad-accum 4. Custom ChatML+tools template (tools in the system turn).
License & data
Weights derive from HuggingFaceTB/SmolLM2-135M-Instruct (Apache-2.0). Trained on Salesforce/xlam-function-calling-60k — review that dataset's license before downstream use.
- Downloads last month
- 20
Model tree for anisiraj/tinytune-smollm2-xlam
Base model
HuggingFaceTB/SmolLM2-135MDataset used to train anisiraj/tinytune-smollm2-xlam
Evaluation results
- Tool-name match (n=200) on xLAM 60k (held-out)self-reported0.960
- Exact match name+args (n=200) on xLAM 60k (held-out)self-reported0.725