Instructions to use AKMESSI/lfm2.5-230m-fable-5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AKMESSI/lfm2.5-230m-fable-5 with PEFT:
Task type is invalid.
- llama-cpp-python
How to use AKMESSI/lfm2.5-230m-fable-5 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="AKMESSI/lfm2.5-230m-fable-5", filename="lfm2.5-230m-fable-5-f16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use AKMESSI/lfm2.5-230m-fable-5 with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M # Run inference directly in the terminal: llama cli -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M # Run inference directly in the terminal: llama cli -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
Use Docker
docker model run hf.co/AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use AKMESSI/lfm2.5-230m-fable-5 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AKMESSI/lfm2.5-230m-fable-5" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AKMESSI/lfm2.5-230m-fable-5", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
- Ollama
How to use AKMESSI/lfm2.5-230m-fable-5 with Ollama:
ollama run hf.co/AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
- Unsloth Studio
How to use AKMESSI/lfm2.5-230m-fable-5 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AKMESSI/lfm2.5-230m-fable-5 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AKMESSI/lfm2.5-230m-fable-5 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for AKMESSI/lfm2.5-230m-fable-5 to start chatting
- Pi
How to use AKMESSI/lfm2.5-230m-fable-5 with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "AKMESSI/lfm2.5-230m-fable-5:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use AKMESSI/lfm2.5-230m-fable-5 with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use AKMESSI/lfm2.5-230m-fable-5 with Docker Model Runner:
docker model run hf.co/AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
- Lemonade
How to use AKMESSI/lfm2.5-230m-fable-5 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull AKMESSI/lfm2.5-230m-fable-5:Q4_K_M
Run and chat with the model
lemonade run user.lfm2.5-230m-fable-5-Q4_K_M
List all available models
lemonade list
LFM2.5-230M Fable-5 GGUF
Fine-tuned GGUF release of LiquidAI/LFM2.5-230M on Glint-Research/Fable-5-traces.
Files
lfm2.5-230m-fable-5-f16.gguf— highest quality, largest filelfm2.5-230m-fable-5-q8_0.gguf— high quality, smallerlfm2.5-230m-fable-5-q4_k_m.gguf— best default for local inference
Training
- Base model:
LiquidAI/LFM2.5-230M - Dataset:
Glint-Research/Fable-5-traces - File used:
fable5_cot_merged.jsonl - Method: PEFT LoRA SFT
- Max sequence length: 4096
- Epochs: 1
- LoRA rank: 32
- LoRA alpha: 64
- LoRA dropout: 0.05
- Precision: FP16 base model, FP32 LoRA trainable weights
- Hardware: Google Colab T4
- Format: Chat template system/user/assistant, preserving Fable
context -> completion
Final training loss samples
- step 555: 1.7037
- step 560: 1.5968
- step 565: 1.6435
- step 570: 1.6109
- step 575: 1.6589
- step 580: 1.6439
Evaluation
We evaluated AKMESSI/lfm2.5-230m-fable-5:F16 against the original base model, LiquidAI/LFM2.5-230M-GGUF:BF16, using local llama.cpp server inference.
These are not official leaderboard submissions. They are lightweight local evaluations intended to compare the fine-tuned model against the base model under the same prompts, decoding settings, and hardware setup.
Summary
The Fable-5 fine-tune improves repository-context code continuation on RepoBench-C-lite Python, while mostly preserving the base model's generic function-calling behavior on BFCL-lite Simple.
| Benchmark | Result |
|---|---|
| RepoBench-C-lite Python | Fine-tuned model outperforms base model |
| BFCL-lite Simple | Fine-tuned model mostly preserves base function-calling ability |
| CodeXGLUE Line Completion Python | Neutral / unchanged |
| CRUXEval-lite | Not a good fit for this trace-style model |
RepoBench-C-lite Python
RepoBench-C-style next-line code completion was used to evaluate repository-context code continuation. We sampled 100 examples each from python_if, python_cff, and python_cfr, for 300 total examples.
| Model | Examples | Exact Match | Prefix Match | Edit Similarity |
|---|---|---|---|---|
LiquidAI/LFM2.5-230M-GGUF:BF16 |
300 | 10.33% | 10.67% | 46.85% |
AKMESSI/lfm2.5-230m-fable-5:F16 |
300 | 14.67% | 15.33% | 50.17% |
Compared with the base model, the Fable-5 fine-tune improved:
- Exact match by +4.33 percentage points
- Prefix match by +4.67 percentage points
- Edit similarity by +3.32 points
Breakdown by config:
| Config | Base Exact | Fable Exact | Base Edit Sim | Fable Edit Sim |
|---|---|---|---|---|
python_if |
21.00% | 27.00% | 55.14% | 57.31% |
python_cff |
3.00% | 5.00% | 37.45% | 38.10% |
python_cfr |
7.00% | 12.00% | 47.96% | 55.10% |
BFCL-lite Simple
We also ran a local BFCL-lite Simple function-calling evaluation over 400 examples as a generic tool-calling control.
| Model | Examples | Parse-valid JSON | Function-name Match | Argument Recall | Rough Score |
|---|---|---|---|---|---|
LiquidAI/LFM2.5-230M-GGUF:BF16 |
400 | 97.75% | 97.50% | 71.60% | 88.44% |
AKMESSI/lfm2.5-230m-fable-5:F16 |
400 | 98.25% | 95.00% | 67.70% | 85.44% |
The fine-tuned model preserves most of the base model's generic function-calling behavior, but does not improve BFCL-style API-schema-to-JSON calling. This is expected because the training data consists of coding-agent traces rather than clean function-calling examples.
CodeXGLUE Line Completion Python
We ran a 1,000-example local CodeXGLUE line-completion evaluation as a general code-completion control.
| Model | Examples | Exact Match | Prefix Match | Edit Similarity |
|---|---|---|---|---|
LiquidAI/LFM2.5-230M-GGUF:BF16 |
1000 | 23.60% | 0.00% | 23.60% |
AKMESSI/lfm2.5-230m-fable-5:F16 |
1000 | 23.50% | 0.00% | 23.50% |
This result is effectively neutral. The Fable-5 fine-tune does not materially change general line-completion performance on this setup.
CRUXEval-lite
We also tried a 200-example CRUXEval-lite run for Python execution reasoning.
| Model | Task O Accuracy | Task I Accuracy | Overall Accuracy |
|---|---|---|---|
LiquidAI/LFM2.5-230M-GGUF:BF16 |
8.50% | 4.00% | 6.25% |
AKMESSI/lfm2.5-230m-fable-5:F16 |
0.00% | 0.00% | 0.00% |
This benchmark was not a good fit for the fine-tuned model. The Fable-5 model often entered explanation or trace-style response mode instead of returning only the exact literal Python value expected by CRUXEval.
Interpretation
The Fable-5 fine-tune appears to shift the base model toward coding-agent and repository-context continuation behavior.
It improves RepoBench-C-lite Python next-line completion, while mostly preserving generic function-calling ability on BFCL-lite Simple. The main regression is in exact BFCL-style argument filling, which is not the main target of the Fable-5 trace dataset.
The model is best understood as a tiny coding-agent trace model, not a general-purpose reasoning model or a benchmark-specialized function-calling model.
Evaluation Caveats
- These are local lightweight evaluations, not official leaderboard submissions.
- Results were produced with llama.cpp server inference.
- Scores may vary with prompting, decoding settings, quantization level, and benchmark harness details.
- BFCL-lite and RepoBench-C-lite use simplified local scoring scripts rather than official leaderboard infrastructure.
- Only the F16 model was benchmarked here; quantized GGUF variants may differ slightly.
Usage
Recommended local file:
lfm2.5-230m-fable-5-q4_k_m.gguf
Caveats
This model is trained on coding-agent trace telemetry. It may emit tool-call-like actions, shell commands, file paths, or long reasoning-style continuations. Review outputs before executing commands.
The dataset contains coding-agent traces and should not be treated as a clean benchmark or a safety-filtered assistant dataset.
License notes
- Base model: LiquidAI LFM Open License v1.0
- Dataset: AGPL-3.0
- This repo preserves upstream license notices. Check compatibility before commercial or closed-source use.
- Downloads last month
- 238
4-bit
8-bit
16-bit