Kili-small-1.0
A powerful, lightweight Small Language Model built for general chat and code — fine-tunable on consumer hardware.
🤗 Model on Hugging Face · 📄 Apache 2.0 License · 💬 Discussions
Overview
Kili-small-1.0 is a 500M-parameter Small Language Model (SLM) developed by Kili Labs, designed to deliver strong general-purpose chat and code generation capabilities in an extremely efficient footprint.
Kili-small-1.0 features a custom architecture designed from the ground up by Kili Labs, while leveraging the battle-tested Qwen2 tokenizer, chat template, and vocabulary for broad ecosystem compatibility. This means you get the benefits of a purpose-built model without sacrificing interoperability — standard Qwen2-compatible tooling, tokenizers, and pipelines work out of the box.
The model is purpose-engineered for developers and researchers who need a capable, adaptable model that runs — and fine-tunes — on consumer-grade hardware.
Key Features
- 500M Parameters — Compact by design. Maximum capability per parameter.
- General Chat & Code — Strong performance on natural language conversation and code generation tasks.
- Consumer Hardware Compatible — Fine-tunable with as little as 15 GB of RAM on low-end ("potato") GPUs with standard optimisation techniques.
- Apache 2.0 Licensed — Fully open. Use, modify, and distribute freely for commercial and research purposes.
- Safetensors Format — Efficient, safe model serialisation out of the box.
Model Details
| Property | Value |
|---|---|
| Architecture | Custom (Kili Labs) |
| Tokenizer | Qwen2 (vocab, chat template & tools) |
| Parameter Count | 500M (0.5B) |
| Tensor Type | F16 |
| Language | English |
| License | Apache 2.0 |
| Task | Text Generation |
| Tags | SLM, Chat, Coding, Qwen2-compatible |
Note on Architecture: While Kili-small-1.0 uses a custom model architecture designed by Kili Labs, it adopts the Qwen2 tokenizer, vocabulary, and chat template. This ensures full compatibility with Qwen2-compatible inference pipelines, tokenization utilities, and chat formatting tools.
Training Datasets
Kili-small-1.0 was trained on a curated combination of high-quality instruction and planning datasets:
| Dataset | Description |
|---|---|
vicgalle/alpaca-gpt4 |
GPT-4 generated Alpaca-format instruction-following data (~52k samples) |
Qwen/DeepPlanning |
Deep planning and reasoning tasks (~2.1k samples) |
Quickstart
Installation
pip install transformers torch
Inference
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "kililabs/Kili-small-1.0"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
messages = [
{"role": "user", "content": "Write a Python function to reverse a linked list."}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
output = model.generate(
input_ids,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9
)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
Fine-Tuning on Consumer Hardware
Kili-small-1.0 is specifically designed to be accessible for fine-tuning. With appropriate optimisation, you can fine-tune on a single consumer GPU with ~15 GB of RAM.
Recommended Setup
pip install transformers peft bitsandbytes accelerate datasets trl
Example: QLoRA Fine-Tuning
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model
import torch
# 4-bit quantisation for low VRAM usage
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
"kililabs/Kili-small-1.0",
quantization_config=bnb_config,
device_map="auto"
)
# LoRA configuration
lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
# trainable params: ~4.7M || all params: ~504M || trainable%: ~0.93%
Hardware Requirements
| Setup | VRAM | RAM | Notes |
|---|---|---|---|
| Full F16 Inference | ~2 GB | 4 GB | Very fast |
| QLoRA Fine-Tuning (4-bit) | ~6–8 GB | 15 GB | Consumer GPUs (GTX 1080, RTX 3060, etc.) |
| Full Fine-Tuning | ~4 GB | 12 GB | F16, gradient checkpointing recommended |
Tip: Enable
gradient_checkpointing=Trueand usebf16orfp16mixed precision in yourTrainingArgumentsto further reduce memory usage during fine-tuning.
Intended Use
Kili-small-1.0 is designed for:
- Conversational AI — Instruction-following, Q&A, and general assistant tasks.
- Code Generation — Writing, explaining, and debugging code across common languages.
- Fine-Tuning Base — A lightweight starting point for domain-specific SLM development.
- Edge & Resource-Constrained Deployments — Applications where model size and memory are critical constraints.
Limitations
- As a 500M parameter model, Kili-small-1.0 may underperform larger models on complex multi-step reasoning or highly specialised domain tasks.
- Output quality is strongly influenced by prompt quality. Clear, well-structured prompts yield the best results.
- The model has not been independently evaluated on standard safety benchmarks. Users are responsible for applying appropriate safety measures in production deployments.
License
This model is released under the Apache License 2.0. You are free to use, reproduce, modify, and distribute this model for commercial and non-commercial purposes. See the full license text: Apache 2.0
Citation
If you use Kili-small-1.0 in your research or projects, please cite it as:
@misc{kililabs2025kilismall,
title = {Kili-small-1.0: A 500M Parameter Small Language Model for Chat and Code},
author = {Kili Labs},
year = {2025},
howpublished = {\url{https://huggingface.co/kililabs/Kili-small-1.0}},
note = {Apache 2.0 License}
}
About Kili Labs
Kili Labs is committed to building powerful, accessible AI tools that work for everyone — not just those with access to enterprise infrastructure. Kili-small-1.0 is a step toward democratising capable language models for developers, researchers, and builders worldwide.
Made with ❤️ by Kili Labs
- Downloads last month
- 85