Instructions to use TilQazyna/Til-Core-1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TilQazyna/Til-Core-1B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="TilQazyna/Til-Core-1B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("TilQazyna/Til-Core-1B") model = AutoModelForCausalLM.from_pretrained("TilQazyna/Til-Core-1B") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use TilQazyna/Til-Core-1B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TilQazyna/Til-Core-1B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TilQazyna/Til-Core-1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/TilQazyna/Til-Core-1B
- SGLang
How to use TilQazyna/Til-Core-1B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "TilQazyna/Til-Core-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TilQazyna/Til-Core-1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "TilQazyna/Til-Core-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TilQazyna/Til-Core-1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use TilQazyna/Til-Core-1B with Docker Model Runner:
docker model run hf.co/TilQazyna/Til-Core-1B
Til Core 1B (base)
A 1.25B-parameter Kazakh base language model, pre-trained from scratch on a
deduplicated Kazakh web/text corpus with a 256k morpheme-aware BPE tokenizer
(stukenov/sozkz-morphbpe-256k-kk-v1).
This is a base (non-instruct) model — it completes text, it does not follow chat instructions. An instruct version is planned (see Roadmap).
Model details
| Architecture | Llama-style decoder (RoPE, RMSNorm, SwiGLU, GQA) |
| Parameters | 1.246 B (tied input/output embeddings) |
| Hidden / layers | 2048 / 16 |
| Attention heads | 32 query / 8 KV (GQA) |
| Intermediate | 5632 |
| Context length | 2048 |
| Vocab | 256 000 (morpheme-BPE) |
| Precision | bf16 |
Training
| Tokens | 6.26 B (1 epoch) |
| Train blocks | 3 057 865 × 2048 |
| Corpus | cleaned → MinHash-deduped (11.29 M / 13.19 M docs kept, 85.6 %) |
| Hardware | 8 × NVIDIA H200, FSDP full-shard, bf16 |
| Optimizer | AdamW (β 0.9/0.95, wd 0.1), cosine LR 3e-4, warmup 200 |
| Effective batch | 512 blocks (8 × 16 × grad-accum 4) ≈ 1.05 M tok/step |
| Throughput | ~313 K tok/s |
| Wall-clock | ~5 h 40 m |
| Final loss | ~2.90 (train) |
Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
name = "stukenov/Til-Core-1B"
tok = AutoTokenizer.from_pretrained(name)
m = AutoModelForCausalLM.from_pretrained(name, dtype=torch.bfloat16).cuda().eval()
ids = tok("Қазақстан Республикасы — ", return_tensors="pt").input_ids.cuda()
out = m.generate(ids, max_new_tokens=50, do_sample=True,
temperature=0.8, top_p=0.95, repetition_penalty=1.2)
print(tok.decode(out[0], skip_special_tokens=True))
Sample generations
Қазақстан Республикасы — мемлекеттік рәміздері. Жалпы білім беретін мектептің 6-сыныбына арналған оқулық…
Жасанды интеллект дегеніміз бұл адам миының эволюциясы, ойлау жүйесі мен мінез-құлқының ерекшеліктерін…
Менің Отаным — «Отан» туралы өлеңді мәнерлеп оқу… Біздің Отанымыз қалай аталады?…
Limitations
- Base model — no instruction following, no safety alignment.
- Single epoch on a 6.26 B-token corpus; factual reliability is limited.
- Corpus skews toward educational / encyclopedic Kazakh text; occasional rare-token artifacts in generation.
- Kazakh-centric; not optimized for other languages.
Roadmap
- Til Core 1B Instruct — SFT on Kazakh instruction data (see plan in repo).
- A smaller instruct sibling for on-device use.
Citation
@misc{tilcore1b2026,
title = {Til Core 1B: a Kazakh base language model with a morpheme-BPE tokenizer},
author = {Tukenov, Saken},
year = {2026}
}
- Downloads last month
- 15