Instructions to use AhiskaAI/AhiskaAI-65m-IT-v0.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AhiskaAI/AhiskaAI-65m-IT-v0.1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AhiskaAI/AhiskaAI-65m-IT-v0.1")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AhiskaAI/AhiskaAI-65m-IT-v0.1")
model = AutoModelForCausalLM.from_pretrained("AhiskaAI/AhiskaAI-65m-IT-v0.1")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use AhiskaAI/AhiskaAI-65m-IT-v0.1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AhiskaAI/AhiskaAI-65m-IT-v0.1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AhiskaAI/AhiskaAI-65m-IT-v0.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/AhiskaAI/AhiskaAI-65m-IT-v0.1

SGLang

How to use AhiskaAI/AhiskaAI-65m-IT-v0.1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AhiskaAI/AhiskaAI-65m-IT-v0.1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AhiskaAI/AhiskaAI-65m-IT-v0.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AhiskaAI/AhiskaAI-65m-IT-v0.1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AhiskaAI/AhiskaAI-65m-IT-v0.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use AhiskaAI/AhiskaAI-65m-IT-v0.1 with Docker Model Runner:
```
docker model run hf.co/AhiskaAI/AhiskaAI-65m-IT-v0.1
```

AhiskaAI 65m IT v0.1 (Instruction Tuned)

AhiskaAI 65m IT v0.1 is a highly efficient, custom-aligned Small Language Model (SLM) for the Turkish language ecosystem.

This model was NOT fine-tuned on top of generic open-source weights. Instead, it was instruction-tuned directly over our proprietary foundation model, AhıskaAI 65m Base v0.1 (which was pre-trained from scratch for 1 full epoch on a 5.3 GB Turkish corpus). For this alignment phase (SFT), we utilized a strictly filtered and curated Turkish Alpaca dataset to maximize procedural logic, formatting accuracy, and structural fluidity while eliminating noisy data tokens.

🧬 The Pipeline: From Scratch to Instruction

Our research lab follows a strict vertical integration philosophy:

Phase 1 (Base Model): Initialized LlamaForCausalLM from zero variables. Pre-trained on 5.3 GB of clean Turkish text matrix to lock down grammar, token-nesting patterns, and core semantics (AhıskaAI 65m Base v0.1).
Phase 2 (Instruction Tuning): Supervised Fine-Tuning (SFT) over the base checkpoint using our custom-filtered Alpaca instructions. This phase injected formatting discipline, listing mechanics (1. 2. 3.), and multi-turn response compliance.

📊 Technical Architecture & Hyperparameters

Directly extracted from the native config.json, the model utilizes a pure modern LLaMA layout optimized for fast local compute:

Architecture: LlamaForCausalLM
Parameters: ~65 Million
Context Length (max_position_embeddings): 1024 tokens (Double the capacity of legacy GPT-2 baselines)
Vocabulary Size: 32,000 tokens (Custom BPE trained for Turkish root-suffix morphology)
Hidden Dimension (hidden_size): 512
Intermediate Layer Dimension (intermediate_size): 1376
Hidden Layers (num_hidden_layers): 12
Attention Heads: 8 (num_attention_heads / num_key_value_heads)
Activation Function: SiLU (silu)
Normalization EPS: rms_norm_eps: 1e-06 (RMSNorm architecture)
Positional Embeddings: RoPE (rope_type: default, theta: 10000.0)
Data Precision: float32

💻 Hardware Efficiency & "Build in Public"

Training & Alignment Hardware: NVIDIA GeForce RTX 4050 Laptop GPU (6GB VRAM)
Inference Footprint: Merely ~202 MB in size! It runs at lightning-fast tokens-per-second even on Hugging Face Free CPU Spaces, bypassing the need for expensive cloud GPU hosting.

🛠️ Quickstart Usage (Alpaca Format)

To interact with the instruction-tuned layer smoothly, invoke the model with the exact token structure it was aligned with:

from transformers import LlamaForCausalLM, AutoTokenizer
import torch

model_name = "AhiskaAI/AhiskaAI-65m-IT-v0.1"

# Load the custom-built architecture and vocabulary
model = LlamaForCausalLM.from_pretrained(model_name).to("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(model_name)

def ask_ahiska_it(instruction):
    # Strict Alpaca Template
    prompt = f"<|im_start|>user\n{user_input}<|im_end|>\n<|im_start|>assistant\n"

    
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs, 
            max_length=250, 
            do_sample=True, 
            top_k=40, 
            top_p=0.92,
            temperature=0.55, # Low temp keeps the 65m nodes highly focused
            repetition_penalty=1.18
        )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response.split("### Response:\n")[-1].strip()

# Run a test inference
print(ask_ahiska_it("Sağlıklı yaşamak için 3 ipucu ver"))

Downloads last month: 2

Safetensors

Model size

70.7M params

Tensor type

F32

Space using AhiskaAI/AhiskaAI-65m-IT-v0.1 1

Collection including AhiskaAI/AhiskaAI-65m-IT-v0.1

AhıskaAI v0.1 Collection

Collection

First AhıskaAI model collection. From Scratch Turkish small language models. • 5 items • Updated 1 day ago • 1