AhiskaAI 65m IT v0.1 (Instruction Tuned)

AhiskaAI 65m IT v0.1 is a highly efficient, custom-aligned Small Language Model (SLM) for the Turkish language ecosystem.

This model was NOT fine-tuned on top of generic open-source weights. Instead, it was instruction-tuned directly over our proprietary foundation model, AhıskaAI 65m Base v0.1 (which was pre-trained from scratch for 1 full epoch on a 5.3 GB Turkish corpus). For this alignment phase (SFT), we utilized a strictly filtered and curated Turkish Alpaca dataset to maximize procedural logic, formatting accuracy, and structural fluidity while eliminating noisy data tokens.


🧬 The Pipeline: From Scratch to Instruction

Our research lab follows a strict vertical integration philosophy:

  1. Phase 1 (Base Model): Initialized LlamaForCausalLM from zero variables. Pre-trained on 5.3 GB of clean Turkish text matrix to lock down grammar, token-nesting patterns, and core semantics (AhıskaAI 65m Base v0.1).
  2. Phase 2 (Instruction Tuning): Supervised Fine-Tuning (SFT) over the base checkpoint using our custom-filtered Alpaca instructions. This phase injected formatting discipline, listing mechanics (1. 2. 3.), and multi-turn response compliance.

📊 Technical Architecture & Hyperparameters

Directly extracted from the native config.json, the model utilizes a pure modern LLaMA layout optimized for fast local compute:

  • Architecture: LlamaForCausalLM
  • Parameters: ~65 Million
  • Context Length (max_position_embeddings): 1024 tokens (Double the capacity of legacy GPT-2 baselines)
  • Vocabulary Size: 32,000 tokens (Custom BPE trained for Turkish root-suffix morphology)
  • Hidden Dimension (hidden_size): 512
  • Intermediate Layer Dimension (intermediate_size): 1376
  • Hidden Layers (num_hidden_layers): 12
  • Attention Heads: 8 (num_attention_heads / num_key_value_heads)
  • Activation Function: SiLU (silu)
  • Normalization EPS: rms_norm_eps: 1e-06 (RMSNorm architecture)
  • Positional Embeddings: RoPE (rope_type: default, theta: 10000.0)
  • Data Precision: float32

💻 Hardware Efficiency & "Build in Public"

  • Training & Alignment Hardware: NVIDIA GeForce RTX 4050 Laptop GPU (6GB VRAM)
  • Inference Footprint: Merely ~202 MB in size! It runs at lightning-fast tokens-per-second even on Hugging Face Free CPU Spaces, bypassing the need for expensive cloud GPU hosting.

🛠️ Quickstart Usage (Alpaca Format)

To interact with the instruction-tuned layer smoothly, invoke the model with the exact token structure it was aligned with:

from transformers import LlamaForCausalLM, AutoTokenizer
import torch

model_name = "AhiskaAI/AhiskaAI-65m-IT-v0.1"

# Load the custom-built architecture and vocabulary
model = LlamaForCausalLM.from_pretrained(model_name).to("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(model_name)

def ask_ahiska_it(instruction):
    # Strict Alpaca Template
    prompt = f"<|im_start|>user\n{user_input}<|im_end|>\n<|im_start|>assistant\n"

    
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs, 
            max_length=250, 
            do_sample=True, 
            top_k=40, 
            top_p=0.92,
            temperature=0.55, # Low temp keeps the 65m nodes highly focused
            repetition_penalty=1.18
        )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response.split("### Response:\n")[-1].strip()

# Run a test inference
print(ask_ahiska_it("Sağlıklı yaşamak için 3 ipucu ver"))
Downloads last month
2
Safetensors
Model size
70.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using AhiskaAI/AhiskaAI-65m-IT-v0.1 1

Collection including AhiskaAI/AhiskaAI-65m-IT-v0.1