Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

Kubernetes AI - Gemma 3 12B LoRA Adapters

Fine-tuned Gemma 3 12B model specialized for answering Kubernetes questions in Turkish.

Model Description

This model consists of LoRA adapters fine-tuned on unsloth/gemma-3-12b-it-qat-bnb-4bit using a comprehensive dataset of Kubernetes documentation, Stack Overflow questions, and DevOps scenarios.

Primary Purpose: Answer Kubernetes-related questions in Turkish language.

Use Cases

Kubernetes cluster management and troubleshooting
YAML configuration generation and validation
kubectl command assistance
Debugging pod, service, and deployment issues
Kubernetes best practices and concepts
DevOps workflow optimization
Turkish language Kubernetes Q&A

Quick Start

Loading the Model

from unsloth import FastLanguageModel
from peft import PeftModel
import torch

# Load base Gemma 3 12B model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/gemma-3-12b-it-qat-bnb-4bit",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,  # Use 4-bit quantization to fit in GPU memory
)

# Load Kubernetes AI LoRA adapters
model = PeftModel.from_pretrained(
    model,
    "aciklab/kubernetes-ai"
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Example usage (Turkish question)
messages = [
    {"role": "user", "content": "Kubernetes'te 3 replikaya sahip bir deployment nasıl oluştururum?"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    input_ids=inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Example Questions

Turkish Examples

# Deployment creation
"Node.js uygulaması için 3 replika, sağlık kontrolleri ve kaynak limitleri olan bir Kubernetes deployment oluştur."

# Troubleshooting
"Pod'um CrashLoopBackOff durumunda. Yaygın nedenleri nelerdir ve nasıl debug ederim?"

# kubectl commands
"Production namespace'indeki çalışmayan tüm pod'ları gösteren kubectl komutunu yaz."

# Best practices
"Kubernetes'te container güvenliği için en iyi uygulamalar nelerdir?"

# Service creation
"LoadBalancer tipinde bir Kubernetes servisi nasıl yapılandırılır?"

English Examples

"How do I create a Kubernetes deployment with 3 replicas?"
"What are the common causes of CrashLoopBackOff?"
"Show me kubectl command to get all pods in production namespace."

Training Dataset

The model was trained on ~157,000 examples from multiple high-quality Kubernetes and DevOps datasets:

Dataset	Count	Description
Kubernetes Official Documentation
- Concepts	2,700	Core Kubernetes concepts
- Kubectl Reference	600	kubectl command documentation
- Setup Guides	430	Installation and setup
- Tasks	4,300	Practical task guides
- Tutorials	880	Step-by-step tutorials
Stack Overflow
mcipriano/stackoverflow-kubernetes-questions	30,000	Kubernetes Q&A
peterpanpan/stackoverflow-kubernetes-questions	22,000	Additional Kubernetes Q&A
DevOps Datasets
Szaid3680/Devops	42,000	General DevOps content
ahmedgongi/Devops_LLM	20,500	Kubernetes-filtered DevOps (from 140k)
Configuration & Operations
HelloBoieeee/kubernetes_config	10,000	Kubernetes configurations
sidddddddddddd/kubernetes-with-ood	6,000	Kubernetes scenarios (incl. Turkish translations)
dereklck/kubernetes_cli_dataset_20k	19,000	kubectl CLI examples
dereklck/kubernetes_operator_3b_1.5k	1,800	Kubernetes operator patterns

Total Training Examples: ~157,210

Training Details

Base Model: unsloth/gemma-3-12b-it-qat-bnb-4bit
Method: LoRA (Low-Rank Adaptation)
Framework: Unsloth
LoRA Rank: 8
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Checkpoint: checkpoint-8175
Max Sequence Length: 1024 tokens
Training Time: 28 hours
Hardware: NVIDIA GeForce RTX 5070 12GB

Hardware Requirements

Minimum VRAM: 12GB (with 4-bit quantization)
Recommended VRAM: 24GB (for faster inference)
CPU RAM: 32GB+
Training Hardware: RTX 5070 12GB

Limitations

May not have information on very recent Kubernetes features released after training
Primarily trained for Turkish language responses, though it can handle English queries
Best suited for technical Kubernetes questions; general conversation capabilities can be limited

Performance Notes

Trained on RTX 5070 12GB in 28 hours
Works with 12GB VRAM using 4-bit quantization
Fast startup by loading only adapters without full model reload

License

This model is released under the MIT License. Free to use in commercial and open-source projects.

Acknowledgments

Google and Unsloth team for the Gemma 3 base model
Unsloth team for the efficient training framework
All dataset contributors
Kubernetes community for comprehensive documentation
NVIDIA for RTX 5070 enabling 28-hour training

Contact

For questions or feedback, please open an issue on the model repository.

Note: This is a LoRA adapter, not a full model. You must load it on top of unsloth/gemma-3-12b-it-qat-bnb-4bit to use it.

Citations

Datasets

@misc{stackoverflow-kubernetes-mcipriano,
  author = {mcipriano},
  title = {Stack Overflow Kubernetes Questions},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/mcipriano/stackoverflow-kubernetes-questions}
}

@misc{devops-szaid,
  author = {Szaid3680},
  title = {DevOps Dataset},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/Szaid3680/Devops}
}

@misc{devops-llm-ahmed,
  author = {ahmedgongi},
  title = {DevOps LLM Dataset},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/ahmedgongi/Devops_LLM}
}

@misc{kubernetes-config-hello,
  author = {HelloBoieeee},
  title = {Kubernetes Config Dataset},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/HelloBoieeee/kubernetes_config}
}

@misc{kubernetes-ood-sidddddddddddd,
  author = {sidddddddddddd},
  title = {Kubernetes with OOD Dataset},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/sidddddddddddd/kubernetes-with-ood}
}

@misc{stackoverflow-kubernetes-peter,
  author = {peterpanpan},
  title = {Stack Overflow Kubernetes Questions},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/peterpanpan/stackoverflow-kubernetes-questions}
}

@misc{kubernetes-operator-derek,
  author = {dereklck},
  title = {Kubernetes Operator Dataset},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/dereklck/kubernetes_operator_3b_1.5k}
}

@misc{kubernetes-cli-derek,
  author = {dereklck},
  title = {Kubernetes CLI Dataset},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/datasets/dereklck/kubernetes_cli_dataset_20k}
}

Model

@misc{kubernetes-ai,
  author = {aciklab},
  title = {Kubernetes AI Turkish - Gemma 3 12B LoRA Adapters},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/aciklab/kubernetes-ai},
  note = {Trained on RTX 5070 12GB in 28 hours}
}

Downloads last month: 8

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aciklab/kubernetes-ai-lora

Base model

google/gemma-3-12b-pt

Finetuned

google/gemma-3-12b-it

Finetuned

google/gemma-3-12b-it-qat-q4_0-unquantized

Quantized

unsloth/gemma-3-12b-it-qat-bnb-4bit

Adapter

(1)

this model

Quantizations

2 models

aciklab
/

kubernetes-ai-lora