YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

King-Kode-128K

An advanced AI-powered coding assistant model based on DeepSeek Coder 33B Instruct with extended capabilities, larger token size, and specialized debugging tokens.

Hugging Face


** Model Summary**

  • Base Model: DeepSeek-Coder-33B-Instruct
  • Model Type: Causal Language Model (AutoRegressive Transformer)
  • Architecture: Transformer with Flash Attention 2
  • Context Length: 128K tokens (extended from 16K)
  • Training Paradigm: Pretrained on diverse programming languages, optimized for debugging, refactoring, and performance analysis.
  • Inference: Optimized for CUDA 12.2 with FlashAttention for faster computations.

** Key Features**

  • ** Extended RoPE (128K Context Length)**: Increased from 16K → 128K using rotary position embeddings (RoPE).
  • ** AI Debugging & Code Optimization**: Includes specialized tokens for debugging, refactoring, and AI-assisted improvements.
  • ** Token Expansion**: Integrated 21+ custom instruction tokens for better AI-guided development.
  • ** Efficient Execution**: Supports FlashAttention-2, 4-bit quantization, and optimized CUDA inference.

🛠️ Model Details

Parameter Value
Base Model DeepSeek-Coder-33B-Instruct
Token Size 128,000 (128K)
RoPE Scaling 8x Dynamic Scaling
Precision FP16 & 4-bit
Flash Attention FlashAttention-2 Enabled
Tokenizer DeepSeek 33B Tokenizer with custom tokens
Architecture Transformer
Layers 80
Hidden Size 7168
Attention Heads 56
Training Data Python, JavaScript, C++, Java, Rust, SQL, Bash, HTML, JSON, YAML, etc.
Batch Size Optimized for large-scale inference

Custom Instruction Tokens

King-Kode-128K introduces 21+ custom tokens for AI-powered debugging, code refactoring, and performance optimization.

Token Purpose
<DEBUG> Debug the code and detect issues.
<FIX_CODE> Automatically fix errors in the script.
<AUTOCOMPLETE> Provide AI-powered code completion.
<AUTO_DEBUG> Perform automatic debugging with AI.
<DEBUG_SCRIPT> Debug the entire script at once.
<GENERATE> Generate code snippets based on user instructions.
<EXPLAIN_ERROR> Explain errors in simple terms.
<SUGGEST_FIX> Provide AI-powered fixes for detected issues.
<CODE_REFACTOR> Suggest and apply better coding practices.
<OPTIMIZE_PERFORMANCE> Identify bottlenecks and optimize code.
<FORMAT_CODE> Format the code to follow best practices.
<REWRITE_CODE> Rewrite existing code for clarity or efficiency.
<CODE_SUMMARY> Summarize what the code does.
<AUTO_COMMENT> Add meaningful comments to the code.
<RUN_TESTS> AI-assisted test execution.
<ANALYZE_COMPLEXITY> Analyze algorithm complexity (Big O analysis).
<FIX_BUG> Automatically identify and fix bugs.
<CHECK_SYNTAX> Check syntax errors in real-time.
<AI_ASSIST> General AI assistant command for debugging & coding.
<INTELLIGENT_COMPLETION> AI-driven code completion for large-scale scripts.

How to Use

Load Model in Python

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "KingyJr/King-Kode-128K"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load model with FlashAttention-2 optimization
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

print("Model and tokenizer loaded successfully!")

Example Usage (AI Debugging)

prompt = "<DEBUG> Fix the following Python code:

print(Hello World)"

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

Model Files

File Description
config.json Model configuration.
generation_config.json Generation settings.
model.safetensors.index.json Index of model shards.
model-00001-of-00007.safetensors - model-00007-of-00007.safetensors Model weights.
special_tokens_map.json Special token mapping.
tokenizer.json Tokenizer settings.
tokenizer_config.json Tokenizer configuration.

Storage & Deployment

  • Inference Optimizations: FlashAttention, quantization (4-bit & 8-bit), tensor parallelism.
  • Fine-tuning & RLHF: Extendable for custom fine-tuning.

License & Usage

  • License: Apache 2.0
  • Intended Use: AI-assisted debugging, software development, and optimization.
  • Restrictions: Not for commercial resale without modification.

Citation

If you use King-Kode-128K, please cite:

@article{KingKode128K,
  author = {KingyJr},
  title = {King-Kode-128K: AI Debugging and Code Assistant},
  journal = {Hugging Face Models},
  year = {2025},
  url = {https://huggingface.co/KingyJr/King-Kode-128K}
}

Contributors

  • KingyJr - Lead Developer
  • DeepSeek-AI - Base Model Contributors

Contact & Support

For issues, suggestions, or contributions, please open a GitHub issue or contact KingyJr.


This model is designed to enhance software development with AI-driven debugging, optimization, and code generation. Enjoy coding!

Downloads last month
13
Safetensors
Model size
33.3B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for KingyJr/King-Kode-128k

Quantizations
2 models