metadata
license: apache-2.0
language:
- ru
- en
base_model:
- Qwen/Qwen2.5-3B-Instruct
pipeline_tag: text-generation
library_name: transformers
FractalGPT/RuQwen2.5-3b-instruct
Model Overview
RuQwen2.5-3b-instruct by FractalGPT is a language model tailored to deliver high-quality Russian language output. Building upon the Qwen2.5 series, it is optimized for Russian-language tasks while retaining broad multilingual support.
Improved Russian Language Quality: Adaptations have significantly enhanced the fluency, accuracy, and coherence of Russian text generation, making it an excellent choice for Russian-language applications.
Model Specifications
- Type: Instruction-tuned Causal Language Model
- Training Stages: Pretraining & Instruction Tuning
- Architecture: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- Layers: 36
- Attention Heads (GQA): 24 for Q, 4 for KV
- Context Length: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens