Ponimash's picture
Update README.md
24882be verified
|
raw
history blame
1.05 kB
metadata
license: apache-2.0
language:
  - ru
  - en
base_model:
  - Qwen/Qwen2.5-3B-Instruct
pipeline_tag: text-generation
library_name: transformers

FractalGPT/RuQwen2.5-3b-instruct


Model Overview

  • RuQwen2.5-3b-instruct by FractalGPT is a language model tailored to deliver high-quality Russian language output. Building upon the Qwen2.5 series, it is optimized for Russian-language tasks while retaining broad multilingual support.

  • Improved Russian Language Quality: Adaptations have significantly enhanced the fluency, accuracy, and coherence of Russian text generation, making it an excellent choice for Russian-language applications.

Model Specifications

  • Type: Instruction-tuned Causal Language Model
  • Training Stages: Pretraining & Instruction Tuning
  • Architecture: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
  • Layers: 36
  • Attention Heads (GQA): 24 for Q, 4 for KV
  • Context Length: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens