metadata

license: mit
library_name: transformers
tags:
  - mergekit
  - merge
base_model:
  - Qwen/Qwen2.5-7B-Instruct-1M
  - Sakalti/SJT-7B-1M
  - Triangle104/Q2.5-Instruct-1M_Harmony
  - bunnycore/Qwen2.5-7B-RRP-1M
  - huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
model-index:
  - name: Qwen2.5-7B-CelestialHarmony-1M
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: IFEval (0-Shot)
          type: HuggingFaceH4/ifeval
          args:
            num_few_shot: 0
        metrics:
          - type: inst_level_strict_acc and prompt_level_strict_acc
            value: 59.44
            name: strict accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: BBH (3-Shot)
          type: BBH
          args:
            num_few_shot: 3
        metrics:
          - type: acc_norm
            value: 34.51
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MATH Lvl 5 (4-Shot)
          type: hendrycks/competition_math
          args:
            num_few_shot: 4
        metrics:
          - type: exact_match
            value: 33.01
            name: exact match
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GPQA (0-shot)
          type: Idavidrein/gpqa
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 9.17
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MuSR (0-shot)
          type: TAUR-Lab/MuSR
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 16.74
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU-PRO (5-shot)
          type: TIGER-Lab/MMLU-Pro
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 37.63
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
          name: Open LLM Leaderboard

ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M

ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M is a custom merged language model based on Qwen2.5-7B with enhanced reasoning, roleplaying, and long-context capabilities. This model supports up to 1 million token context lengths, making it ideal for ultra-long text processing, deep reasoning tasks, and immersive roleplay interactions.

Quants are availble in GGUF format, provided by mradermacher. 1. GGUF 2. imatrix GGUF

🔧 Model Details

Base Model: Qwen/Qwen2.5-7B-Instruct-1M
Models Used in Merge:
- Qwen/Qwen2.5-7B-Instruct-1M
- bunnycore/Qwen2.5-7B-RRP-1M
- Triangle104/Q2.5-Instruct-1M_Harmony
- Sakalti/SJT-7B-1M
- huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
Merge Method: MODEL_STOCK (Optimized layer-wise weight averaging)

📖 Overview

Qwen2.5-7B-CelestialHarmony-1M enhances the Qwen2.5-7B series with a fine-tuned balance of roleplaying dynamics, structured reasoning, and long-context memory. The model is particularly well-suited for:

Roleplaying 🧝‍♂️: Immersive character-based storytelling with deep contextual awareness.
Reasoning & Thought Processing 🧠: Capable of structured logical thinking, especially when prompted with <think> tags.
Ultra-Long Context Handling 📜: Efficient processing of sequences up to 1,010,000 tokens using optimized sparse attention.

⚙️ Technical Specifications

Specification	Value
Model Type	Causal Language Model
Parameters	7.61B
Non-Embedding Parameters	6.53B
Layers	28
Attention Heads (GQA)	28 (Q), 4 (KV)
Max Context Length	1,010,000 tokens
Max Generation Length	8,192 tokens
Merge Method	Model Stock

🔬 Merging Details

This model was merged using the Model Stock method, which optimally averages weights from multiple fine-tuned models to create a more efficient, balanced, and performant model.

Merge YAML Configuration

base_model: Qwen/Qwen2.5-7B-Instruct-1M
dtype: bfloat16
merge_method: model_stock
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Triangle104/Q2.5-Instruct-1M_Harmony
  - model: Sakalti/SJT-7B-1M
  - model: bunnycore/Qwen2.5-7B-RRP-1M
  - model: huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
tokenizer_source: Qwen/Qwen2.5-7B-Instruct-1M

🚀 Quickstart

Install Required Packages

Ensure you have the latest transformers library installed:

pip install transformers torch accelerate

Load and Use the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Tell me a short story about an ancient celestial warrior."
messages = [
    {"role": "system", "content": "You are a wise celestial storyteller."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(**model_inputs, max_new_tokens=512)
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(response)

⚡ Optimized Deployment with vLLM

For long-context inference, use vLLM:

git clone -b dev/dual-chunk-attn git@github.com:QwenLM/vllm.git
cd vllm
pip install -e . -v

Run the model:

vllm serve ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M \
  --tensor-parallel-size 4 \
  --max-model-len 1010000 \
  --enable-chunked-prefill --max-num-batched-tokens 131072 \
  --enforce-eager \
  --max-num-seqs 1

🎯 Model Capabilities

✅ Roleplay & Storytelling – Designed for engaging interactions.
✅ Long-Context Awareness – Handles texts up to 1M tokens.
✅ Logical Thinking & Reasoning – Supports <think> tag to enhance thought structuring.
✅ Optimized Merge Strategy – Uses Model Stock for superior generalization.

📜 Acknowledgments

This model is built on top of Qwen2.5-7B, with contributions from bunnycore, Triangle104, and Sakalti, leveraging the Model Stock merging methodology.

For further details, see:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	31.75
IFEval (0-Shot)	59.44
BBH (3-Shot)	34.51
MATH Lvl 5 (4-Shot)	33.01
GPQA (0-shot)	9.17
MuSR (0-shot)	16.74
MMLU-PRO (5-shot)	37.63