Qwen2.5-14B-MetaMergev2
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the DARE TIES merge method using CultriX/Qwen2.5-14B-Brocav7 as a base.
Models Merged
The following models were included in the merge:
- sometimesanotion/Qwenvergence-14B-v3-Prose
- djuna/Q2.5-Veltha-14B-0.5
- CultriX/Qwen2.5-14B-Wernickev3
- CultriX/Qwen2.5-14B-Brocav9
- CultriX/Qwen2.5-14B-partialmergept1
- CultriX/Qwen2.5-14B-Broca
- CultriX/Qwen2.5-14B-Unity
- CultriX/Qwenfinity-2.5-14B
- allknowingroger/QwenSlerp6-14B
Configuration
The following YAML configuration was used to produce this model:
models:
- model: CultriX/Qwen2.5-14B-Brocav7
parameters:
weight: 0.18 # Backbone for logical reasoning and multitask performance.
density: 0.55 # Balances precision and versatility for critical tasks.
- model: djuna/Q2.5-Veltha-14B-0.5
parameters:
weight: 0.15 # Advanced reasoning contributor with balanced impact.
density: 0.45 # Retains essential parameters for contextual tasks.
- model: allknowingroger/QwenSlerp6-14B
parameters:
weight: 0.09 # Specialized contributions to MMLU-PRO and multitask performance.
density: 0.35 # Focused on key parameters for contextual reasoning.
- model: sometimesanotion/Qwenvergence-14B-v3-Prose
parameters:
weight: 0.09 # Supports MATH, GPQA, and MUSR benchmarks without redundancy.
density: 0.40 # Balanced retention for logical reasoning.
- model: CultriX/Qwen2.5-14B-Broca
parameters:
weight: 0.07 # Logical reasoning and tiny benchmarks contributor.
density: 0.40 # Ensures critical reasoning parameters are preserved.
- model: CultriX/Qwenfinity-2.5-14B
parameters:
weight: 0.05 # Generalist multitask performer with broad contributions.
density: 0.40 # Balances multitask performance and precision.
- model: CultriX/Qwen2.5-14B-Unity
parameters:
weight: 0.04 # Enhances MUSR and BBH tasks with unique capabilities.
density: 0.40 # Retains enough parameters for balanced task support.
- model: CultriX/Qwen2.5-14B-Wernickev3
parameters:
weight: 0.03 # Focused on language understanding and MUSR tasks.
density: 0.35 # Preserves high-quality parameters without overlap.
- model: CultriX/Qwen2.5-14B-partialmergept1
parameters:
weight: 0.13 # Balanced contributions to multitask benchmarks like MMLU-PRO.
density: 0.45 # Retains essential parameters without over-representation.
- model: CultriX/Qwen2.5-14B-Brocav9
parameters:
weight: 0.19 # Strong logical reasoning and multitask contributor.
density: 0.50 # Retains more parameters to maximize impact.
base_model: CultriX/Qwen2.5-14B-Brocav7
# Chosen for its logical reasoning and task versatility.
merge_method: dare_ties
# Ensures smooth integration of diverse model strengths.
parameters:
normalize: true # Ensures consistency in parameter scaling.
int8_mask: true # Optimizes memory and computation.
dtype: bfloat16
# Provides high precision with efficient memory usage, ideal for large-scale models.
tokenizer_source: CultriX/Qwen2.5-14B-Brocav7
# Matches the tokenizer to the base model for compatibility.
adaptive_merge_parameters:
task_weights:
tinyArc: 1.85 # Logical reasoning priority from Brocav7 and Brocav9.
tinyHellaswag: 1.7 # Balanced contextual understanding.
tinyMMLU: 1.9 # Enhanced domain knowledge from multitask models.
tinyTruthfulQA: 2.2 # Prioritized factual reasoning with Veltha's strength.
tinyTruthfulQA_mc1: 2.0 # Balanced focus on multiple-choice reasoning.
tinyWinogrande: 2.0 # Advanced contextual predictions.
IFEval: 2.5 # Instruction-following maximized with Brocav9.
BBH: 2.2 # Strengthened for complex reasoning tasks.
MATH: 2.4 # High focus on mathematical problem-solving.
GPQA: 2.15 # Enhanced QA capabilities leveraging Brocav7 and Brocav9.
MUSR: 2.2 # Balanced for multi-step reasoning improvements.
MMLU-PRO: 2.35 # High domain multitask performance weight.
smoothing_factor: 0.03
# Further reduced for sharper task-specific blending, preserving distinct strengths.
gradient_clipping:
CultriX/Qwen2.5-14B-Brocav7: 0.77 # Ensures stable contributions while leveraging strong logical reasoning.
djuna/Q2.5-Veltha-14B-0.5: 0.83 # Optimized for advanced reasoning and MUSR tasks.
allknowingroger/QwenSlerp6-14B: 0.80 # Supports contributions to MMLU-PRO while maintaining stability.
sometimesanotion/Qwenvergence-14B-v3-Prose: 0.79 # Calibrated for precision in MATH, GPQA, and MUSR.
CultriX/Qwen2.5-14B-Broca: 0.81 # Fine-tuned for logical reasoning enhancements.
CultriX/Qwenfinity-2.5-14B: 0.79 # Balanced for multitask contributions.
CultriX/Qwen2.5-14B-Unity: 0.81 # Calibrated to support unique task contributions.
CultriX/Qwen2.5-14B-Wernickev3: 0.83 # Optimized for high-quality language understanding and GPQA.
CultriX/Qwen2.5-14B-partialmergept1: 0.82 # Supports balanced multitask performance.
CultriX/Qwen2.5-14B-Brocav9: 0.83 # Further optimized for logical reasoning and multitask improvements.
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for CultriX/model
Merge model
this model