File size: 3,437 Bytes
565a12d
 
 
 
 
 
 
 
 
f82e566
565a12d
 
 
 
f971ac5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ca4994
 
f971ac5
56eeaea
 
f971ac5
 
 
 
 
 
 
 
 
 
 
 
56eeaea
 
 
 
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
56eeaea
ab6ac75
67d49fc
ab6ac75
67d49fc
ab6ac75
56eeaea
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
license: apache-2.0
base_model:
- Qwen/Qwen2.5-7B
pipeline_tag: text-generation
tags:
- not-for-all-audiences
language:
- en
library_name: transformers
---

## Model Description

Model created by analyzing and selecting the optimal layers from other Qwen2.5-7B models based on their dimensional utilization efficiency, measured by the Normalized Effective Rank (NER). Computed like:

Singular Value Decomposition:
   - Input: Weight matrix A ∈ R^(m×n) # m = number of output features, n = number of input features
   - Compute singular values σᵢ where σᵢ ≥ 0 # σᵢ represents the importance of each dimension
   - Filter values above numerical threshold (>1e-12) # removes numerical noise from computation

Distribution Normalization:
   - Sum all singular values: S = Σσᵢ # S acts as normalization factor
   - Create probability distribution: pᵢ = σᵢ/S # converts singular values to probabilities summing to 1

Entropy Calculation:
   - Compute Shannon entropy: H = -Σ(pᵢ * log₂(pᵢ)) # measures information content of distribution
   - Calculate maximum possible entropy: H_max = log₂(n) # n = number of singular values
   where n is the number of singular values # maximum entropy occurs when all dimensions contribute equally

Normalization:
   - Final NER score = H/H_max # normalizes score to [0,1] range
   - Results in value between 0 and 1 # 0 = single dimension dominance, 1 = perfect dimensional utilization
   - Higher scores indicate more uniform dimensional utilization

## Creating Composite Model

Code here: https://huggingface.co/jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0/blob/main/ner_merge.py

Layer Analysis:
   - Download base and fine-tuned models from Hugging Face Hub 
   - Calculate Normalized Effective Rank (NER) for each layer within each model 

Layer Selection:
   - Identify common layer structures across models 
   - Define model and layer name pairs that have highest NER for each layer based on their NER scores 

Model Composition:
   - Incrementally build a composite model using layer with highest NER from model pool. 

Output Generation:
   - Save merge reports documenting layer sources 
   - Copy config and tokenizer files from base model
   - Save the composite model with complete weights # model ready to use

Configfile:

base_model: "Qwen/Qwen2.5-7B"

fine_tuned_models: # uncomment the models you want to merge

#- "Qwen/Qwen2.5-7B"

#- "Qwen/Qwen2.5-7B-Instruct"

#- "FourOhFour/Vapor_v2_7B"

#- "Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2"

#- "happzy2633/qwen2.5-7b-ins-v3"

#- "huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2"

#- "HumanLLMs/Humanish-Qwen2.5-7B-Instruct"

#- "Orion-zhen/Qwen2.5-7B-Instruct-Uncensored"

#- "Orion-zhen/Meissa-Qwen2.5-7B-Instruct"

#- "jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0"

#- "rombodawg/Rombos-LLM-V2.5-Qwen-7b"

#- "Cran-May/T.E-8.1"

#- "thomas-yanxin/XinYuan-Qwen2.5-7B-0917"

#- "beomi/Qwen2.5-7B-Instruct-kowiki-qa"

#- "Orion-zhen/Qwen2.5-7B-Gutenberg-KTO"

#- 'fblgit/cybertron-v4-qw7B-MGS'

#- 'nguyentd/FinancialAdvice-Qwen2.5-7B'

#- "Qwen/Qwen2.5-Coder-7B-Instruct"

#- "Qwen/Qwen2.5-Math-7B-Instruct" 

#- "Qwen/Qwen2.5-Coder-7B"

#- "Qwen/Qwen2.5-Math-7B"

#- "WhiteRabbitNeo/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B"

#- "edgerunner-ai/EdgeRunner-Command-Nested"

#- "katanemo/Arch-Function-7B"

models_dir: "./input_models/"

output_dir: "./merged_model/"

metric_dir: "./metrics/"