File size: 5,763 Bytes
33c0287
 
e18e636
 
 
 
 
 
 
 
33c0287
 
 
ba23dce
e18e636
 
a697368
 
 
7621239
 
 
 
 
 
 
33c0287
a697368
1ac1941
 
 
2aa3de1
 
 
0cbaa05
a697368
 
 
 
 
 
 
 
 
62f8389
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a697368
2616687
33c0287
2616687
33c0287
2616687
 
 
 
 
 
 
33c0287
 
 
a697368
 
33c0287
a697368
33c0287
 
 
 
 
 
 
 
a697368
33c0287
a697368
 
 
 
 
 
33c0287
 
a697368
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33c0287
 
a697368
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
base_model:
- LeroyDyer/Mixtral_AI_Multi_TEST
- LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0
- LeroyDyer/Mixtral_AI_CyberLAW
- LeroyDyer/Mixtral_AI_CyberBrain_3_0
- LeroyDyer/Mixtral_AI_Cyber_5.0
- LeroyDyer/Mixtral_AI_CyberBrain_2.0
- ezelikman/quietstar-8-ahead
- KoboldAI/Mistral-7B-Erebus-v3
library_name: transformers
tags:
- mergekit
- megamerge
- code
- Cyber-Series
license: mit
language:
- en
datasets:
- Open-Orca/OpenOrca
- cognitivecomputations/dolphin
- WhiteRabbitNeo/WRN-Chapter-2
- WhiteRabbitNeo/WRN-Chapter-1
- gate369/Alpaca-Star
- gate369/alpaca-star-ascii
---

<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
https://github.com/spydaz

Currently undegoing Fine tuning ! as this model contains all Previous models !


This model contains many hidden tensors : 
As it was emrged with many lora adapter for various task such as vision and sound . 
The problem was that for some reason i could not get the extra heads to show up like other models.
such as the llava model ... i suppose this model can change the config.json to be a llava model and yes ! it works! ie it can think and has hidden think heads ? but you need to config it up !, It has vision heads but also i could not set the config up !
so hidden talents: 
It was also merged with the mothers of these models for QUiet(thoughts) and (llava vision etc ) so the tensors are there . i just did not understand how to fine tne the addtional funcitonalitys. as they need a single trainign example to populate the hidden tensor hence te merges. and yet when the model is put in train mode , ie by setting the model after loading to model.TRAIN ... the tensors apear waiting for training so just add a peft and start the training!


THIS VERSION HAS BEEN UPDATED TO INCLUDE CYBERBRAIN ! (Hidden Tensors)

## Extended capabilities:
  * mistralai/Mistral-7B-Instruct-v0.1 - Prime-Base

  * ChaoticNeutrals/Eris-LelantaclesV2-7b - role play
 
  * ChaoticNeutrals/Eris_PrimeV3-Vision-7B - vision

  * rvv-karma/BASH-Coder-Mistral-7B - coding

  * Locutusque/Hercules-3.1-Mistral-7B - Unhinging

  * KoboldAI/Mistral-7B-Erebus-v3 - NSFW

  * Locutusque/Hyperion-2.1-Mistral-7B - CHAT

  * Severian/Nexus-IKM-Mistral-7B-Pytorch - Thinking

  * NousResearch/Hermes-2-Pro-Mistral-7B - Generalizing
 
  * mistralai/Mistral-7B-Instruct-v0.2 - BASE

  * Nitral-AI/ProdigyXBioMistral_7B - medical

  * Nitral-AI/Infinite-Mika-7b - 128k - Context Expansion enforcement

  * Nous-Yarn-Mistral-7b-128k - 128k - Context Expansion
 
  * yanismiraoui/Yarn-Mistral-7b-128k-sharded

  * ChaoticNeutrals/Eris_Prime-V2-7B - Roleplay


This Expert is a companon to the MEGA_MIND 24b CyberSeries represents a groundbreaking leap in the realm of language models, integrating a diverse array of expert models into a unified framework. At its core lies the Mistral-7B-Instruct-v0.2, a refined instructional model designed for versatility and efficiency.

Enhanced with an expanded context window and advanced routing mechanisms, the Mistral-7B-Instruct-v0.2 exemplifies the power of Mixture of Experts, allowing seamless integration of specialized sub-models. This architecture facilitates unparalleled performance and scalability, enabling the CyberSeries to tackle a myriad of tasks with unparalleled speed and accuracy.

Among its illustrious sub-models, the OpenOrca - Mistral-7B-8k shines as a testament to fine-tuning excellence, boasting top-ranking performance in its class. Meanwhile, the Hermes 2 Pro introduces cutting-edge capabilities such as Function Calling and JSON Mode, catering to diverse application needs.

Driven by Reinforcement Learning from AI Feedback, the Starling-LM-7B-beta demonstrates remarkable adaptability and optimization, while the Phi-1.5 Transformer model stands as a beacon of excellence across various domains, from common sense reasoning to medical inference.

With models like BioMistral tailored specifically for medical applications and Nous-Yarn-Mistral-7b-128k excelling in handling long-context data, the MEGA_MIND 24b CyberSeries emerges as a transformative force in the landscape of language understanding and artificial intelligence.

Experience the future of language models with the MEGA_MIND 24b CyberSeries, where innovation meets performance, and possibilities are limitless.
### Models Merged

The following models were included in the merge:
* [LeroyDyer/Mixtral_AI_Multi_TEST](https://huggingface.co/LeroyDyer/Mixtral_AI_Multi_TEST)
* [LeroyDyer/Mixtral_AI_CyberLAW](https://huggingface.co/LeroyDyer/Mixtral_AI_CyberLAW)
* [LeroyDyer/Mixtral_AI_CyberBrain_3_0](https://huggingface.co/LeroyDyer/Mixtral_AI_CyberBrain_3_0)
* [LeroyDyer/Mixtral_AI_Cyber_5.0](https://huggingface.co/LeroyDyer/Mixtral_AI_Cyber_5.0)

### Configuration

The following YAML configuration was used to produce this model:

```yaml

models:
  - model: LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0
    parameters:
      density: [0.256, 0.512, 0.128] # density gradient
      weight: 0.382
  - model: LeroyDyer/Mixtral_AI_CyberLAW
    parameters:
      density: 0.382
      weight: [0.256, 0.128, 0.256, 0.128] # weight gradient
  - model: LeroyDyer/Mixtral_AI_CyberBrain_3_0
    parameters:
      density: 0.382
      weight: [0.128, 0.512, 0.128, 0.128] # weight gradient
  - model: LeroyDyer/Mixtral_AI_Multi_TEST
    parameters:
      density: 0.382
      weight: [0.128, 0.512, 0.128, 0.128] # weight gradient
  - model: LeroyDyer/Mixtral_AI_Cyber_5.0
    parameters:
      density: 0.382
      weight:
        - filter: mlp
          value: 0.5
        - value: 0
merge_method: ties
base_model:  LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0
parameters:
  normalize: true
  int8_mask: true
dtype: float16

```