|
--- |
|
base_model: |
|
- LeroyDyer/Mixtral_AI_Multi_TEST |
|
- LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0 |
|
- LeroyDyer/Mixtral_AI_CyberLAW |
|
- LeroyDyer/Mixtral_AI_CyberBrain_3_0 |
|
- LeroyDyer/Mixtral_AI_Cyber_5.0 |
|
- LeroyDyer/Mixtral_AI_CyberBrain_2.0 |
|
- ezelikman/quietstar-8-ahead |
|
- KoboldAI/Mistral-7B-Erebus-v3 |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- megamerge |
|
- code |
|
- Cyber-Series |
|
license: mit |
|
language: |
|
- en |
|
datasets: |
|
- Open-Orca/OpenOrca |
|
- cognitivecomputations/dolphin |
|
- WhiteRabbitNeo/WRN-Chapter-2 |
|
- WhiteRabbitNeo/WRN-Chapter-1 |
|
- gate369/Alpaca-Star |
|
- gate369/alpaca-star-ascii |
|
--- |
|
|
|
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/> |
|
https://github.com/spydaz |
|
|
|
Currently undegoing Fine tuning ! as this model contains all Previous models ! |
|
|
|
|
|
This model contains many hidden tensors : |
|
As it was emrged with many lora adapter for various task such as vision and sound . |
|
The problem was that for some reason i could not get the extra heads to show up like other models. |
|
such as the llava model ... i suppose this model can change the config.json to be a llava model and yes ! it works! ie it can think and has hidden think heads ? but you need to config it up !, It has vision heads but also i could not set the config up ! |
|
so hidden talents: |
|
It was also merged with the mothers of these models for QUiet(thoughts) and (llava vision etc ) so the tensors are there . i just did not understand how to fine tne the addtional funcitonalitys. as they need a single trainign example to populate the hidden tensor hence te merges. and yet when the model is put in train mode , ie by setting the model after loading to model.TRAIN ... the tensors apear waiting for training so just add a peft and start the training! |
|
|
|
|
|
THIS VERSION HAS BEEN UPDATED TO INCLUDE CYBERBRAIN ! (Hidden Tensors) |
|
|
|
## Extended capabilities: |
|
* mistralai/Mistral-7B-Instruct-v0.1 - Prime-Base |
|
|
|
* ChaoticNeutrals/Eris-LelantaclesV2-7b - role play |
|
|
|
* ChaoticNeutrals/Eris_PrimeV3-Vision-7B - vision |
|
|
|
* rvv-karma/BASH-Coder-Mistral-7B - coding |
|
|
|
* Locutusque/Hercules-3.1-Mistral-7B - Unhinging |
|
|
|
* KoboldAI/Mistral-7B-Erebus-v3 - NSFW |
|
|
|
* Locutusque/Hyperion-2.1-Mistral-7B - CHAT |
|
|
|
* Severian/Nexus-IKM-Mistral-7B-Pytorch - Thinking |
|
|
|
* NousResearch/Hermes-2-Pro-Mistral-7B - Generalizing |
|
|
|
* mistralai/Mistral-7B-Instruct-v0.2 - BASE |
|
|
|
* Nitral-AI/ProdigyXBioMistral_7B - medical |
|
|
|
* Nitral-AI/Infinite-Mika-7b - 128k - Context Expansion enforcement |
|
|
|
* Nous-Yarn-Mistral-7b-128k - 128k - Context Expansion |
|
|
|
* yanismiraoui/Yarn-Mistral-7b-128k-sharded |
|
|
|
* ChaoticNeutrals/Eris_Prime-V2-7B - Roleplay |
|
|
|
|
|
This Expert is a companon to the MEGA_MIND 24b CyberSeries represents a groundbreaking leap in the realm of language models, integrating a diverse array of expert models into a unified framework. At its core lies the Mistral-7B-Instruct-v0.2, a refined instructional model designed for versatility and efficiency. |
|
|
|
Enhanced with an expanded context window and advanced routing mechanisms, the Mistral-7B-Instruct-v0.2 exemplifies the power of Mixture of Experts, allowing seamless integration of specialized sub-models. This architecture facilitates unparalleled performance and scalability, enabling the CyberSeries to tackle a myriad of tasks with unparalleled speed and accuracy. |
|
|
|
Among its illustrious sub-models, the OpenOrca - Mistral-7B-8k shines as a testament to fine-tuning excellence, boasting top-ranking performance in its class. Meanwhile, the Hermes 2 Pro introduces cutting-edge capabilities such as Function Calling and JSON Mode, catering to diverse application needs. |
|
|
|
Driven by Reinforcement Learning from AI Feedback, the Starling-LM-7B-beta demonstrates remarkable adaptability and optimization, while the Phi-1.5 Transformer model stands as a beacon of excellence across various domains, from common sense reasoning to medical inference. |
|
|
|
With models like BioMistral tailored specifically for medical applications and Nous-Yarn-Mistral-7b-128k excelling in handling long-context data, the MEGA_MIND 24b CyberSeries emerges as a transformative force in the landscape of language understanding and artificial intelligence. |
|
|
|
Experience the future of language models with the MEGA_MIND 24b CyberSeries, where innovation meets performance, and possibilities are limitless. |
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [LeroyDyer/Mixtral_AI_Multi_TEST](https://huggingface.co/LeroyDyer/Mixtral_AI_Multi_TEST) |
|
* [LeroyDyer/Mixtral_AI_CyberLAW](https://huggingface.co/LeroyDyer/Mixtral_AI_CyberLAW) |
|
* [LeroyDyer/Mixtral_AI_CyberBrain_3_0](https://huggingface.co/LeroyDyer/Mixtral_AI_CyberBrain_3_0) |
|
* [LeroyDyer/Mixtral_AI_Cyber_5.0](https://huggingface.co/LeroyDyer/Mixtral_AI_Cyber_5.0) |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
|
|
models: |
|
- model: LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0 |
|
parameters: |
|
density: [0.256, 0.512, 0.128] # density gradient |
|
weight: 0.382 |
|
- model: LeroyDyer/Mixtral_AI_CyberLAW |
|
parameters: |
|
density: 0.382 |
|
weight: [0.256, 0.128, 0.256, 0.128] # weight gradient |
|
- model: LeroyDyer/Mixtral_AI_CyberBrain_3_0 |
|
parameters: |
|
density: 0.382 |
|
weight: [0.128, 0.512, 0.128, 0.128] # weight gradient |
|
- model: LeroyDyer/Mixtral_AI_Multi_TEST |
|
parameters: |
|
density: 0.382 |
|
weight: [0.128, 0.512, 0.128, 0.128] # weight gradient |
|
- model: LeroyDyer/Mixtral_AI_Cyber_5.0 |
|
parameters: |
|
density: 0.382 |
|
weight: |
|
- filter: mlp |
|
value: 0.5 |
|
- value: 0 |
|
merge_method: ties |
|
base_model: LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0 |
|
parameters: |
|
normalize: true |
|
int8_mask: true |
|
dtype: float16 |
|
|
|
``` |