|
--- |
|
license: cc-by-nc-2.0 |
|
library_name: transformers |
|
tags: |
|
- mixtral |
|
pipeline_tag: text-generation |
|
--- |
|
## Exllama v2 Quantizations of laserxtral |
|
|
|
Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.0.11">turboderp's ExLlamaV2 v0.0.11</a> for quantization. |
|
|
|
# The "main" branch only contains the measurement.json, download one of the other branches for the model (see below) |
|
|
|
Join Our Discord! https://discord.gg/cognitivecomputations |
|
|
|
Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions. |
|
|
|
Conversion was done using the default calibration dataset. |
|
|
|
Default arguments used. |
|
|
|
Original model: https://huggingface.co/cognitivecomputations/laserxtral |
|
|
|
<a href="https://huggingface.co/cognitivecomputations/laserxtral-exl2/tree/6.5">6.5 bits per weight</a> |
|
|
|
<a href="https://huggingface.co/cognitivecomputations/laserxtral-exl2/tree/4">4 bits per weight</a> |
|
|
|
<a href="https://huggingface.co/cognitivecomputations/laserxtral-exl2/tree/3">3 bits per weight</a> |
|
|
|
<a href="https://huggingface.co/cognitivecomputations/laserxtral-exl2/tree/2">2 bits per weight</a> |
|
|
|
Credit to Bartowski for help and model card formatting |
|
|
|
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/655dc641accde1bbc8b41aec/iToMZFTp1DuXnpw9oJ61y.jpeg) |
|
|
|
## Original Model Card Below |
|
|
|
![image/webp](https://cdn-uploads.huggingface.co/production/uploads/646e57a5cb6ea6e6b6df1ad4/BtnWsqZnaG1I6aa-Ldkfz.webp) |
|
|
|
by David, Fernando and Eric |
|
|
|
Sponsored by: [VAGO Solutions](https://vago-solutions.de) |
|
|
|
Join our Discord! https://discord.gg/vT3sktQ3zb |
|
|
|
An experimentation regarding 'lasering' each expert to denoise and enhance model capabilities. |
|
|
|
This model has half size in comparison to the Mixtral 8x7b Instruct. And it basically has the same level of performance (we are working to get a better MMLU score). |
|
|
|
|
|
# Laserxtral - 4x7b (all, except for base, lasered using laserRMT) |
|
|
|
This model is a Mixture of Experts (MoE) made with [mergekit](https://github.com/cg123/mergekit) (mixtral branch). It uses the following base models: |
|
* [cognitivecomputations/dolphin-2.6-mistral-7b-dpo](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo) |
|
* [mlabonne/Marcoro14-7B-slerp (base)](https://huggingface.co/mlabonne/Marcoro14-7B-slerp) |
|
* [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B) |
|
* [Q-bert/MetaMath-Cybertron-Starling](https://huggingface.co/Q-bert/MetaMath-Cybertron-Starling) |
|
* [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) |
|
|
|
|
|
It follows the implementation of laserRMT @ https://github.com/cognitivecomputations/laserRMT |
|
|
|
Here, we are controlling layers checking which ones have lower signal to noise ratios (which are more subject to noise), to apply Laser interventions, still using Machenko Pastur to calculate this ratio. |
|
|
|
We intend to be the first of a family of experimentations being carried out @ Cognitive Computations. |
|
|
|
In this experiment we have observed very high truthfulness and high reasoning capabilities. |