4.5bpw/h6 exl2 quantization of NeverSleep/NoromaidxOpenGPT4-1 using default exllamav2 calibration dataset.
Fits in 32GB VRAM with 32k+ context (Q4cache)
ORIGINAL CARD:
Description
This repo contains fp16 files of NoromaidxOpenGPT4-1.
The model was created by merging Noromaid-8x7b-Instruct with Open_Gpt4_8x7B_v0.2 the exact same way Rombodawg done his merge.
The only difference between NoromaidxOpenGPT4-1 and NoromaidxOpenGPT4-2 is that the first iteration use Mixtral-8x7B as a base for the merge (f16), where the second use Open_Gpt4_8x7B_v0.2 as a base (bf16).
After further testing and usage, the two model was released, because they each have their own qualities.
You can download the imatrix file to do many other quant HERE.
Prompt template:
Alpaca
### Instruction:
{system prompt}
### Input:
{prompt}
### Response:
{output}
Mistral
[INST] {prompt} [/INST]
Merge Details
Merge Method
This model was merged using the TIES merge method using mistralai/Mixtral-8x7B-Instruct-v0.1 as a base.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
models:
- model: rombodawg/Open_Gpt4_8x7B_v0.2
parameters:
density: .5
weight: 1
- model: NeverSleep/Noromaid-v0.1-mixtral-8x7b-Instruct-v3
parameters:
density: .5
weight: .7
merge_method: ties
base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
parameters:
normalize: true
int8_mask: true
dtype: float16
Support
If you want to support us, you can here.
- Downloads last month
- 10