File size: 894 Bytes
9439554
 
76c8ca9
 
 
 
9439554
76c8ca9
 
 
 
 
 
 
 
 
 
 
 
a9770df
 
 
76c8ca9
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
license: apache-2.0
tags:
- mixtral
- conversational
- finetune
---

# Model Card for Cerebrum-1.0-8x7b-imatrix-GGUF

Quantized from https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b
using llama.cpp commit 46acb3676718b983157058aecf729a2064fc7d34 utilizing an importance matrix.

Quants will be upload with slow german internet so they will appear 1 by 1, stay tuned.

imatrix generated with:

./imatrix -ofreq 4 -b 512 -c 512 -t 14 --chunks 24 -m ../models/Cerebrum-1.0-8x7b-GGUF/cerebrum-1.0-8x7b-Q8_0.gguf -f ./groups_merged.txt

with the dataset from here: 
https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384

Sadly this means the imatrix is generated from the Q8 instead of the unquantized f16, like it should be, sadly I can't get it to work with the f16 on my machine at the moment. It should still improve the performance of the quants though.