Quant-Cartel
/

Cerebrum-1.0-8x7b-iMat-GGUF

GGUF

Merge

iMat

conversational

Model card Files Files and versions Community

InferenceIllusionist commited on Mar 30, 2024

Commit

cf48ee0

verified ·

1 Parent(s): e468ef4

Added t/s benchmark, updated sizes, fixed tables

Browse files

Files changed (1) hide show

README.md +11 -5

README.md CHANGED Viewed

@@ -39,9 +39,15 @@ Please note importance matrix quantizations are a work in progress, IQ3 and abov
 | Quant | Size (GB) | Comments |
 |:-----|--------:|:------|
-| [IQ4_XS](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ4_XS.gguf) |  25.1 |  Better quality than Q3_K_L and below |
-| [Q4_K_M ](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-Q4_K_M.gguf) |  28.4 | |
-| [Q5_K_M ](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-Q5_K_M.gguf) |  33.2 | |
 Original model card can be found [here](https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b) and below.
@@ -63,8 +69,8 @@ Cerebrum 8x7b offers competitive performance to Gemini 1.0 Pro and GPT-3.5 Turbo
 ## Benchmarking
 An overview of Cerebrum 8x7b performance compared to Gemini 1.0 Pro, GPT-3.5 and Mixtral 8x7b on selected benchmarks:
-<img src="benchmarking.png" alt="benchmarking_chart" width="750"/>
-<img src="benchmarking_table.png" alt="benchmarking_table" width="750"/>
 Evaluation details:
 1) ARC-C: all models evaluated zero-shot. Gemini 1.0 Pro and GPT-3.5 (gpt-3.5-turbo-0125) evaluated via API, reported numbers taken for Mixtral 8x7b.

 | Quant | Size (GB) | Comments |
 |:-----|--------:|:------|
+| [IQ2_S](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ2_S.gguf?download=true) |  14.1 | Can be fully offloaded on 16GB VRAM for up to 38.94 t/s [(source)](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/discussions/1#660897c68695a785ed7363b3) |
+| [IQ2_M](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ2_M.gguf?download=true) |  15.5 | |
+| [IQ3_XXS](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ3_XXS.gguf?download=true) | 18.2 | |
+| [IQ3_XS](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ3_XS.gguf?download=true) |  19.3 |  |
+| [IQ3_S](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ3_S.gguf?download=true) |  20.4 |  |
+| [IQ3_M](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/resolve/main/Cerebrum-1.0-8x7b-iMat-IQ3_M.gguf?download=true) |  21.4 |   |
+| [IQ4_XS](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ4_XS.gguf?download=true) |  25.1 |  Better quality than Q3_K_L and below |
+| [Q4_K_M ](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-Q4_K_M.gguf?download=true) |  28.4 | |
+| [Q5_K_M ](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-Q5_K_M.gguf?download=true) |  33.2 | |
 Original model card can be found [here](https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b) and below.
 ## Benchmarking
 An overview of Cerebrum 8x7b performance compared to Gemini 1.0 Pro, GPT-3.5 and Mixtral 8x7b on selected benchmarks:
+<img src="https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b/resolve/main/benchmarking.png?download=true" alt="benchmarking_chart" width="750"/>
+<img src="https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b/resolve/main/benchmarking_table.png?download=true" alt="benchmarking_table" width="750"/>
 Evaluation details:
 1) ARC-C: all models evaluated zero-shot. Gemini 1.0 Pro and GPT-3.5 (gpt-3.5-turbo-0125) evaluated via API, reported numbers taken for Mixtral 8x7b.