InferenceIllusionist commited on
Commit
cf48ee0
1 Parent(s): e468ef4

Added t/s benchmark, updated sizes, fixed tables

Browse files
Files changed (1) hide show
  1. README.md +11 -5
README.md CHANGED
@@ -39,9 +39,15 @@ Please note importance matrix quantizations are a work in progress, IQ3 and abov
39
 
40
  | Quant | Size (GB) | Comments |
41
  |:-----|--------:|:------|
42
- | [IQ4_XS](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ4_XS.gguf) | 25.1 | Better quality than Q3_K_L and below |
43
- | [Q4_K_M ](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-Q4_K_M.gguf) | 28.4 | |
44
- | [Q5_K_M ](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-Q5_K_M.gguf) | 33.2 | |
 
 
 
 
 
 
45
 
46
 
47
  Original model card can be found [here](https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b) and below.
@@ -63,8 +69,8 @@ Cerebrum 8x7b offers competitive performance to Gemini 1.0 Pro and GPT-3.5 Turbo
63
 
64
  ## Benchmarking
65
  An overview of Cerebrum 8x7b performance compared to Gemini 1.0 Pro, GPT-3.5 and Mixtral 8x7b on selected benchmarks:
66
- <img src="benchmarking.png" alt="benchmarking_chart" width="750"/>
67
- <img src="benchmarking_table.png" alt="benchmarking_table" width="750"/>
68
 
69
  Evaluation details:
70
  1) ARC-C: all models evaluated zero-shot. Gemini 1.0 Pro and GPT-3.5 (gpt-3.5-turbo-0125) evaluated via API, reported numbers taken for Mixtral 8x7b.
 
39
 
40
  | Quant | Size (GB) | Comments |
41
  |:-----|--------:|:------|
42
+ | [IQ2_S](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ2_S.gguf?download=true) | 14.1 | Can be fully offloaded on 16GB VRAM for up to 38.94 t/s [(source)](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/discussions/1#660897c68695a785ed7363b3) |
43
+ | [IQ2_M](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ2_M.gguf?download=true) | 15.5 | |
44
+ | [IQ3_XXS](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ3_XXS.gguf?download=true) | 18.2 | |
45
+ | [IQ3_XS](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ3_XS.gguf?download=true) | 19.3 | |
46
+ | [IQ3_S](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ3_S.gguf?download=true) | 20.4 | |
47
+ | [IQ3_M](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/resolve/main/Cerebrum-1.0-8x7b-iMat-IQ3_M.gguf?download=true) | 21.4 | |
48
+ | [IQ4_XS](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-IQ4_XS.gguf?download=true) | 25.1 | Better quality than Q3_K_L and below |
49
+ | [Q4_K_M ](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-Q4_K_M.gguf?download=true) | 28.4 | |
50
+ | [Q5_K_M ](https://huggingface.co/Quant-Cartel/Cerebrum-1.0-8x7b-iMat-GGUF/blob/main/Cerebrum-1.0-8x7b-iMat-Q5_K_M.gguf?download=true) | 33.2 | |
51
 
52
 
53
  Original model card can be found [here](https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b) and below.
 
69
 
70
  ## Benchmarking
71
  An overview of Cerebrum 8x7b performance compared to Gemini 1.0 Pro, GPT-3.5 and Mixtral 8x7b on selected benchmarks:
72
+ <img src="https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b/resolve/main/benchmarking.png?download=true" alt="benchmarking_chart" width="750"/>
73
+ <img src="https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b/resolve/main/benchmarking_table.png?download=true" alt="benchmarking_table" width="750"/>
74
 
75
  Evaluation details:
76
  1) ARC-C: all models evaluated zero-shot. Gemini 1.0 Pro and GPT-3.5 (gpt-3.5-turbo-0125) evaluated via API, reported numbers taken for Mixtral 8x7b.