Exllamav2 quant (exl2 / 5.0 bpw) made with ExLlamaV2 v0.0.21

Other EXL2 quants:

Quant Model Size lm_head
2.2
7777 MB
6
2.5
8519 MB
6
3.0
9944 MB
6
3.5
11365 MB
6
3.75
12080 MB
6
4.0
12789 MB
6
4.25
13503 MB
6
5.0
15632 MB
6
6.0
18594 MB
8
6.5
19969 MB
8
8.0
24115 MB
8

(Maybe i'll change the waifu picture later)

GGUF/Exl2 quants

Experimental RP-oriented MoE, the idea was to get a model that would be equal to or better than Mixtral 8x7B and it's finetunes in RP/ERP tasks.

Llama 3 SnowStorm 4x8B

base_model: NeverSleep_Llama-3-Lumimaid-8B-v0.1-OAS
gate_mode: random
dtype: bfloat16
experts_per_token: 2
experts:
  - source_model: ChaoticNeutrals_Poppy_Porpoise-v0.7-L3-8B
  - source_model: NeverSleep_Llama-3-Lumimaid-8B-v0.1-OAS
  - source_model: openlynn_Llama-3-Soliloquy-8B-v2
  - source_model: Sao10K_L3-8B-Stheno-v3.1

Models used

Difference(from ChaoticSoliloquy v1.5)

Vision

llama3_mmproj

image/png

Prompt format: Llama 3

Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.