mera-mix-4x7B / README.md
codelion's picture
Update README.md
e2c6bb0 verified
|
raw
history blame
1.07 kB
metadata
license: apache-2.0

Model mera-mix-4x7B

This is a mixture of experts (MoE) model that is half as large (4 experts instead of 8) as the Mixtral-8x7B while been comparable to it across different benchmarks. You can use it as a drop in replacement for your Mixtral-8x7B and get much faster inference.

mera-mix-4x7B achieves 76.37 on the openLLM eval v/s 72.7 by Mixtral-8x7B (as shown here).

OpenLLM Eval

Model ARC HellaSwag MMLU TruthfulQA Winogrande GSM8K Average
mera-mix-4x7B 72.01 88.82 63.67 77.45 84.61 71.65 76.37

Raw eval results are available at this gist