abhinavnmagic's picture
Update README.md
7c61824 verified
|
raw
history blame
1.56 kB
metadata
tags:
  - fp8
  - vllm

Mixtral-8x22B-Instruct-v0.1-FP8

Model Overview

Mixtral-8x22B-Instruct-v0.1 quantized to FP8 weights and activations using per-tensor quantization, ready for inference with vLLM >= 0.5.0.

Usage and Creation

Produced using AutoFP8 with calibration samples from ultrachat.

Evaluation

Open LLM Leaderboard evaluation scores

Mixtral-8x22B-Instruct-v0.1 Mixtral-8x22B-Instruct-v0.1-FP8
(this model)
arc-c
25-shot
72.70 69.19
hellaswag
10-shot
89.08 82.49
mmlu
5-shot
77.77 70.61
truthfulqa
0-shot
68.14 65.73
winogrande
5-shot
85.16 82.63
gsm8k
5-shot
82.03 76.57
Average
Accuracy
79.15 74.53
Recovery 100% 94.17%