GLEAM-Mixtral-8x7B-Instruct

Overview

GLEAM-Mixtral-8x7B-Instruct is a experimental preference-aligned model built on top of mistralai/Mixtral-8x7B-Instruct-v0.1.

Model Description

The model has been optimized with ORPO on a self-generated synthetic preference dataset using 300 examples from argilla/ultrafeedback-binarized-preferences and prompts from Open-Orca/SlimOrca.

Prompt Format

The model utilizes the standard Mixtral prompt format:

Prompt Example:

<s> [INST] {prompt-0} [/INST] {response}</s> [INST] {prompt-1} [/INST]

Benchmarks

Performance metrics are provided below, comparing GLEAM-Mixtral with the original Mixtral model against tinyBenchmarks:

Benchmark	Mixtral (5-shot)	GLEAM-Mixtral (5-shot)
MMLU	66.8	65.5
Hellaswag	87.4	77.8
ARC	69.0	50.6
WinoGrande	80.7	79.5

Clear degradation is present in some results.

Model Alignment

Preference alignment tests show that GLEAM-Mixtral outperforms Mixtral against a preference model trained on argilla/ultrafeedback-binarized-preferences when evaluated against prompts from the validation set of the prompt dataset:

Model	Win Rate (95% CI)
Mixtral	40.89% ± 6.13%
GLEAM-Mixtral	59.11% ± 6.13%

Usage

I woudn't actually recomend using this model for any practical application beyond evaluation and testing, but its a nice proof of concept.

How to Use (If you are insistent on using it)

You can access GLEAM-Mixtral-8x7B-Instruct-v2.0 via the HuggingFace API. Below is a Python snippet demonstrating how to load and use the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Txoka/GLEAM-Mixtral-8x7B-Instruct-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)


prompt = "[INST] Write a diary entry from the perspective of a cat who believes it's the ruler of the household. [/INST]"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Txoka
/

GLEAM-Mixtral-8x7B-Instruct-v2.0