---
license: apache-2.0
datasets:
- Open-Orca/SlimOrca
- argilla/ultrafeedback-binarized-preferences
tags:
- synthetic-data
- self-training
- preference-alignment
---
# GLEAM-Mixtral-8x7B-Instruct

## Overview
GLEAM-Mixtral-8x7B-Instruct is a experimental preference-aligned model built on top of `mistralai/Mixtral-8x7B-Instruct-v0.1`.

## Model Description
The model has been optimized with ORPO on a self-generated synthetic preference dataset using 300 examples from `argilla/ultrafeedback-binarized-preferences` and prompts from `Open-Orca/SlimOrca`.

## Prompt Format
The model utilizes the standard Mixtral prompt format:

**Prompt Example:**
```
<s> [INST] {prompt-0} [/INST] {response}</s> [INST] {prompt-1} [/INST]
```

## Benchmarks
Performance metrics are provided below, comparing GLEAM-Mixtral with the original Mixtral model against `tinyBenchmarks`:

| Benchmark   | Mixtral (5-shot) | GLEAM-Mixtral (5-shot) |
|-------------|------------------|------------------------|
| MMLU        | 66.8             | 65.5                   |
| Hellaswag   | 87.4             | 77.8                   |
| ARC         | 69.0             | 50.6                   |
| WinoGrande  | 80.7             | 79.5                   |

Clear degradation is present in some results.

## Model Alignment
Preference alignment tests show that GLEAM-Mixtral outperforms Mixtral against a preference model trained on `argilla/ultrafeedback-binarized-preferences` when evaluated against prompts from the validation set of the prompt dataset:

| Model          | Win Rate (95% CI) |
|----------------|-------------------|
| Mixtral        | 40.89% ± 6.13%    |
| GLEAM-Mixtral  | 59.11% ± 6.13%    |


## Usage
I woudn't actually recomend using this model for any practical application beyond evaluation and testing, but its a nice proof of concept.

## How to Use (If you are insistent on using it)
You can access GLEAM-Mixtral-8x7B-Instruct-v2.0 via the HuggingFace API. Below is a Python snippet demonstrating how to load and use the model:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Txoka/GLEAM-Mixtral-8x7B-Instruct-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)


prompt = "[INST] Write a diary entry from the perspective of a cat who believes it's the ruler of the household. [/INST]"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```