--- license: apache-2.0 datasets: - Open-Orca/SlimOrca - argilla/ultrafeedback-binarized-preferences tags: - synthetic-data - self-training - preference-alignment --- # GLEAM-Mixtral-8x7B-Instruct ## Overview GLEAM-Mixtral-8x7B-Instruct is a experimental preference-aligned model built on top of `mistralai/Mixtral-8x7B-Instruct-v0.1`. ## Model Description The model has been optimized with ORPO on a self-generated synthetic preference dataset using 300 examples from `argilla/ultrafeedback-binarized-preferences` and prompts from `Open-Orca/SlimOrca`. ## Prompt Format The model utilizes the standard Mixtral prompt format: **Prompt Example:** ``` [INST] {prompt-0} [/INST] {response} [INST] {prompt-1} [/INST] ``` ## Benchmarks Performance metrics are provided below, comparing GLEAM-Mixtral with the original Mixtral model against `tinyBenchmarks`: | Benchmark | Mixtral (5-shot) | GLEAM-Mixtral (5-shot) | |-------------|------------------|------------------------| | MMLU | 66.8 | 65.5 | | Hellaswag | 87.4 | 77.8 | | ARC | 69.0 | 50.6 | | WinoGrande | 80.7 | 79.5 | Clear degradation is present in some results. ## Model Alignment Preference alignment tests show that GLEAM-Mixtral outperforms Mixtral against a preference model trained on `argilla/ultrafeedback-binarized-preferences` when evaluated against prompts from the validation set of the prompt dataset: | Model | Win Rate (95% CI) | |----------------|-------------------| | Mixtral | 40.89% ± 6.13% | | GLEAM-Mixtral | 59.11% ± 6.13% | ## Usage I woudn't actually recomend using this model for any practical application beyond evaluation and testing, but its a nice proof of concept. ## How to Use (If you are insistent on using it) You can access GLEAM-Mixtral-8x7B-Instruct-v2.0 via the HuggingFace API. Below is a Python snippet demonstrating how to load and use the model: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "Txoka/GLEAM-Mixtral-8x7B-Instruct-v2.0" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) prompt = "[INST] Write a diary entry from the perspective of a cat who believes it's the ruler of the household. [/INST]" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ```