jondurbin
/

airoboros-7b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

airoboros-7b / README.md

leaderboard-pr-bot's picture

leaderboard-pr-bot

Adding Evaluation Results

2a92e0e 8 months ago

|

1.95 kB

	---
	license: cc-by-nc-4.0
	---

	# Overview

	This is a fine-tuned 7b parameter LlaMa model, using completely synthetic training data created by https://github.com/jondurbin/airoboros

	__I don't recommend using this model! The outputs aren't particularly great, and it may contain "harmful" data due to jailbreak__

	Please see one of the updated airoboros models for a much better experience.

	### Training data

	This was an experiment to see if a "jailbreak" prompt could be used to generate a broader range of data that would otherwise have been filtered by OpenAI's alignment efforts.

	The jailbreak did indeed work with a high success rate, and caused OpenAI to generate a broader range of topics and fewer refusals to answer questions/instructions of sensitive topics.

	### Usage and License Notices

	All airoboros models and datasets are intended and licensed for research use only. I've used the 'cc-nc-4.0' license, but really it is subject to a custom/special license because:

	- the base model is LLaMa, which has it's own special research license
	- the dataset(s) were generated with OpenAI (gpt-4 and/or gpt-3.5-turbo), which has a clausing saying the data can't be used to create models to compete with openai

	So, to reiterate: this model (and datasets) cannot be used commercially.
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_jondurbin__airoboros-7b)

	\| Metric \| Value \|
	\|-----------------------\|---------------------------\|
	\| Avg. \| 44.01 \|
	\| ARC (25-shot) \| 53.07 \|
	\| HellaSwag (10-shot) \| 77.65 \|
	\| MMLU (5-shot) \| 37.23 \|
	\| TruthfulQA (0-shot) \| 43.39 \|
	\| Winogrande (5-shot) \| 70.96 \|
	\| GSM8K (5-shot) \| 2.12 \|
	\| DROP (3-shot) \| 23.66 \|