Edit model card


NeuralDaredevil-7B is a DPO fine-tune of mlabonne/Daredevil-7B using the argilla/distilabel-intel-orca-dpo-pairs preference dataset and my DPO notebook from this article.

Thanks Argilla for providing the dataset and the training recipe here. πŸ’ͺ

πŸ† Evaluation


The evaluation was performed using LLM AutoEval on Nous suite.

Model Average AGIEval GPT4All TruthfulQA Bigbench
mlabonne/NeuralDaredevil-7B πŸ“„ 59.39 45.23 76.2 67.61 48.52
mlabonne/Beagle14-7B πŸ“„ 59.4 44.38 76.53 69.44 47.25
argilla/distilabeled-Marcoro14-7B-slerp πŸ“„ 58.93 45.38 76.48 65.68 48.18
mlabonne/NeuralMarcoro14-7B πŸ“„ 58.4 44.59 76.17 65.94 46.9
openchat/openchat-3.5-0106 πŸ“„ 53.71 44.17 73.72 52.53 44.4
teknium/OpenHermes-2.5-Mistral-7B πŸ“„ 52.42 42.75 72.99 52.99 40.94

You can find the complete benchmark on YALL - Yet Another LLM Leaderboard.

Open LLM Leaderboard

Detailed results can be found here

Metric Value
Avg. 74.12
AI2 Reasoning Challenge (25-Shot) 69.88
HellaSwag (10-Shot) 87.62
MMLU (5-Shot) 65.12
TruthfulQA (0-shot) 66.85
Winogrande (5-shot) 82.08
GSM8k (5-shot) 73.16

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/NeuralDaredevil-7B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)

Prompt Tempalte

This model uses the same prompt template as mistralai/Mistral-7B-Instruct-v0.2

See instruction-format for more details

Built with Distilabel

Downloads last month
Model size
7.24B params
Tensor type
Inference API
Input a message to start chatting with mlabonne/NeuralDaredevil-7B.
This model can be loaded on Inference API (serverless).

Finetuned from

Space using mlabonne/NeuralDaredevil-7B 1

Collection including mlabonne/NeuralDaredevil-7B

Evaluation results