Update README.md

dfbbdcc verified 2 months ago

No virus

6.6 kB

	---
	license: cc-by-nc-4.0
	tags:
	- merge
	- mergekit
	- lazymergekit
	- dpo
	- rlhf
	- mlabonne/example
	base_model: mlabonne/Daredevil-7B
	model-index:
	- name: NeuralDaredevil-7B
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 69.88
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/NeuralDaredevil-7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 87.62
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/NeuralDaredevil-7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 65.12
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/NeuralDaredevil-7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 66.85
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/NeuralDaredevil-7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 82.08
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/NeuralDaredevil-7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 73.16
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/NeuralDaredevil-7B
	name: Open LLM Leaderboard
	---

	![](https://i.imgur.com/D80Ua7T.png)

	# NeuralDaredevil-7B

	NeuralDaredevil-7B is a DPO fine-tune of [mlabonne/Daredevil-7B](https://huggingface.co/mlabonne/Daredevil-7B) using the [argilla/distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs) preference dataset and my DPO notebook from [this article](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac).

	Thanks [Argilla](https://huggingface.co/argilla) for providing the dataset and the training recipe [here](https://huggingface.co/argilla/distilabeled-Marcoro14-7B-slerp). 💪

	## 🏆 Evaluation

	### Nous

	The evaluation was performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval) on Nous suite.

	\| Model \| Average \| AGIEval \| GPT4All \| TruthfulQA \| Bigbench \|
	\|---\|---:\|---:\|---:\|---:\|---:\|
	\| [mlabonne/NeuralDaredevil-7B](https://huggingface.co/mlabonne/NeuralDaredevil-7B) [📄](https://gist.github.com/mlabonne/cbeb077d1df71cb81c78f742f19f4155) \| 59.39 \| 45.23 \| 76.2 \| 67.61 \| 48.52 \|
	\| [mlabonne/Beagle14-7B](https://huggingface.co/mlabonne/Beagle14-7B) [📄](https://gist.github.com/mlabonne/f5a5bf8c0827bbec2f05b97cc62d642c) \| 59.4 \| 44.38 \| 76.53 \| 69.44 \| 47.25 \|
	\| [argilla/distilabeled-Marcoro14-7B-slerp](https://huggingface.co/argilla/distilabeled-Marcoro14-7B-slerp) [📄](https://gist.github.com/mlabonne/9082c4e59f4d3f3543c5eda3f4807040) \| 58.93 \| 45.38 \| 76.48 \| 65.68 \| 48.18 \|
	\| [mlabonne/NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B) [📄](https://gist.github.com/mlabonne/b31572a4711c945a4827e7242cfc4b9d) \| 58.4 \| 44.59 \| 76.17 \| 65.94 \| 46.9 \|
	\| [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) [📄](https://gist.github.com/mlabonne/1afab87b543b0717ec08722cf086dcc3) \| 53.71 \| 44.17 \| 73.72 \| 52.53 \| 44.4 \|
	\| [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) [📄](https://gist.github.com/mlabonne/88b21dd9698ffed75d6163ebdc2f6cc8) \| 52.42 \| 42.75 \| 72.99 \| 52.99 \| 40.94 \|

	You can find the complete benchmark on [YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).

	# [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)

	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_mlabonne__NeuralDaredevil-7B)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|74.12\|
	\|AI2 Reasoning Challenge (25-Shot)\|69.88\|
	\|HellaSwag (10-Shot) \|87.62\|
	\|MMLU (5-Shot) \|65.12\|
	\|TruthfulQA (0-shot) \|66.85\|
	\|Winogrande (5-shot) \|82.08\|
	\|GSM8k (5-shot) \|73.16\|

	## 💻 Usage

	```python
	!pip install -qU transformers accelerate

	from transformers import AutoTokenizer
	import transformers
	import torch

	model = "mlabonne/NeuralDaredevil-7B"
	messages = [{"role": "user", "content": "What is a large language model?"}]

	tokenizer = AutoTokenizer.from_pretrained(model)
	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	pipeline = transformers.pipeline(
	"text-generation",
	model=model,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
	print(outputs[0]["generated_text"])
	```

	<p align="center">
	<a href="https://github.com/argilla-io/distilabel">
	<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/>
	</a>
	</p>