RichardErkhov's picture
uploaded readme
e5acfe9 verified

Quantization made by Richard Erkhov.

Github

Discord

Request more models

Daredevil-8B - GGUF

Original model description:

license: other tags: - merge - mergekit - lazymergekit base_model: - nbeerbower/llama-3-stella-8B - Hastagaras/llama-3-8b-okay - nbeerbower/llama-3-gutenberg-8B - openchat/openchat-3.6-8b-20240522 - Kukedlc/NeuralLLaMa-3-8b-DT-v0.1 - cstr/llama3-8b-spaetzle-v20 - mlabonne/ChimeraLlama-3-8B-v3 - flammenai/Mahou-1.1-llama3-8B - KingNish/KingNish-Llama3-8b model-index: - name: Daredevil-8B results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 68.86 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 84.5 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 69.24 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 59.89 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 78.45 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 73.54 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B name: Open LLM Leaderboard

Daredevil-8B

image/jpeg

Daredevil-8B is a mega-merge designed to maximize MMLU. On 27 May 24, it is the Llama 3 8B model with the highest MMLU score. From my experience, a high MMLU score is all you need with Llama 3 models.

It is a merge of the following models using LazyMergekit:

Thanks to nbeerbower, Hastagaras, openchat, Kukedlc, cstr, flammenai, and KingNish for their merges. Special thanks to Charles Goddard and Arcee.ai for MergeKit.

πŸ”Ž Applications

You can use it as an improved version of meta-llama/Meta-Llama-3-8B-Instruct.

This is a censored model. For an uncensored version, see mlabonne/Daredevil-8B-abliterated.

Tested on LM Studio using the "Llama 3" preset.

⚑ Quantization

πŸ† Evaluation

Open LLM Leaderboard

Daredevil-8B is the best-performing 8B model on the Open LLM Leaderboard in terms of MMLU score (27 May 24).

image/png

Nous

Daredevil-8B is the best-performing 8B model on Nous' benchmark suite (evaluation performed using LLM AutoEval, 27 May 24). See the entire leaderboard here.

Model Average AGIEval GPT4All TruthfulQA Bigbench
mlabonne/Daredevil-8B πŸ“„ 55.87 44.13 73.52 59.05 46.77
mlabonne/Daredevil-8B-abliterated πŸ“„ 55.06 43.29 73.33 57.47 46.17
mlabonne/Llama-3-8B-Instruct-abliterated-dpomix πŸ“„ 52.26 41.6 69.95 54.22 43.26
meta-llama/Meta-Llama-3-8B-Instruct πŸ“„ 51.34 41.22 69.86 51.65 42.64
failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 πŸ“„ 51.21 40.23 69.5 52.44 42.69
mlabonne/OrpoLlama-3-8B πŸ“„ 48.63 34.17 70.59 52.39 37.36
meta-llama/Meta-Llama-3-8B πŸ“„ 45.42 31.1 69.95 43.91 36.7

🌳 Model family tree

image/png

🧩 Configuration

models:
  - model: NousResearch/Meta-Llama-3-8B
    # No parameters necessary for base model
  - model: nbeerbower/llama-3-stella-8B
    parameters:
      density: 0.6
      weight: 0.16
  - model: Hastagaras/llama-3-8b-okay
    parameters:
      density: 0.56
      weight: 0.1
  - model: nbeerbower/llama-3-gutenberg-8B
    parameters:
      density: 0.6
      weight: 0.18
  - model: openchat/openchat-3.6-8b-20240522
    parameters:
      density: 0.56
      weight: 0.12
  - model: Kukedlc/NeuralLLaMa-3-8b-DT-v0.1
    parameters:
      density: 0.58
      weight: 0.18
  - model: cstr/llama3-8b-spaetzle-v20
    parameters:
      density: 0.56
      weight: 0.08
  - model: mlabonne/ChimeraLlama-3-8B-v3
    parameters:
      density: 0.56
      weight: 0.08
  - model: flammenai/Mahou-1.1-llama3-8B
    parameters:
      density: 0.55
      weight: 0.05
  - model: KingNish/KingNish-Llama3-8b
    parameters:
      density: 0.55
      weight: 0.05
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
dtype: bfloat16

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/Daredevil-8B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])