metadata

license: other
library_name: transformers
base_model: meta-llama/Meta-Llama-3-8B
datasets:
  - mlabonne/orpo-dpo-mix-40k
  - Open-Orca/SlimOrca-Dedup
  - jondurbin/airoboros-3.2
  - microsoft/orca-math-word-problems-200k
  - m-a-p/Code-Feedback
  - MaziyarPanahi/WizardLM_evol_instruct_V2_196k
model-index:
  - name: llama-3-neural-chat-v1-8b
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 60.84
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/llama-3-neural-chat-v1-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 84.13
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/llama-3-neural-chat-v1-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 64.69
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/llama-3-neural-chat-v1-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 56.34
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/llama-3-neural-chat-v1-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 78.22
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/llama-3-neural-chat-v1-8b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 54.81
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/llama-3-neural-chat-v1-8b
          name: Open LLM Leaderboard

llama-3-neural-chat-v1-8b

Model Details

Model Description

I fine-tuned llama-3 8B on an approach similar to Intel's neural chat language model. I have slightly modified the data sources so it is stronger in coding, math, and writing. I use both SFT and DPO.

Developed by: Locutusque
Model type: Built with Meta Llama 3
Language(s) (NLP): Many?
License: Llama 3 license https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE

Quants

EXL2 @bartowski

https://huggingface.co/bartowski/llama-3-neural-chat-v1-8b-exl2

GGUF @bartowski

https://huggingface.co/bartowski/llama-3-neural-chat-v1-8b-GGUF

Uses

This model has great performance in writing and coding.

Training Data

Open-Orca/SlimOrca-Dedup
jondurbin/airoboros-3.2
microsoft/orca-math-word-problems-200k
m-a-p/Code-Feedback
MaziyarPanahi/WizardLM_evol_instruct_V2_196k
mlabonne/orpo-dpo-mix-40k

Direct Use

Conversational AI.

Evaluations

Tasks	Version	Filter	n-shot	Metric	Value		Stderr
truthfulqa_mc2	2	none	0	acc	0.5627	±	0.0154
gsm8k	3	strict-match	5	exact_match	0.5481	±	0.0137
		flexible-extract	5	exact_match	0.5557	±	0.0137
agieval_nous	N/A	none	0	acc	0.3763	±	0.0093
		none	0	acc_norm	0.3665	±	0.0093
- agieval_aqua_rat	1	none	0	acc	0.2087	±	0.0255
		none	0	acc_norm	0.2047	±	0.0254
- agieval_logiqa_en	1	none	0	acc	0.3456	±	0.0187
		none	0	acc_norm	0.3594	±	0.0188
- agieval_lsat_ar	1	none	0	acc	0.1826	±	0.0255
		none	0	acc_norm	0.1783	±	0.0253
- agieval_lsat_lr	1	none	0	acc	0.3549	±	0.0212
		none	0	acc_norm	0.3451	±	0.0211
- agieval_lsat_rc	1	none	0	acc	0.5242	±	0.0305
		none	0	acc_norm	0.5130	±	0.0305
- agieval_sat_en	1	none	0	acc	0.6650	±	0.0330
		none	0	acc_norm	0.6505	±	0.0333
- agieval_sat_en_without_passage	1	none	0	acc	0.4175	±	0.0344
		none	0	acc_norm	0.3738	±	0.0338
- agieval_sat_math	1	none	0	acc	0.4227	±	0.0334
		none	0	acc_norm	0.3682	±	0.0326

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	66.50
AI2 Reasoning Challenge (25-Shot)	60.84
HellaSwag (10-Shot)	84.13
MMLU (5-Shot)	64.69
TruthfulQA (0-shot)	56.34
Winogrande (5-shot)	78.22
GSM8k (5-shot)	54.81