ibndias's picture
Adding Evaluation Results (#1)
1a7695f verified
metadata
language:
  - en
license: apache-2.0
tags:
  - merge
model-index:
  - name: NeuralHermes-MoE-2x7B
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 62.12
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 84.21
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 64.56
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 43.61
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 78.14
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 51.86
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ibndias/NeuralHermes-MoE-2x7B
          name: Open LLM Leaderboard

NeuralHermes-MoE-2x7B

This is a mix between teknium/OpenHermes-2.5-Mistral-7B and Intel/neural-chat-7b-v3-3. Using mistralai/Mistral-7B-v0.1 as the base model.

This Mixture of Expert was done using mergekit method.

Getting Started

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="ibndias/NeuralHermes-MoE-2x7B",torch_dtype=torch.bfloat16, device_map="auto")

prompt = """<|system|> You are a helpful assistant.
<|user|>
Write me bash script to scan ip 192.3.1.4 with nmap only port that ends with 9 from 1-100.
<|assistant|>
"""
outputs = pipe(prompt, max_new_tokens=512, do_sample=True, temperature=0.2, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Output:

<|system|> You are a helpful assistant. <|user|> Write me bash script to scan ip 192.3.1.4 with nmap only port that ends with 9 from 1-100. <|assistant|> Sure, here's a bash script that scans the specified IP address with nmap for open ports that end with 9 from 1 to 100:

#!/bin/bash
IP_ADDRESS="192.3.1.4"
START_PORT=1
END_PORT=100
for ((i=$START_PORT; i<=$END_PORT; i++)); do
   PORT=$i
   if [[ $PORT % 10 == 9 ]]; then
       nmap -p $PORT $IP_ADDRESS
   fi
done

Save the script with a.sh extension (e.g., scan_ports.sh) and make it executable by running chmod +x scan_ports.sh. Then, run the script by executing ./scan_ports.sh. ...

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 64.08
AI2 Reasoning Challenge (25-Shot) 62.12
HellaSwag (10-Shot) 84.21
MMLU (5-Shot) 64.56
TruthfulQA (0-shot) 43.61
Winogrande (5-shot) 78.14
GSM8k (5-shot) 51.86