Matter-0.2-7B-DPO / README.md
0-hero's picture
Adding Evaluation Results (#3)
9b6ada7 verified
metadata
language:
  - en
license: apache-2.0
datasets:
  - 0-hero/Matter-0.2-alpha
model-index:
  - name: Matter-0.2-7B-DPO
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: IFEval (0-Shot)
          type: HuggingFaceH4/ifeval
          args:
            num_few_shot: 0
        metrics:
          - type: inst_level_strict_acc and prompt_level_strict_acc
            value: 33.03
            name: strict accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=0-hero/Matter-0.2-7B-DPO
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: BBH (3-Shot)
          type: BBH
          args:
            num_few_shot: 3
        metrics:
          - type: acc_norm
            value: 10.06
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=0-hero/Matter-0.2-7B-DPO
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MATH Lvl 5 (4-Shot)
          type: hendrycks/competition_math
          args:
            num_few_shot: 4
        metrics:
          - type: exact_match
            value: 0.83
            name: exact match
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=0-hero/Matter-0.2-7B-DPO
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GPQA (0-shot)
          type: Idavidrein/gpqa
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 1.23
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=0-hero/Matter-0.2-7B-DPO
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MuSR (0-shot)
          type: TAUR-Lab/MuSR
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 5.87
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=0-hero/Matter-0.2-7B-DPO
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU-PRO (5-shot)
          type: TIGER-Lab/MMLU-Pro
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 1.82
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=0-hero/Matter-0.2-7B-DPO
          name: Open LLM Leaderboard

Matter 7B - 0.2 - DPO (Mistral 7B Finetune)

DPO version of Matter 7B fine-tuned on the Matter dataset, which is curated from over 35 datsets analyzing >6B tokens

Training

Prompt format: This model uses ChatML prompt format.

<|im_start|>system
You are a helpful AI assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Function Calling

Model also supports function calling. Additional tokens for function calling

Model function call tokens

  • <|begin_func|> - Function call start token
  • <|end_func|> - Function call end token

Function call response tokens

  • <|begin_func_response|> - Function response start token
  • <|end_func_response|> - Function response end token

Example

<|im_start|>system
  You are a helpful assistant with access to the following functions. Use them if required -
  { "name": "get_news_headlines",
    "description": "Get the latest news headlines",
    "parameters":
      { "type": "object",
        "properties":
        { "country":
          { "type": "string",
            "description": "The country for which to fetch news"
          }
        },
      "required": [ "country" ]
    }
  }
<|im_end|>
<|im_start|>user
   Can you tell me the latest news headlines for the United States?<|im_end|>
<|im_start|>assistant
  <|begin_func|>{"name": "get_news_headlines", "arguments": '{"country": "United States"}'}<|end_func|><|im_end|>
<|im_start|>user
  <|begin_func_response|>{
    "headlines":
      [
        "Biden announces new vaccine mandates",
        "Hurricane Ida devastates Louisiana",
        "Apple unveils new iPhone",
        "NASA's Perseverance rover collects first Mars rock sample"
      ]
  }<|end_func_response|>
<|im_end|>
<|im_start|>assistant
  Here are the latest news headlines for the United States:
  1. Biden announces new vaccine mandates
  2. Hurricane Ida devastates Louisiana
  3. Apple unveils new iPhone
  4. NASA's Perseverance rover collects first Mars rock sample
<|im_end|>

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 8.81
IFEval (0-Shot) 33.03
BBH (3-Shot) 10.06
MATH Lvl 5 (4-Shot) 0.83
GPQA (0-shot) 1.23
MuSR (0-shot) 5.87
MMLU-PRO (5-shot) 1.82