metadata

base_model:
  - uukuguy/speechless-code-mistral-7b-v1.0
  - upaya07/Arithmo2-Mistral-7B
library_name: transformers
tags:
  - mergekit
  - merge
license: apache-2.0
model-index:
  - name: sethuiyer/CodeCalc-Mistral-7B
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 61.95
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/CodeCalc-Mistral-7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 83.64
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/CodeCalc-Mistral-7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 62.78
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/CodeCalc-Mistral-7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 47.49
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/CodeCalc-Mistral-7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 78.3
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/CodeCalc-Mistral-7B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 63.53
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/CodeCalc-Mistral-7B
          name: Open LLM Leaderboard
language:
  - en
pipeline_tag: text-generation

CodeCalc-Mistral-7B

CodeCalc

This is a merge of pre-trained language models created using mergekit.

Usage

Alpaca Instruction Format and Divine Intellect preset.

You are an intelligent programming assistant.

### Instruction:
Implement a linked list in C++

### Response:

Preset:

temperature: 1.31
top_p: 0.14
repetition_penalty: 1.17
top_k: 49

Configuration

The following YAML configuration was used to produce this model:


base_model: uukuguy/speechless-code-mistral-7b-v1.0
dtype: bfloat16
merge_method: ties
models:
- model: uukuguy/speechless-code-mistral-7b-v1.0
- model: upaya07/Arithmo2-Mistral-7B
  parameters:
    density:  [0.25, 0.35, 0.45, 0.35, 0.25]
    weight: [0.1, 0.25, 0.5, 0.25, 0.1]
parameters:
  int8_mask: true

Evaluation

T	Model	Average	ARC	HellaSwag	MMLU	TruthfulQA	Winogrande	GSM8K
🔍	sethuiyer/CodeCalc-Mistral-7B	66.33	61.95	83.64	62.78	47.79	78.3	63.53
📉	uukuguy/speechless-code-mistral-7b-v1.0	63.6	61.18	83.77	63.4	47.9	78.37	47.01

The merge appears to be successful, especially considering the substantial improvement in the GSM8K benchmark while maintaining comparable performance on other metrics.