gemma-2b-zephyr-sft / README.md
qywu's picture
Update README.md
9492916 verified
metadata
license: other
license_name: gemma-terms-of-use
license_link: https://ai.google.dev/gemma/terms
base_model: google/gemma-2b
tags:
  - alignment-handbook
  - trl
  - sft
  - generated_from_trainer
datasets:
  - HuggingFaceH4/deita-10k-v0-sft
model-index:
  - name: gemma-2b-zephyr-sft
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 51.88
            name: normalized accuracy
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 72.63
            name: normalized accuracy
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 42.2
            name: accuracy
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 41.96
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 63.85
            name: accuracy
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 20.09
            name: accuracy

Model Card for Gemma 2B Zephyr SFT

We trained the google/gemma-2b with deita-10k-v0-sft. We carefully selected the hyper-parameters and masked the user tokens during training to achieve the best supervised fine-tuning performance.

Model description

  • Model type: A 2.5B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
  • Language(s) (NLP): Primarily English
  • License: Gemma Terms of Use
  • Finetuned from model: google/gemma-2b

License

This model has the same license as the original Gemma model collection

OpenLLM Leaderboard Performance

Models Avg. ARC HellaSwag MMLU TruthfulQA Winogrande GSM8k
google/gemma-2b 46.37 48.38 71.77 41.77 33.08 66.77 16.91
google/gemma-2b-it 42.75 43.94 62.70 37.65 45.82 60.93 5.46
wandb/gemma-2b-zephyr-sft 47.18 49.74 72.38 41.37 34.42 66.93 18.27
wandb/gemma-2b-zephyr-dpo 46.92 49.66 72.23 41.13 34.47 66.54 17.51
Columbia-NLP/gemma-2b-zephyr-sft 48.75 51.80 72.63 42.20 41.96 63.85 20.09
Columbia-NLP/gemma-2b-zephyr-dpo 49.14 52.22 73.11 42.55 42.64 64.40 19.94

MT-Bench

GPT-4-0125-preview as Judge

Model Total Coding Extraction Humanities Math Reasoning Roleplay STEM Writing
google/gemma-2b-it 4.71 2.95 4.35 6.15 2.90 3.50 5.60 5.50 6.70
wandb/gemma-2b-zephyr-sft 4.03 3.10 3.15 5.00 2.70 2.65 5.10 4.80 5.75
wandb/gemma-2b-zephyr-dpo 4.06 2.80 2.90 5.55 2.65 2.70 5.20 4.80 5.85
Columbia-NLP/gemma-2b-zephyr-sft 4.34 3.10 3.70 6.25 2.65 2.70 5.55 5.25 5.50
Columbia-NLP/gemma-2b-zephyr-dpo 4.75 3.50 4.05 6.75 3.30 3.70 5.85 5.40 5.53