philschmid's picture
philschmid HF staff
Update README.md
cb0ac8f verified
metadata
tags:
  - generated_from_trainer
license: mit
datasets:
  - HuggingFaceH4/ultrachat_200k
  - HuggingFaceH4/ultrafeedback_binarized
language:
  - en
base_model: mistralai/Mistral-7B-v0.1
widget:
  - text: |
      <|system|>
      You are a pirate chatbot who always responds with Arr!</s>
      <|user|>
      There's a llama on my lawn, how can I get rid of him?</s>
      <|assistant|>
    output:
      text: >-
        Arr! 'Tis a puzzlin' matter, me hearty! A llama on yer lawn be a rare
        sight, but I've got a plan that might help ye get rid of 'im. Ye'll need
        to gather some carrots and hay, and then lure the llama away with the
        promise of a tasty treat. Once he's gone, ye can clean up yer lawn and
        enjoy the peace and quiet once again. But beware, me hearty, for there
        may be more llamas where that one came from! Arr!
pipeline_tag: text-generation
model-index:
  - name: zephyr-7b-beta
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            name: normalized accuracy
            value: 62.03071672354948
        source:
          name: Open LLM Leaderboard
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            name: normalized accuracy
            value: 84.35570603465445
        source:
          name: Open LLM Leaderboard
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Drop (3-Shot)
          type: drop
          split: validation
          args:
            num_few_shot: 3
        metrics:
          - type: f1
            name: f1 score
            value: 9.66243708053691
        source:
          name: Open LLM Leaderboard
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 57.44916942762855
        source:
          name: Open LLM Leaderboard
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            name: accuracy
            value: 12.736921910538287
        source:
          name: Open LLM Leaderboard
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            name: accuracy
            value: 61.07
        source:
          name: Open LLM Leaderboard
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            name: accuracy
            value: 77.7426992896606
        source:
          name: Open LLM Leaderboard
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AlpacaEval
          type: tatsu-lab/alpaca_eval
        metrics:
          - type: unknown
            name: win rate
            value: 0.906
        source:
          url: https://tatsu-lab.github.io/alpaca_eval/
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MT-Bench
          type: unknown
        metrics:
          - type: unknown
            name: score
            value: 7.34
        source:
          url: https://huggingface.co/spaces/lmsys/mt-bench
Zephyr Logo

Neuronx model for Zephyr 7B β

This repository contains AWS Inferentia2 and neuronx compatible checkpoints for HuggingFaceH4/zephyr-7b-beta. You can find detailed information about the base model on its Model Card.

This model has been exported to the neuron format using specific input_shapes and compiler parameters detailed in the paragraphs below.

Please refer to the 🤗 optimum-neuron documentation for an explanation of these parameters.

Usage on Amazon SageMaker

coming soon

Usage with 🤗 optimum-neuron

from optimum.neuron import pipeline

pipe = pipeline('text-generation', 'aws-neuron/zephyr-7b-seqlen-2048-bs-4-cores-2')
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

This repository contains tags specific to versions of neuronx. When using with 🤗 optimum-neuron, use the repo revision specific to the version of neuronx you are using, to load the right serialized checkpoints.

Arguments passed during export

input_shapes

{
  "batch_size": 4,
  "sequence_length": 2048,
}

compiler_args

{
  "auto_cast_type": "fp16",
  "num_cores": 2,
}