leaderboard-pr-bot's picture
Adding Evaluation Results
6b727fa verified
|
raw
history blame
6.24 kB
metadata
language:
  - zh
  - en
license: apache-2.0
model-index:
  - name: tigerbot-70b-base
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 62.46
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-70b-base
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 83.61
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-70b-base
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 65.49
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-70b-base
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 52.76
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-70b-base
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 80.19
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-70b-base
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 37.76
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-70b-base
          name: Open LLM Leaderboard

TigerBot

A cutting-edge foundation for your very own LLM.

💻Github • 🌐 TigerBot • 🤗 Hugging Face

快速开始

  • 方法1,通过transformers使用

    • 下载 TigerBot Repo

      git clone https://github.com/TigerResearch/TigerBot.git
      
    • 启动infer代码

      python infer.py --model_path TigerResearch/tigerbot-70b-base-v1 --model_type base
      
  • 方法2:

    • 下载 TigerBot Repo

      git clone https://github.com/TigerResearch/TigerBot.git
      
    • 安装git lfs: git lfs install

    • 通过huggingface或modelscope平台下载权重

      git clone https://huggingface.co/TigerResearch/tigerbot-70b-base-v1
      git clone https://www.modelscope.cn/TigerResearch/tigerbot-70b-base-v1.git
      
    • 启动infer代码

      python infer.py --model_path tigerbot-70b-base-v1 --model_type base
      

Quick Start

  • Method 1, use through transformers

    • Clone TigerBot Repo

      git clone https://github.com/TigerResearch/TigerBot.git
      
    • Run infer script

      python infer.py --model_path TigerResearch/tigerbot-70b-base-v1 --model_type base
      
  • Method 2:

    • Clone TigerBot Repo

      git clone https://github.com/TigerResearch/TigerBot.git
      
    • install git lfs: git lfs install

    • Download weights from huggingface or modelscope

      git clone https://huggingface.co/TigerResearch/tigerbot-70b-base-v1
      git clone https://www.modelscope.cn/TigerResearch/tigerbot-70b-base-v1.git
      
    • Run infer script

      python infer.py --model_path tigerbot-70b-base-v1 --model_type base
      

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 62.1
ARC (25-shot) 62.46
HellaSwag (10-shot) 83.61
MMLU (5-shot) 65.49
TruthfulQA (0-shot) 52.76
Winogrande (5-shot) 80.19
GSM8K (5-shot) 37.76
DROP (3-shot) 52.45

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 63.71
AI2 Reasoning Challenge (25-Shot) 62.46
HellaSwag (10-Shot) 83.61
MMLU (5-Shot) 65.49
TruthfulQA (0-shot) 52.76
Winogrande (5-shot) 80.19
GSM8k (5-shot) 37.76