Yi-1.5-9B-Chat / README.md
anonymitaet's picture
Update README.md
22b0ac8 verified
specify theme context for images

πŸ€— HuggingFace β€’ πŸ€– ModelScope β€’ ✑️ WiseModel
πŸ™ GitHub β€’ πŸ‘Ύ Discord β€’ 🐀 Twitter β€’ πŸ’¬ WeChat
πŸ“ Paper β€’ πŸ™Œ FAQ β€’ πŸ“— Learning Hub

Intro

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension.

Model Context Length Pre-trained Tokens
Yi-1.5 4K 3.6T

Models

  • Chat models
Model Download
Yi-1.5-34B-Chat β€’ πŸ€— Hugging Face
Yi-1.5-9B-Chat β€’ πŸ€— Hugging Face
Yi-1.5-6B-Chat β€’ πŸ€— Hugging Face
  • Base models
Model Download
Yi-1.5-34B β€’ πŸ€— Hugging Face
Yi-1.5-9B β€’ πŸ€— Hugging Face
Yi-1.5-6B β€’ πŸ€— Hugging Face

Benchmarks

  • Chat models

    Waiting for benchmark results.

  • Base models

    • Yi-1.5-34B excels beyond or is on par with some larger models in overall performance.

      Model MMLU CMMLU BBH AGIEval HumanEva(+) MBPP(+) GSM8k MATH
      Mistral 8*22B 77.8 60.4 61.9 58.6 45.1(34.1) 71.2(-) 81.7 41.8
      DeepSeek-V2 78.5 84.0 78.9 - 48.8(-) 66.6(-) 79.2 43.6
      Qwen1.5-32B 73.7 83 66.8 72.6 37.8(35.4) 49.4(44.4) 79.5 37.7
      Qwen1.5-72B 77.5 84.2 65.5 71.7 41.5(39.0) 53.4(43.6) 81.4 41.8
      Llama3_70B_base 78.7 68.9 65.0 52.8 38.4(34.8) 69.7(52.9) 82.4 42.4
      Yi 1.5-34B 77.1 84.8 76.4 71.1 46.3(40.2) 65.5(55.4) 82.7 41.0
    • Yi-1.5-9B is a strong performer among similarly sized open-source models.

      Model MMLU CMMLU BBH AGIEval HumanEval(+) MBPP(+) GSM8k Math
      Gemma-7B 64.3 48.4 41.1 46.0 33.5(28.0) 45.8(32.8) 55.7 24.8
      Qwen1.5-7B 61.0 73.4 33.4 61.6 36.0(31.1) 46.1(37.6) 70.1 20.3
      Mistral-7B 62.5 44.6 45.0 42.4 29.3(22.6) 50.2(32.1) 47.5 15.5
      Mistral 8*7B 70.6 53.0 52.4 49.5 40.2(31.1) 60.7(31.1) 65.7 28.4
      Llama3-8B_Base 66.6 50.9 47.9 44.7 34.7(31.7) 48.0(44.9) 54.7 21.16
      Yi 1.5-6B 63.5 70.8 45.7 56.0 36.5(28.7) 56.8(46.9) 62.2 28.42
      Yi 1.5-9B 69.5 74.8 50.9 62.7 41.4(34.1) 61.1(53.6) 73.7 32.6

Quick Start

For getting up and running with Yi-1.5 models quickly, see README.