--- license: other license_name: yi-license license_link: LICENSE widget: - text: 你好! 你叫什么名字! output: text: 你好,我的名字叫聚言,很高兴见到你。 pipeline_tag: text-generation model-index: - name: OrionStar-Yi-34B-Chat-Llama results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 64.93 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OrionStarAI/OrionStar-Yi-34B-Chat-Llama name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 84.34 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OrionStarAI/OrionStar-Yi-34B-Chat-Llama name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 73.67 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OrionStarAI/OrionStar-Yi-34B-Chat-Llama name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 53.35 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OrionStarAI/OrionStar-Yi-34B-Chat-Llama name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 78.85 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OrionStarAI/OrionStar-Yi-34B-Chat-Llama name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 53.9 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OrionStarAI/OrionStar-Yi-34B-Chat-Llama name: Open LLM Leaderboard --- [OrionStarAI/OrionStar-Yi-34B-Chat-Llama](https://huggingface.co/OrionStarAI/OrionStar-Yi-34B-Chat-Llama/tree/main) *This model is identical to [OrionStarAI/OrionStar-Yi-34B](https://huggingface.co/OrionStarAI/OrionStar-Yi-34B/tree/main) with the only difference being that the tensors have been renamed to follow the LLaMA format for automatic evaluation on the HF leaderboard.* # Model Introduction - OrionStar-Yi-34B-Chat from OrionStarAI is based on the open-source Yi-34B model, fine-tuned on a high-quality corpus of over 15 million sentences. OrionStar-Yi-34B-Chat aims to provide an excellent interactive experience for users in the large model community. - The Yi series models, open-sourced by the 01-ai team, have shown impressive performance on various benchmarks in Chinese, English, and general domains. OrionStar-Yi-34B-Chat further explores the potential of Yi-34B. Through extensive fine-tuning on a large and high-quality corpus, OrionStar-Yi-34B-Chat performs exceptionally well on evaluation data. We strive to make it an outstanding open-source alternative in the ChatGPT domain! - Our fine-tuned model is completely open for academic research, but please adhere to the [agreement](#license) and the [Yi License](https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt). - Model Evaluation Results We use [opencompass](https://opencompass.org.cn) to perform 5-shot on the following general domain datasets Testing. The evaluation results of other models are taken from [opencompass leaderboard](https://opencompass.org.cn/leaderboard-llm). | | C-Eval | MMLU | CMMLU | |---------------------------|-----------|--------|-----------| | **GPT-4** | 69.9 | **83** | 71 | | **ChatGPT** | 52.5 | 69.1 | 53.9 | | **Claude-1** | 52 | 65.7 | - | | **TigerBot-70B-Chat-V2** | 57.7 | 65.9 | 59.9 | | **WeMix-LLaMA2-70B** | 55.2 | 71.3 | 56 | | **LLaMA-2-70B-Chat** | 44.3 | 63.8 | 43.3 | | **Qwen-14B-Chat** | 71.7 | 66.4 | 70 | | **Baichuan2-13B-Chat** | 56.7 | 57 | 58.4 | | **OrionStar-Yi-34B-Chat** | **77.71** | 78.32 | **73.52** | # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_OrionStarAI__OrionStar-Yi-34B-Chat-Llama) | Metric |Value| |---------------------------------|----:| |Avg. |68.17| |AI2 Reasoning Challenge (25-Shot)|64.93| |HellaSwag (10-Shot) |84.34| |MMLU (5-Shot) |73.67| |TruthfulQA (0-shot) |53.35| |Winogrande (5-shot) |78.85| |GSM8k (5-shot) |53.90|