sharp commited on
Commit
b371102
β€’
1 Parent(s): 9b62647

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -8
README.md CHANGED
@@ -11,9 +11,7 @@ pipeline_tag: text-generation
11
 
12
  <!-- markdownlint-disable first-line-h1 -->
13
  <!-- markdownlint-disable html -->
14
- <div align="center">
15
- <img src="./assets/imgs/orion_start.PNG" alt="logo" width="50%" />
16
- </div>
17
 
18
  <div align="center">
19
  <h1>
@@ -28,7 +26,7 @@ pipeline_tag: text-generation
28
  <p>
29
  <b>🌐English</b> |
30
  <a href="https://huggingface.co/OrionStarAI/Orion-14B-Chat-Int4/blob/main/README_zh.md">πŸ‡¨πŸ‡³δΈ­ζ–‡</a><br><br>
31
- πŸ€— <a href="https://huggingface.co/OrionStarAI" target="_blank">HuggingFace Mainpage</a> | πŸ€– <a href="https://modelscope.cn/organization/OrionStarAI" target="_blank">ModelScope Mainpage</a><br>🎬 <a href="https://huggingface.co/spaces/OrionStarAI/Orion-14B-App-Demo" target="_blank">HuggingFace Demo</a> | 🎫 <a href="https://modelscope.cn/studios/OrionStarAI/Orion-14B-App-Demo/summary" target="_blank">ModelScope Demo</a><br>πŸ“– <a href="https://github.com/OrionStarAI/Orion/blob/master/doc/Orion14B_v3.pdf" target="_blank">Tech Report</a>
32
  <p>
33
  </h4>
34
 
@@ -42,12 +40,14 @@ pipeline_tag: text-generation
42
  - [πŸ”— Model Download](#model-download)
43
  - [πŸ”– Model Benchmark](#model-benchmark)
44
  - [πŸ“Š Model Inference](#model-inference)
45
- - [πŸ“œ Declarations & License](#declarations-license)
46
  - [πŸ₯‡ Company Introduction](#company-introduction)
 
47
 
48
  # 1. Model Introduction
49
 
50
- - Orion-14B series models are open-source multilingual large language models trained from scratch by OrionStarAI. The base model is trained on 2.5T multilingual corpus, including Chinese, English, Japanese, Korean, etc, and it exhibits superior performance in these languages. For details, please refer to [tech report](https://github.com/OrionStarAI/Orion/blob/master/doc/Orion14B_v3.pdf).
 
 
51
 
52
  - The Orion-14B series models exhibit the following features:
53
  - Among models with 20B-parameter scale level, Orion-14B-Base model shows outstanding performance in comprehensive evaluations.
@@ -174,8 +174,7 @@ Model release and download links are provided in the table below:
174
  | Llama2-13B-Chat | 3.05 | 3.79 | 5.43 | 4.40 | 6.76 | 6.63 | 6.99 | 5.65 | 4.70 |
175
  | InternLM-20B-Chat | 3.39 | 3.92 | 5.96 | 5.50 |**7.18**| 6.19 | 6.49 | 6.22 | 4.96 |
176
  | **Orion-14B-Chat** | 4.00 | 4.24 | 6.18 |**6.57**| 7.16 |**7.36**|**7.16**|**6.99**| 5.51 |
177
-
178
- \* use vllm for inference
179
 
180
  ## 3.3. LongChat Model Orion-14B-LongChat Benchmarks
181
  ### 3.3.1. LongChat evaluation of LongBench
 
11
 
12
  <!-- markdownlint-disable first-line-h1 -->
13
  <!-- markdownlint-disable html -->
14
+ ![](./assets/imgs/orion_start.PNG)
 
 
15
 
16
  <div align="center">
17
  <h1>
 
26
  <p>
27
  <b>🌐English</b> |
28
  <a href="https://huggingface.co/OrionStarAI/Orion-14B-Chat-Int4/blob/main/README_zh.md">πŸ‡¨πŸ‡³δΈ­ζ–‡</a><br><br>
29
+ πŸ€— <a href="https://huggingface.co/OrionStarAI" target="_blank">HuggingFace Mainpage</a> | πŸ€– <a href="https://modelscope.cn/organization/OrionStarAI" target="_blank">ModelScope Mainpage</a><br>🎬 <a href="https://huggingface.co/spaces/OrionStarAI/Orion-14B-App-Demo" target="_blank">HuggingFace Demo</a> | 🎫 <a href="https://modelscope.cn/studios/OrionStarAI/Orion-14B-App-Demo/summary" target="_blank">ModelScope Demo</a>
30
  <p>
31
  </h4>
32
 
 
40
  - [πŸ”— Model Download](#model-download)
41
  - [πŸ”– Model Benchmark](#model-benchmark)
42
  - [πŸ“Š Model Inference](#model-inference)
 
43
  - [πŸ₯‡ Company Introduction](#company-introduction)
44
+ - [πŸ“œ Declarations & License](#declarations-license)
45
 
46
  # 1. Model Introduction
47
 
48
+ - Orion-14B-Chat is fine-tuned from Orion-14B-Base using a high-quality corpus of approximately 850,000 entries (only sft), and it also supports Chinese, English, Japanese, and Korean. It performs exceptionally well on the MT-Bench and AlignBench evaluation sets, significantly surpassing other models of the same parameter scale in multiple metrics. For details, please refer to [tech report](https://github.com/OrionStarAI/Orion/blob/master/doc/Orion14B_v3.pdf).
49
+
50
+ - The 850,000 fine-tuning corpus comprises two parts: approximately 220,000 manually curated high-quality datasets and 630,000 entries selected and semantically deduplicated from open-source data through model filtering. Among these, the Japanese and Korean data, totaling 70,000 entries, have only undergone basic cleaning and deduplication.
51
 
52
  - The Orion-14B series models exhibit the following features:
53
  - Among models with 20B-parameter scale level, Orion-14B-Base model shows outstanding performance in comprehensive evaluations.
 
174
  | Llama2-13B-Chat | 3.05 | 3.79 | 5.43 | 4.40 | 6.76 | 6.63 | 6.99 | 5.65 | 4.70 |
175
  | InternLM-20B-Chat | 3.39 | 3.92 | 5.96 | 5.50 |**7.18**| 6.19 | 6.49 | 6.22 | 4.96 |
176
  | **Orion-14B-Chat** | 4.00 | 4.24 | 6.18 |**6.57**| 7.16 |**7.36**|**7.16**|**6.99**| 5.51 |
177
+ \* use vllm for inference
 
178
 
179
  ## 3.3. LongChat Model Orion-14B-LongChat Benchmarks
180
  ### 3.3.1. LongChat evaluation of LongBench