OrionStarAI
/

Orion-14B-Chat-Int4

Text Generation

4-bit precision

Model card Files Files and versions Community

sharp commited on Jan 20

Commit

2e8c13b

•

1 Parent(s): b371102

Update README.md

Files changed (1) hide show

README.md +1 -10

README.md CHANGED Viewed

@@ -45,16 +45,7 @@ pipeline_tag: text-generation
 # 1. Model Introduction
-- Orion-14B-Chat is fine-tuned from Orion-14B-Base using a high-quality corpus of approximately 850,000 entries (only sft), and it also supports Chinese, English, Japanese, and Korean. It performs exceptionally well on the MT-Bench and AlignBench evaluation sets, significantly surpassing other models of the same parameter scale in multiple metrics. For details, please refer to [tech report](https://github.com/OrionStarAI/Orion/blob/master/doc/Orion14B_v3.pdf).
-- The 850,000 fine-tuning corpus comprises two parts: approximately 220,000 manually curated high-quality datasets and 630,000 entries selected and semantically deduplicated from open-source data through model filtering. Among these, the Japanese and Korean data, totaling 70,000 entries, have only undergone basic cleaning and deduplication.
-- The Orion-14B series models exhibit the following features:
-  - Among models with 20B-parameter scale level, Orion-14B-Base model shows outstanding performance in comprehensive evaluations.
-  - Strong multilingual capabilities, significantly outperforming in Japanese and Korean testsets.
-  - The fine-tuned models demonstrate strong adaptability, excelling in human-annotated blind tests.
-  - The long-chat version supports extremely long texts, extending up to 200K tokens.
-  - The quantized versions reduce model size by 70%, improve inference speed by 30%, with performance loss less than 1%.
  <div align="center">
   <img src="./assets/imgs/model_cap_en.png" alt="model_cap_en" width="50%" />
 </div>

 # 1. Model Introduction
+- Orion-14B-Chat-Int4 is quantized using awq from Orion-14B-Chat while reducing model size by 70% and improving inference speed by 30%, with performance loss less than 1%.
  <div align="center">
   <img src="./assets/imgs/model_cap_en.png" alt="model_cap_en" width="50%" />
 </div>