wenbopan commited on
Commit
5e10907
1 Parent(s): 40e6674

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -16,7 +16,12 @@ library_name: transformers
16
 
17
  Gigi is fine-tuned on over 1.3 million pieces of high-quality Chinese-English bilingual corpus screened with the state-of-the-art Llama-3-8B-Instruct. It can better handle various downstream tasks and provide you with high-quality Chinese-English bilingual results. We incorporated high-quality fine-tuning data, such as Hermes and glaive-function-calling instructions, into the training, as well as a large amount of GPT4 data translated using GPT3.5. Gigi can meet your needs well in Chinese-English bilingual contexts.
18
 
19
- Gigi 是使用最先进的 Llama-3-8B-Instruct 在超过130万条经过筛选的高质量中英双语语料上进行精调,它能更好地处理各种下游任务,并为您提供高质量的中英双语结果。我们在训练中加入了包含Hermes、glaive-function-calling等高质量的指令精调数据,以及大量使用GPT3.5翻译的GPT4数据,Gigi能很好的在中英双语上满足您的需求。
 
 
 
 
 
20
 
21
  # Gigi-Llama-3-8B-zh
22
 
 
16
 
17
  Gigi is fine-tuned on over 1.3 million pieces of high-quality Chinese-English bilingual corpus screened with the state-of-the-art Llama-3-8B-Instruct. It can better handle various downstream tasks and provide you with high-quality Chinese-English bilingual results. We incorporated high-quality fine-tuning data, such as Hermes and glaive-function-calling instructions, into the training, as well as a large amount of GPT4 data translated using GPT3.5. Gigi can meet your needs well in Chinese-English bilingual contexts.
18
 
19
+ `Gigi` [Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) 在超过130万条经过筛选的高质量中英双语语料上的精调模型,明显增强中文能力。
20
+
21
+ 训练数据来源:
22
+
23
+ - **英文**:[OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) 包含超过 100 万条 GPT-4 生成精调数据
24
+ - **中文**:超过20万条,包含多个高质量中文 SFT 数据集合和校正翻译的 GPT-4 生成数据。
25
 
26
  # Gigi-Llama-3-8B-zh
27