lee0ray commited on
Commit
0a6d4bf
1 Parent(s): 299589a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -1
README.md CHANGED
@@ -4,4 +4,60 @@ language:
4
  - en
5
  - zh
6
  pipeline_tag: text-generation
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - en
5
  - zh
6
  pipeline_tag: text-generation
7
+ ---
8
+
9
+ # Unichat-llama3-Chinese-8B
10
+
11
+
12
+ ## 介绍
13
+
14
+ * 本模型以Meta-Llama-3-8B-Instruct为基础,增加中文数据进行微调,解决llama3模型中文能力弱的问题
15
+ * 基础模型 [**Meta-Llama-3-8B-Instruct**](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
16
+
17
+ ## 快速开始
18
+
19
+ ```python
20
+ from transformers import AutoTokenizer, AutoModelForCausalLM
21
+ import torch
22
+
23
+ model_id = "UnicomLLM/Unichat-llama3-Chinese-8B"
24
+
25
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
26
+ model = AutoModelForCausalLM.from_pretrained(
27
+ model_id,
28
+ torch_dtype=torch.bfloat16,
29
+ device_map="auto",
30
+ )
31
+
32
+ messages = [
33
+ {"role": "system", "content": "You are a helpful assistant"},
34
+ {"role": "user", "content": "Who are you?"},
35
+ ]
36
+
37
+ input_ids = tokenizer.apply_chat_template(
38
+ messages,
39
+ add_generation_prompt=True,
40
+ return_tensors="pt"
41
+ ).to(model.device)
42
+
43
+ terminators = [
44
+ tokenizer.eos_token_id,
45
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
46
+ ]
47
+
48
+ outputs = model.generate(
49
+ input_ids,
50
+ max_new_tokens=256,
51
+ eos_token_id=terminators,
52
+ do_sample=True,
53
+ temperature=0.6,
54
+ top_p=0.9,
55
+ )
56
+ response = outputs[0][input_ids.shape[-1]:]
57
+ print(tokenizer.decode(response, skip_special_tokens=True))
58
+ ```
59
+
60
+ ## 资源
61
+ 更多模型,数据集和训练相关细节请参考:
62
+ * Github:[**Unichat-llama3-Chinese**](https://github.com/UnicomAI/Unichat-llama3-Chinese)
63
+