FrankC0st1e commited on
Commit
eb30cff
·
verified ·
1 Parent(s): 7d01e83

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Introduction
2
+
3
+ The MiniCPM-MoE-8x2B is a decoder-only transformer-based generative language model.
4
+
5
+ The MiniCPM-MoE-8x2B adopt a Mixture-of-Experts(MoE) architecture, which has 8 experts per layer and activates 2 of 8 experts for each token.
6
+
7
+ # Usage
8
+ This is a model version after instruction tuning but without other rlhf methods. Chat template is automatically applied.
9
+ ``` python
10
+ from transformers import AutoModelForCausalLM, AutoTokenizer
11
+ import torch
12
+ torch.manual_seed(0)
13
+
14
+ path = 'openbmb/MiniCPM-MoE-8x2B'
15
+ tokenizer = AutoTokenizer.from_pretrained(path)
16
+ model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map='cuda', trust_remote_code=True)
17
+
18
+ responds, history = model.chat(tokenizer, "山东省最高的山是哪座山, 它比黄山高还是矮?差距多少?", temperature=0.8, top_p=0.8)
19
+ print(responds)
20
+ ```
21
+
22
+ # Note
23
+ 1. You can alse inference with [vLLM](https://github.com/vllm-project/vllm), which will be compatible with this repo and has a much higher inference throughput.
24
+ 2. The precision of model weights in this repo is bfloat16. Manual convertion is needed for other kinds of dtype.
25
+
26
+ # Statement
27
+ 1. As a language model, MiniCPM-MoE-8x2B generates content by learning from a vast amount of text.
28
+ 2. However, it does not possess the ability to comprehend or express personal opinions or value judgments.
29
+ 3. Any content generated by MiniCPM-MoE-8x2B does not represent the viewpoints or positions of the model developers.
30
+ 4. Therefore, when using content generated by MiniCPM-MoE-8x2B, users should take full responsibility for evaluating and verifying it on their own.
31
+