Gong Baitao commited on
Commit
8a24b4e
1 Parent(s): e556329

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - zh
5
+ ---
6
+
7
+ # CPM-Bee
8
+
9
+ **CPM-Bee** is a fully open-source, commercially-usable Chinese-English bilingual base model with a capacity of ten billion parameters. It is the second milestone achieved through the training process of [**CPM-live**](https://live.openbmb.org/).
10
+ Utilizing the Transformer auto-regressive architecture, CPM-Bee has been pre-trained on an extensive corpus of trillion-scale tokens, thereby possessing remarkable foundational capabilities.
11
+
12
+ ## Model description
13
+
14
+ - **Open-source and Commercial Usable**:OpenBMB adheres to the spirit of open-source, aiming to make large-scale models accessible to everyone. CPM-Bee, as a foudation model, is fully open-source and available for commercial use, contributing to the advancement of the field of large-scale models.
15
+
16
+ - **Excellent Performance in Chinese and English**: : CPM-Bee's base model has undergone rigorous selection and balancing of pre-training data, resulting in outstanding performance in both Chinese and English. For detailed information regarding evaluation tasks and results, please refer to the assessment documentation.
17
+
18
+
19
+ - **Vast and High-quality Corpus**: CPM-Bee, as a base model, has been trained on an extensive corpus of over trillion tokens, making it one of the models with the highest volume of training data within the open-source community. Furthermore, we have implemented stringent selection, cleaning, and post-processing procedures on the pre-training corpus to ensure its quality.
20
+
21
+ - **Support for OpenBMB System**: The OpenBMB system provides a comprehensive ecosystem of tools and scripts for high-performance pre-training, adaptation, compression, deployment, and tool development. CPM-Bee, as a base model, is accompanied by all the necessary tool scripts, enabling developers to efficiently utilize and explore advanced functionalities.
22
+
23
+
24
+ - **Conversational and Tool Usage Capabilities**: Building upon OpenBMB's exploration in instruction-based fine-tuning and tool learning, we have performed fine-tuning on top of the CPM-Bee base model, resulting in an instance model with powerful conversational and tool usage capabilities. The API and beta testing for this model will be made available in the near future.
25
+
26
+ ## Intended uses & limitations
27
+
28
+ You can use the raw model for many NLP tasks like text generation or fine-tune it to a downstream task.
29
+
30
+ ### How to use
31
+
32
+ ```python
33
+ >>> from transformers import AutoModelForCausalLM, AutoTokenizer
34
+ >>> tokenizer = AutoTokenizer.from_pretrained("openbmb/cpm-bee-5b", trust_remote_code=True)
35
+ >>> model = AutoModelForCausalLM.from_pretrained("openbmb/cpm-bee-5b", trust_remote_code=True).cuda() #
36
+ >>> result = model.generate({"input": "今天天气不错,", "<ans>": ""}, tokenizer)
37
+ >>> print(result)
38
+ ```
39
+
40
+ If you wanna use multi GPUs to inference, you can use `accelerate` as follow:
41
+
42
+ ```python
43
+ from transformers import AutoModelForCausalLM, AutoTokenizer
44
+ from accelerate import dispatch_model
45
+ from accelerate.utils import get_balanced_memory, infer_auto_device_map
46
+
47
+ tokenizer = AutoTokenizer.from_pretrained("openbmb/cpm-bee-5b", trust_remote_code=True)
48
+ model = AutoModelForCausalLM.from_pretrained("openbmb/cpm-bee-5b", trust_remote_code=True).cuda()
49
+
50
+ max_memory = get_balanced_memory(
51
+ model,
52
+ no_split_module_classes=["CpmBeeTransformerBlock"]
53
+ )
54
+ device_map = infer_auto_device_map(model, max_memory=max_memory, no_split_module_classes=["CpmBeeTransformerBlock"])
55
+ # make sure the data on the same device when projecting hidden states to logits.
56
+ device_map["cpmbee.encoder.output_layernorm"] = device_map["cpmbee.input_embedding"] = 0
57
+
58
+ model = dispatch_model(model, device_map=device_map)
59
+
60
+ res = model.generate(
61
+ [
62
+ {"input": "今天天气是真的", "<ans>": ""},
63
+ {"input": "NGC 6231是一个位于天蝎座的疏散星团,天球座标为赤经16时54分,赤纬-41度48分,视觉观测大小约45角分,亮度约2.6视星等,距地球5900光年。NGC 6231年龄约为三百二十万年,是一个非常年轻的星团,星团内的最亮星是5等的天蝎座 ζ1星。用双筒望远镜或小型望远镜就能看到个别的行星。NGC 6231在1654年被意大利天文学家乔瓦尼·巴蒂斯特·霍迪尔纳(Giovanni Battista Hodierna)以Luminosae的名字首次纪录在星表中,但是未见记载于夏尔·梅西耶的天体列表和威廉·赫歇尔的深空天体目录。这个天体在1678年被爱德蒙·哈雷(I.7)、1745年被夏西亚科斯(Jean-Phillippe Loys de Cheseaux)(9)、1751年被尼可拉·路易·拉卡伊(II.13)分别再次独立发现。", "question": "NGC 6231的经纬度是多少?", "<ans>": ""}
64
+ ],
65
+ tokenizer,
66
+ max_new_tokens=100
67
+ )
68
+ print(res)
69
+
70
+ ```