|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- BelleGroup/train_1M_CN |
|
- BelleGroup/multiturn_chat_0.8M |
|
- jeffwan/sharegpt_vicuna |
|
language: |
|
- zh |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
tags: |
|
- chat |
|
widget: |
|
- text: "<Human>: Hello <eoh> <Assistant>: " |
|
example_title: "Hello" |
|
- text: "<Human>: 你好 <eoh> <Assistant>: " |
|
example_title: "你好" |
|
- text: "<Human>: What should I do if I can't sleep at night? <eoh> <Assistant>: " |
|
example_title: "insomnia" |
|
- text: "<Human>: 晚上睡不着应该怎么办? <eoh> <Assistant>: " |
|
example_title: "失眠" |
|
inference: |
|
parameters: |
|
temperature: 0.8 |
|
max_new_tokens: 128 |
|
--- |
|
# ChatBLOOM |
|
|
|
ChatBLOOM是基于[BLOOM](https://huggingface.co/bigscience/bloom-1b7)(17亿参数)训练的中英双语对话语言模型,此模型为SFT版本。 |
|
详见[Github](https://github.com/NicholasCao/ChatBloom)。 |
|
|
|
ChatBLOOM is a Chinese-English bilingual dialogue language model trained based on [BLOOM](https://huggingface.co/bigscience/bloom-1b7) (1.7 billion parameters). This model is the SFT version. |
|
See [Github](https://github.com/NicholasCao/ChatBloom) for details. |
|
|
|
## Usage |
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig |
|
|
|
tokenizer = AutoTokenizer.from_pretrained('nicholascao/chatbloom-1b7-sft') |
|
model = AutoModelForCausalLM.from_pretrained('nicholascao/chatbloom-1b7-sft').half() |
|
generation_config = GenerationConfig.from_pretrained('nicholascao/chatbloom-1b7-sft') |
|
|
|
inputs = tokenizer('<Human>: Hello <eoh> <Assistant>: ', return_tensors='pt').to(torch.cuda.current_device()) |
|
model.to(torch.cuda.current_device()) |
|
|
|
output = model.generate(**inputs, generation_config=generation_config) |
|
output = tokenizer.decode(output[0], skip_special_tokens=True) |
|
print(output) |
|
``` |
|
|
|
## Limitation and Usage Limits |
|
|
|
我们使用的数据集(例如[BELLE](https://github.com/LianjiaTech/BELLE))要求开发人员仅将数据用于研究目的。 |
|
因此,不允许将我们的模型用于商业以及其他潜在的有害用途。 |
|
|
|
The datasets we used (e.g. [BELLE](https://github.com/LianjiaTech/BELLE)) require developers only use the data for research purposes. |
|
Thus, commercial and other potentially harmful uses of our models are not allowed. |