File size: 2,281 Bytes

e1c16ed
 
bb017b7
 
 
a680316
bb017b7
 
 
 
 
aaa7397
 
5462923
6d0b3bc
5462923
6d0b3bc
5462923
6d0b3bc
5462923
6d0b3bc
5462923
70e4cca
 
 
 
e1c16ed
24a2272
bb017b7
a680316
bb017b7
 
a680316
24a2272
 
 
 
374f6d1
 
 
24a2272
335161d
 
374f6d1
24a2272
6d0b3bc
24a2272
 
335161d
24a2272
 
c4b7b21

---
license: apache-2.0
datasets:
- BelleGroup/train_1M_CN
- BelleGroup/multiturn_chat_0.8M
- jeffwan/sharegpt_vicuna
language:
- zh
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- chat
widget:
- text: "<Human>: Hello <eoh> <Assistant>:"
  example_title: "Hello"
- text: "<Human>: 你好 <eoh> <Assistant>:"
  example_title: "你好"
- text: "<Human>: What should I do if I can't sleep at night? <eoh> <Assistant>:"
  example_title: "insomnia"
- text: "<Human>: 晚上睡不着应该怎么办？ <eoh> <Assistant>:"
  example_title: "失眠"
inference:
  parameters:
    temperature: 0.8
    max_new_tokens: 128
---
# ChatBLOOM

ChatBLOOM是基于[BLOOM](https://huggingface.co/bigscience/bloom-1b7)（17亿参数）训练的中英双语对话语言模型，此模型为SFT版本。
详见[Github](https://github.com/NicholasCao/ChatBloom)。

ChatBLOOM is a Chinese-English bilingual dialogue language model trained based on [BLOOM](https://huggingface.co/bigscience/bloom-1b7) (1.7 billion parameters). This model is the SFT version.
See [Github](https://github.com/NicholasCao/ChatBloom) for details.

## Usage
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

tokenizer = AutoTokenizer.from_pretrained('nicholascao/chatbloom-1b7-sft')
tokenizer.pad_token_id = tokenizer.eos_token_id

model = AutoModelForCausalLM.from_pretrained('nicholascao/chatbloom-1b7-sft').half()

inputs = tokenizer('<Human>: Hello <eoh> <Assistant>:', return_tensors='pt').to(torch.cuda.current_device())
model.to(torch.cuda.current_device())

output = model.generate(**inputs, max_length=768, do_sample=True, temperature=0.8, top_k=50, early_stopping=True, repetition_penalty=1.1)
output = tokenizer.decode(output[0], skip_special_tokens=True)
print(output)
```

## Limitation and Usage Limits

我们使用的数据集(例如[BELLE](https://github.com/LianjiaTech/BELLE))要求开发人员仅将数据用于研究目的。
因此，不允许将我们的模型用于商业以及其他潜在的有害用途。

The datasets we used (e.g. [BELLE](https://github.com/LianjiaTech/BELLE)) require developers only use the data for research purposes. 
Thus, commercial and other potentially harmful uses of our models are not allowed.