chatbloom-1b7-sft / README.md
nicholascao's picture
Update README.md
335161d
|
raw
history blame
2.28 kB
metadata
license: apache-2.0
datasets:
  - BelleGroup/train_1M_CN
  - BelleGroup/multiturn_chat_0.8M
  - jeffwan/sharegpt_vicuna
language:
  - zh
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - chat
widget:
  - text: '<Human>: Hello <eoh> <Assistant>:'
    example_title: Hello
  - text: '<Human>: 你好 <eoh> <Assistant>:'
    example_title: 你好
  - text: '<Human>: What should I do if I can''t sleep at night? <eoh> <Assistant>:'
    example_title: insomnia
  - text: '<Human>: 晚上睡不着应该怎么办? <eoh> <Assistant>:'
    example_title: 失眠
inference:
  parameters:
    temperature: 0.8
    max_new_tokens: 128

ChatBLOOM

ChatBLOOM是基于BLOOM(17亿参数)训练的中英双语对话语言模型,此模型为SFT版本。 详见Github

ChatBLOOM is a Chinese-English bilingual dialogue language model trained based on BLOOM (1.7 billion parameters). This model is the SFT version. See Github for details.

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

tokenizer = AutoTokenizer.from_pretrained('nicholascao/chatbloom-1b7-sft')
tokenizer.pad_token_id = tokenizer.eos_token_id

model = AutoModelForCausalLM.from_pretrained('nicholascao/chatbloom-1b7-sft').half()

inputs = tokenizer('<Human>: Hello <eoh> <Assistant>:', return_tensors='pt').to(torch.cuda.current_device())
model.to(torch.cuda.current_device())

output = model.generate(**inputs, max_length=768, do_sample=True, temperature=0.8, top_k=50, early_stopping=True, repetition_penalty=1.1)
output = tokenizer.decode(output[0], skip_special_tokens=True)
print(output)

Limitation and Usage Limits

我们使用的数据集(例如BELLE)要求开发人员仅将数据用于研究目的。 因此,不允许将我们的模型用于商业以及其他潜在的有害用途。

The datasets we used (e.g. BELLE) require developers only use the data for research purposes. Thus, commercial and other potentially harmful uses of our models are not allowed.