nicholascao
/

chatbloom-1b7-sft

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chatbloom-1b7-sft / README.md

nicholascao's picture

Update README.md

335161d over 1 year ago

|

2.28 kB

	---
	license: apache-2.0
	datasets:
	- BelleGroup/train_1M_CN
	- BelleGroup/multiturn_chat_0.8M
	- jeffwan/sharegpt_vicuna
	language:
	- zh
	- en
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- chat
	widget:
	- text: "<Human>: Hello <eoh> <Assistant>:"
	example_title: "Hello"
	- text: "<Human>: 你好 <eoh> <Assistant>:"
	example_title: "你好"
	- text: "<Human>: What should I do if I can't sleep at night? <eoh> <Assistant>:"
	example_title: "insomnia"
	- text: "<Human>: 晚上睡不着应该怎么办？ <eoh> <Assistant>:"
	example_title: "失眠"
	inference:
	parameters:
	temperature: 0.8
	max_new_tokens: 128
	---
	# ChatBLOOM

	ChatBLOOM是基于[BLOOM](https://huggingface.co/bigscience/bloom-1b7)（17亿参数）训练的中英双语对话语言模型，此模型为SFT版本。
	详见[Github](https://github.com/NicholasCao/ChatBloom)。

	ChatBLOOM is a Chinese-English bilingual dialogue language model trained based on [BLOOM](https://huggingface.co/bigscience/bloom-1b7) (1.7 billion parameters). This model is the SFT version.
	See [Github](https://github.com/NicholasCao/ChatBloom) for details.

	## Usage
	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

	tokenizer = AutoTokenizer.from_pretrained('nicholascao/chatbloom-1b7-sft')
	tokenizer.pad_token_id = tokenizer.eos_token_id

	model = AutoModelForCausalLM.from_pretrained('nicholascao/chatbloom-1b7-sft').half()

	inputs = tokenizer('<Human>: Hello <eoh> <Assistant>:', return_tensors='pt').to(torch.cuda.current_device())
	model.to(torch.cuda.current_device())

	output = model.generate(**inputs, max_length=768, do_sample=True, temperature=0.8, top_k=50, early_stopping=True, repetition_penalty=1.1)
	output = tokenizer.decode(output[0], skip_special_tokens=True)
	print(output)
	```

	## Limitation and Usage Limits

	我们使用的数据集(例如[BELLE](https://github.com/LianjiaTech/BELLE))要求开发人员仅将数据用于研究目的。
	因此，不允许将我们的模型用于商业以及其他潜在的有害用途。

	The datasets we used (e.g. [BELLE](https://github.com/LianjiaTech/BELLE)) require developers only use the data for research purposes.
	Thus, commercial and other potentially harmful uses of our models are not allowed.