This model is based on bigscience/bloom-1b1.

We pruned its vocabulary from 250880 to 46145 with Chinese corpus to reduce GPU memory usage. So the total parameter is 800m now.

How to use

from transformers import BloomTokenizerFast, BloomForCausalLM

tokenizer = BloomTokenizerFast.from_pretrained('Langboat/bloom-800m-zh')
model = BloomForCausalLM.from_pretrained('Langboat/bloom-800m-zh')

print(tokenizer.batch_decode(model.generate(tokenizer.encode('中国的首都是', return_tensors='pt'))))

Downloads last month: 111

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Langboat
/

bloom-800m-zh

How to use

Spaces using Langboat/bloom-800m-zh 2