metadata

license: bigscience-bloom-rail-1.0
language:
  - ak
  - ar
  - as
  - bm
  - bn
  - ca
  - code
  - en
  - es
  - eu
  - fon
  - fr
  - gu
  - hi
  - id
  - ig
  - ki
  - kn
  - lg
  - ln
  - ml
  - mr
  - ne
  - nso
  - ny
  - or
  - pa
  - pt
  - rn
  - rw
  - sn
  - st
  - sw
  - ta
  - te
  - tn
  - ts
  - tum
  - tw
  - ur
  - vi
  - wo
  - xh
  - yo
  - zh
  - zhs
  - zht
  - zu
pipeline_tag: text-generation

BLOOM LM - 8bit

BigScience Large Open-science Open-access Multilingual Language Model - 8bit

Model Card

Version 1.0 / 26.May.2022

Related paper: https://arxiv.org/abs/2208.07339

TL;DR

This repository contains 8bit weights of bloom-1b7 model. You can load this model using transformers==4.28.0 and bitsandbytes>0.37.2 out of the box !

# pip install accelerate bitsandbytes
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("ybelkada/bloom-1b7-8bit")

How to push 8bit weights?

First, make sure you are using transformers & bitsandbytes versions stated above. Then load your 8bit model as usual using load_in_8bit=True!

# pip install accelerate bitsandbytes
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("bigscience/bloom-1b7", device_map="auto", load_in_8bit=True)

Then just call push_to_hub method or save_pretrained method if you want to save your 8bit model locally

model.push_to_hub("{your_username}/bloom-1b7-8bit")

That's it!

What is inside the model's `state_dict`?

Inside the state dict of the model (pytorch_model.bin file) you have

the quantized int8 weights
the quantization statistics in float16