metadata

license: bigscience-bloom-rail-1.0
datasets:
  - bigscience/xP3
language:
  - ak
  - ar
  - as
  - bm
  - bn
  - ca
  - code
  - en
  - es
  - eu
  - fon
  - fr
  - gu
  - hi
  - id
  - ig
  - ki
  - kn
  - lg
  - ln
  - ml
  - mr
  - ne
  - nso
  - ny
  - or
  - pa
  - pt
  - rn
  - rw
  - sn
  - st
  - sw
  - ta
  - te
  - tn
  - ts
  - tum
  - tw
  - ur
  - vi
  - wo
  - xh
  - yo
  - zh
  - zu
programming_language:
  - C
  - C++
  - C#
  - Go
  - Java
  - JavaScript
  - Lua
  - PHP
  - Python
  - Ruby
  - Rust
  - Scala
  - TypeScript
pipeline_tag: text-generation
widget:
  - text: >-
      一个传奇的开端，一个不灭的神话，这不仅仅是一部电影，而是作为一个走进新时代的标签，永远彪炳史册。Would you rate the
      previous review as positive, neutral or negative?
    example_title: zh-en sentiment
  - text: 一个传奇的开端，一个不灭的神话，这不仅仅是一部电影，而是作为一个走进新时代的标签，永远彪炳史册。你认为这句话的立场是赞扬、中立还是批评？
    example_title: zh-zh sentiment
  - text: Suggest at least five related search terms to "Mạng neural nhân tạo".
    example_title: vi-en query
  - text: >-
      Proposez au moins cinq mots clés concernant «Réseau de neurones
      artificiels».
    example_title: fr-fr query
  - text: >-
      Explain in a sentence in Telugu what is backpropagation in neural
      networks.
    example_title: te-en qa
  - text: Why is the sky blue?
    example_title: en-en qa
  - text: >-
      Write a fairy tale about a troll saving a princess from a dangerous
      dragon. The fairy tale is a masterpiece that has achieved praise worldwide
      and its moral is "Heroes Come in All Shapes and Sizes". Story (in
      Spanish):
    example_title: es-en fable
  - text: >-
      Write a fable about wood elves living in a forest that is suddenly invaded
      by ogres. The fable is a masterpiece that has achieved praise worldwide
      and its moral is "Violence is the last refuge of the incompetent". Fable
      (in Hindi):
    example_title: hi-en fable

Repository: bigscience-workshop/bloomz

Models

Multilingual model capable of following user instructions in a variety of languages. Together with our paper [TODO: LINK], we release the following models:

bloomz: 176B parameter multitask finetuned version of bloom on xP3
bloomz-7b1: 7.1B parameter multitask finetuned version of bloom-7b1 on xP3
bloomz-3b: 3B parameter multitask finetuned version of bloom-3b on xP3
bloomz-1b7: 1.7B parameter multitask finetuned version of bloom-1b7 on xP3
bloomz-1b1: 1.7B parameter multitask finetuned version of bloom-1b1 on xP3
bloomz-560m: 560M parameter multitask finetuned version of bloom-560m on xP3

bloomz-mt: 176B parameter multitask finetuned version of bloom on xP3 & xP3mt. Better than bloomz when prompting in non-english
bloomz-7b1-mt: 7.1B parameter multitask finetuned version of bloom-7b1 on xP3 & xP3mt. Better than bloomz-7b1 when prompting in non-english

bloomz-p3: 176B parameter multitask finetuned version of bloom on P3. Released for research purposes, performance is inferior to bloomz
bloomz-7b1-p3: 7.1B parameter multitask finetuned version of bloom-7b1 on P3. Released for research purposes, performance is inferior to bloomz-7b1

Intended uses

You can use the models to perform inference on tasks by specifying your query in natural language, and the models will generate a prediction. For instance, you can ask "Translate this to Chinese: Je t'aime.", and the model will hopefully generate "我爱你".

How to use

Here is how to use the model in PyTorch:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m")
model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m")

inputs = tokenizer.encode("Is this review positive or negative? Review: this is the best cast iron skillet you will ever buy", return_tensors="pt")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

To use another checkpoint, replace the path in AutoTokenizer and AutoModelForCausalLM.

Note: 176B models are trained with bfloat16, while smaller models are trained with fp16. We recommend using the same precision type or fp32 at inference

Limitations

Large model size may require large computational resources
High performance variance depending on the prompt

BibTeX entry and citation info

TODO