# BLOOM, a version for Petals This model is a version of [bigscience/bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) post-processed to be run at home using the [Petals](https://github.com/bigscience-workshop/petals#readme) swarm. **Note:** Petals is developed to run 100B+ models like the [full-scale BLOOM](https://huggingface.co/bigscience/bloom-petals) or [BLOOMZ](https://huggingface.co/bigscience/bloomz-petals). This model is provided for testing purposes only. It may be more efficient to run the original version of it locally. Please check out: - The [original model card](https://huggingface.co/bigscience/bloom-7b1) to learn about the model's capabilities, specifications, and terms of use. - The [Petals repository](https://github.com/bigscience-workshop/petals#readme) to learn how to install Petals and run this model over the Petals swarm. We provide minimal code examples below. ## Using the model ```python from petals import DistributedBloomForCausalLM model = DistributedBloomForCausalLM.from_pretrained("bigscience/bloom-7b1-petals") # Embeddings & prompts are on your device, BLOOM blocks are distributed across the Internet inputs = tokenizer("A cat sat", return_tensors="pt")["input_ids"] outputs = model.generate(inputs, max_new_tokens=5) print(tokenizer.decode(outputs[0])) # A cat sat on a mat... ``` ## Serving the model blocks ```bash python -m petals.cli.run_server bigscience/bloom-7b1-petals ```