No longer available, why?

#21
by micole66 - opened

Whyyyyyyyyyyyyyyyyyyyyyyyyyy?

BigScience Workshop org

Whyyyyyyyyyyyyyyyyyyyyyyyyyy?

It costs ~30K USD / month to keep up the inference widget, so we decided to turn it off after the first month. Really sorry :(
You can of course still download the model and run it on your own hardware if you have the resources available.

I like it more than bloom

I like it more than bloom

Same

NOOOOOOOOOO 😭

BigScience Workshop org

:(
On the bright side mt0-xxl & mt0-xxl-mt can still be used via the inference widget. 🤗

Definitely share if you find them more / less useful & if so why 🧐
In my experiments I found them better at following instructions requiring short answers & worse at instructions requiring long answers.

:(
On the bright side mt0-xxl & mt0-xxl-mt can still be used via the inference widget. 🤗

Definitely share if you find them more / less useful & if so why 🧐
In my experiments I found them better at following instructions requiring short answers & worse at instructions requiring long answers.

Bloomz know when to stop, Bloom don't.

I also found that Bloomz almost stopped too soon. When summarizing text, it ended after a single sentence. And since it only generated one sentence, it was never given the opportunity to follow the prompt. I honestly found Bloom more helpful. It could respond to longer prompts well, especially few shot prompts. But Bloomz seems to only work with short Q and A prompts. I do have hope that if it keeps getting better, Bloomz will become more diverse in capability.

I also found that Bloomz almost stopped too soon. When summarizing text, it ended after a single sentence. And since it only generated one sentence, it was never given the opportunity to follow the prompt. I honestly found Bloom more helpful. It could respond to longer prompts well, especially few shot prompts. But Bloomz seems to only work with short Q and A prompts. I do have hope that if it keeps getting better, Bloomz will become more diverse in capability.

Because of XP3 dataset i think. Most of the answers in this dataset are short.

TimeRobber changed discussion status to closed
TimeRobber changed discussion status to open
BigScience Workshop org
edited Jan 18, 2023

Now you can run inference and fine-tune BLOOMZ (the 176B English version) using the Petals swarm.

You can use BLOOMZ via this Colab notebook to get the inference speed of 1-2 sec/token for a single sequence. Running the notebook on a local machine is also fine, you'd need only 10+ GB GPU memory or 12+ GB RAM (though it will be slower without a GPU).

Note: Don't forget to replace bigscience/bloom-petals with bigscience/bloomz-petals in the model name.

As an example, there is a chatbot app running BLOOMZ this way.

BigScience Workshop org
edited Mar 2, 2023

Bloomz is back and even stronger than before. You can now do token streaming:

pip install sseclient-py (do NOT install sseclient, be sure to install sseclient-py)

import sseclient
import requests

prompt = "Why is the sky blue? Explain in a detailled paragraph."
parameters = {"max_new_tokens": 200, "top_p": 0.9, "seed": 0}
options = {"use_cache": False}

payload = {"inputs": prompt, "stream": True, "parameters": parameters, "options": options}

r = requests.post("https://api-inference.huggingface.co/models/bigscience/bloomz", stream=True, json=payload)
sse_client = sseclient.SSEClient(r)

for i, event in enumerate(sse_client.events()):
    print(i, event.data)

Sign up or log in to comment