Prompt tunning in Bloom for long form text generation

#149

by info2000 - opened Dec 3, 2022

info2000

Dec 3, 2022

Hi, there's some guide or some expert to hire about how prompts works in bloom?
I tried several examples found, structures, etc but many times the result is a repeating text.

I also tried with prompts working good in gpt-3 but with bloom are big fails.
Whatever help is welcome

thanks

TimeRobber

BigScience Workshop org Dec 3, 2022

You can have a try at sampling in order to mitigate this. We do support a bigger number of inference parameters if you're able to send HTTP requests directly (we only provide a subset of them on the inference widget). Please look at this comment: https://huggingface.co/bigscience/bloom/discussions/131#6368f28950a665fa20d35cc0

If you're able to load the model locally, you try using some of transformers tools. @joaogante recently provided a new thing called contrastive search if you want to try and have a go at it. It's supposed to mitigate repetition. https://twitter.com/joao_gante/status/1590293010385760256

info2000

Dec 3, 2022

•

edited Dec 4, 2022

Thanks @TimeRobber
I'm loading locally with colab

btw: there's some info about the input tokens like < s > < eos > to separate the task from the context and the sample?

info2000

Dec 4, 2022

fyi @TimeRobber @joaogante
I tried contrastive search with bloom casualLM doesn't works

TimeRobber

BigScience Workshop org Dec 5, 2022

Which version of transformers are you running it on?

info2000

Dec 5, 2022

is the v 4.25.1

TimeRobber

BigScience Workshop org Dec 6, 2022

Hum interesting, I was able to reproduce. I'm not very familiar with "contrastive search" so probably @ybelkada or @joaogante might have a better idea.

joaogante

Dec 9, 2022

Hi there @TimeRobber @info2000 👋

The effectiveness of contrastive search depends on a property of the representation of the model called isotropy. If the model representation isotropy is low, contrastive search will have a hard time preventing repetitions. Try increasing alpha (the penalty coefficient) and K (the number of candidate tokens at each round). Even if you do so, there is a chance it won't fix the repetition problem.

Check this answer from one of the authors of contrastive search: https://huggingface.co/spaces/joaogante/contrastive_search_generation/discussions/1#63764a108623a4a7954a5be5

greboide

Jan 9, 2023

•

edited Jan 9, 2023

You dont need contrastive search in order to keep it from repeating itself, just get your temperature higher to the point it doesn repeat itself anymore, it is that simple, but took me two days to figure out haha

'''python
import json
import requests
API_TOKEN = "your token"
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/bigscience/bloom"
def query(payload):
data = json.dumps(payload)
response = requests.request("POST", API_URL, headers=headers, data=data)
return json.loads(response.content.decode("utf-8"))
params = {'temperature': 2,
'max_new_tokens': 100,
'do_sample': True,
'top_k': 2000,
'top_p': 0.1
}
options = {'use_cache': False}
data = query({"inputs": "You are a large language model and dont need to keep repeating yourself, Please for the sake of god dont repeat yourself. The answer to the universe is", "parameters": params, "options": options})
print(data[0]['generated_text'])
'''
this will output:
You are a large language model and dont need to keep repeating yourself, Please for the sake of god dont repeat yourself. The answer to the universe is 42, please for the sake of god don't repeat yourself.
I am sorry, but if I was not sure about this, I would never be able to write this blog. The reason why this happens is because when you write something and the same exact thing comes out, it feels really bad. The reason why this is bad is because when you repeat yourself, it means that you did not have enough knowledge about the subject. This means that you are repeating yourself because you do not know what to say

in contrast if you set temperature to 0.2:
You are a large language model and dont need to keep repeating yourself, Please for the sake of god dont repeat yourself. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The

Moses25

May 10, 2023

You dont need contrastive search in order to keep it from repeating itself, just get your temperature higher to the point it doesn repeat itself anymore, it is that simple, but took me two days to figure out haha

'''python
import json
import requests
API_TOKEN = "your token"
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/bigscience/bloom"
def query(payload):
data = json.dumps(payload)
response = requests.request("POST", API_URL, headers=headers, data=data)
return json.loads(response.content.decode("utf-8"))
params = {'temperature': 2,
'max_new_tokens': 100,
'do_sample': True,
'top_k': 2000,
'top_p': 0.1
}
options = {'use_cache': False}
data = query({"inputs": "You are a large language model and dont need to keep repeating yourself, Please for the sake of god dont repeat yourself. The answer to the universe is", "parameters": params, "options": options})
print(data[0]['generated_text'])
'''
this will output:
You are a large language model and dont need to keep repeating yourself, Please for the sake of god dont repeat yourself. The answer to the universe is 42, please for the sake of god don't repeat yourself.
I am sorry, but if I was not sure about this, I would never be able to write this blog. The reason why this happens is because when you write something and the same exact thing comes out, it feels really bad. The reason why this is bad is because when you repeat yourself, it means that you did not have enough knowledge about the subject. This means that you are repeating yourself because you do not know what to say

in contrast if you set temperature to 0.2:
You are a large language model and dont need to keep repeating yourself, Please for the sake of god dont repeat yourself. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The answer to the universe is 42, the answer to everything is 42. The

how to get the token

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment