Unable to edit max_length

#27
by amuhak - opened

When using Longer prompts I get:
input length of input_ids is 33, but `max_length` is set to 20. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.

Setting max_length=1000 doesn't seem to do anything.

And setting max_new_tokens=100 gives me this:

_batch_encode_plus() got an unexpected keyword argument 'max_new_tokens'

What am I doing wrong?

max_new_tokens is a parameter in model.generate(), which you can change for different prompts for the same created tokenizer and model.

See the snippet below as an example:

checkpoint = "bigscience/bloomz-7b1-mt"

tokenizer = BloomTokenizerFast.from_pretrained(checkpoint)
model = BloomForCausalLM.from_pretrained(
    checkpoint, torch_dtype="auto", device_map="auto"
)

prompt = "your prompt"
inputs = tokenizer.encode(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0]))

See max_new_tokens=20. This is equivalent to giving max_length=len(prompt) + 20, which will allow the model to output an additional 20 tokens however long your prompt would be.

I had the same error message, but since I was training I wasn't directly calling model.generate as above.
To get around this, I changed the value of model.config.max_length before beginning training.

I am not sure if this is technically correct, but it did get rid of the same warning you mentioned.

amuhak changed discussion status to closed

Sign up or log in to comment