Generate a little bit of content at a time

#26

by loong - opened Feb 22

Discussion

loong

Feb 22

Generate a little bit of content at a time

vishanoberoi

Feb 22

You need to change the max_length.

For example:
Set model as model = AutoModelForCausalLM.from_pretrained("google/gemma-7b-it", quantization_config=quantization_config, max_length = 200)

ybelkada

Feb 22

Indeed, make sure to pass a large enough max_new_tokens in generate()

osanseviero changed discussion status to closed Feb 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment